Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
8b9114009e
Initial commit
1 year ago
26911af61d
update readme automation
1 year ago

README.md

You have to be logged in to leave a comment. Sign In

ChEMBL - Data Lakehouse Ready

Stream data with DDA:

from dagshub.streaming import DagsHubFilesystem

fs = DagsHubFilesystem(".", repo_url="https://dagshub.com/DagsHub-Datasets/chembl-dataset")

fs.listdir("s3://aws-roda-hcls-datalake/chembl_29/")

Description:

ChEMBL is a manually curated database of bioactive molecules with drug-like properties. It brings together chemical, bioactivity and genomic data to aid the translation of genomic information into effective new drugs. This representation of ChEMBL is stored in Parquet format and most easily utilized through Amazon Athena. Follow the documentation for install instructions (< 2 minute install). New ChEMBL releases occur sporadically; the most up to date information on ChEMBL releases can be found here.

Contact:

ChEMBL is a manually curated database of bioactive molecules with drug-like properties. It brings together chemical, bioactivity and genomic data to aid the translation of genomic information into effective new drugs. This representation of ChEMBL is stored in Parquet format and most easily utilized through Amazon Athena. Follow the documentation for install instructions (< 2 minute install). New ChEMBL releases occur sporadically; the most up to date information on ChEMBL releases can be found here.

Update Frequency:

Upon request. We try to keep it updated to every odd version.

Managed By:

https://aws.amazon.com/

Resources:

  1. resource:

    • Description: ChEMBL 29
    • ARN: arn:aws:s3:::aws-roda-hcls-datalake/chembl_29/
    • Region: us-east-1
    • Type: S3 Bucket
  2. resource:

    • Description: ChEMBL 27
    • ARN: arn:aws:s3:::aws-roda-hcls-datalake/chembl_27/
    • Region: us-east-1
    • Type: S3 Bucket
  3. resource:

    • Description: ChEMBL 25
    • ARN: arn:aws:s3:::aws-roda-hcls-datalake/chembl_25/
    • Region: us-east-1
    • Type: S3 Bucket

Tags:

chemistry, genomic, molecule, life sciences, biotech blueprint, parquet

Tutorials:

  1. tutorial:

Publication:

  1. publication:
Tip!

Press p or to see the previous file or, n or to see the next file

About

chembl-dataset is originate from the Registry of Open Data on AWS

Collaborators 5

Comments

Loading...