Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
9e37a40306
Initial commit
1 year ago
d1dd95079f
update readme automation
1 year ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

Genome Aggregation Database (gnomAD) - Data Lakehouse Ready

Stream data with DDA:

from dagshub.streaming import DagsHubFilesystem

fs = DagsHubFilesystem(".", repo_url="https://dagshub.com/DagsHub-Datasets/gnomad-data-lakehouse-ready-dataset")

fs.listdir("s3://aws-roda-hcls-datalake/gnomad")

Description:

The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators that aggregates and harmonizes both exome and genome data from a wide range of large-scale human sequencing projects Sign up for the gnomAD mailing list here. This dataset was derived from summary data from gnomAD release 3.1, available on the Registry of Open Data on AWS for ready enrollment into the Data Lake as Code.

Contact:

The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators that aggregates and harmonizes both exome and genome data from a wide range of large-scale human sequencing projects Sign up for the gnomAD mailing list here. This dataset was derived from summary data from gnomAD release 3.1, available on the Registry of Open Data on AWS for ready enrollment into the Data Lake as Code.

Update Frequency:

Not updated

Managed By:

https://aws.amazon.com/

Resources:

  1. resource:
    • Description: Parquet representations of gnomAD summary data aggregated from gnomAD release 3.1
    • ARN: arn:aws:s3:::aws-roda-hcls-datalake/gnomad
    • Region: us-east-1
    • Type: S3 Bucket

Tags:

biology, bioinformatics, biotech blueprint, genomic, genetic, life sciences, parquet, population genetics, vcf, whole genome sequencing

Tutorials:

  1. tutorial:
Tip!

Press p or to see the previous file or, n or to see the next file

About

gnomad-data-lakehouse-ready-dataset is originate from the Registry of Open Data on AWS

Collaborators 5

Comments

Loading...