Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
ae3c7c182f
Initial commit
1 year ago
5fea8321a7
update readme automation
1 year ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

Genome Aggregation Database (gnomAD)

Stream data with DDA:

from dagshub.streaming import DagsHubFilesystem

fs = DagsHubFilesystem(".", repo_url="https://dagshub.com/DagsHub-Datasets/broad-gnomad-dataset")

fs.listdir("s3://gnomad-public-us-east-1")

Description:

The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators that aggregates and harmonizes both exome and genome data from a wide range of large-scale human sequencing projects. The summary data provided here are released for the benefit of the wider scientific community without restriction on use. The v2 data set (GRCh37) spans 125,748 exome sequences and 15,708 whole-genome sequences from unrelated individuals. The v3 data set (GRCh38) spans 71,702 genomes, selected as in v2. Sign up for the gnomAD mailing list here.

Contact:

The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators that aggregates and harmonizes both exome and genome data from a wide range of large-scale human sequencing projects. The summary data provided here are released for the benefit of the wider scientific community without restriction on use. The v2 data set (GRCh37) spans 125,748 exome sequences and 15,708 whole-genome sequences from unrelated individuals. The v3 data set (GRCh38) spans 71,702 genomes, selected as in v2. Sign up for the gnomAD mailing list here.

Update Frequency:

Data from new releases are made public as soon as they are available. New releases, including both minor and major versions, have historically been issued on the order of once per year.

Managed By:

gnomAD Production Team at the Broad Institute

Resources:

  1. resource:
    • Description: gnomAD summary data aggregated from large-scale human genome and exome sequencing projects.
    • ARN: arn:aws:s3:::gnomad-public-us-east-1
    • Region: us-east-1
    • Type: S3 Bucket

Tags:

aws-pds, population genetics, population, whole genome sequencing, genomic, genetic, life sciences, bioinformatics, short read sequencing

Tools & Applications:

  1. tools & applications:

  2. tools & applications:

  3. tools & applications:

  4. tools & applications:

Publication:

  1. publication:

    • Title: The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020)
    • URL: https://doi.org/10.1038/s41586-020-2308-7
    • AuthorName: Karczewski, K. J., Francioli, L. C., Tiao, G., Cummings, B. B., Alföldi, J., Wang, Q., Collins, R. L., Laricchia, K. M., Ganna, A., Birnbaum, D. P., Gauthier, L. D., Brand, H., Solomonson, M., Watts, N. A., Rhodes, D., Singer-Berk, M., England, E. M., Seaby, E. G., Kosmicki, J. A., ... MacArthur, D. G.
  2. publication:

    • Title: A structural variation reference for medical and population genetics. Nature 581, 444–451 (2020)
    • URL: https://doi.org/10.1038/s41586-020-2287-8
    • AuthorName: Collins, R. L., Brand, H., Karczewski, K. J., Zhao, X., Alföldi, J., Francioli, L. C., Khera, A. V., Lowther, C., Gauthier, L. D., Wang, H., Watts, N. A., Solomonson, M., O’Donnell-Luria, A., Baumann, A., Munshi, R., Walker, M., Whelan, C., Huang, Y., Brookings, T., ... Talkowski, M. E.
  3. publication:

    • Title: Transcript expression-aware annotation improves rare variant interpretation. Nature 581, 452–458 (2020)
    • URL: https://doi.org/10.1038/s41586-020-2329-2
    • AuthorName: Cummings, B. B., Karczewski, K. J., Kosmicki, J. A., Seaby, E. G., Watts, N. A., Singer-Berk, M., Mudge, J. M., Karjalainen, J., Kyle Satterstrom, F., O’Donnell-Luria, A., Poterba, T., Seed, C., Solomonson, M., Alföldi, J., The Genome Aggregation Database Production Team, The Genome Aggregation Database Consortium, Daly, M. J., & MacArthur, D. G.
  4. publication:

    • Title: Evaluating potential drug targets through human loss-of-function genetic variation. Nature 581, 459–464 (2020)
    • URL: https://doi.org/10.1038/s41586-020-2267-z
    • AuthorName: Minikel, E. V., Karczewski, K. J., Martin, H. C., Cummings, B. B., Whiffin, N., Rhodes, D., Alföldi, J., Trembath, R. C., van Heel, D. A., Daly, M. J., Genome Aggregation Database Production Team, Genome Aggregation Database Consortium, Schreiber, S. L., & MacArthur, D. G.
  5. publication:

    • Title: Landscape of multi-nucleotide variants in 125,748 human exomes and 15,708 genomes. Nature Communications 11, 2539 (2020)
    • URL: https://doi.org/10.1038/s41467-019-12438-5
    • AuthorName: Wang, Q., Pierce-Hoffman, E., Cummings, B. B., Karczewski, K. J., Alföldi, J., Francioli, L. C., Gauthier, L. D., Hill, A. J., O’Donnell-Luria, A. H., Genome Aggregation Database (gnomAD) Production Team, Genome Aggregation Database (gnomAD) Consortium, & MacArthur, D. G.
  6. publication:

    • Title: The effect of LRRK2 loss-of-function variants in humans. Nature Medicine (2020)
    • URL: https://doi.org/10.1038/s41591-020-0893-5
    • AuthorName: Whiffin, N., Armean, I. M., Kleinman, A., Marshall, J. L., Minikel, E. V., Goodrich, J. K., Quaife, N. M., Cole, J. B., Wang, Q., Karczewski, K. J., Cummings, B. B., Francioli, L., Laricchia, K., Guan, A., Alipanahi, B., Morrison, P., Baptista, M. A. S., Merchant, K. M., Genome Aggregation Database Production Team, ... MacArthur, D. G.
  7. publication:

    • Title: Characterising the loss-of-function impact of 5’ untranslated region variants in 15,708 individuals. Nature Communications 11, 2523 (2020)
    • URL: https://doi.org/10.1038/s41467-019-10717-9
    • AuthorName: Whiffin, N., Karczewski, K. J., Zhang, X., Chothani, S., Smith, M. J., Gareth Evans, D., Roberts, A. M., Quaife, N. M., Schafer, S., Rackham, O., Alföldi, J., O’Donnell-Luria, A. H., Francioli, L. C., Genome Aggregation Database (gnomAD) Production Team, Genome Aggregation Database (gnomAD) Consortium, Cook, S. A., Barton, P. J. R., MacArthur, D. G., & Ware, J. S.
  8. publication:

    • Title: Technical artifact drives apparent deviation from Hardy-Weinberg equilibrium at CCR5-∆32 and other variants in gnomAD. bioRxiv (p. 784157)
    • URL: https://doi.org/10.1101/784157
    • AuthorName: Karczewski, K. J., Gauthier, L. D., Daly, M. J.
  9. publication:

    • Title: Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016)
    • URL: https://doi.org/10.1038/nature19057
    • AuthorName: Lek, M., Karczewski, K., Minikel, E. et al.
  10. publication:

  11. publication:

Tip!

Press p or to see the previous file or, n or to see the next file

About

broad-gnomad-dataset is originate from the Registry of Open Data on AWS

Collaborators 5

Comments

Loading...