Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
53b5341aa5
Initial commit
1 year ago
1b0253c0fc
update readme automation
1 year ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

1000 Genomes Phase 3 Reanalysis with DRAGEN 3.5 and 3.7

Stream data with DDA:

from dagshub.streaming import DagsHubFilesystem

fs = DagsHubFilesystem(".", repo_url="https://dagshub.com/DagsHub-Datasets/ilmn-dragen-1kgp-dataset")

fs.listdir("s3://1000genomes-dragen")

Description:

This dataset contains alignment files and short nucleotide, copy number, repeat expansion (STR) and structural variant call files from the 1000 Genomes Project Phase 3 dataset (n=3202) using Illumina DRAGEN v3.5.7b and v3.7.6 software. The v3.7.6 dataset also includes results from joint small variant, de novo structural variant, de novo copy number variant and repeat expansion calls on 602 trio families comprised of members from the 1000 Genomes Project Phase 3 dataset, as well as DRAGEN gVCF Genotyper (v3.8.3) analysis on the entire dataset (n=3202). Improvements and new features in the v3.7.6 individual samples analyses include CYP2D6 variant calling and joint detection (see ‘DRAGEN 3.7 User Guide’ for details on these features) and use of graph-based hg19 and hg38 reference hash tables (see ‘DRAGEN Wins at PrecisionFDA Truth Challenge V2 Showcase Accuracy Gains from Alt-aware Mapping and Graph Reference Genomes’ for details).

Contact:

This dataset contains alignment files and short nucleotide, copy number, repeat expansion (STR) and structural variant call files from the 1000 Genomes Project Phase 3 dataset (n=3202) using Illumina DRAGEN v3.5.7b and v3.7.6 software. The v3.7.6 dataset also includes results from joint small variant, de novo structural variant, de novo copy number variant and repeat expansion calls on 602 trio families comprised of members from the 1000 Genomes Project Phase 3 dataset, as well as DRAGEN gVCF Genotyper (v3.8.3) analysis on the entire dataset (n=3202). Improvements and new features in the v3.7.6 individual samples analyses include CYP2D6 variant calling and joint detection (see ‘DRAGEN 3.7 User Guide’ for details on these features) and use of graph-based hg19 and hg38 reference hash tables (see ‘DRAGEN Wins at PrecisionFDA Truth Challenge V2 Showcase Accuracy Gains from Alt-aware Mapping and Graph Reference Genomes’ for details).

Update Frequency:

Files may be updated subsequent to changes to the 1000 Genomes Project data set or select new DRAGEN features or offerings.

Managed By:

https://www.illumina.com/products/by-type/informatics-products/dragen-bio-it-platform.html

Resources:

  1. resource:

    • Description: BAM, SNV-vcf, SNV-gvcf, STR-vcf, STR-bam, SV-vcf, ROH-vcf, CNV-vcf, CNV-bw, metrics and other supporting files from DRAGEN v3.5.6b analyses in a public S3 bucket.
    • ARN: arn:aws:s3:::1000genomes-dragen
    • Region: us-west-2
    • Type: S3 Bucket
  2. resource:

    • Description: BAM, SNV-vcf, SNV-gvcf, STR-vcf, STR-bam, SV-vcf, ROH-vcf, CNV-vcf, CNV-bw, cyp2d6-tsv, metrics and other supporting files from DRAGEN v3.7.6 analyses in a public S3 bucket.
    • ARN: arn:aws:s3:::1000genomes-dragen-3.7.6
    • Region: us-west-2
    • Type: S3 Bucket
  3. resource:

    • Description: BAM, SNV-vcf, SNV-gvcf, STR-vcf, STR-bam, SV-vcf, ROH-vcf, CNV-vcf, CNV-bw, cyp2d6-tsv, metrics and other supporting files from DRAGEN v3.7.6 analyses in a public S3 bucket. This is a clone of the 1000genomes-dragen-3.7.6 bucket in the us-east-1 region.
    • ARN: arn:aws:s3:::1000genomes-dragen-v3.7.6
    • Region: us-east-1
    • Type: S3 Bucket

Tags:

aws-pds, life sciences, health, biology, genetic, genomic, bam, vcf

Tutorials:

  1. tutorial:

Tools & Applications:

  1. tools & applications:

  2. tools & applications:

  3. tools & applications:

Publication:

  1. publication:

  2. publication:

Tip!

Press p or to see the previous file or, n or to see the next file

About

ilmn-dragen-1kgp-dataset is originate from the Registry of Open Data on AWS

Collaborators 5

Comments

Loading...