Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
3aa04bbc40
Initial commit
1 year ago
b6fb57ecb9
update readme automation
1 year ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

Open Bioinformatics Reference Data for Galaxy

Stream data with DDA:

from dagshub.streaming import DagsHubFilesystem

fs = DagsHubFilesystem(".", repo_url="https://dagshub.com/DagsHub-Datasets/open-bio-ref-data-dataset")

fs.listdir("s3://biorefdata")

Description:

This dataset provides genomic reference data and software packages for use with Galaxy and Bioconductor applications. The reference data is available for hundreds of reference genomes and has been formatted for use with a variety of tools. The available configuration files make this data easily incorporable with a local Galaxy server without additional data preparation. Additionally, Bioconductor's AnnotationHub and ExperimentHub data are provided for use via R packages through which the data can be downloaded and installed into any R environment.

Contact:

This dataset provides genomic reference data and software packages for use with Galaxy and Bioconductor applications. The reference data is available for hundreds of reference genomes and has been formatted for use with a variety of tools. The available configuration files make this data easily incorporable with a local Galaxy server without additional data preparation. Additionally, Bioconductor's AnnotationHub and ExperimentHub data are provided for use via R packages through which the data can be downloaded and installed into any R environment.

Update Frequency:

Periodically but not on a set schedule.

Managed By:

Galaxy and Bioconductor Projects

Resources:

  1. resource:
    • Description: The data is organized as versioned objects for reference data, and data packages for use with the R programming language. The reference data is primarily intended for use with the Galaxy application, and has been formatted for easy configuration. In the future, additional formats may be provided. The packages as well as the AnnotationHub and ExperimentHub data are intended for use with R / Bioconductor.

    • ARN: arn:aws:s3:::biorefdata

    • Region: ap-southeast-2

    • Type: S3 Bucket

Tags:

aws-pds, bioinformatics, biology, genetic, genomic, life sciences, reference index

Tutorials:

  1. tutorial:

Tools & Applications:

  1. tools & applications:

  2. tools & applications:

Publication:

  1. publication:

  2. publication:

    • Title: TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages
    • URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5302158/
    • AuthorName: Tiago C. Silva, Antonio Colaprico, Catharina Olsen, Fulvio D'Angelo, Gianluca Bontempi, Michele Ceccarelli, Houtan Noushmehr
  3. publication:

    • Title: Accessible, curated metagenomic data through ExperimentHub
    • URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5862039/
    • AuthorName: Edoardo Pasolli, Lucas Schiffer, Paolo Manghi, Audrey Renson, Valerie Obenchain, Duy Tin Truong, Francesco Beghini, Faizan Malik, Marcel Ramos, Jennifer B Dowd, Curtis Huttenhower, Martin Morgan, Nicola Segata, and Levi Waldron
Tip!

Press p or to see the previous file or, n or to see the next file

About

open-bio-ref-data-dataset is originate from the Registry of Open Data on AWS

Collaborators 5

Comments

Loading...