Photo by Shubham Dhage on Unsplash

DigitalCorpora Dataset for Machine Learning

Install DagsHub:

pip install dagshub
Click on copy button to copy content

To stream this data directly on DagsHub

from dagshub.streaming import DagsHubFilesystem

fs = DagsHubFilesystem(".", repo_url="https://dagshub.com/DagsHub-Datasets/digitalcorpora-dataset")

fs.listdir("s3://digitalcorpora")
Click on copy button to copy content

Description

Disk images, memory dumps, network packet captures, and files for use in digital forensics research and education. All of this information is accessible through the digitalcorpora.org website, and made available at s3://digitalcorpora/. Some of these datasets implement scenarios that were performed by students, faculty, and others acting *in persona*. As such, the information is synthetic and may be used without prior authorization or IRB approval. Details of these datasets can be found at http://www.simson.net/clips/academic/2009.DFRWS.Corpora.pdf

Additional information

Update frequency

Quarterly

Managed by

License

There are no restrictions on the use of this data.

Related datasets

BodyM Dataset

Cloud to Street – Microsoft Flood and Clouds Dataset

A2D2: Audi Autonomous Driving Dataset

Galaxy Evolution Explorer Satellite (GALEX)

Launch your ML development to new heights with DagsHub

Back to top