Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
e72b946676
Initial commit
1 year ago
55e4531a87
update readme automation
1 year ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

Common Screens

Stream data with DDA:

from dagshub.streaming import DagsHubFilesystem

fs = DagsHubFilesystem(".", repo_url="https://dagshub.com/DagsHub-Datasets/comonscreens-dataset")

fs.listdir("s3://common-screens")

Description:

A corpus of web screenshot and metadata data composed of over 70 million websites.

Contact:

A corpus of web screenshot and metadata data composed of over 70 million websites.

Update Frequency:

Monthly

Managed By:

https://commonscreens.com/

Resources:

  1. resource:

    • Description: Common Screens (jpeg and csv format)
    • ARN: arn:aws:s3:::common-screens
    • Region: us-west-2
    • Type: S3 Bucket
  2. resource:

    • Description: Cloudfront CDN distribution for hotlinking screenshots
    • Host: dqh5x5k6xg3n1.cloudfront.net
    • Region: us-west-2
    • Type: CloudFront Distribution

Tags:

aws-pds, encyclopedic, natural language processing, internet

Tutorials:

  1. tutorial:
Tip!

Press p or to see the previous file or, n or to see the next file

About

comonscreens-dataset is originate from the Registry of Open Data on AWS

Collaborators 5

Comments

Loading...