Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Michael Ekstrand 0ce8407694
update dvc
3 years ago
..
f27edd8757
Use Curl instead of wget/aria2 for downloads
3 years ago
621efa70d0
Bump DVC version
3 years ago
787e7359a6
upgrade yaml
3 years ago
96e1a3d4f9
Document data download
5 years ago
0ce8407694
update dvc
3 years ago
787e7359a6
upgrade yaml
3 years ago
621efa70d0
Bump DVC version
3 years ago
621efa70d0
Bump DVC version
3 years ago
787e7359a6
upgrade yaml
3 years ago
787e7359a6
upgrade yaml
3 years ago
621efa70d0
Bump DVC version
3 years ago
621efa70d0
Bump DVC version
3 years ago
621efa70d0
Bump DVC version
3 years ago
621efa70d0
Bump DVC version
3 years ago

README.md

You have to be logged in to leave a comment. Sign In

Data files go

Library of Congress

https://www.loc.gov/cds/products/MDSConnect-books_all.html

Download the MARC-XML files, all 42 of them, to a subdirectory called LOC.

OpenLibrary

https://openlibrary.org/developers/dumps

BookCrossing

http://www2.informatik.uni-freiburg.de/~cziegler/BX/

Amazon ratings

http://jmcauley.ucsd.edu/data/amazon/

Download the ratings-only file for Books.

Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...