Michael Ekstrand 0ce8407694
update dvc
..
f27edd8757
Use Curl instead of wget/aria2 for downloads
1 year ago
621efa70d0
Bump DVC version
1 year ago
787e7359a6
upgrade yaml
6 months ago
96e1a3d4f9
Document data download
3 years ago
0ce8407694
update dvc
6 months ago
787e7359a6
upgrade yaml
6 months ago
621efa70d0
Bump DVC version
1 year ago
787e7359a6
upgrade yaml
6 months ago
787e7359a6
upgrade yaml
6 months ago
621efa70d0
Bump DVC version
1 year ago
621efa70d0
Bump DVC version
1 year ago
621efa70d0
Bump DVC version
1 year ago
621efa70d0
Bump DVC version
1 year ago

README.md

Data files go

Library of Congress

https://www.loc.gov/cds/products/MDSConnect-books_all.html

Download the MARC-XML files, all 42 of them, to a subdirectory called LOC.

OpenLibrary

https://openlibrary.org/developers/dumps

BookCrossing

http://www2.informatik.uni-freiburg.de/~cziegler/BX/

Amazon ratings

http://jmcauley.ucsd.edu/data/amazon/

Download the ratings-only file for Books.