Lazyweb: Getting public domain books out of Gutenberg en masse and onto a Kindle

I'm trying to figure out how to slurp down every single book in Gutenberg to throw on to a Kindle. It's easy enough to download the Gutenberg archive, but everything comes down in .txt (with funky line breaks and no title/author information in index page of the Kindle) or in PDF, which I would have to pay Amazon to convert or run through one-at-a-time in a conversion program.

There are sites like Feedbooks and Manybooks that have some of the works pre-formated into Mobi, but no page by which to slurp down the entire library. I even tried writing a script to grab everything from Manybooks, but they're doing something that renames the file with the proper author and title when you click the download link that doesn't work when you use curl or wget, making it possible only to download every title they offer by number, which in turn makes it impossible for me to tell which file is which book and remove the ones in other languages.

It's supposed to be a gift — most of the Western canon (especially any philosophy texts I can gather) on a single device. But I'm having a devil of a time getting it all in the right formats and am a little surprised no one else has done the same and thrown it all up in a torrent. Any ideas?

This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

Leave a Reply


More BB

Boing Boing Video

Flickr Pool




Displays ads via FM Tech

RSS and Email

This work is licensed under a Creative Commons License permitting non-commercial sharing with attribution. Boing Boing is a trademark of Happy Mutants LLC in the United States and other countries.

FM Tech