How to convert all Project Madurai ebooks to 6 Inch PDF for Kindle?

http://ProjectMadurai.com has tons of old tamil literature as HTML and A4 PDF files in public domain license.

There are many ebook reading devices and tables that dont support Tamil.
To use these devices, we can create 6 Inch PDF files with tamil content.

Let us see here, how to convert all Project Madurai ebooks into 6 Inch PDF files

using the utilities available in GNU/Linux.

1. Get the filenames.

http://www.projectmadurai.org/pmworks.html

This page has all the ebooks.

Copy the page content, paste in LibreOffice spreadsheet.
Copy the column named “unicode”

save the filenames only as a separate text file.

cat pm.txt

pmuni0001.html
pmuni0002.html
pmuni0002.html
pmuni0002.html
pmuni0002.html

2. Download these files using wget and python script.

cat dl-wget.py

import urllib
import os
book = open(“pm.txt”).readlines()

for filename in book:
    filename = filename.strip()

    print “Downloading ” + filename

    bookurl = “http://www.projectmadurai.org/pm_etexts/utf8/” + filename

    command = “wget -E -H -k -K -p –max-redirect 0 –domains www.projectmadurai.org -e   robots=off ” + bookurl

    os.system(command)

running the following command,

python dl-wget.py

will download all the html files with the relevant images to the current folder.

There are 593 html files downloaded.

3.
Convert to PDF

The utility wkhtmltopdf will convert any given html file to PDF file.

To convert to 6 inch PDF, the following command helps.

wkhtmltopdf -s A6 –minimum-font-size 40 -B 5 -L 5 -R 5 -T 5 source.html destination.pdf

Now, let us convert all the downloaded html files to 6 Inch PDF file using a small shell script.

for i in *.html; do orig=`basename $i .html`; echo “Converting $orig”; wkhtmltopdf -s A6 –minimum-font-size 40 -B 5 -L 5 -R 5 -T 5 $i.html $orig-6-inch.pdf; done

By running this command, all the 593 html files are converted into 6 inch PDF files.

Now, we can read these 6 inch PDF files in Kindle, android mobile or tablets.

4. Upload the 6 inch PDF files.

I have uploaded all the 6 inch PDF files here.
http://bit.ly/project-madurai-kindle-books

Get the real name of the book by comparing here.
http://www.projectmadurai.org/pmworks.html

Download your favorite book and start reading.

5 thoughts on “How to convert all Project Madurai ebooks to 6 Inch PDF for Kindle?

  1. Pingback: Mudukulathur » Project Madurai திட்டத்தின் அனைத்து நூல்கள்

  2. Pingback: தமிழுக்கு தேவையான கட்டற்ற மென்பொருட்களின் பட்டியல் | கணியம்

  3. Pingback: தமிழுக்கு தேவையான கட்டற்ற மென்பொருட்களின் பட்டியல் -1 | கணிணித் தமிழ்

  4. Thanks a lot & lot for your invaluable EFFORTS. God Bless YOU in your efforts to ENLIGHTEN all the People around the world.
    My heartful salutations to your Great Work
    Regards,
    Shansu Rajagopal

Leave a comment