We are planning to convert all the ebooks in http://ProjectMadurai.org as epub files, so that any one can read them on the latest devices like andoid and ios devices.
We will be using http://pressbooks.com to convert them as ebub.
There are around 500 ebooks.
We are looking for volunteers to convert them all as epub and mobi.
In the meantime, They all need cover images.
On my request, sathia tried a rails app.
But, it got unicode issue.
I tried using python to add a tamil text over a image.
The Tamil Text is displayed as splitted.
Sample Image: http://postimg.org/image/8z7ed377r/
I too got the same issue what sathia gets.
Then, How to solve this?
This seems a great unicode issue and there is no programmable solution.
Here comes the power utilities of GNU/Linux.
when programming cant help much, shell can help.
There is a utility called "wkhtmltopdf" which can convert html as PDF.
Shall we create html pages with all the book name and author name details?
Then, we can convert as PDF.
Then, PDF to image.
Will it work?
I Installed wkhtmltopdf and tried.
In my latest ubuntu, wkhtmltopdf is installed, but it is too old.
Got some issues like "Not compiled with QT support".
We have to get the latest binaries from its website.
Download wkhtmltopdf from http://wkhtmltopdf.org/downloads.html
The websites gives latest binary with QT binding.
The one that comes with ubuntu in default is very old.
Download it from the site and install it.
It gives two utilities. 1. wkhtmltopdf 2. wkhtmltoimage
We use wkhtmltoimage to convert the html files into images.
wkhtmltoimage –height 640 –width 429 001.html 001.jpg
Using this method, we can convert all the html files to images directly.
Here is the detailed workflow:
1. Get all the book names,author name from
2. Copy them into a LibreOffice Calc spreadsheet.
3. Add serial number in a new front column.
4. Export as CSV file. Put $ as the delimeter.
5. Run the file parser.py
It will extract the book name, author and generate html files, with a background image for the cover art.
6. Download wkhtmltopdf from http://wkhtmltopdf.org/downloads.html and install
7. Now, using a small shell script, we can convert all the html files into images.
for i in *.html; do wkhtmltoimage –height 640 –width 429 $i `basename $i .html`.jpg; done
We get all the images.
Background Image source: http://pixabay.com/en/grey-background-texture-template-370125/
License Public Domain CC0
Uploaded all the generated cover images here:
The source code repo: https://github.com/tshrinivasan/project-madurai-cover-images
Now, we have all the cover images.
We need volunteers to work on creating ebooks for all the books on http://ProjectMadurai.org
Contact me if you are interested in volunteering this.
Today, FreeTamilEbooks.com enters into 2nd year with 100th ebook.
Thanks for all the authors writing in creative commons license, to all the volunteers helping to create ebooks and for the readers for the great welcome.
Read more in Tamil here.:
We are planning for a python hackathon at DGV arts & science college, chennai.
The students learnt python already.
To make them contribute to Free Software, I had a discussion with HOD,
to conduct a one day hackathon.
There are 40 students and we can give a 50 programs for students to pick one and do in that 6 hours span.
I have the following list.
1. scrap a flipkart.com and get rate of a given peoduct
2. resize some huge size photos and add some text to all the images.
3. get two dates and calculate the no of days between them.
4. upload images to flickr using flickr api
5. auto mate blog posting using wordpress api
6. analyse a apache log file and get statistics from it.
7. create a solver for crossword puzzles using the given no of words and few letters
8. Download picture of the day from http://commons.wikimedia.org and make it as wallpaper or a widget
9. test a website for its availability. send mail to some people, if the site is down
10. backup all files in /var/www/html and databases and store in a remote place.
Reply here with more ideas.
Tamil Software Development Program
There are lot of software required in Tamil Linguistic World.
Planning for a Program like Google Summer of Code to encourage developers for developing software for Tamil.
Program Details :
1. List the required software:
List the software requirements related to Tamil, so that Developer can pick one from work on that.
The software can be Desktop application, Mobile Application, Online application or a game.
send the requirements to tshrinivasan AT gmail.com or comment below.
The Event duration is 4 months. Don’t give big software like OCR, TTS, Speech to Text etc.
Give small games, small components required to build big software etc.
2. Forming a Mentor team:
Not all developers are good at Tamil Linguistic and grammar. We have to form a Mentor team with Tamil Scholars. Developers can contact them for any queries related to Tamil.
Contact me if you are interested to be a Mentor.
3. Call for Programmers:
Programmers, from around the globe, can participate in this program.
They can be students, working professionals, Home makers, kids etc.
They can pick their desired software from the list.
They can work as a team or as individual.
All software developed should be Free/Open Source Software from day one.
All code should be published in http://github.com
Developers should write dev log on their blogs atleast weekly once.
At the end of 4 months, we can review the developed software and package them for release.
We can call for Donations for the projects. Divide the amount and share with the developers.
Need a team to handle Finance. They have to approach donors, collect money and share with the Developers.
Reply here with your thoughts.
Contact me at tshrinivasan AT gmail.com if you want to join the core team for this project.
We are trying to build a spell checker for tamil using hunpell.
For that we need a huge list of root words in tamil.
Root words may have people names, city names, verbs, etc.
To collect them manually is a huge task.
Thinking of a browser plugin, where a reader of any tamil blog, Wikipedia page, or social media can select a word and tag it with relevant part of speech and send to a central server.
By this way, any tamil reader can contribute to build the huge list of root words.
Looking for volunteers for building the central web application and Firefox or chrome plugin to collect data.
Reply here if you are interested to contribute.
I am trying to build a TTS – Text To Speech – in Tamil.
I can split the words now.
There is a software in PHP which can convert Tamil text to IPA.
Using your code, we can convert Text to IPA.
Hope we can get the sound files for IPA or we can record.
Once we get all the Sound files, we can do
Tamil->IPA->Play sound files.
Thus, we can build a TTS system.
Planning to do in Python.
Reply here if anyone is interested to join and contribute to this project.
Thanks to Vinodh Rajan – vinodh AT virtualvinodh.com
For sharing the Tamil ->IPA conversion code.