I am listing here few project ideas and requirements. If you are interested in contributing to any open source project, consider these to start with.
I am giving an intro about each of them in this series of blog posts.
Add your comment here if you pick any of the project to do, so that others can join with you.
1. Clean up Epub files.
We create epub files for FreeTamilEbooks.com by using Calibre. It creates epub files with lot of extra span and other tags. We need to remove all the unwanted tags from those epub files.
Create a command line or web application to clean up the given epub files.
If you are writing in python, plan to create a calibre plugin to clean the epub files.
https://archive.org/download/pazhaiya-kuppaigal/pazhaiya-kuppaigal-jothiji.epub
2. Download reports for Tamil Wikisource Ebooks
http://ta.wikisource is providing ebooks downloads.
In this database, all language wiki source ebook downloads are stored.
http://tools.wmflabs.org/wsexport/logs.sqlite
Create a web application or command line application to get the details of tamil books and create a download
count report for each book.
Create similar report as http://freetamilebooks.com/htmlbooks/download-report.html
3. Improve FreeTamilEbooks android app
The android app for FreeTamilEbooks has some bugs.
https://github.com/jskcse4/FreeTamilEBooks/issues
Use the App and read the issues.
Fix them.
4. OCR4WikiSource – Create a web application
OCR4WikiSource is a command line application that connects google ocr and wikisource.
It sends the pdf files to google drive, ocr it, gets text, sends to wikisource.
Create a web application to upload any pdf file, send to google via google vision api, get text, send to wikisource.
Links:
Here is the requirement.
https://github.com/tshrinivasan/OCR4wikisource/issues/89
Few links about it.
https://goinggnu.wordpress.com/2015/12/28/announcing-ocr4wikisource/
https://goinggnu.wordpress.com/2015/09/30/automating-google-ocr-with-python/
https://meta.wikimedia.org/wiki/WikiConference_India_2016/Submissions/Introduction_to_OCR4WikiSource
Discussion with wikipedia developers on this.
https://phabricator.wikimedia.org/T120788
Google Vision API
https://cloud.google.com/vision
Explore the links
https://github.com/GoogleCloudPlatform/cloud-vision
http://terrenceryan.com/blog/index.php/working-with-cloud-vision-api-from-php/
https://github.com/thangman22/google-cloud-vision-php
http://blog.aimanbaharum.com/2016/04/21/ocr-with-google-cloud-vision-api/
5. FlipBoard like application for Tamil
Flipboard is a web, mobile app which gives latest content on user selected topics. Create such application for providing tamil content from web on various topics. Content contributors should give links for good articles with relevant categories, tags. Users should subscribe to categories and read the latest content.
6. Firefox plugin for tamil wikisource proofreading
https://addons.mozilla.org/en-US/firefox/addon/quickwikieditor/
Need to extend this plugin, to send the error words and the corrected words to a remote web application. From there, we can get the list of error words, search for them in entire ta.wikisource.org, replace with the corrected words automatically using bots.
Extend the plugin and create a web application to get the words collection from the plugin.
7. Fix the Tamil TTS by IITM
https://www.iitm.ac.in/donlab/tts/
It is very initial version. Not as good as the latest web version available at http://speech.ssn.edu.in/
Still, we can learn, extend the initial version.
Explore the android app, get the C code out of it, create a command line app or web app as having the c code as backend.
We store all the details about the books in a XML file.
Here is the file – https://github.com/kishorek/Free-Tamil-Ebooks/blob/master/booksdb.xml
This file is source for Android and iOS apps for FreeTamilebooks.
Once an ebook is released, we have to update the xml file manually, which is tough for non-tech contributors.
Need a web application to get the ebooks details in a form, then add those details in XML file and commits to the repo automatically.
9. Add ebooks automatically in GoodReads.com
We can add the details about the ebooks in FreeTamilebooks.com to GoodReads.com
See here how to do it. – http://www.wikihow.com/Add-a-New-Book-to-the-Goodreads-Database
10. Build a SAAS version of planet kind of RSS aggregation software.
It will be good, if we build a SAAS version of planet or similar software, so that they can simply sign in, add rss feeds and start using it.
Read more at https://goinggnu.wordpress.com/2017/02/15/thinking-on-a-hosted-planet-solution-share-your-thoughts/
There are more ideas. Written them somewhere on my notebooks. Will collect them and share soon.
All the projects should be released as Free/Open Source software only.
If you are interested in doing any of the things said above, comment here.
Pingback: Open Source Projects for Tamil – Hackathon – Sunday – April 23 – Chennai | Going GNU
Ich wollte einfach einen netten Gruss hinterlassen. Bin gerade auf eure Seite gestossen. http://nodakclassifieds.com/author/wilfredlawr/
Pingback: Minutes – Hackathon on OpenSource Software for Tamil Language | Going GNU
Very nice blog you havee here