Skip to content
January 8, 2013 / tshrinivasan

Open source Projects to do in Tamil computing

There are many projects to do in Open Source for Tamil Computing.

Listing here some of them.

1. Font conversion to unicode

There are lot of Tamil fonts in TSCII and ASCII format.

Example:

TAM, TAB, Vanavil, Bamini, indoword, softview, kabilan, kaniyan, shri TAM, Shri lipi, ilango, mayilai, anu, senthamizh etc. These are used in DTP centers.

There are tons of documents generated in these fonts.

To view them, we need to install these fonts locally.

These documents should be converted to unicode so that anyone can view them without installing any special fonts.

NHM converter is a online service which does this.
http://software.nhm.in/services/converter

We need to create a FOSS application for this.

We can do this in python, php, ruby etc.
we can do this as desktop or web application.

TACE format should be considered.

2.

Spellchecker

We can create a new spellchecker or extend the existing spellcheckers aspell or hunspell or project silpa.

explore these:
www.silpa.in/

https://groups.google.com/forum/?fromgroups#!topic/freetamilcomputing/dEQgHESN9us

http://saranyaselvaraj.wordpress.com/2009/09/17/aspell-and-hunspell/

3.

grammer checker – santhi pizai thirutthi

4.

Dictionary with tamil meaning, english meaning, opposite, same meaning words

5.

Number to string converter

example: 100 = nooru

6.

OCR for Tamil

The following are in beginning stage.

http://gtamilocr.sourceforge.net

https://launchpad.net/tamilocr

test and extend them.

7.

Tamil Corpus

A web application should be developed, showing a word and all the grammar tags.
Logged in users can select the relevant grammar tag for that word.

Thus, when many people contribute, a whole Tamil corpus will be generated.

8.

Rule based auto complete for Tamil

9.
automatic machine translation

10.
GUI and web based Tools to learn tamil for beginners

11.

Text to Speech for Tamil

http://dhvani.sourceforge.net/

test and enhance it

12.

project for Wiktionary

Wikionary is the wiki based dictionary for all languages.

Example:
http://ta.wiktionary.org

We can add voice files to wiktionary.

We need to create an web application, desktop and mobile client to display each word, asking the user to record the sound of the word.

once recorded, the sound ogg file should be uploaded to commons.wikipedia.org and then it should be linked back to the same word in the wiktionary page.

Thus, any user can record and upload the audio words automatically.

13.

In Tamil wikipedia, we need a javascript based on screen keyboard, so that users can click and type easily.

Some of the projects are discussed in the following research paper collection.

http://ti2012.infitt.org/sites/default/files/Conference-book.part1.rar
http://ti2012.infitt.org/sites/default/files/Conference-book.part2.rar

Engg college students who needs some base paper to their projects can use these papers and build applications on top of them.

Add your comments, if you have any more projects for tamil computing.

7 Comments

Leave a Comment
  1. Baiju Muthukadan / Jan 9 2013 6:33 am

    The correct SILPA URL is http://silpa.org.in/

  2. Joe Lewis / Feb 9 2013 11:01 pm

    The research paper for building apps on top of them is extremely appealing.
    I’m in my final year, and being forced by the college to take on some ieee papers for project, but my interests lie in application development for useful stuff. This is exactly what i was looking for. I’ll try to convince my college faculties to give me a second chance in choosing the topic.

    • tshrinivasan / Feb 9 2013 11:10 pm

      Thats nice.

      They will agree for these papers.

      Please contribute to these projects.

      Tamil computing needs loads of open source projects.

      Shrini

    • tshrinivasan / Feb 9 2013 11:27 pm

      Do they need only IEEE papers?

      There are tons of open access research papers.

      search here http://oajse.com/

      read these too http://en.wikipedia.org/wiki/Open-access_journal

      http://www.guardian.co.uk/science/occams-corner/2012/oct/22/inexorable-rise-open-access-scientific-publishing

      http://www.doaj.org http://oad.simmons.edu/oadwiki/Main_Page

      Search as “text to speech” in http://oajse.com/ We get tons of open papers.

      Wishes for your research.

  3. Siva / Feb 9 2013 11:41 pm

    Thanks.
    I am doing project on Recognition and translation from digital images. These resource will help.

    • tshrinivasan / Feb 9 2013 11:43 pm

      Thanks for the comment.

      Happy to know that this helps you.

  4. Law Practice Attorneys / Aug 13 2013 10:16 pm

    Oh yea, a great post! No clue how you came up with this textual content..it’d take me extended hours. Worthwhile though, I’d presume. Have you considered selling advertising space on your blog?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 952 other followers