Building Open Source Tamil Spellchecker – Day 8 – Porting from C# to Python

Recently, Tamil Virtual Academy released 10 Tamil NLP tools as Free/Open Source Software with source code. It has a SpellChecker too. Read more here about this.

https://goinggnu.wordpress.com/2020/08/16/tamilvu-released-10-tamil-software-as-free-open-source-software/

The spellchecker is written in C#. I want it to be ported to Python so that we can extend it very well.

C# is very new language for me. Asked in ChennaiPy mailing lists and social media. Many friends encouraged and came forward to help.

We thought to port it to Mono and then Python. But It is a long path. As all the C# code is mostly with parsing a big JSON file and many if, else, for stuff, I hope we can port it by reading line by line and rewriting in Python.

I had too many doubts on the C# code, its syntax. Wanted to open the code in any IDE, run it with debug mode. As I dont have Windows, was looking for help from Friends.

I have few friends for decades. We even dont know when we started to know each other. But I have many such friends, mostly from free software communities. We help each other on many things.

Today, Manik, a friend from FreeTamilComputing Community, pinged me and enquired about the requirement. He spinned up a Windows VM, loaded the code, compiled and explained me the basic workflow.

We immediately started the read the code line by line and wrote in Python line by line. Though he is also very new to C#, he googled for C# syntax and I googled for Python Syntax. Yes. We are long time programmers who still google for basic syntax. 🙂

https://github.com/tshrinivasan/Tamilinaiya-Spellchecker/blob/master/PythonPort/from_Csharp.py

Here is our half cooked initial version. There are still more functions to be ported. After ported, it needs many rounds of debugging,auditing, improving, etc. Still it is a good start. For weeks, I was dreaming about how to do this. A quick call, helping hands and time gave a great boost on the progress.

I got a call from Rajesh, who is a C# programmer too to help on this. Once the initial version is ported, by next week, hoping to get his help to have a deep study and analysis.

The big dream of bringing a open source Tamil spellchecker is happening. Happy to be a small part of this.

Tons of thanks for all friends for their helping hands and good hearts.

Read Previous days notes on building tamil spellchecker.

  1. Study notes on open-tamil spellchecker – day 1
  2. Building Tamil Spellchecker – Day 2 – Bloom Filter to quick query on dataset
  3. Building Tamil Spellchecker – Day 3 – Collecting all Tamil Nouns
  4. Building Tamil Spellchecker – Day 4 – Shall we collect ALL Tamil Words?
  5. Building Tamil Spellchecker – Day 5 – started collecting ALL Tamil Words
  6. Building Open Source Tamil Spellchecker – Day 6 – How fast is bloom filter for 24 lakh words?
  7. Building Open Source Tamil Spellchecker – Day 7 – Scrapping websites to get more words
  8. https://goinggnu.wordpress.com/2020/08/29/building-open-source-tamil-spellchecker-day-8-porting-from-c-to-python/

2 thoughts on “Building Open Source Tamil Spellchecker – Day 8 – Porting from C# to Python

  1. Pingback: Building Open Source Tamil Spellchecker – Day 9 – Ported from C# to Python | Going GNU

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s