Annual Review – What I did on 2017?


2017 was a great year for me, as all the years. Sharing here on how it went. Yes. It is too late to write this now. It is because of the few lifestyle changes I am following. Its is OK to do many things slowly and drop few things. Am I right?

Health :

I was trying to reduce my ongrowing tummy for years. in 2017, took some serious, series of efforts.

  • Joined Zumba Dance class on Sep 2017. It is a thrice in a week class. I attend 2 classes, atleast.
  • Bought a cycle on October 2017 and went to office in cycle. It was around 40 km per round trip. Cycled for 2 months, twice in a week.
  • Exercised for 10-15 min every day, mostly.

With all these efforts, my hip size came to 34 from 39. It is a great improvement. Thanks to “Dude Academy” and Master Kannadasan for inspiring.

Got fever 4 times. Those bad days reminded how bad my health is. It is time to eat good food, and siddha medicines which gave enormous immunity in my school/college days. started to have the medicines prepared by my dad.

Stopped drinking cool drinks, fried items like pizza, burger etc.

Family:

  • Viyan got into a nearby school. He is in LKG now.
  • He sent hand drawn Deepavali Wishes post cards to all our friends.
  • Nithya learnt karagam, Poikal Kuthirai dances. She is performing in various event. She wrote a ebook on Javascript. Released a video on Bigdata and Elasticsearch
  • Brother Arulalan joined at meteorology department as scientist in Noida.
  • Brother Suresh got a government job.

Office:

Office works made me keep on learning new things. Elasticsearch, Spark, Druid, Python3, AWS api are some great things I explored. Completed reading one book fully on the technology I start to work.

Travel:

  • Went to Vienna, Austria for a wikipedia hackathon. Met many interesting people and blogged the event happenings of Day 1, Day 2 and Day 3.
  • Went to Yelagiri for Nithya’s karagam dance on a government function.
  • Went to Kothagiri for marriage of Nithya’s friend’s sister.
  • Went to Kerala with Office friends and their Family.
  • Went to Pondicherry and visited science center, bharathiyar house etc.
  • Went to Melmalaiyanur with family.
  • Went to Beaches few times in the early morning. Those are really great times.
  • Went to DakshanChitra many times for Nithya’s dance practises.
  • Went to Sriperumbuthur as Nithya danced for Sri Ramanujar’s 1000th birthday celebrations.
  • Went to Kumbakonam and nearby places with office friends.
  • Went to few resorts with office friends.

Foss Contributions:

  • IITM and SSN College of Engineering, released a good Tamil Text-to-Speech system. Met IITM team and learnt how to install in linux. Automated and released it here.
  • In Vienna wikipedia hackathon, enabled LinguaLibre to upload voice files to wikipedia commons.
  • Helped to released a telegram bot to translate strings collaboratively for the english strings in openstreetmaps to tamil.
  • Helped to released a “Send to kindle” option for the ebooks in FreeTamilEbooks.com
  • Got some interest on adding interesting places to OSM. Used cycle to roam around tambaram and added many places with the maps.me app.
  • Conducted 2 hackathons for ilugc. one for tamil computing and one for wikipedia.
  • Conducted 1 event for ebook making training.

Society:

  • Found a social team “Agaththi” in Tambaram. Participating on their market for organic formers.
  • Filed few RTI in YouRTI.in for the issues in Tambaram and got few resolved.

Writing:

Wrote 57 blog post in this blog and 6 posts in Tamil in my tamil blog. This is low. Will try to increase next year. Writing a book on Python in Tamil. Hope will release by next year. But I have read plenty of books this year. I love reading than writing. Lazy me.

Books Read:

Thanks to PacktPub’s One free ebook per day, Kindle Unlimited Plan and All authors of FreeTamilEbooks.com for providing tons of great books to read.

With the Kindle device, I enjoyed plenty of hours in continuous reading.

Award :

Got an award for the contributions on Tamil computing on 2016 from The Tamil Literary Garden, Canada. Received Rs. 50,000 as prize money. Working on creating an organization with this money to do more contributions with more volunteers. Will ask for donations soon.

LifeStyle:

Was living a simple, minimalistic lifestyle.
Quited TV, Facebook and smartphone two years ago. Still able to live without them. Prioritized the life in the order of Health, Family, Office, Personal growth, FOSS Contributions.

Thus, the year 2017 was so great. It gave me great lessons via bad experiences too. Those are opportunities for me to learn about people and world.

Hope you too had a great year. Hoping that 2018 will be even more exciting for all of us.

 

Advertisements

Why do I love apache Kafka?


apache kafka க்கான பட முடிவு

From Wikipedia,

https://en.wikipedia.org/wiki/Apache_Kafka

<quote>

Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Its storage layer is essentially a “massively scalable pub/sub message queue architected as a distributed transaction log,” making it highly valuable for enterprise infrastructures to process streaming data. Additionally, Kafka connects to external systems (for data import/export) via Kafka Connect and provides Kafka Streams, a Java stream processing library.

</quote>

I am using Kafka as a message queue, in one of my projects that get huge amount of real time data. I get around 6000 to 1,00,000 events per minute. I tried to read those events by a custom python script. The script can not read that huge data. It missed many data.

 

Was looking for a stable data reading tool. Found Kafka and explore it. For my surprise, it worked well. Stress tested with the tool “siege“, producing millions of test data. Single Kafka server received all the data and  stored.

apache kafka க்கான பட முடிவு

It compresses all the data as its own internal format and keeps them all. By default, it stores for a week. Anyone can write to it and anyone can read from it, in a very stable process.

Logstash is a perfect pet for reading from kafka. Then it can write to s3, another kafka or elasticsearch.

Installation is very simple. Just download, extract, start running it.

https://www.digitalocean.com/community/tutorials/how-to-install-apache-kafka-on-ubuntu-14-04

With confluent platform, it can read and write json documents easily.

I strongly suggest to use kafka on any message queue requirements.

Image sources:

 

 

Real Time Bigdata Analysis – Few Tools


https://media.licdn.com/mpr/mpr/AAEAAQAAAAAAAAZbAAAAJGE5Y2ZiNmU2LWRhNTgtNDhlYi05YTY0LTAwYWVmY2EyZGY5Yw.png

 

Big Data Analysis is becoming one of the hot words in the IT industry. Everyone wants to analysis data. They all want to use the tools like hadoop, spark etc. These are used to process huge amount data. i.e in TB size . This is called “Historical Data Analysis”.

In opposite to this, there is “Real Time Data Analysis”. This is to process immediately on the stream of constantly incoming data.

The typical data pipeline for Real Time Big Data Analysis is as below.

App/Site->API Server->Message Queue(Kafka) ->Processor(Logstash)->Storage(Elasticsearch, Redis, MongoDB)->Visualization(Kibana)

Few years ago, we had to rely on Google Analytics and pay huge amount of money to get real time data of our site visitors, credit card swipes etc. Nowadays, we can build entire pipeline with Free/Open Source Software itself.

https://i2.wp.com/blog.infochimps.com/wp-content/uploads/2012/05/realtime-analytics.png

With the following links, we can setup the data pipeline easily.

https://www.digitalocean.com/community/tutorials/how-to-install-apache-kafka-on-ubuntu-14-04

http://docs.confluent.io/3.2.0/kafka-rest/docs/index.html

https://www.digitalocean.com/community/tutorials/how-to-install-elasticsearch-logstash-and-kibana-elk-stack-on-ubuntu-14-04

To setup these things are easy. But once the real time flow is started on production, remember, you are always on fire. You will feel that you are riding an aeroplane, with so many buttons on the dashboard. You have to keep running, while solving the real time issues when they appear.

Explore these tools and learn their basics. Learning Basics will give their sweet results for sure.

There are tons of new tools coming in this arena. We can not master all the tools. But, exploring and learning one tool will help to keep on moving with new tools easily.

I am exploring the following tools along with the ELK.

  1. Presto
  2. Spark
  3. Secor
  4. Druid
  5. Hadoop
  6. Hive

Doing most of the programming with Python. It becomes very slow to deal with GBs of data. Go language seems faster to work with text files. Started exploring Go too.

What are the new tools, technologies you are learning?

 

Image source- https://www.linkedin.com/pulse/real-time-stream-processing-big-data-platform-birendra-kumar-sahu

http://gcastd.com/

 

 

What I learnt from teaching ELK stack in a Workshop?


Today, I trained a mixed group of students about doing real time bigdata analysis in 4ccon, Chennai.

 

https://pbs.twimg.com/media/C23UnKJUcAECtEb.jpg:large

As the Bigdata is one of the trending words on the IT field, got around 40 participants.

The participants are from Electrical, CSE departments and few working professionals.

Though we asked everyone to bring the laptop with ELK stack preinstalled, many spot registered participants, did not get the laptop or installed anything.

Thats fine. I had one full day. We can do the installations in one hour.

There were many unexpected issues.

1. Windows laptops

I never thought that people will come with windows laptops. I did not know that ELK stack can run in windows, till I see the windows laptops.

I left windows some 10 years ago. Dont know the basic stuff to do on it. Fortunately, Mr. Sivarama selvan, from NIC, got the packages for windows and demonstrated the following

1. Installing Java
2. Setting the JAVA_HOME and path
3. Invoking logstash with a sample configuration file

Without him, I would felt hopeless. Thanks a lot sir.

2. Poor Internet

Though the college provided WiFI to all the rooms, we got very poor connectivity. Connection speed was too low. The ubuntu and redhat users lost their patients to get installed these packages from repositories.

After some time, I asked them to login to my laptop and explore the commands and use my elasticsearch and kibana to connect from chrome plugins (Elasticsearch tool box, postman). As the wifi was poor, they had to wait for a long time to check even small stuff.

3. Windows Users behaviors

Our mixed stilled participants found very tough to work with the Command Prompt in windows. Many saw it for the first time. So, traversing through various directories itself very tough. Had to teach the very basic commands like cd, dir etc. Never thought that I will be teaching MS Dos commands in the ELK workshop.

We provided Zip files for logstash, elasticsearch and kibana with sample configuration files in another zip file for logstash, Elasticsearch.

The icon for Zip file and a folder seems similar in Windows. On double clicking any zip file, it opens just like a folder. People started to double clicked the Zip file and edited the config files. When they tried to access those files from command prompt, they can not reach those files. It took much time for me to find the issue and trained them on how to extract zip files. 😦

4. Editing Files

Some opened the sample logstash config file in notepad. It showed everything in a single line. Changing some values were tough.

Some opened in MS word and save as docx files.

Some found difficulties on finding a file path to give in this sample config file.

5. curl for windows

As curl is the main tool to interact with elasticsearch, dont know how people can practise in windows without curl. Found curl for windows. Again, downloading with poor Internet, teaching how to install and how to use in command prompt was tough for me. So, missed this part. Asked people to use chrome plugins like sense, elasticseach toolbox. With these plugins, people can index only few data. They cant do bulk import of data.

6. ELK versions

Someone installed a mixed versions of ELK stack. it did not worked as I displayed on my laptop. After a deep troubleshooting session, found the version issue and installed the latest version as on my laptop.

Finally, got some handson learning.

With more than 50% of time, spent of fixing these issues, managed to explain the ELK stack. Demonstrated how to read a CSV file using logstash. Displayed the data in screen and sent to elasticsearch. Then, explained Elasticsearch. Demonstrated indexing data, importing bulk data, search, and delete. Then, explored kibana. Asked them to create visualizations and dashboards. They did it with huge interest.
Then, demonstrated how we can get data from twitter stream and analyse in kibana.

Participants are happy to get some handson with the ELK stack.

Used the following links.

Config files, sample data = https://github.com/tshrinivasan/elk-training

https://www.digitalocean.com/community/tutorials/how-to-install-elasticsearch-logstash-and-kibana-elk-stack-on-ubuntu-16-04

http://ikeptwalking.com/elasticsearch-sample-data/

http://www.generatedata.com/

sample config files.
https://github.com/elastic/examples/tree/master/ElasticStack_twitter

Slides

Here are my learnings:

1. Never expect internet connection.

Find a solution to setup a quick local intranet. Always go with a Wifi router. So that all VNC, SSH, web servers, file transfer can be easy and fast.

Get some portable packages for GNU/Linux too.

Always be prepared to run the workshop without internet.

2. Learn few Windows stuff and have software for windows too.

It is not good to ignore the windows users. When they come forward to learn something, we have to be prepared to teach them too.

Have a copy of ELK Zip files, curl, putty, VNC, Java setup files, Notepad++ editor, Firefox/Chrome browser etc.

3. Prepare documentation and share to participants

Prepare a how to install/setup/example document and share with all. With this document, people can explore further once they go home. If possible, create video tutorials and share online and offline.

4. Software versions

Make sure the software you use on laptop and participants using are same. ELK stak is changing a lot on every release.

5. Know the audience

Mostly, we get mixed skilled audience. I assumed that they had the basic computer skills like extracting files, understanding file path and using command line. When they lack on this, we have to start training them on the basics.

This is my first training on ELK for public. Learnt tons of stuff on my preparation hours and on workshop. Thanks for the participants. With their patience and interest on learning, the day was successful. Thanks for 4ccon volunteers for the wonderful event.