Project Idea – Automation script needed to download British Library books


British Library has already digitized many Indian books (including Tamil, Bengali and other languages) and uploaded them in their website.[1]  The books are split in separate pages in .tiff format, so, we need a script to automate the process of transferring them in Internet Archive/Commons as a single pdf/djvu file, so that we can use it in Wikisource.

https://i1.wp.com/eap.bl.uk/images/header_main.jpg
Got this request from my Wikipedia friend Bodhisattwa Mandal
I checked few Tamil Books.
Example :
http://eap.bl.uk/database/overview_item.a4d?catId=164997;r=18467

“Access for research purposes only” is the license for this file.

But, it seems that these books are very old and already in public domain.
We have all the permissions to download them and publish anywhere.
Now, we need a program in python or any language to download all the books, magazines from the sire http://eap.bl.uk and to provide them as individual PDF files or a zip file of images.
Once, if we get the PDF or image files, we can do OCR them using google OCR and get text out of them. Then, we can publish both images and text for further proofreading and fixing to WikiSource sites, using OCR4WikiSource.
if you are interested to contribute for this project, reply with your details in comment or send mail to tshrinivasan@gmail.com
Thanks.

Looking for Commercial Tamil Translators


I got a request for translating slideshows in Tamil for the following subjects, from a college professor.

  1. Embedded Systems
  2. Discrete Structures

The Embedded systems subject has electronics and programming concepts. The Discrete Structures has full of high level mathematics. Each book have around 500 pages.

The style can be with mixed of tamil and English(for tech terms). It is a paid job. We may get more content on various subjects to translate, if we complete this.

If you are interested in this translation work, comment here or send an email to tshrinivasan@gmail.com with your profile and translation experience information.

Share this info to your known Tamil Translators.

 

Co-operative business model – few thoughts -2


Last sunday, met Mr.Ganesh to discuss about the co-operative business model. Here are the points we discussed.

This is a plan to do a business with co-operative model. Any one can buy shares to be a part of the business. We can buy the products from the source itself so that we can buy for low cost and in high numbers.

https://goinggnu.wordpress.com/2017/03/18/can-we-build-co-operative-organizations-for-consumer-community/

read here for the initial thoughts.

Need to plan on the following.

1. Product to sell
2. Place
3. Workers/Salary
4. Storage place/Electricity
5. Advertisements
6. Product selling cost
7. Membership cost
8. Return of the loans
9. Benifits for the members

Investments:
To start operating, we need investment.

We can get investments, via
1. Bank loan
2. Personal Investments

Heard that if we form a co-operative society, government will assign a person as secretary, to monitor the operations and will give some fund as loan. Need to read the laws for co-ops societies, to know more on this. If you know anything on this, please share the details.

Management:
We need to get volunteers or hire some people to do all the management activities.

Need to fix the following
selecting people
roles
how long they can serve for the role
laws to solve any issues/reports on them

Transparency:
Transparency is one of the key pillars of this system.
All the buying prices, selling prices, profits, loss, bills, sales, accounts should be open for public. All the processes should be recored and kep onine immediately.

Buying:
we may not able to buy from the source, aka farmers directly, But, we can buy from the fitst level agents. Need to plan for buying from the source/manufacturers itself, to eleminate man in the middle.

Quality:
Need to fix Quality levels, QA process, people, inspections in all the levels.

Cost of the products:
The cost of the selling products should be lower for members. Non-members should buy for little higher and below market rate.

Market Survey:
Need to do a market survey, to choose the products to sell. It may vary on the place of the shops. We can do it ourself or outsource for some marketing agency.

Share cost and Dividend:
How to calculate share cost and dividend?
How many shares can one person buy?
How to make the share holders not to trouble the business operations?

Like this, there are so many things to be discussed, fixed. Need to create a strong system with all if’s and but’s. All the created laws should manage the entire system, regardless of the people involved in any position.

Started to discuss about the co-operative business model with friends. Have to discuss with some local business people. They may see it from different angel. May be as competitive. Hope they will give some interesting inputs. Will share after some discussions.

Are there any books available on how the co-operative business models are operating? How the OK Co-operative business are working?

Please share your thoughts on this business model.

will you be a member of such business model?

YouRTI.in – submit RTI anonymously


Few days back, wrote about search for an organization that can help to file RTI anonymously.

https://goinggnu.wordpress.com/2017/03/09/is-it-possible-to-create-a-system-for-anonymous-rti/

Found it today. It is http://www.YouRTI.in

 

 

It helps to file RTI online for free. All the requests are anonymous. They file rti on behalf of us. Once got the results, they publish it online. If you need privacy on your rti requests and for requests, you have to pay them.

I am wondering how they are doing as service for free and how they handle the threats from unknown people.

Tons of thanks for their services. Filed two rti on their site. They conduct rti awareness program in Hyderabad recently. Looking for such an event in Chennai.

If you are interested in such event, share the details about the people who can help on this. Let us create more awareness on RTI to public.

Can we build Co-operative organizations for consumer community ?


The one big cause of constantly raising cost for the food, groceries and other goods is “Man in the Middle”.

In india, a farmer can not fix the rate for his product, where the oil companies fix their rates and change them frequently. The price for the agriculture products are fixed by the brokers and agents.

We, as consumers can not reach the farmers and buy the stuff from them. Bacause

  1. we can not buy in large scale
  2. we can not store food products in large scale
  3. we have to travel a lot to buy directly from farmers. We wont travel 20-100 km to buy 1kg of rice.

But, somehow, if we buy the goods from the farmers, we will get them for very low cost and farmers will get high income as there is no man in the middle.

How can we achieve this?

In my native, Kanchipuram, there are many silk manufacturer’s society, milk sellers society. Where the group of people join together, manufacture and sell something.

Like this, why cant we form a consumer’s society?

Imagine this. We all consume many things regularly. If we all form a society/organization by contributing a small amount of money as subscription fees, we can solve all the 3 issues mentioned above.

With the huge money,

  1. we can buy in large scale
  2. we can build storages
  3. we can create many local shops so that members can walk and buy stuff

The management of the organization should be transparent. All the decisions should be well acknowledged by the members.

This is how the open source community is building software. When we need something, we join together, discuss transparently, take decisions, roll out our sleeves, get hands dirty with keyboard, write code, fix issues and done.

Similarly, here is a social problem. We need to buy directly from farmers. How can we achieve this? Building consumer society, connecting with other consumer societies across the state/country will be a good solutions.

Imagine, today there is a consumer society is started. What should bethe guidelines for that? How much can we set the membership cost? It needs some initial money to set a shop, storage, salary for people, electricity bill etc. How can we pay for them until we get significant amount of members reached, so that it runs on its own self sufficient fund? What will be your doubts and fears on joining such society?

Share your thoughts.

Do you think, this is impossible?

It has been proved for 165 years in UK.  Quoting from the wikipedia page, https://en.wikipedia.org/wiki/The_Co-operative_Group

Read the full page to know more.

<quote>

The Co-operative Group, commonly known as the Co-op, is a British consumer co-operative with a diverse family of retail businesses including food retail; electrical retail; financial services; insurance services; legal services and funeralcare, with in excess of 4,500 locations. It is the largest consumer co-operative in the UK and owned by more than 4 million active members.[4] Membership is open to everyone aged 16 and over, provided they share the values and principles upon which the group was founded. Members are democratically involved in setting business strategy, decide how social goals are achieved, and share in its profits – in the last quarter of 2016 over £15m was returned to members and their chosen local community causes

The Co-operative Group has over 70,000 employees across the UK.

The Co-operative Group is unusual as a co-op because it is owned by millions of UK consumers and also a number of other UK co-operatives, making the business a hybrid of a primary consumers’ co-operative and a co-operative federation.

</quote>

File:The Co-operative, Balloon Street, Manchester.jpg

Image source – https://en.wikipedia.org/wiki/File:The_Co-operative,_Balloon_Street,_Manchester.jpg

cooperative uk க்கான பட முடிவு

Image source – https://www.flickr.com/photos/12859033@N00/2151256125

 

I am wondering, how we missed the co-op culture from the UK, when we copied all the politics, law, culture from the british people.

Yes. The Consumer Co-op culture is all around the UK and it has making the country wealthier.

Why cant we create such nation-wide networked co-opeative organization in India? We can. Share your thoughts on this. Let us build a great consumer society. Let us buy directly from farmers and manufacturers.

 

 

 

 

How can we add more data to OpenStreetMaps easily?


File:OSM Logo.svg

When exploring on creating maps in Tamil using OpenSteetMaps found that the current data available on OSM is not equivalent with Google Maps.

We can not copy data from Google Maps and import to OSM as it is a big copyright violation. Yes. We dont have control over the data for the area where we live.

We can do the following.

  1. Look for the data sources who may have the data about the streets, villages, cities. I think think the government departments like Postal, Revenue,Rural development, Public Works may have these data. How can we ask them to share the data in public? Will a RTI help on this? Do you have contacts with the leads of these dept? Please help to get map data from them.
  2. Add the data manually. Edit OSM just like how we are adding content to wikipedia. Drawing roads, marking important places is easy. Watch the below video for a demo.

    By this way, we can add any data manually, edit, improve the existing data. But editing on the browser is not possible for many.

    It will be nice if we have a mobile app to add data to OSM. When I had a smartphone, few years ago, searched for a OSM mobile app. Cant find any app, which helped to edit OSM.

    I dont use smartphone nowadays.

    If you have a smartphoe, can you search for the apps, which can help to edit the OSM easily?

    The app should be very simple. The user should should open the app. It should capture the latitude, longitude from its GPS or mobile tower. Then it should ask for the name of the building,  building number, type of the place, street name, area name,city name, if required a photo, etc. Once the user entered these data, it should be synced on OSM. Contribution should very simple as filling few forms only.

    If we have such app, we can create communities/volunteers to add data to OSM with their smartphones. Just open app, fill data. They are done.

If there is no such easy editing app for OSM, it is high time to create such one. If you are a mobile developer, please create such app and help to build the Openstreetmaps as content rich.

Wondering how the Google, Apple, Bing maps collected data. What kind of mobile app they used, what ate the data they collected. If you have worked for these maps, please share more details about them. It will help a lot.

There may be other easier, better ways to add data to OSM. Please share the details or connect with the communities.

Let us build content rich Open Street Maps.

 

Image source : https://commons.wikimedia.org/wiki/File:OSM_Logo.svg  – CC-BY-SA

 

 

 

 

 

 

 

Why do I love apache Kafka?


apache kafka க்கான பட முடிவு

From Wikipedia,

https://en.wikipedia.org/wiki/Apache_Kafka

<quote>

Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Its storage layer is essentially a “massively scalable pub/sub message queue architected as a distributed transaction log,” making it highly valuable for enterprise infrastructures to process streaming data. Additionally, Kafka connects to external systems (for data import/export) via Kafka Connect and provides Kafka Streams, a Java stream processing library.

</quote>

I am using Kafka as a message queue, in one of my projects that get huge amount of real time data. I get around 6000 to 1,00,000 events per minute. I tried to read those events by a custom python script. The script can not read that huge data. It missed many data.

 

Was looking for a stable data reading tool. Found Kafka and explore it. For my surprise, it worked well. Stress tested with the tool “siege“, producing millions of test data. Single Kafka server received all the data and  stored.

apache kafka க்கான பட முடிவு

It compresses all the data as its own internal format and keeps them all. By default, it stores for a week. Anyone can write to it and anyone can read from it, in a very stable process.

Logstash is a perfect pet for reading from kafka. Then it can write to s3, another kafka or elasticsearch.

Installation is very simple. Just download, extract, start running it.

https://www.digitalocean.com/community/tutorials/how-to-install-apache-kafka-on-ubuntu-14-04

With confluent platform, it can read and write json documents easily.

I strongly suggest to use kafka on any message queue requirements.

Image sources:

 

 

Real Time Bigdata Analysis – Few Tools


https://media.licdn.com/mpr/mpr/AAEAAQAAAAAAAAZbAAAAJGE5Y2ZiNmU2LWRhNTgtNDhlYi05YTY0LTAwYWVmY2EyZGY5Yw.png

 

Big Data Analysis is becoming one of the hot words in the IT industry. Everyone wants to analysis data. They all want to use the tools like hadoop, spark etc. These are used to process huge amount data. i.e in TB size . This is called “Historical Data Analysis”.

In opposite to this, there is “Real Time Data Analysis”. This is to process immediately on the stream of constantly incoming data.

The typical data pipeline for Real Time Big Data Analysis is as below.

App/Site->API Server->Message Queue(Kafka) ->Processor(Logstash)->Storage(Elasticsearch, Redis, MongoDB)->Visualization(Kibana)

Few years ago, we had to rely on Google Analytics and pay huge amount of money to get real time data of our site visitors, credit card swipes etc. Nowadays, we can build entire pipeline with Free/Open Source Software itself.

https://i1.wp.com/blog.infochimps.com/wp-content/uploads/2012/05/realtime-analytics.png

With the following links, we can setup the data pipeline easily.

https://www.digitalocean.com/community/tutorials/how-to-install-apache-kafka-on-ubuntu-14-04

http://docs.confluent.io/3.2.0/kafka-rest/docs/index.html

https://www.digitalocean.com/community/tutorials/how-to-install-elasticsearch-logstash-and-kibana-elk-stack-on-ubuntu-14-04

To setup these things are easy. But once the real time flow is started on production, remember, you are always on fire. You will feel that you are riding an aeroplane, with so many buttons on the dashboard. You have to keep running, while solving the real time issues when they appear.

Explore these tools and learn their basics. Learning Basics will give their sweet results for sure.

There are tons of new tools coming in this arena. We can not master all the tools. But, exploring and learning one tool will help to keep on moving with new tools easily.

I am exploring the following tools along with the ELK.

  1. Presto
  2. Spark
  3. Secor
  4. Druid
  5. Hadoop
  6. Hive

Doing most of the programming with Python. It becomes very slow to deal with GBs of data. Go language seems faster to work with text files. Started exploring Go too.

What are the new tools, technologies you are learning?

 

Image source- https://www.linkedin.com/pulse/real-time-stream-processing-big-data-platform-birendra-kumar-sahu

http://gcastd.com/

 

 

Looking for a social media manager for FreeTamilEbooks.com


We are looking for a social media manager for FreeTamilEbooks.com

It is a voluntary task.

The roles are as below

  1. Publish the new ebook arrivals at mailing lists, facebook, google plus, whatsup, telegram groups
  2. Update the XML file https://github.com/kishorek/Free-Tamil-Ebooks/blob/master/booksdb.xml for new ebooks
  3. Monitor Social media and response to queries from readers
  4. Contact bloggers and writers to get new contents

If you are interested in volunteering for this, send an email to FreeTamilEbooksTeam@gmail.com

Join in our forum
https://groups.google.com/forum/#!forum/freetamilebooksforum

and post your interest.

 

Filed a RTI to get info on Tamil TTS by IITM and SSN


I blogged on the topic “How to ask IITM to release IndicTTS as Free/Open Source Software?” recently.

Had a good discussion about this on the ILUGC mailing list too.
https://www.freelists.org/post/ilugc/How-to-ask-IITM-to-release-IndicTTS-as-FreeOpen-Source-Software

As a followup on this,  filed a RTI on this.

We can request for any information online in the portal https://rtionline.gov.in itself. Registration asks for our address and phone number. Then we can fill the request form.

Note : The content box does not allow Question mark and URL.

Asked for the below information.

1.
Is there any government policy or G.O to release the software developed by or funded by Department of Electronics and Information Technology (Deity) and Ministry of Communication and Information Technology (M CIT) as Free/Open Source software

2. Is IITMadras funding SSN engg college to develop a Tamil Text to speech software

3. If so, how much is the funded amount

4. Send me the project plan, roadmap, and cost splitups for the development

5. There is a open source android app for Tamil text to speech at IITM site. (IITM donlab site)
This is very very old. But the latest development by SSN college available at  [ speech DOT ssn DOT edu DOT in ]  is very new and works well. Why it is not released as Free/Open Source Software with source code

6. When can we get the latest Tamil Text to speech software from SSN college, as free/open source software with source code

Once submitted, paid Rs 10 via its online payment gateway. It was smooth.

Received an acknowledgment as below.

https://i0.wp.com/storage9.static.itmages.com/i/17/0310/h_1489149948_9704514_0b02428daa.jpeg

Fine. Let me wait for 30 days for the responses.

Will share the results here.