How to compile Tamil TTS engine from source?

IITMadras and SSN College of Engineering team have released a Open Source Text to speech conversion engine for Tamil and other indian languages.

Check their efforts at https://www.iitm.ac.in/donlab/tts

In this blog post, let us see how can we compile and install the tamil tts system in a ubuntu 16.04 machine.

Go to this link

https://www.iitm.ac.in/donlab/tts/voices.php

It will ask for Signup.
Register by giving username and email address.

Soon, you will get an email with the password to login.

If you dont’ get any password, send an email to hema@cse.iitm.ac.in

Once you got a password, login to the same link.

Now, you will get two select drop down lists.

Select Language = Tamil
Select Voice type = HTS-2.3

Click “Agree & Download”

It will download a file ssn_hts_demo_tamil_male.tgz

gunzip ssn_hts_demo_tamil_male.tgz
tar xvf ssn_hts_demo_tamil_male.tar

cd ssn_hts_demo

Read the README.txt

as step 1, it says as
Step 1: Configure the folder
./configure –with-fest-search-path=/$FESTDIR/examples/ –with-sptk-search-path=/usr/local/SPTK/bin/ –with-hts-search-path=/usr/local/HTS-2.2beta/bin/ –with-hts-engine-search-path=/PATH TO hts_engine_API-1.06/bin/

It requires, following software

1. Festival – we can install it using apt-get
2. SPTK
3. HTS
4. hts_engine_api

2,3, and 4 should be downloaded from respective sites as source and they should be compiled.

1. To install festival, run the below command

sudo apt-get install festival

Install few required packages

sudo wget festival libx11-dev build-essential g++-4.7 csh gawk bc sox tcsh default-jre -y

2. Download SPTK source from http://sp-tk.sourceforge.net/

https://nchc.dl.sourceforge.net/project/sp-tk/SPTK/SPTK-3.10/SPTK-3.10.tar.gz

tar xvzf SPTK-3.10.tar.gz

cd SPTK-3.10
./configure –prefix=/home/ubuntu/tts/sptk
make
make install

3. HTS-HTK
Download from http://hts.sp.nitech.ac.jp/?Download

mkdir hts-htk
cd hts-htk
wget http://hts.sp.nitech.ac.jp/archives/2.3/HTS-2.3_for_HTK-3.4.1.tar.bz2

tar xvjf HTS-2.3_for_HTK-3.4.1.tar.bz2

The INSTALL file says few things to do.

let us do them

Download HTK from
http://htk.eng.cam.ac.uk/download.shtml

it requires to register with username, email, organization and address.
Once registered, you will get password in mail.

using that you can download the packages

http://htk.eng.cam.ac.uk/ftp/software/HTK-3.4.1.tar.gz

wget http://htk.eng.cam.ac.uk/ftp/software/HTK-3.4.1.tar.gz –user=htkuserchennai –password=sgqY=t=M

download HDecode from
http://htk.eng.cam.ac.uk/prot-docs/hdecode.shtml

wget http://htk.eng.cam.ac.uk/ftp/software/hdecode/HDecode-3.4.1.tar.gz –user=htkuserchennai –password=sgqY=t=M

tar -zxvf HTK-3.4.1.tar.gz
tar -zxvf HDecode-3.4.1.tar.gz

cd htk

patch -p1 -d . < ../hts-htk/HTS-2.3_for_HTK-3.4.1.patch

./configure –prefix=/home/ubuntu/tts/hts

make

Now, I got the following error.

(cd HTKLib && make HTKLib.a) \
|| case “” in *k*) fail=yes;; *) exit 1;; esac;
make[1]: Entering directory ‘/home/ubuntu/htk/HTKLib’
gcc -Wall -Wno-switch -g -O2 -I. -DPHNALG -c -o HGraf.o HGraf.c
HGraf.c:118:77: fatal error: X11/Xlib.h: No such file or directory
compilation terminated.
<builtin>: recipe for target ‘HGraf.o’ failed
make[1]: *** [HGraf.o] Error 1
make[1]: Leaving directory ‘/home/ubuntu/htk/HTKLib’
Makefile:141: recipe for target ‘HTKLib/HTKLib.a’ failed
make: *** [HTKLib/HTKLib.a] Error 1

To solve this, run the below command

sudo apt-get install libx11-dev

https://stackoverflow.com/questions/5299989/x11-xlib-h-not-found-in-ubuntu

Thanks to the s.t.o community for the answer.

run make again

make

Got another error as below.

gcc -Wall -Wno-switch -g -O2 -I. -DPHNALG -c -o esignal.o esignal.c
In file included from /usr/include/string.h:630:0,
from esignal.h:34,
from esignal.c:29:
esignal.c: In function ‘ReadHeader’:
esignal.c:974:29: error: ‘ARCH’ undeclared (first use in this function)
if (strcmp(architecture, ARCH) == 0) /* native architecture */
^
esignal.c:974:29: note: each undeclared identifier is reported only once for each function it appears in
esignal.c: In function ‘WriteHeader’:
esignal.c:1184:25: error: ‘ARCH’ undeclared (first use in this function)
architecture = ARCH;
^
esignal.c: In function ‘GetLine’:
esignal.c:1760:4: warning: ignoring return value of ‘fgets’, declared with attribute warn_unused_result [-Wunused-result]
fgets(buf, len+1, file);
^
esignal.c: In function ‘GetLong’:
esignal.c:1808:4: warning: ignoring return value of ‘fgets’, declared with attribute warn_unused_result [-Wunused-result]
fgets(buf, len+1, file);
^
<builtin>: recipe for target ‘esignal.o’ failed
make[1]: *** [esignal.o] Error 1
make[1]: Leaving directory ‘/home/ubuntu/htk/HTKLib’
Makefile:141: recipe for target ‘HTKLib/HTKLib.a’ failed
make: *** [HTKLib/HTKLib.a] Error 1

Again s.t.o helped.

https://stackoverflow.com/questions/37719890/install-hts-2-3-for-htk-3-4-1-on-ubuntu-16-04-has-error

Run the below commands

sudo apt-get install g++-4.7
export CC=gcc-4.7 CXX=g++-4.7
./configure CFLAGS=”-DARCH=linux” –prefix=/home/ubuntu/tts/hts
make
make install

Next is hts_engine_API

download from https://sourceforge.net/projects/hts-engine/files/hts_engine%20API/hts_engine_API-1.10/

wget https://nchc.dl.sourceforge.net/project/hts-engine/hts_engine%20API/hts_engine_API-1.10/hts_engine_API-1.10.tar.gz

tar xvzf hts_engine_API-1.10.tar.gz
cd hts_engine_API-1.10
./configure –prefix=/home/ubuntu/tts/hts_engine_api
make
make install

Then, few more commands.

cd /usr/share/doc/festival/examples/
sudo gunzip dumpfeats.gz

sudo gunzip dumpfeats.sh.gz
sudo chmod a+rx /usr/share/doc/festival/examples/dumpfeats

Thats all. We install all the dependancies of ssn_hts_demo.
Let us install it now.

cd ssn_hts_demo

./configure –with-fest-search-path=/usr/share/doc/festival/examples –with-sptk-search-path=/home/ubuntu/tts/sptk/bin/ –with-hts-search-path=/home/ubuntu/tts/hts/bin/ –with-hts-engine-search-path=/home/ubuntu/tts/hts_engine_api/bin/

sudo mv /usr/share/festival/radio_phones.scm /usr/share/festival/radio_phones.scm-old

sudo cp ~/ssn_hts_demo/radio_phones.scm /usr/share/festival/

sudo cp ~/ssn_hts_demo/Slurp.pm /usr/share/perl5/File/

gcc scripts/tamil_trans.c -o scripts/tamil_trans

Thats all. Done with all the compilation works.

Let us invoke the command to make the tamil text to audio.

export FESTDIR=/usr
ssn_hts_demo/scripts/complete “தமிழ் வாழ்க” linux

This will make audio file as ssn_hts_demo/wav/1.wav

I can now play the file with any audio player and hear a good voice of the text in tamil.

I tried with a little huge text.

here is the demo

The text I gave is

என் சரித்திரம், உ. வே. சாமிநாதையர் எழுதிய தன்வரலாறு ஆகும். இதில் 1855ஆம் ஆண்டு முதல் 1898ஆம் ஆண்டு வரை அவருடைய வாழ்வில் நிகழ்ந்தவை பதியப்பட்டுள்ளன. இதில் அவர் தமிழ் கற்ற வரலாறு, தமிழ் நூல்களைப் பதிப்பித்த வரலாறும் பதிவுசெய்யப்பட்டு உள்ளன.தமிழ்த்தாத்தா டாக்டர் உ.வே.சா. அவர்கள் எழுதிய தன் வரலாற்று நூல் இது. இந்நூலைக் கற்றால் ‘பெருக்கத்து வேண்டும் பணிதல்’ என்ற இலக்கணத்துக்கு இதுதான் சரியான இலக்கியம் என்ற உண்மை தெளிவாகும். பேதங்களுக்கு அப்பாற்பட்ட போதம்தான் தமிழ்ஞானம் என்பது இந்நூலின் தொகுமொத்தப் பொருள் என்றால் அது மிகையாகாது. ’நன்றிக்கு வித்தாகும் நல்லொழுக்கம்’ என்ற தொடரை விளக்குவதற்காக இவர் மண்ணுலகில் பிறந்தார் என்று கொள்ள வேண்டி இருக்கிறது. டாக்டர் உ.வே.சா. அவர்களின் என் சரித்திரமும் மகாத்மா காந்திஜி அவர்களின் சத்திய சோதனையும் ஒரேதரம் உடையவை. இவற்றின் ஒவ்வோரெழுத்தும் வாய்மை நிரம்பிய வைர எழுத்துக்கள்.என் சரித்திரம் கற்றால் தமிழார்வம் வரும். வந்த தமிழார்வம் வளரும். பத்துப்பாட்டும், எட்டுத்தொகையுள் ஐந்தும், மூன்று பெரும் காப்பியங்களும், ஐம்பதிற்கும் மேற்பட்ட பிற இலக்கியங்களும், இலக்கண நூல்களும் நின்று நிலவுவதற்குக் காரணம், டாக்டர் உ.வே.சா. அவர்களின் அயரா உழைப்பே என்பதை, இந்த மன்பதை அறியும். அந்த நூல்களைக் கற்கும் முன், ’என் சரித்திரம்’ என்னும் இந்த நூலைக் கற்க வேண்டும். இதனைக் கற்றால் தமிழ் நூல்களை அச்சுக்குக் கொண்டுவர அவர்பட்ட இன்னல்கள் புரியும்.1. எங்கள் ஊர்சற்றேறக்குறைய இருநூறு வருஷங்களுக்கு முன்பு தஞ்சாவூர் ஸமஸ்தானத்தை ஆண்டு வந்த அரசர் ஒருவர் தம்முடைய பரிவாரங்களுடன் நாடு முழுவதையும் சுற்றிப் பார்க்கும் பொருட்டு ஒருமுறை தஞ்சாவூரிலிருந்து புறப்பட்டார். அங்கங்கே உள்ள இயற்கைக் காட்சிகளை யெல்லாம் கண்டு களித்தும், ஸ்தலங்களைத் தரிசித்துக்கொண்டும் சென்றார். இடையில், தஞ்சைக்குக் கிழக்கே பதினைந்து மைல் தூரத்திலுள்ள பாபநாசத்திற்கு அருகில் ஓரிடத்தில் தங்கினார். வழக்கம்போல் அங்கே போஜனம் முடித்துக்கொண்ட பிறகு தாம்பூலம் போட்டுக்கொண்டு சிறிது நேரம் சிரம பரிகாரம் செய்திருந்தார்; தம்முடன் வந்தவர்களோடு பேசிக்கொண்டு பொழுதுபோக்குகையில் பேச்சுக்கிடையே அன்று ஏகாதசி யென்று தெரிய வந்தது. அரசர் ஏகாதசியன்று ஒரு வேளை மாத்திரம் உணவுகொள்ளும் விரதமுடையவர்; விரத தினத்தன்று தாம்பூலம் தரித்துக்கொள்வதும் வழக்கமில்லை. அப்படியிருக்க, அவர் ஏகாதசி யென்று தெரியாமல் அன்று தாம்பூலம் தரித்துக்கொண்டார். தஞ்சாவூராக இருந்தால் அரண்மனை ஜோதிஷர் ஒவ்வொரு நாளும் காலையில் வந்து அன்றன்று திதி, வார, நக்ஷத்திர, யோக, கரண விசேக்ஷங்கள் இன்னவையென்று பஞ்சாங்கத்திலிருந்து வாசித்துச் சொல்வார். அதற்காகவே அவருக்கு மான்யங்களும் இருந்தன.அரசருடைய பிரயாணத்தில் ஜோதிஷர் உடன் வரவில்லை. அதனால் ஏகாதசியை அரசர் தெரிந்துகொள்ள முடியவில்லை. எதிர்பாராதபடி விரதத்திற்கு ஒரு பங்கம் நேர்ந்ததைப் பற்றி வருந்திய அரசர் அதற்கு என்ன பரிகாரம் செய்யலாமென்று சில பெரியோர்களைக் கேட்கத் தொடங்கினர்.(மேலும் படிக்க…)

Here the audio now.

Yes. This is the best open source text to speech engine for tamil, so far.

Tons of thanks for IITM team and SSN College of Engineering for making the TTS engine and releasing as open source and for free.

Working on making the installation easier with a shell script.

Will share once done.

Hello Shrinivasan,

I’m Swaraj from Kerala. I working with an NGO called Space Trivandrum. I’m planning to integrate malayalam TTS into Orca. For that I did some basic research. I found that Indic TTS has developed 13 languages but could not find ready to use outputs from the project. Finally reached to your blog and saw a great documentation on installing and using Tamil TTS using the data from Indic TTS. Thankyou so much for doing that. From your blog posts I understand the efforts you put into building this on your own, enquiring through mails and not getting any reply and filing RTI.

The synthesized demo audio quality just blew my mind. Out of curiosity I compiled and built the Tamil TTS using the shell scripts you provided, but I could not generate audio output. i copied the text given in the blog and saved in a .txt file format and run the python script.

This is the error on the terminal.

rm: cannot remove ‘wordpronunciation’: No such file or directory
rm: cannot remove ‘temp_2’: No such file or directory
rm: cannot remove ‘temp_1’: No such file hope you can help me on this. or directory
Loading default properties from tagger models/tamil_pos.tagger
Reading POS tagger model from models/tamil_pos.tagger … done [0.3 sec].
Tagged 281 words at 156.28 words per second.
No english character in map file for ’
No english character in map file for ’
FESTDIR: Undefined variable.Hello Shrinivasan,

This is the error on the terminal.

looking forward to a reply from you soon and thank you in advance.

System error.
Could not open sound file “wav/1.wav”.
cat: ‘*.mp3’: No such file or directory
rm: cannot remove ‘0*.mp3’: No such file or directory

looking forward to a reply from you soon and thank you in advance.

8 thoughts on “How to compile Tamil TTS engine from source?”

Pingback: Installation script for Tamil Text to speech System | Going GNU
Vasanth Kumar

October 25, 2017 at 12:01 pm

Hi sir, Its really awesome to see someone has more valuable posts regarding Tamil TTS & NLP.
I tried to build this Tamil TTS in my Ubuntu 16.04 32Bit, But i couldnt synthesize the output sir, i even used your installation script but not working sir.

Error : hts_engine -m
Invalid option -m in hts_engine sir.

Please sir, post a tutorial to build a DNN based Tamil TTS using Merlin/Ossian with the Speaker datas provided by IIT for Tamil.
It would be sooooooo great sir.

Thank you so much in advance, can u help me to solve my error anna. 🙂

- tshrinivasan
  
  October 25, 2017 at 4:29 pm
  
  http://github.com/tshrinivasan/tamil-tts-install
  
  Try this repo.
  
  2017-10-25 12:01 GMT+05:30 Going GNU :
  
  >
  
swaraj

March 20, 2018 at 1:04 pm

Hello Shrinivasan,

I’m Swaraj from Kerala. I working with an NGO called Space Trivandrum. I’m planning to integrate malayalam TTS into Orca. For that I did some basic research. I found that Indic TTS has developed 13 languages but could not find ready to use outputs from the project. Finally reached to your blog and saw a great documentation on installing and using Tamil TTS using the data from Indic TTS. Thankyou so much for doing that. From your blog posts I understand the efforts you put into building this on your own, enquiring through mails and not getting any reply and filing RTI.

The synthesized demo audio quality just blew my mind. Out of curiosity I compiled and built the Tamil TTS using the shell scripts you provided, but I could not generate audio output. i copied the text given in the blog and saved in a .txt file format and run the python script.

This is the error on the terminal.

rm: cannot remove ‘wordpronunciation’: No such file or directory
rm: cannot remove ‘temp_2’: No such file or directory
rm: cannot remove ‘temp_1’: No such file hope you can help me on this. or directory
Loading default properties from tagger models/tamil_pos.tagger
Reading POS tagger model from models/tamil_pos.tagger … done [0.3 sec].
Tagged 281 words at 156.28 words per second.
No english character in map file for ’
No english character in map file for ’
FESTDIR: Undefined variable.Hello Shrinivasan,

I’m Swaraj from Kerala. I working with an NGO called Space Trivandrum. I’m planning to integrate malayalam TTS into Orca. For that I did some basic research. I found that Indic TTS has developed 13 languages but could not find ready to use outputs from the project. Finally reached to your blog and saw a great documentation on installing and using Tamil TTS using the data from Indic TTS. Thankyou so much for doing that. From your blog posts I understand the efforts you put into building this on your own, enquiring through mails and not getting any reply and filing RTI.

The synthesized demo audio quality just blew my mind. Out of curiosity I compiled and built the Tamil TTS using the shell scripts you provided, but I could not generate audio output. i copied the text given in the blog and saved in a .txt file format and run the python script.

This is the error on the terminal.

rm: cannot remove ‘wordpronunciation’: No such file or directory
rm: cannot remove ‘temp_2’: No such file or directory
rm: cannot remove ‘temp_1’: No such file hope you can help me on this. or directory
Loading default properties from tagger models/tamil_pos.tagger
Reading POS tagger model from models/tamil_pos.tagger … done [0.3 sec].
Tagged 281 words at 156.28 words per second.
No english character in map file for ’
No english character in map file for ’
FESTDIR: Undefined variable.
System error.
Could not open sound file “wav/1.wav”.
cat: ‘*.mp3’: No such file or directory
rm: cannot remove ‘0*.mp3’: No such file or directory

looking forward to a reply from you soon and thank you in advance.

System error.
Could not open sound file “wav/1.wav”.
cat: ‘*.mp3’: No such file or directory
rm: cannot remove ‘0*.mp3’: No such file or directory

looking forward to a reply from you soon and thank you in advance.

- tshrinivasan
  
  March 20, 2018 at 2:35 pm
  
  run this commands
  
  echo “FESTDIR=/usr” >> $HOME/.profile source $HOME/.profile
  
  and then, run the conversion again.
  
  Share the results
  
  2018-03-20 13:04 GMT+05:30 Going GNU :
  
  >
  
  - swaraj
    
    March 20, 2018 at 5:19 pm
    
    yes It did work. Thankyou
Pingback: Install OFFLINE Tamil TTS – தமிழ் AI
Pingback: தமிழின் எதிர்காலமும் தகவல் தொழில்நுட்பமும் 12. ஏன் திறந்த மூலமும், திறந்த தரவுகளும், திறந்த ஆய்�