Text to Speech with the Arduino Yún

The ability to translate text to speech is a feature that many different Arduino applications can benefit from. Imagine having a voice reading out updated sensor values, or a way to give audible feedback that's way more insightful than a flashing LED - it opens up new ways for projects to interact with users.

In Project 7 text to speech is used to warn users that their voice command wasn't recognized. In the Siri Clone project it is used to speak answers to a user's questions.

The Yún's combination of a Linux processor that supports audio software, and a USB port that audio interfaces can plug into, is exactly whats needed to make this possible with an Arduino.

In this demo I'll take you through the steps on how to get started with text to speech on the Yún - from loading software to writing a sketch that talks to users.

Parts List

To follow along, you need:

  1. An Arduino Yún, connected to the Internet over Ethernet or Wi-Fi

  2. A microSD card, with the OpenWrt-Yun Linux file system expanded onto it

  3. A USB audio interface (used in video)

  4. A pair of desktop PC speakers, with a 3.5 mm stereo jack plug and their own power supply (used in video). Or a set of headphones with a 3.5 mm stereo jack plug

Working with USB Audio Interfaces

In order to play audio from the Yún, we need to connect external speakers (or headphones) to it. The easiest way to accomplish this is with a cheap USB audio interface that provides a 3.5 mm speaker output socket.

Simply plug the audio interface into the Yún's vertically mounted USB socket, then plug the speaker's jack into the speaker output socket.

OpenWrt-Yun, like other distributions of Linux, is capable of playing audio. However, this requires audio support files that OpenWrt-Yún doesn't come initially installed with.

To install these files:

  1. Login to the OpenWrt-Yun command line.

  2. At the command prompt, type the following command and then press Enter:

    opkg update

  3. Type the following command, then press Enter:

    opkg install kmod-usb-audio

Installing & Using eSpeak

eSpeak is an open source software speech synthesizer for English and other languages, for Linux. This software allows for easy text to speech translations from the command line.

For eSpeak to work on the Yún, you need to install a compiled version that is compatible with OpenWrt-Yun. To begin this process first make sure that the GCC compiler is installed:

  1. At the OpenWrt-Yun command prompt, type the following command and then press Enter:

    opkg update

  2. Type the following command and then press Enter:

    opkg install binutils

  3. Type the following command and then press Enter:

    opkg -t /root install yun-gcc

To install eSpeak:

  1. Type the following command and then press Enter:

    wget --no-checkcertificate http://arduinomeetslinux.com/download/espeak_1.48.04_arm71xx.ipk

  2. Type the following command and then press Enter:

    opkg install espeak_1.48.04_arm71xx.ipk

  3. Type the following command and then press Enter:

    rm espeak_1.48.04_arm71xx.ipk

Once installed, you can test eSpeak from the command line:

  • Type the following command and then press Enter:

    espeak "Hello from Arduino Meets Linux."

Try some more sample sentences by calling the 'espeak' command followed by the string of text you want translated. Multi-word strings should be wrapped in quotation marks.

You can even see how eSpeak handles translations of numbers:

  • Type the following command and then press Enter:

    espeak "Today is Monday, May 15, 2015."

The eSpeak program accepts command-line parameters that you can use to change how the program speaks your text. The following table shows a few of the most common options that you may need to use.

Parameter Description
a Sets the volume. This is a number in the range 0 through 200. The default setting is 100.
p Sets the pitch. This is a number in the range 0 through 99. The default setting is 50.
s Sets the speed in words per minute. The default setting is 175. There is noupper limit, but the lowest setting that you can use is 80.
m Specifies that the text contains speech synthesis markup language (SSML). You can use SSML for more precise control of espeak’s pronunciation.
w Writes the speech output to a wave file.
v Selects a language and voice. The available English languages are:
en – standard English
en-us – American English
en-sc – English with a Scottish accent
en-n – English with a northern accent
en-rp – Received Pronunciation (“BBC English”)
en-wm – English with an accent from the West Midlands

For a full description of all of the available options, see here.

To set an option: put a dash before the parameter name, then a space, and then type the value for the option. To set more than option, separate each with a space. For example:

espeak -a 50 -v en -w /mnt/sda1/P7/test.wav "Hello"

There are an additional seven male voice variants, and five female voice variants. To use these, specify +m1, +m2, +m3, +m4, +m5, +m6, +m7, +f1, +f2, +f3, +f4, or +f5 after the language.

espeak -v en-n+m3 "Hello"

Writing the Arduino Sketch

Because eSpeak can be called from the command line, running it from an Arduino sketch is simple using the Bridge library. In the below example we'll write a sketch that speaks to the user and counts down from five.

Start a new sketch in the Arduino IDE, and add the following #include derivative at the top:

#include <Bridge.h>

Next, declare the Process object that will be used:

Process p;

In your sketch's setup() function add the following code:

p.runShellCommand("espeak \"Hello from an Arduino sketch. Lets count down from five.\"");

for (int t=5; t>0; t--) {
    String num = String(t);
    p.runShellCommand("espeak " + num);

p.runShellCommand("espeak \"Hooray, we did it.\"");

The Bridge library’s Process class contains methods for starting Linux commands on the Atheros AR9331 from an Arduino sketch running on the Atmega32u4. We can use the Process object's runShellCommand() method to pass in an entire command as a string.

In this case we're calling espeak in the same way we did from the command line, by including 'espeak' followed by the text we want converted to speech. The text is surrounded in escaped quotes (\”) so that the quotation marks will be included in the command string.

The for loop demonstrates how a sketch's number variables can be translated as well. By converting numbers to a string using the String() function, they are easily included in the command string. The num string does not need to be surrounded in quotes because it is a single word containing no spaces.

Source Code


Start Building Today

book cover Start Reading Now

Your guide to Yún developement. Arduino Meets Linux includes:

  • Over 320+ pages of content
  • OpenWrt-Yun Linux tutorials
  • Python and shell scripting
  • 7 in-depth Arduino projects

Take your Arduino projects to the next level. This book shows you how.