This is a replacement for snips-tts
(the text-to-speech component of the Snips voice assistant).
Rather than using the local (but low quality) TTS service (e.g. pico2wav
), this system uses the
much higher quality AWS Polly service.
It communicates with Snips via the same MQTT topics as the default TTS service, i.e.
- subscribes to
hermes/tts/say
to pick up new TTS requests. - sends converted audio to
hermes/audioServer/default/playBytes
. - closes Snips session via
hermes/tts/sayFinished
.
One of the main design features of Snips is that it's a local, rather than cloud based, voice assistant. Therefore depending on an cloud TTS service may seem to negate those benefits. However the quality improvement is massive, privacy concern low (i.e. Polly doesn't "listen" to you, or have access to your device), and it's fast.
Switching between the two can be as simple as:
systemctl stop snips-tts-poly && systemctl start snips-tts
boto3
- python library for communicating with AWS.paho
- python library for interacting with Snips via MQTT.toml
- python library for reading the central Snips config file.mpg123
- binary for converting the MP3s Polly responds with, into WAVs which Snips can process.
- The service caches speech in
/tmp/tts
to avoid converting text it's converted before. It doesn't currently prune this cache, so keep an eye on it. - I've included an example systemd unit file (
snips-tts-polly.service
) to make it easy to daemonize the service. (Remember to copy the binary to/usr/bin
).
Initially developed from jarvis_listener.py by @tschmidty69