Tortoise TTS is a voice.ai like AI voice emulation application. With the minimum of 50 seconds of voice audio in 10 second cuts you can emulate the voice of anyone in a natural and human sounding way, Tortoise support multiple voices, voice blending, and a variety of extended features all in a self hosted application with no restrictions.
Tortoise TTS is as simple or as advanced as you need it to be. Getting started however is always simple!
Using Tortoise TTS
To use Tortoise TTS you can start with either our provided helper script, or using Tortoise TTS directly.
To use our script, you can use the following command,
tortoisetts "<PROMPT>" ARGUMENTS
This will allow you to provide a prompt in the first command and the arguments in the remaining area. However, you can always choose to use Tortoise TTS directly with the following command.
echo "<PROMPT>" | tortoise_tts.py ARGUMENTS
The traditional command consumes from stdin and uses the python script from the scripts/ dir as such.
Notes
Tortoise TTS has proven to be unreliable with its long form arguments, ex. --output, working better with the corresponding shorthand commands -o, -v etc. Keep this in mind as you use both our scripts and Tortoise TTS directly.
Voice Training
Voice training is a simple process. It takes at least 5 10 second clips of even, no pauses, no background nose, audio clips of a person speaking in the tone and inflection of your choice. They should be put in a folder with the name of the voice, ex. freemand/ and can be used with -v
, ex. -v freeman
. This folder should go in,
/usr/local/lib/python3.10/dist-packages/TorToiSe-2.4.2-py3.10.egg/tortoise/voices
This folder may be different based on versions and changes, you can always find it again with the following command,
find /usr -name voices
The folder should be obvious based on the above location and your versions.
A good tip for training is to find steady clips, with even talking, no background noise, and without stutters. Cut the stutters out if need be. Audacity and similar apps are a solid choice for cutting these clips into the formats you need. 5 10 second clips is enough, but 10 is even better. The length is not a hard requirement, but the closer the better, and over is better than under.
More Info
You can find more information from the following URL for the project's Github.
Report an application with malicious intent or harmful content.