30
« on: June 03, 2005, 03:39:30 pm »
Let's talk turkey regarding these voices.
Most of the voices out there--the ones you have to pay for--are disappointing. They're either flat and mechanical-sounding or if they possess something of the dynamics of a real voice, they're too unstable and quirky to sound convincingly human.
AT&T, for instance, produces two good stable, dynamic female voices: Lauren and Crystal. However, their British female, Audrey, and their preppy American voice, Julia, are disappointing. Obviously, producing a convincing, natural-sounding voice is difficult. There's no telling what will come out, how stable and consistent a voice will be once it starts to combine different sounds and generate different sorts of sentences.
Just about all voices come with adjustable speed. Normally, the default speed is okay for standard utterances. You wouldn't want to listen to somebody who was talking a mile a minute (unless you were a salesperson--then you might be turned on by it). However, if you want to, say, create TTS scripts and then turn them into wav files, you might want a product that would allow you to adjust the speed of an utterance in isolated sentences or phrases. MASH, a Microsoft Agent Scripting application, allows you to do this. I don't use it to produce MS Agent scripts, but rather, to create more dynamic TTS files, which I then record as wav files.
Unfortunately, none of the high-quality voices come with an adjustable pitch. In other words, you can't raise or lower a voice, to make it, say, suddenly serious and sexy (lower/deeper pitch) or excited, even hysterical (higher pitch). All the free voices come with an adjustable pitch. However, as I've said, none of the for-sale voices comes with it.
Cepstral claims that their voices have an adjustable pitch, but I think that's a bit of a cheat. I bought those voices for the adjustable-pitch factor. Boy, was I suckered. First of all, the pitch isn't really adjustable. There's no slider bar. There are only a few presets, and each one is disappointing, as far as I'm concerned. The pitch adjustments just make the voices sound distorted, rather goofy. I guess it's not easy to add an adjustable pitch to a truly dynamic, human-sounding voice. For now, higher quality voices seem to be of a fixed pitch.
Although I've already indicated my personal favourites elsewhere, I'll say it again. For me, of all the female voices I've heard, Lauren (AT&T) is the best out there. Crystal (also AT&T) is good, too.
Heather (Acepela) has a soft, breathy, seductive timbre. She also sounds as if she's got a cold, not a bad cold, just a little plugged up. It's an intriguing sound, nonetheless. Acepela has a British female voice that sounded promising, if memory serves. I don't think you can demo that one, though. Anyway, I hope somebody markets both of them for a reasonable price. They'd be great addition to a rather meagre selection of top-notch voices. Perhaps Nige is right: maybe NextUp.com is getting ready to release Heather. They're a good company to deal with, and they're always ready to add new voices to their list.
I must admit I'm not a great fan of NeoSpeech's Kate. I find her voice tinny, mechanical-sounding. She reminds me of an earnest reporter, one whose voice starts to grate on you if you listen to it too long. Maybe she was hired for her looks.