When AI Speaks

Have you ever listened to your friend trying to sing karaoke? It seems like they’re giving it their all, but somehow… it still sounds a bit off. You could say it’s lacking emotion.

The same happens with voiceovers - only here, the voice actor or singer is artificial intelligence.

And believe me, AI can sound in many ways:

  • Cheerful - as if it just won the lottery (or at least got an extra coffee break)

  • Angry - like your GPS repeating “Turn left!!!” for the fifth time

  • Overly dramatic - when a simple sentence like “Please bring the coffee” sounds as if a new Game of Thrones episode is about to start

  • And wayyyyyy more

Examples from our AI lab

We have a few audio synthesis samples where our AI tool tries to read text with emotion.


Did it succeed? Well… you decide.
Try to guess which voice is angry, which is happy, and which is neutral:

Unfortunately, all these recordings were… angry. Or at least, they were supposed to be angry.

Could you feel the emotion? Especially in the third recording? :)
Because we - not really. The last one sounded like a complete jumble of sounds, with as much emotion as a refrigerator has.

In short - all these attempts were failures. Oops.

After rounds of tweaking, fine-tuning, and more than a few late-night debates in the lab, our engineers finally cracked it. The model stopped sounding flat and robotic, and instead began shaping words with emotional nuance and precision.

Now try to guess which recording is angry, which is surprised, and which is happy:

🚨 Answers:

  1. Surprised

2. Angry

3. Happy

So, at first, it was a disaster - sounds were just sounds. Now they speak, smile, get angry - almost like us. You just need patience and a few hundred mistakes.

Next
Next

Artificial Intelligence for Business and Society: Challenges, Opportunities, and Ethics