Take the case of any ad played on YouTube, maybe one where some blonde model puts a hand playfully to cover the lens of a camera and wears white sunglasses while flashing a grin at same time. All this happens while a hip-hop track is heard playing from the background and an unquestionably female voice reads out that fashion may change but style is something that stays forever.
This particularly slick and short ad forms a portion of a trial reel posted on YouTube. It has been made by WellSaid Labs, a brand new startup company. What’s starkly different about this is ad is that the woman on-screen is a living human being and the voice heard in the background actually sounds like that of one.
The company is based in Seattle and has used voice actors in collaboration with artificial intelligence for the creation of synthetic voices vividly resembling those of humans. The company has already claimed that the software they’ve been developing in the last one year to convert text into speech, is able to produce audio bearing more resemblance to human voices than synthetic ones. As explained by the company, it has been able to reach that level of perfection by not tightly monitoring or controlling the various speech variables like pronunciation, volume, and speed during the training procedure of the voice model.
According to Matt Hocking, who is the CEO of WellSaid Labs, the voice they’ve been attempting to produce is aimed to be extremely expressive as well as life-like
Currently, computerized voices have gained solid ground, be it in reading out news pieces from smart speakers or providing a person driving the car with accurate road directions. However, software like Google Assistant or Siri is still inclined towards speaking in monotonous and robotic voices, with the only notable exception being Google Duplex in certain cases.