Amazon uses kid’s dead grandma in morbid demo of Alexa audio deepfake

amazon echo dot gen 4 — Enlarge / The 4th-gen Amazon Echo Dot smart speaker.
Amazon

Amazon is figuring out how to make its Alexa voice assistant deepfake the voice of anyone, dead or alive, with just a short recording. The company demoed the feature at its re:Mars conference in Las Vegas on Wednesday, using the emotional trauma of the ongoing pandemic and grief to sell interest.

Amazon's re:Mars focuses on artificial intelligence, machine learning, robotics, and other emerging technologies, with technical experts and industry leaders taking the stage. During the second-day keynote, Rohit Prasad, senior vice president and head scientist of Alexa AI at Amazon, showed off a feature being developed for Alexa.

After noting the large amount of lives lost during the pandemic, Prasad played a video demo, where a child asks Alexa, "Can grandma finish reading me Wizard of Oz?" Alexa responds, "Okay," in her typical effeminate, robotic voice. But next, the voice of the child's grandma comes out of the speaker to read L. Frank Baum's tale.

You can watch the demo below:

Amazon re:MARS 2022 - Day 2 - Keynote.

Prasad only said Amazon is "working on" the Alexa capability and didn't specify what work remains and when/if it'll be available.

He did provide minute technical details, however.

"This required invention where we had to learn to produce a high-quality voice with less than a minute of recording versus hours of recording in a studio," he said. "The way we made it happen is by framing the problem as a voice-conversion task and not a speech-generation task."

Enlarge / Prasad very briefly discussed how the feature works.
Amazon/YouTube

Of course, deepfaking has earned a controversial reputation. Still, there has been some effort to use the tech as a tool rather than a means for creepiness.

Audio deepfakes specifically, as noted by The Verge, have been leveraged in the media to help make up for when, say, a podcaster messes up a line or when the star of a project passes away suddenly, as happened with the Anthony Bourdain documentary Roadrunner.

There are even instances of people using AI to create chatbots that work to communicate as if they are a lost loved one, the publication noted.

Alexa wouldn't even be the first consumer product to use deepfake audio to fill in for a family member who can't be there in person. The Takara Tomy smart speaker, as pointed out by Gizmodo, uses AI to read children bedtime stories with a parent's voice. Parents reportedly upload their voices, so to speak, by reading a script for about 15 minutes. Although, this differs from what Amazon's video demo implies, in that the owner of the product decides to provide their vocals, rather than someone affiliated with the owner (Amazon didn't get into how permissions, particularly for deceased people, might work with the feature).

Besides worries of deepfakes being used for scams, rip-offs, and other nefarious activity, there are already some troubling things about how Amazon is framing the feature, which doesn't even have a release date yet.

Before showing the demo, Prasad talked about Alexa giving users a "companionship relationship."

"In this companionship role, human attributes of empathy and affect are key for building trust," the exec said. "These attributes have become even more important in these times of the ongoing pandemic, when so many of us have lost someone we love. While AI can't eliminate that pain of loss, it can definitely make their memories last."

Prasad added that the feature "enables lasting personal relationships."

It's true that countless people are in serious search of human "empathy and affect" in response to emotional distress initiated by the COVID-19 pandemic. However, Amazon's AI voice assistant isn't the place to satisfy those human needs. Alexa also can't enable "lasting personal relationships" with people who are no longer with us.

It's not hard to believe that there are good intentions behind this developing feature and that hearing the voice of someone you miss can be a great comfort. We could even see ourselves having fun with a feature like this, theoretically. Getting Alexa to make a friend sound like they said something silly is harmless. And as we've discussed above, there are other companies leveraging deepfake tech in ways that are similar to what Amazon demoed.

But framing a developing Alexa capability as a way to revive a connection to late family members is a giant, unrealistic, problematic leap. Meanwhile, tugging at the heartstrings by bringing in pandemic-related grief and loneliness feels gratuitous. There are some places Amazon doesn't belong, and grief counseling is one of them.

Alexa, is this a good idea? —

Amazon uses kid’s dead grandma in morbid demo of Alexa audio deepfake

Amazon taps emotional woes of pandemic, grief to push developing Alexa feature.

Channel Ars Technica

reader comments

Channel Ars Technica