ISD Team
25 Apr 2026
Professional woman recording a podcast or broadcast using a microphone and laptop at a desk.

A new study finds that AI-generated voice clones are significantly easier to understand than original human voices, especially in noisy environments.

Key Findings
  • Voice clones created from as little as 10 seconds of recorded speech can be up to 20% more intelligible than the human speaker’s natural voice in background noise.
  • This advantage held across multiple experiments, including tests with:
    • Elderly listeners (who may have age-related hearing loss)
    • American listeners (to check for accent effects)
    • Simulations mimicking cochlear implant processing
Research Details

Led by Patti Adank (University College London and University of Roehampton) and Han Wang, the study analyzed over 100 acoustic features but could not fully pinpoint why the clones performed better. The researchers were surprised by the result, as they initially expected unfamiliar synthetic voices to be less intelligible.

Implications

The finding could improve applications such as:

  • Digital voice assistants
  • Automated customer service systems
  • Accessibility tools for people with hearing difficulties

Unlike traditional text-to-speech systems (which often require hours of recordings), modern voice cloning needs only seconds of audio.

Researcher Quote

“I thought initially that voice clones would be less intelligible because they were unfamiliar. I found they were up to 20% more intelligible, which was quite shocking.”
— Patti Adank

The team plans further research into text-to-speech systems and digital signal processing to understand and potentially recreate this “voice cloning intelligibility benefit.”

The study

Share on