Accent reduction is a goal that many non-native English speakers persue. A big obstacle is that they often listen to native speakers’ speech, but they have never heard their own voice with a native accent. If learners could listen to their own voice speaking with a native accent, they can notice the differences and improve how they sound. Think of it like a personal AI accent coach but with your own voice.

Now how can we change the accent of a non-native English speaker to sound like a native American speaker? Well, It turns out by carefully adjusting the prosody and timbre in a TTS model such as StyleTTS2, we can achieve this accent transformation with ease.

I created a demo on HuggingFace called AccentCoach that can transform any accent into an American accent. It is technically a STTS: Speech-to-Text (Whisper) and then Text-to-Speech (StyleTTS2). The output sounds a bit robotic, but it is good enough to assist English language learners. Let’s hear two examples first.

Einstein’s distinct accent:


Here is Eintein’s American accent produced by AccentCoach:


Arnold Schwarzenegger’s accent:


Here is Arnold Schwarzenegger’s American accent produced by AccentCoach:


Curious about how you’d sound with an American accent? Give the demo a try!

Attention: It’s highly recommended to clone the space and run it either locally or on a powerful GPU on HuggingFace. The model might take more than 10 seconds to do inference on HF’s free vCPUs. On the other hand, it takes less than a second to make inference on an Nvidia 3090.

🐷 AccentCoach on HuggingFace 🐷

Accent-Coach-AI-Accent-Cloning-Reduction.jpg

Run AccentCoach Locally

I tested this on Arch Linux, but the steps should be applicable to Mac and Windows as well, with minor modification. An Nvidia card with at least 8GB of RAM is recommended. However you can run it on CPU or any hardware that PyTorch supports.

git lfs install
git clone https://huggingface.co/spaces/otioss/AccentCoach
cd AccentCoach
python -m venv ac_env
source ac_env/bin/activate
pip install -r requirements.txt
sudo pacman -S espeak-ng

# Now run the app:
python accent_gradio.py
# Open the URL http://127.0.0.1:7860 in your browser.