Teaching Pronunciation with AI

Ross Thorburn

Ross Thorburn
24th December 2024

Pronunciation sometimes gets called “the Cinderella of language teaching” (Kelly, 1969:87). It’s overlooked and undervalued compared to its siblings, grammar and vocabulary. Teacher training programs rarely address it in depth, and coursebooks either avoid it or present activities that don’t match your students’ needs. After all, coursebook writers have never met your learners. But knowing your students doesn’t necessarily help you teach pronunciation. The more you interact with your learners, the better you understand their pronunciation. This can make it harder to notice and correct pronunciation errors. Complicating all of these issues are the tools meant to help—like the IPA. To many teachers, phonemic script feels like a code created by a secret club, one they’re not a member of.

In this post, I'll show you how to use AI to address these challenges. We’ll explore how to use AI to identify what aspects of pronunciation to teach. Then we’ll look at generating pronunciation materials which match the challenges that your students actually face. Finally, we’ll consider how students can use AI to get feedback on their pronunciation.

What to Teach?

Choosing a pronunciation focus for a lesson can feel like trying to push a basketball through a letterbox. The sounds your students struggle with rarely align with your target language, leaving gaps however how you approach it. You could start with your students’ pronunciation mistakes - but these might not match your lesson aims or target language. You could start with a vocabulary list and look for pronunciation features - but this can be time-consuming and inefficient.

AI can analyse target language for common pronunciation features, and generate a list to choose from. You can then browse this list for the feature that most closely matches your students’ challenges.

Start by giving an AI text generator a text, dialogue or list of target language. Ask it to find some examples of pronunciation features which could be focused on.

Prompt: Analyse the following target language for common pronunciation features relevant to EFL learners. Focus on features shared by at least two items, such as marked phonemes (e.g., /θ/, /ð/), elision, assimilation, weak forms, word stress, sentence stress, or intonation patterns. Target language: [lexis, grammar, text, etc.].

Although AI can create a list of features, it’s up to you to pick the most relevant one. (Just like the coursebook writer, AI doesn’t know your students.) Once you’ve chosen a feature, use this as a starting point for creating materials.

Generating Phonemic Script

As a new teacher, I avoided pronunciation like a wet paint sign. I was scared my students would notice they knew more about phonemic script than I did. If that sounds like you, AI can help.

AI can generate IPA which describes your students’ problems. For example:

Prompt: My students often mix up the words [word #1] and [word #2]. Write in IPA what phonemes they are likely getting wrong.

Once you know the phonemes your students are mixing up, you can then use phonemic script when creating materials.

Tongue Twisters

Tongue twisters are phrases that are tricky (but fun!) to say. They repeat or quickly switch between similar sounds. While native speakers find them challenging, for ESL students they can be impossible. But, with a little tweaking, tongue twisters can help students practice pronunciation in a fun, familiar format. After all, your students should already know the concept from their first language.

Custom Tongue Twisters

If you know which English sounds your students mix up, or have difficulty saying, you can ask AI to generate a tongue twister that focuses on these sounds.

Prompt: Create [number] tongue twisters targeting the sounds [IPA sounds]. Use short, simple sentences.

You could use these:

as listening practice. The teacher reads a tongue twister and the students note the sounds they hear.
using pair work. Students take turns reading aloud and correcting each other.
in group competitions. Learners time each other and compete to be the fastest to repeat a tongue twister a certain number of times.
with lipreading. Give students a list of tongue twisters. Students face each other and silently mouth a tongue twister. Their partner must guess which one they’re reading.
to generate AI pronunciation feedback. Ask students to read tongue twisters aloud for the Pronunciation Feedback activity (below).

The idea of saying something aloud with difficult sounds doesn’t need to be limited to short sentences like “Red lorry, yellow lorry” (that one still gets me). Generative AI can create whole paragraphs packed with challenging sounds for your students.

Prompt: Write a paragraph which contains many examples of the sounds [sounds in IPA]. Use simple words and short sentences. The paragraph must make sense semantically.

These are ideal for running dictations, where learners must remember and repeat a text for their partner to write. This stretches the pronunciation of one student, and the listening of another.

Jazz Chants

Jazz chants use rhythm, stress, and intonation to help students sound more natural. AI can adapt existing chants or generate new ones focusing on sounds your students struggle with. This reinforces rhythm and intonation while practicing pronunciation.

Generating a Jazz Chant

We can use an AI text generator to create a new jazz chant based on an existing one, incorporating sounds (or words) which are challenging for your students.

Prompt: Read this jazz chant. Analyse how this would help ELT learners with their pronunciation. Then create a new jazz chant which focuses on the sounds: [insert sounds here]. Jazz chant: [transcript of example jazz chant].

Minimal Pairs

A minimal pair is two words with identical pronunciation, except for one sound. These pairs of words (like 'pin' and 'bin') target specific pronunciation issues (like /p/ and /b/). AI can quickly generate tailored minimal pairs based on sounds your students confuse, ready for use in various activities.

Generating Minimal Pairs

After identifying which sounds your students get confused, you can use an AI text generator to create targeted minimal pairs.

Prompt: Create minimal pairs for the phonemes [phoneme #1 in IPA] (as in [example of this in a word]) and phonemes [phoneme #2 in IPA] (as in [example of this in a word]).

1. The words should differ only in these sounds.

2. All other sounds should remain exactly the same.

- Good example: pin and bin (only /p/ and /b/ differ).

- Bad example: how and hop (because both the vowel and the final consonant differ).

If no minimal pairs exist for these sounds, tell me that the task is impossible.

Creating minimal pairs with AI

Once you create a list of minimal pairs, you can use these in the games or activities below.

Minimal Pairs Spot the Difference

In Minimal Pairs Spot the Difference, students read aloud two lists of words or sentences with subtle pronunciation differences. They have to find the differences between the lists. AI can generate these lists for any target sounds.

Generate materials for this using the following prompt:

Prompt: Create two lists of [number] words each for a minimal pairs activity. The words in List A and List B should mostly be the same, but around half the words should differ by one sound (minimal pairs). The lists should focus on the following sounds: [details of the sounds to include]. Make sure the differences between the minimal pairs are a single sound, and that the other sounds in the words remain the same. Provide both lists in a format like this:

List A:

1. [word]

2. [word]

...

List B:

1. [word]

2. [word]

...

If no minimal pairs exist for a certain sound, let me know that the task is impossible.

Minimal pairs spot the difference from ChatGPT

After checking, these lists can be given directly to your students. Minimal Pairs Spot the Difference can be adapted for different levels by limiting the number of readings. More advanced students could create their own lists to challenge other pairs of students.

Minimal Pairs Code

This activity asks students to send messages to each other using a code. This is a little like communicating using Morse Code. But instead of dots and dashes, students use a code made up of difficult to pronounce words. The code contains all 26 letters of the alphabet. Next to each letter is a difficult to pronounce English word (like A: fan, B: van; C: fail, D: vale). Another, similar sounding word (a minimal pair partner) will be next to another letter.

In class, give each student a copy of the code. One student acts as a “spy,” sending a message by reading aloud words from the code. The other students act as decoders, listening carefully to each word and matching it to a letter. By matching the words to the letters, the decoders spell the message. Students must pay careful attention to accurate pronunciation to successfully code and decode. AI can generate a code based on the pronunciation problems that your students have.

Prompt: Create a list of pairs of words focusing on specific minimal pair sounds. Each pair should correspond to two letters of the alphabet (from A to Z) and include minimal pairs for the following sounds: [insert sounds here, e.g. /f/ and /v/, /e/ and /ei/, /p/ and /b/.] For example: A: Fan, B: Van; C: Fail, D: Vale. Do not include any additional information about the words in the output. Use real English words only.

Using the code above, you’d spell “ChatGPT” by saying, “Fail, bile, fan, bore, pile, van, bore.”

Sentence Stress

Sentence stress is about making certain words louder and longer, adding emphasis to the most important parts of a sentence. Changing the stress can change the meaning of a sentence. For example, in the sentence, “He didn’t eat my sandwich”:

"HE didn't eat my sandwich." Someone else ate the sandwich, but not him.
"He DIDN'T eat my sandwich." In spite of what you think, he did not eat the sandwich.
"He didn't EAT my sandwich." He did something with the sandwich (maybe threw it away or hid it), but he didn't eat it.
"He didn't eat MY sandwich." He ate someone else's sandwich, but not mine.
"He didn't eat my SANDWICH." He ate something of mine, but it wasn't my sandwich.

Creating a Sentence Stress Puzzle

AI can generate sentences that highlight how stressing different words changes the overall meaning.

Prompt: Choose a simple, everyday sentence (5-7 words) and demonstrate how emphasising different words changes its meaning. Format the emphasised words in CAPS, first show the original sentence, and then create variations where different words are stressed in turn. For each variation, explain what new meaning or implication is created by that particular emphasis.

In class, you students can take turns reading the sentences, stressing different words, while their classmates guess the meaning. You can even create sentences that match your target language.

Prompt: Look at the following sentences from a [classroom dialogue/written text/etc]. Choose [number] of these and demonstrate how emphasising different words changes the meaning of these sentences. Format the emphasised words in CAPS, first show the original sentence, and then create variations where each word is stressed in turn. For each variation, explain what new meaning or implication is created by that particular emphasis using simple English. Sentences: ### [target language]

Sentence stress with target language from Claude

Connected Speech

Connected speech is about words and sounds blending and transforming as they meet. It’s a little like colours on an artist's palette, merging to create new shades, like green forming on the boundary of the yellow and blue paints. Similarly, at the boundary between the words 'good’ and ‘morning' the /d/ (in ‘good’) might change to a /b sound, influenced by the /m/ in ‘morning’.

Connected speech is vital for listening. Without knowing connected speech patterns, students might only understand classroom English or very slow speech. One of the best ways to learn what connected speech sounds like is to use it to speak. As well as supporting listening, producing connected speech can help students sound more natural.

There are many different features of connected speech that we don’t have space to go into here. But if you have a specific feature of connected speech that you want to teach your students about, AI can help generate examples.

Generating Sample Sentences

Imagine that you’ve got a pronunciation feature you plan to teach your students about. Use the prompt below to generate sentences for students to practice reading aloud to hear and feel where this takes place.

Prompt: Explain the connected speech feature called [elision/assimilation/catenation/vowel reduction]. Then write ten sentences. The sentences should include simple grammar and common vocabulary. Each sentence needs to include at least one example of this feature of connected speech.

List the ten sentences (e.g. “Come and get it”). These should be written in standard English.

Write the sentences indicating where the feature takes place (e.g. “Come an’ get it”).

Explain where the feature takes place in each sentence (e.g. “”an’ get”: “and” is reduced to “an’”).

The feature is likely to take place at word boundaries. (i.e. the pronunciation of words will be impacted by the previous or following word.) Think through why each of these is an example of this pronunciation feature and not another feature of connected speech, such as assimilation, contraction, catenation, weak forms, etc.

Pronunciation Feedback

Intelligibility (being understood by others) lies at the heart of effective pronunciation teaching. But the better you know your students, the harder this is to judge. Over time, we become more sympathetic listeners. This can cause us to overlook pronunciation which we can understand, but might be unintelligible to others. And if you don’t know your students, you might be hesitant to pronunciation that you don’t understand. There’s little daylight between accents and errors. For these reasons, speech-to-text AI is ideal for giving feedback on intelligibility.

Speech-to-text AI transcribes what you say. Students can speak or read something aloud, and receive a transcription. Thy can compare the transcription with what they were trying to say. Differences suggest mistakes. This activity is ideal for repetition. Students can read the same text again and again, each time trying to decrease the number of transcription errors.

AI can transcribe ‘live’ - as the student talks, the transcription appears. Some students might find this distracting. Alternatively, students can record themselves, then upload the audio file and get a transcript. The recording could be of a tongue twister, an upcoming speech or even of a classroom discussion or conversation.

Giving students a transcript and asking them to find errors might be too much for some learners. You can help by:

limiting the length of the recording and the transcript. Reading aloud for one minute might generate enough feedback for your students.
controlling the input. You might give learners something to read aloud (like a jazz chant, a tongue twister, a monologue, a dialogue or even a poem). This can make the feedback more targeted.
getting students to compare the transcript with the original which they read aloud.
getting an AI text generator to compare the transcript with the original which they read aloud.

Limitations

We’ve looked at several ways that AI can help teach pronunciation. We’ve looked at using AI to give you options for what to teach, to generate materials for use in class, and to give students feedback. However, there are limits on what AI can do.

AI can analyse the target language in your coursebook. It can suggest aspects of pronunciation for you to focus on. But it can’t decide what ought to be taught. You need to make the call on which pronunciation feature will be most valuable to your learners. You also need to check that the pronunciation features the AI has listed are actually there.

While AI is good at creating natural sounding texts, it isn’t quite as good at making pronunciation materials. Check that the sounds that you’re trying to practice are present in the materials AI generates. Although this advice goes for any AI generated materials, it is especially important for pronunciation materials, as this is something of a niche use of AI.

Speech to text AI is a fantastic tool for generating pronunciation feedback. Some students might find it easier to accept feedback on their pronunciation from a machine than from a person. But this doesn’t mean that you get a pass on correcting pronunciation errors. There are many aspects of pronunciation which a speech-to-text AI will not pick up on, especially suprasegmental features (like word stress, sentence stress, and connected speech). Remember to focus on these in class.

Conclusions

AI can help tackle the challenges of teaching pronunciation. It can help identify areas to focus on, generate custom materials like tongue twisters and minimal pairs, and give feedback that students can use independently. Like all uses of AI, teachers need to remain in control. With the right prompts and some creativity, AI can make pronunciation practice more focused, fun, and effective.

References

Kelly, L. G. (1969). 25 centuries of language teaching. Newbury House.

Topics:

Teaching skills

ESL activities

Five ways to

Ross Thorburn

About the Author

Ross Thorburn

Ross Thorburn is a teacher trainer, materials writer and consultant based in Shanghai. Ross started his career in language teaching in 2006. He holds a Trinity DipTESOL, a Trinity FTCL TESOL, an IDLTM from the University of Queensland and a Master’s Degree in Language Education from NILE. Ross is also a keen researcher and has published research articles on teacher training, teacher motivation, task-based learning and young learners. In 2020, Ross published his first book, Inside Online Language Teaching. He also is the host of the TEFL Training Institute podcast.

Teaching Pronunciation with AI

Ross Thorburn

What to Teach?

Generating Phonemic Script

Tongue Twisters

Custom Tongue Twisters

Jazz Chants

Generating a Jazz Chant

Minimal Pairs

Generating Minimal Pairs

Minimal Pairs Spot the Difference

Minimal Pairs Code

Sentence Stress

Creating a Sentence Stress Puzzle

Connected Speech

Generating Sample Sentences

Pronunciation Feedback

Limitations

Conclusions

References

Ross Thorburn

Ross Thorburn

AI for Language Teaching