How to convert a speech MP3 to text

Written by scott knickelbine
  • Share
  • Tweet
  • Share
  • Pin
  • Email
How to convert a speech MP3 to text
Speech recognition software can help you transcribe an audio file without having to type it yourself. (typing hands image by Tom Davison from

If you have a speech or dictation as an MP3 file, and you want to transcribe it, you have two options. You can either listen to the file and keystroke it yourself, or you can have a speech recognition program do the typing for you. Converting audio speech files to text requires a bit of setting up and preparation. But if the audio file is clear enough, using speech recognition software to transcribe it can save you a lot of time.

Skill level:

Things you need

  • MP3 speech file
  • Audio conversion software (Microsoft Sound Recorder, Audacity, etc.)
  • Speech recognition software (Dragon NaturallySpeaking, Wave to Text, etc.)

Show MoreHide


  1. 1

    Open your audio conversion software.

  2. 2

    Import the MP3 file you wish to transcribe.

  3. 3

    Save a copy of the file as a WAV file.

  4. 4

    Open your speech recognition software.

  5. 5

    Create a new dictation source for the WAV file. You might want to give it the name of the person speaking on the file.

  6. 6

    Click on the menu option to have the program begin taking dictation from the WAV file.

  7. 7

    Review the text that the program creates, correcting any errors. If you're planning on transcribing many MP3 files from the same speaker, you may wish to have the speech recognition program learn your corrections.

  8. 8

    Save the text file, or import it into a word processing program.

Tips and warnings

  • Nearly all speech recognition programs work with WAV files, not with MP3s. This is why you need to convert your file to a WAV before you can have the speech recognition program transcribe it.
  • Speech recognition programs become more accurate the more they can "learn" the speaking style of an individual speaker. Using them for audio file transcription will work best if you're transcribing many files from the same speaker.
  • Audio quality means a lot to a speech recognition program. Your transcriptions will be more accurate if the MP3 was recorded with high-quality sound, with the speaker's mouth a consistent distance from the microphone and with little or no background noise.
  • MP3 files are compressed files; WAV files are not. This means the WAV file you create from your MP3 will take much more disk space than the original MP3 did. Make sure you've got plenty of hard drive space before you begin your project.
  • Low-quality audio files and files with multiple speakers, cross talk or background noise will produce very inaccurate transcriptions. In these cases, correcting the text file may take more time than doing your own typing will.

Don't Miss

  • All types
  • Articles
  • Slideshows
  • Videos
  • Most relevant
  • Most popular
  • Most recent

No articles available

No slideshows available

No videos available

By using the site, you consent to the use of cookies. For more information, please see our Cookie policy.