Converting Any Audio or Video clip Into Text using “Whisper AI”

Artificial Intelligence (AI) has revolutionized various fields, and “Whisper AI” is a prime example of its capabilities. This tool allows for converting any audio or video clip into text, providing either a text file or a subtitle file that can be used on platforms like YouTube. “Whisper AI” is currently free and supports a wide range of languages.

Getting Started with Whisper AI

To use “Whisper AI,” you need a Google account to avoid installing any software directly on your device. Here is a step-by-step guide:

  • Create a Google Account: If you don’t have one already, create a Google account, which is free of charge.
  • Access Google Drive: Log in to your Google account, navigate to the main Google page, and select “Drive” from the options at the top.
1 result
  • Create a New Project: Click on “New” in the Drive menu.
2 result
  • Add Google Colaboratory: From the menu, select “More,” then “Connect more apps,” and search for “Colaboratory.” Click “Install” to add it to your apps.
3 result
4 result
5 result
  • Open Google Colaboratory: Return to the “New” menu and select “Google Colaboratory” to create a new file. You can rename this file for future reference.
6 result
7 result

For example, we will name this file “Test”.

  • Set Up Runtime Environment: In the “Runtime” menu, choose “Change runtime type.
8 result
  • ” Select “GPU” under the “Hardware accelerator” menu and click “Save.”
9 result

Installing Whisper AI

To install “Whisper AI,” use the following code in Google Colaboratory:

!pip install git+https://github.com/openai/whisper.git
!sudo apt update & sudo apt install ffmpeg

You can find this code on the official “Whisper AI” GitHub page. Copy and paste the code into the specified area and click “Run.” The installation process may take up to a minute.

10 result

Using Whisper AI

  • Explore Features: To view all available features, type !whisper –h in the code cell and click “Run.” This command will display all properties, including supported languages and various features.
11 result

But now we want to try a simple example, we have an audio clip on the desktop, we want to convert it from audio to text.

  • Upload Audio File: Click on the “File” tab on the left, drag your audio file into the file list.
12 result
  • Transcribe Audio: Press “+Code” and type the following command, replacing "Test transcribe.mp3" with your file name:
   !whisper "Test transcribe.mp3" --model medium
13 result

Adjust the resolution by specifying the desired model, such as “medium,” to balance file size and quality.

14 result
15 result
  • Specify Language: To shorten processing time, specify the language by adding --language en (replace en with the appropriate language code).
16 result
  • Review Outputs: Once the transcription is complete, you’ll find a set of outputs in the “Files” list. Download the “srt” and “txt” files by clicking on each and selecting “Download.”
17 result
18 result
19 result

Conclusion

“Whisper AI” provides an efficient way to transcribe audio and video files into text with high accuracy. While the tool is powerful, it’s important to review the generated text and subtitles to ensure accuracy. This guide helps you set up and use “Whisper AI” effectively, leveraging AI to streamline your transcription needs.

Scroll to Top