The randint() Method in Python

In this article, we will be unveiling the process of Conversion of Speech to Text in Python using SpeechRecognition Library.

Speech Recognition is the process of recognizing the voice and representing it in a textual manner. In today’s fast-moving world, Speech Recognition is useful in many aspects such as Automatic driving car, House Surveillance, etc.


Prerequisites for Python speech to text conversion

Before diving into the process of Python speech to text conversion, it is mandatory for us to install the necessary libraries.

Step 1: Install SpeechRecognition library

Installation Of speechrecognition Library

The SpeechRecognition library is used for the Speech to Text conversion. Moreover, it supports various offline/online speech recognition engines and APIs.

Step 2: Install PyAudio module

Installation Of PyAudio Module

The PyAudio library serves as a cross-platform Input-Output module and provides bindings with PortAudio. PyAudio enables the user to record and play the audio files irrespective of the platform i.e. it is completely platform-independent.

Understanding Python speech to text conversion using SpeechRecognition module

Step 1: Import the necessary library/module

In the process of conversion of speech to text using SpeechRecognition module, we will have to import the same in our program so as to avail all the functions defined under the module/library.

Step 2: Initialize the Speech Recognizer

In order to take the input in the audio format and recognize the sound, it is necessary for us to initialize the recognizer to recognize the audio/voice.

Step 3: Set the source of input audio/voice

The input to the speechrecognition module is of two types:

  • Pre-recorded audio file
  • Voice input through default Microphone

In the above statement, the input to our function is directly recorded through the default microphone. Thus, the Microphone() object is being used to fetch the audio from the microphone.

Note: We need to install the PyAudio module in order to accept the input in audio format from the default microphone.

If you want to convert a pre-recorded audio file to text, we need to follow the following statement:

Step 4: Define the time limit for recording the audio from the microphone.

The record() method is used to set the source of the input and the time for which the microphone needs to accept and record the input audio.

  • source: Defines the source of input such as audio file, input from microphone, etc.
  • duration: The time period (in seconds) for which the microphone would be active and accept the input voice from the user.

Step 5: Convert the speech to text using a search engine or an API

The record() function accepts the voice from the user and uploads the same to the speech recognition engine such as google voice recognition engine for speech recognition. It is mandatory for the system to stay connected to the Internet in order to use the google recognition engine.

The recognize_google() function recognizes the input voice passed to it as a parameter and returns it in the text form. If the user wishes to use any other language for speech recognition like Spanish, Japanese, etc, will need to pass the language as a parameter to the function.


Implementation of Python Speech to text conversion using SpeechRecognition library

Output:


Conclusion

Thus, in this article, we have understood the conversion of Speech to Text in Python using the SpeechRecognition library.


References

By admin

Leave a Reply

%d bloggers like this: