![]() | This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
|
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics focused on developing computer-based methods and technologies to translate spoken language into text. It is also known as automatic speech recognition (ASR), computer speech recognition, or speech-to-text (STT).
Speech recognition applications include voice user interfaces such as voice dialing (e.g. "call home"), call routing (e.g. "I would like to make a collect call"), and home automation (e.g., "turn off the kitchen lights"). There are also productivity applications for speech recognition, such as searching audio recordings (e.g., by creating a transcript), simple data entry (e.g., speaking a credit card number aloud), preparation of structured documents (e.g. a radiology report), determining speaker characteristics,[1] speech-to-text processing (e.g., word processors or emails), and controlling aircraft (usually termed direct voice input).
Automatic pronunciation assessment is used in education, such as for spoken language learning.
The term voice recognition[2][3][4] or speaker identification[5][6][7] refers to identifying the speaker, rather than what they are saying. Recognizing the speaker can simplify the task of translating speech in systems that have been trained on a specific person's voice, or it can authenticate or verify the identity of a speaker as part of a security process.
When you speak to someone, they don't just recognize what you say: they recognize who you are. WhisperID will let computers do that, too, figuring out who you are by the way you sound.