What is a voice recognition system?
A voice recognition system is a computerized system which can read and decipher human speech as well as comprehend and execute spoken instructions. For the computerized system to configure human speech it must first convert analogue audio signals to digital ones using a relevant infrastructure. This infrastructure must feature a digital database containing a vocabulary of words or syllables and quick means for comparing this data with the received signals. Speech patterns are stored on the hard drive and loaded on to the memory when the program is operated. These systems are currently operated using data collected by various platforms such as the internet.
Use of "digital help" systems has increased significantly in recent years. Examples of such help include Apple's Siri, Microsoft's Cortana and Amazon's Alexa, all systems which answer users' questions, play music or read audio books and even control other home devices via smart house technologies. These options, among others, are all accessed via user voice command and computerized system voice response.
Digital help is evolving as AI advances over the years, a central theme being Deep Learning applications. As time goes by, these "helpers" can better understand users' intentions, comprehending the verbal context as well as users' location and behavior based on their previous queries. Thus, systems can address users' needs precisely and enable them in turn to act accordingly (e.g. ordering a certain product online after completing a search).
Using voice recognition systems for data searches
There is a great difference between a textual and an audio search. While textual searches focus on a limited number of words (usually between 2 to 4 words) so to focus search results, voice searches usually rely on many words. The user tends to ask a question in natural form, which involves question words such as who, how, what, where, why and when expecting the search engine to provide a precise answer though the question included conjunctions and elaboration.
The users' choice of words during a voice search provides information regarding their intention (is the user conducting a research on a certain subject or interested in purchasing a product?). The system can also provide advertisers many insights regarding users' intentions and display data customized according to the search.
Despite technological advances and impressive achievements, voice recognition systems are far perfect; current systems can still fail to recognize phonetically similar words that differ semantically due to background noise interference, etc.; nevertheless, these challenges will be addressed soon enough.
According to Comscore, in the year 2020, 50% of data searches will be performed vocally; Gaertner claims that 2018 will see voice-based interactions as 30% of all interactions with devices, since people can speak four times faster than they can type.
Vocal recognition systems mainly interest private consumers as they can perform searches for general and local info on shopping, weather, news, etc. Business parties use these systems to adapt their ads to the performed voice searches.
Yet is seems that soon enough digital help and vocal searches will serve organizations for professional searches from scheduling, searching through organizational databases, knowledge documentation via digital help, etc.
It's just a matter of time and patience.
References:
Comments