The future of healthcare apps: how voice recognition can help diagnose heart conditions.
The new idea for your business: how voice recognition can help diagnose heart conditions.
Disorders of the heart and blood vessels or cardiovascular diseases (CVDs) are among the most widespread health issues. According to the World Health Organization, CVDs are the leading cause of death globally. The latest available data suggest that around 18 million people die from CVDs a year, which is more than 30% of all deaths.
According to the European Society of Cardiology, which includes representatives from more than 50 countries, ischaemic heart disease, strokes, and other CVDs far outstrip cancer or any other cause of death.
Thus, combating heart disease has become a task of prime importance. Modern technologies are already treating and preventing CVDs. Cardiologists use artificial intelligence, virtual and augmented reality, and big data analysis to boost their efficiency.
Healthcare professionals are looking for new information sources that can assist in assessing the risk of critical heart conditions. Wearable devices and smartphones are an invaluable source of such data. And their full potential is yet to be harnessed. Recent medical research uses voice analysis to detect heart disease.
Heart Disease and Voice
Some indicators of cardiac issues are not distinctive. For example, the typical heart blockage symptoms (problems with the electrical impulses that are responsible for heartbeats) are:
- slow or irregular heartbeats;
- shortness of breath;
- lightheadedness, fainting;
- the feeling of discomfort or pain in the chest;
- problems with exercising (when blood isn’t appropriately pumped around the body).
For most people, a change in the voice isn’t the first thing that comes to mind regarding heart disease. However, conditions such as congestive heart failure (CHF) or aortic aneurysm affect blood vessels around the heart, which can compress vocal cord nerves. The voice of a person suffering from those conditions may become weak and hoarse.
Heart failure and aneurysms can be quite damaging. CHF reduces the pumping power of the heart muscle to the point of being life-threatening. An aneurysm is a bulge in the wall of a blood vessel, and can lead to internal bleeding and even death.
The indicators of heart disease are so subtle that they require special equipment to be detected. In 2016, the Mayo Clinic, an American nonprofit academic medical center, teamed up with Beyond Verbal, a voice analytics company, to search for links between vocal features and coronary artery disease (CAD). The “signature" of this condition is plaque buildup in the arteries, which causes heart attacks. CAD is among the most common heart diseases.
120 patients made 30-second voice recordings in English, processed with the sound recognition software's help. As a result, the program found a link between a biomarker in the voice signal and CAD. Its presence signifies a 19-fold increase in the likelihood of this condition.
“What it sounds like specifically is not something that can be articulated; it’s not something that the human ear can detect. It’s similar to eyesight in that we can see a certain spectrum, but much more exists.”
— statement from Yuval Mor, Beyond Verbal CEO
Yuval expressed hope that a smartphone app will emerge to help patients monitor their health more efficiently.
Recent research on vocal biomarkers of heart disease published provided similar results. A group of scientists analyzed the recordings of telephone conversations between patients and nurses. The data came from an Israeli hospital offering telemedicine services to patients with chronic conditions.
The project used audio recordings in Hebrew or Russian from 8,316 non-CHF patients and 2,267 CHF patients. With the help of a machine learning model, the researchers detected a vocal marker associated with an increased risk of death and hospitalization.
“Our study, therefore, supports the use of vocal biomarkers. They are noninvasive and can be incorporated to any smartphone or even landline phone, for risk assessment of heart failure patients in telemedicine settings.”
— statement from Elad Maor, MD, Ph.D., and his colleagues
Voice Recognition - Basic Rules and Tools
Use of speech or voice recognition software is on the rise. It enables hands-free control of dictation, automatic translation, and various devices and equipment. This technology is the cornerstone for personal assistants in vehicles and smartphones, such as Apple’s Siri, Google Assistant, and Cortana by Microsoft. Research and Markets estimates the global speech and voice recognition market will reach $26.8 billion by 2025.
But how can computers “understand” human speech? Speech recognition is a difficult task; rules and requirements must be satisfied by software development solutions. The most important of them are:
- Homonyms—speech recognition must take context into account to deal with words, which sound alike but have several meanings (“bat,” “lie,” “ring,” etc.);
- Background noise—the system should filter out interfering sounds, including the voices of other people talking nearby;
- The speed and fluency of natural speech—the instrument has to be smart enough to determine the end of one word and the start of another;
- Different people speak in different ways—voice recognition systems must be able to understand any voice.
A mix of linguistics, mathematics, and computing is employed to overcome all those difficulties and allow spoken language to be processed by a machine. There are four essential “tools,” i.e., methods to use:
- Matching patterns. The computer tries to distinguish between various sound patterns, comparing a chunk of sound to a database of stored patterns;
- Pattern and feature analysis. Each word is disassembled into different parts and identified from critical features, such as its vowels;
- Language models and statistical analysis—including the grammar patterns and the probability that certain words or sounds will follow one another, speeds up recognition and improves accuracy;
- AI voice recognition—artificial neural networks are trained to recognize the patterns of word sounds.
Medical Voice Recognition Software
The healthcare sector takes up around half of the language recognition software on the market, according to the “Voice Tech Landscape”-report.
- get information from medical records;
- assist patients with disabilities in satisfying their needs;
- confirm doctor appointments;
- instruct nurses on specific procedures;
- provide the patients with information about schedules, available hospital units, waiting lists, etc.;
- check on prescription details;
- facilitate administrative routines.
“One of our custom software development solutions for healthcare, Triage, is equipped with an AI-powered voice attendant. The patient is asked to answer a series of questions with the help of a voice prompt that supports multiple languages. The voice-fingerprinting technology helps to identify the patients and locate their records.”
— Vlad Medvedovsky at Proxet, custom software development solutions company.
Let’s compare voice recognition software types and their suitability for different circumstances. Back-end speech recognition software records spoken words and translates them into text. The draft document, together with a voice record, is sent to a transcriptionist or a doctor. This technology is suited to drawing up healthcare documents and electronic health records.
Front-end speech recognition converts spoken words into text in real-time. It eliminates the need for medical transcriptionists but is prone to slight errors that require human correction. It is useful for taking personal notes or short medical reports.
How to Make a Voice Recognition Program
Developing a voice recognition app depends on your resources and business goals. You don’t need to write code from scratch, as there are many tools and libraries to serve as “building blocks.”
One way is to use voice recognition APIs (Application Programming Interfaces) provided by primary cloud services. Some of the most popular are:
- Google Speech API;
- Bing Speech API (Microsoft Cognitive Services);
- Wit.ai by Facebook;
- Amazon Alexa Voice Service.
Relying on the resources and algorithms of technology giants is convenient and easy. But these services aren’t free, and have limited customization
Another way is to harness open-source libraries. This makes a variety of techniques and programming languages very accessible. Here are several examples:
- CMU Sphinx — a group of voice recognition systems, which may serve different purposes. It was developed at Carnegie Mellon University. This solution is flexible in terms of programming languages—it uses Java, but you might build your software in C# or Python;
- HTK (Hidden Markov Model Toolkit) — suitable for the statistical analysis method of voice recognition. Microsoft owns it, but users are allowed to change the source code. The C programming language serves as a framework.
- Kaldi — a relatively new, easy to use toolkit based on C++;
- Wav2Letter++ — a library released by Facebook as “the fastest state-of-the-art speech recognition system available.” It’s a machine-learning-driven tool in C++, converting speech to text.
Armed with these tools, even individual computing enthusiasts can create simple speech recognition systems. Building a voice recognition free app as your pet project is one thing; custom software development for business is quite another.
Proxet has a team of professionals that will equip your enterprise with the power of speech or voice recognition software. We've completed several projects for clients in healthcare, some of which used voice recognition. Our team possesses all the necessary skills and experience to satisfy your business needs quickly and efficiently.