Imagine going to a clinic for your stomach issues — you will have to make a reservation, wait in the lobby for a while, only then can you talk to the doctor about your issues. From there you describe your symptoms, maybe follow up with a few more tests, and after another conversation with the doctor will you identify the issue — which might not be fully accurate, by the way, and the process starts over again.
This is why new healthcare systems turn towards technology for assistance — instead of us doing the job, the machine will do it for us. In fact, machine learning and healthcare can really go hand in hand together to better diagnose and help make healthcare-related decisions. On the other hand, it can also save time and simplify the process.
"A lot of the technology is not really designed to connect with patients. Physicians and nurses don't necessarily give electronic health records (EHRs) a passing grade."
— Dr. Steven Waldren, Vice President & Chief Medical Informatics Officer at American Academy of Family Physicians.
According to Dr. Waldren, the average patient visit to a primary care physician (PCP) takes about 18 minutes, and 27% of that 18 minutes is dedicated to face-to-face time with a patient. In general, 49% is consumed by EHR and desk work.
But how exactly do machine learning classifiers help solve the issues?
What Are ML Classifiers
First of all, what is a classifier?
Simply put, classifiers in machine learning are algorithms that automatically order and categorize data into classes. Think of an algorithm that looks into our emails and categorizes them into “spam” and “not spam.” You get the idea.
“Classes” can also sometimes be referred to as targets, labels, or categories, but in this context, they all mean the same thing. Generally speaking, training data are fed to the classifiers to help train the machine on how to correctly identify and categorize the data. In the case of emails, the machine will be fed spam and non-spam emails as training data to help them identify spam.
Sounds easy enough, right? Now let’s take a brief look at different types of machine learning models.
Machine Learning Models
In short, there are four machine learning models:
- Supervised learning
- Semi-supervised learning
- Unsupervised learning
- Reinforcement learning
Supervised learning
For supervised learning models, labeled data and examples are fed to the machine to help build the model. The dataset would include the desired inputs and outputs that allow the machine to figure out the algorithms. The machine would then make predictions and is corrected by the operators until it reaches the desired performance.
Here are some examples of supervised machine learning — back to our email scenario, the machine would be fed spam and non-spam emails as predetermined datasets, then it would start computing and determine whether other emails are spam using algorithms based on the dataset fed.
Alternatively, you can take a look at this fun article on how to train machines to identify different Iris species.
Semi-supervised learning
In semi-supervised learning, both labeled and unlabeled data are fed to the machine for it to figure out the algorithm. Labeled data has the tag required to help organize and utilize the data, while unlabeled ones lack such tags.
The idea remains the same, however, as the machine will try and label those data without the label based on the labeled ones.
Unsupervised learning
Here’s how we can define unsupervised learning — it does not require human intervention. No predefined datasets, no operators, just the algorithms. Unsupervised learning looks at a large group of data and identifies the correlations and relationships between different data, either grouping them into clusters or arranging them in a way that’s more organized.
The more data it processes, the more accurate it gets. There are two sub-categories under unsupervised learning — clustering, which groups similar data together; and dimension reduction, which reduces the number of variables in order to find the exact information.
Reinforcement learning
Lastly, we have reinforcement learning, which mostly concerns decision-making — the machine will find solutions to an issue through a trial-and-error basis, and will be rewarded or punished based on the decisions made. The goal for the machine? Maximize its reward.
The machine will try to solve the problem without any hints from the operator. One prime example to help illustrate its use is self-driving cars, where the machine will learn to discover the optimal ways to drive based on the criteria set, which can be understood as game rules. Here’s one detailed research on the use of reinforcement learning in autonomous driving, conducted by Stanford.
With these out of the way, let’s look at how machine learning classifiers work with data.
How It Works with Data
Here comes the more technical part — but bear with us, we will try our best to explain everything in layman's terms.
Now you should understand that classifiers are essentially algorithms that “classify” or “categorize” things. But how exactly do they do that? Well, there are two major types of ML classifiers.
Lazy learners: these classifiers are useful with large datasets that only have a few attributes. With these classifiers, generalization of the training data is delayed until a query is made to the system. Examples of lazy learners include k-nearest neighbor and case-based reasoning.
Eager learners: as opposed to lazy learners, eager learners do all the calculations and computations before a prediction is requested. Because of its nature, it would take more time for them to train but less to predict. Notable examples include decision trees and naive Bayes.
K-nearest neighbor … what? What do decision trees mean? No worries, we will get to them now.
K-nearest neighbor
K-nearest neighbor, also known as K-NN, looks at the data and classifies them based on how close they are to other neighbors by taking numerous labeled points and using them to label others. “K” stands for the parameter that considers the number of closest neighbors to include in the majority of the voting process. Getting that exact number is a process of trial and error.
K-nearest neighbor is easy to implement and performs well with noisy data. In terms of real-world applications, K-nearest neighbor has been used in diagnosing heart disease patients with impressive results — we will get to that later.
Decision tree
A decision tree, in theory, isn’t that different from your typical flow chart. This classification method utilizes the if-then logic and gradually classifies the data into subsets, ultimately creating a tree-like structure — just like the image below.
Naive Bayes classifier
Naive Bayes classifier utilizes Bayes’ Theorem, which, in simple terms, assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature. Still confused? Imagine a potato — what are its attributes? Brown? Yellow inside? Is it hard or soft? All these properties contribute to how we identify a potato, independent of one another.
Even though this classification model might not be the most accurate out there, it requires little training data to acquire the desired results, which is its biggest advantage.
Now, let’s take a look at how classifiers are used in medical diagnosis, shall we?
Classifiers Use Cases in Healthcare and Real-World Cases
With the advance of AI and machine learning in recent years, there has been a surge of healthcare use cases in machine learning. But to make sure we don’t get ahead of ourselves, we will only focus on how classifiers were used in healthcare.
Back in 2012, a paper was published by a group of researchers on the use of K-NN in diagnosing heart disease patients — with impressive results. According to their reports, applying K-NN achieved an accuracy of 97.4% in diagnosing heart diseases, which was higher than other published findings on that benchmark dataset at the time.
Remember how we mentioned the use of labeled and unlabeled data in machine learning classifications? Labeling data can be a time-consuming process, and in order to develop a competent and reliable classification system, a group of researchers explored the possibility of using a semi-supervised model, trained by the few labeled data working alongside an abundant amount of unlabeled data.
Babylon is another healthcare company that utilizes AI in its primary diagnosis.
Intelligence Medical Diagnosis
"We took artificial intelligence with a powerful algorithm, and gave it the ability to imagine alternate realities and consider ‘would this symptom be present if it was a different disease’? This allows the artificial intelligence to tease apart the potential causes of a patient's illness and score more highly than over 70% of the doctors on these written test cases,— Dr. Jonathan Richens, Research scientist.
In fact, AI uses in healthcare have seen a surge in recent years. According to one report, while the market size was estimated at USD 693.57 million in 2020, by 2027, it is expected to grow at a CAGR of 15.87% to reach USD 1,945.62 million.
"Regarding AI use cases in healthcare, AI can “take full advantage of the data and increases the speed and objectivity of diagnoses.— Dr. Nanditha Mallesh, Research scientist at University of Bonn - Institute for Genomics Statistics and Bioinformatics.
All in all, machine learning, particularly through the use of classifiers, can provide an objective and often more accurate result compared to human diagnosis. As Dr. Peter Krawitz and his colleagues put it, the use of AI in healthcare “increases the speed as well as the objectivity of the analyses, compared to established processes.”
While the systems we have today are not fully matured yet, we are seeing promising results — and it’s likely that in the near future we will have more automated healthcare systems that can save more lives.
Here at Proxet, we provide software development services — including those related to machine learning and healthcare. Should you be in need of any intelligent medical solutions, we have the right experience and expertise to help you achieve your vision. Contact us today.