Aria has been working with a source for the past four weeks to get information about hospitalized patients with COVID-19.
In her short career as a political reporter, she has never dealt with technical data, so imagine her surprise when she realized the information her source sent her was not in plain English.
The hospital has been using a machine learning model to predict whether patients sick with COVID-19 will end up in the hospital. The report came with a bunch of numbers, and it measured the quality of the model using two weird terms: “precision” and “recall.”
Aria has to write her article, but first, she needs to figure out how to interpret the numbers.
Which of the following statements accurately represent what precision and recall are?
Precision: Among all the patients classified by the model as potential hospitalizations, the fraction that ended up in the hospital.
Precision: Among all the patients that ended up in the hospital, the fraction that the model correctly classified.
Recall: Among all the patients classified by the model as potential hospitalizations, the fraction that ended up in the hospital.
Recall: Among all the patients that ended up in the hospital, the fraction that the model correctly classified.
1, 4
In machine learning, especially when building classification models, precision and recall are two of the most critical metrics we need to understand.
Let’s start with the definition from Wikipedia:
Precision is the fraction of relevant instances among the retrieved instances, while recall is the fraction of relevant instances that were retrieved.
Using this definition, we can break down what precision is by looking at two components:
- Relevant instances: These are all patients the model predicted as positive and actually ended up in the hospital. We call these “true positive” patients.
- Retrieved instances: These are all the patients the model predicted would end up in the hospital.
The precision is the number of relevant instances divided by retrieved instances. We want to express how “precise” the model was: out of every patient the model predicted would end up in the hospital, how many were actually hospitalized. Looking at the available choices, the first is the correct definition of precision.
Let’s do the same with recall. We need two components:
- Relevant instances: These are all patients the model predicted as positive and actually ended up in the hospital. We call these “true positive” patients.
- Positive instances: The number of patients that ended up in the hospital.
The recall is the number of relevant instances divided by the number of positive instances. In other words, we want to express the capacity of the model to find positive patients: out of every patient that ended up in the hospital, how many did the model retrieve. Looking at the choices, the fourth is the correct definition of recall.
In summary, the first and fourth choices are the correct answer to this question.
Recommended reading