Evaluating logistic regression

Published

October 27, 2022

Eva is working on a logistic regression model.

She works for an auto manufacturer that wants to predict whether previous customers are potential future buyers.

Eva has a well-balanced dataset and is ready to evaluate her model after a few iterations.

Which of the following are viable ways for Eva to evaluate her model?

Eva could evaluate the model by computing its accuracy.
Eva could evaluate the model by computing its log loss.
Eva could evaluate the model using the Mean Squared Error.
Eva could evaluate the model using the Mean Absolute Error.

Expand to see the answer

1, 2

Eva is building a logistic regression model to solve a binary classification problem. She is only concerned about two classes: potential buyers and non-potential buyers.

Eva could compute the accuracy of her model and use it to determine how good it is. Accuracy is misleading for imbalanced datasets, but this is not the case.

Eva could also compute the log loss of her model to evaluate it. You can use the log loss to determine how close are the prediction probabilities returned by the model to the actual values. The closer these are, the lower the log loss will be.

Finally, Eva cannot use the Mean Squared Error (MSE) or the Mean Absolute Error (MAE) to evaluate her model. These metrics work for regression problems, where the model’s output is a continuous value, but Eva is working on a classification problem with a model’s output bounded between 0 and 1.

Recommended reading

Check out “Logistic Regression for Machine Learning” for an introduction to Logistic regression.
“Intuition behind Log-loss score” is a great explanation of log loss and how it works.