What makes a good regression model
- When you’re evaluating regression models, you’re looking for good R^2 scores and normally distributed residuals, when you’re dealing with a linear regression model (OLS or other model). This means that you should check your R^2, but also check to see if the residuals of your model are normally distributed, by using tests such as the Shapiro-Wilk test or the Anderson Darling test.
- If you’re dealing with a logistic regression model, you would want high precision and recall scores, and a high F1 statistic. This ensures that the extent of misclassification (false positives and false negatives) is low.
- The same criteria that you evaluate for logistic regression models (as above) would apply to other classification models that use other techniques - logistic regression is essentially a linear model, but applied to a supervised classification problem with labeled training data.
In practice, two other
considerations play key roles.
- Reducible and irreducible error should both be low. To choose models with low reducible error, you can use R^2 and such metrics, but to ensure you have low irreducible error, you may want to understand whether your regression model is the right order. If you are building a polynomial regression model, you run the risk of overfitting, but you also get to reduce the irreducible error (because you work in higher order features into your model)
- Feature selection and feature engineering should be done well. These often make the difference between good models and really good ones. Good feature engineering might also consider domain aspects, in addition to what various transformation functions or encoders can do.
Linear Regression (for predicting
continuous variables)
- R squared should be high (the definition of “high” varies from domain to domain, the predictive power in the data)
- R squared and adjusted R squared should be near to each other (too much difference implies usage of too many variables and you might want to consider dropping a few to preserve only the important ones)
- Residuals (errors) should be normally distributed
- There should be one or more significant p-valued co-efficients (probability of a variable’s co-efficient being zero must be very low [less than alpha cut-off])
Logistic Regression (for predicting
categorical variables)
- Precision, recall and F1 score must be high
- Although the literature says both must be high, once you understand these terms you will appreciate that this is more of a trade-off. A good model will have both these terms high, but selection of the probability threshold is also key to determining these metrics
- So once you are done building a model whose accuracy you can’t increase, experiment with the probability cut-off to determine what kind of precision-recall makes sense for your problem
- F1 score is just the harmonic mean of precision and recall
- Rank ordering should be maintained.
- There should be a significant “lift” in the ROC curve
These are some of the methods that
will “tell” you how your model is doing. Taking care of these problems is a
very data-specific job and is a different ball game altogether.
No comments:
Post a Comment