Thursday, 22 March 2018

Home LATEST UPDATES LEARN Characteristics of a good regression model

Characteristics of a good regression model

Unknown March 22, 2018 LATEST UPDATES, LEARN,

What makes a good regression model

When you’re evaluating regression models, you’re looking for good R^2 scores and normally distributed residuals, when you’re dealing with a linear regression model (OLS or other model). This means that you should check your R^2, but also check to see if the residuals of your model are normally distributed, by using tests such as the Shapiro-Wilk test or the Anderson Darling test.
If you’re dealing with a logistic regression model, you would want high precision and recall scores, and a high F1 statistic. This ensures that the extent of misclassification (false positives and false negatives) is low.
The same criteria that you evaluate for logistic regression models (as above) would apply to other classification models that use other techniques - logistic regression is essentially a linear model, but applied to a supervised classification problem with labeled training data.

In practice, two other considerations play key roles.

Reducible and irreducible error should both be low. To choose models with low reducible error, you can use R^2 and such metrics, but to ensure you have low irreducible error, you may want to understand whether your regression model is the right order. If you are building a polynomial regression model, you run the risk of overfitting, but you also get to reduce the irreducible error (because you work in higher order features into your model)
Feature selection and feature engineering should be done well. These often make the difference between good models and really good ones. Good feature engineering might also consider domain aspects, in addition to what various transformation functions or encoders can do.

Linear Regression (for predicting continuous variables)

R squared should be high (the definition of “high” varies from domain to domain, the predictive power in the data)
R squared and adjusted R squared should be near to each other (too much difference implies usage of too many variables and you might want to consider dropping a few to preserve only the important ones)
Residuals (errors) should be normally distributed
There should be one or more significant p-valued co-efficients (probability of a variable’s co-efficient being zero must be very low [less than alpha cut-off])

Logistic Regression (for predicting categorical variables)

Precision, recall and F1 score must be high

Although the literature says both must be high, once you understand these terms you will appreciate that this is more of a trade-off. A good model will have both these terms high, but selection of the probability threshold is also key to determining these metrics
So once you are done building a model whose accuracy you can’t increase, experiment with the probability cut-off to determine what kind of precision-recall makes sense for your problem
F1 score is just the harmonic mean of precision and recall

Rank ordering should be maintained.
There should be a significant “lift” in the ROC curve

These are some of the methods that will “tell” you how your model is doing. Taking care of these problems is a very data-specific job and is a different ball game altogether.

NATIONAL ASSOCAITION OF STATISTICS STUDENTS OF NIGERIA FPN CHAPTER

Breaking

NASSON Add

JAMB Regularization

Thursday, 22 March 2018

Characteristics of a good regression model

What makes a good regression model

No comments:

Post a Comment

NASSON FPN Chapter

Author Details

Dean School of Applied Science, HOD Maths & Statistics

Okpe ThankGod Damion - NASSON President

Mr. STAT for FACE of FPN

Statistics

Contributors

Translate blog to other languages

Popular Posts

Recent

Popular

Comments

Education and Statistics

NASSON FPN

Tags

Pages

JAMB REGULARIZATION

Tags

Connect With us

Login To Your Portal

Recent News

MISS STAT for FACE of FPN

About NASSON

Contact Form

NATIONAL ASSOCAITION OF STATISTICS STUDENTS OF NIGERIA FPN CHAPTER

Breaking

NASSON Add

JAMB Regularization

Thursday, 22 March 2018

Characteristics of a good regression model

What makes a good regression model

No comments:

Post a Comment

NASSON FPN Chapter

Author Details

Socialize

Dean School of Applied Science, HOD Maths & Statistics

Okpe ThankGod Damion - NASSON President

Mr. STAT for FACE of FPN

Statistics

Contributors

Translate blog to other languages

Popular Posts

Recent

Popular

Comments

Education and Statistics

NASSON FPN

Tags

Pages

JAMB REGULARIZATION

Tags

Connect With us

Login To Your Portal

Recent News

MISS STAT for FACE of FPN

About NASSON

Contact Form