A little closer to Cook’s distance

Source: Analytics Vidhya

Cook’s distance and outliers? How can they relate to each other?

Formula:

The cut-off values — controversial:

  1. If a data point has a Cook’s distance of more than three times the mean, it is a possible outlier
  2. Any point over 4/n, where n is the number of observations, should be examined
  3. To find the potential outlier’s percentile value using the F-distribution. A percentile of over 50 indicates a highly influential point

When to use Cook’s D

  1. When suspect influence problems
  2. When graphical displays may not be adequate
  3. When performing a least-square regression analysis
from statsmodels.formula.api import ols
infl = model.get_influence()
sm_fr = infl.summary_frame(); sm_fr[:10]

How to interpret Cook’s Distance plots

df.bedrooms.max() #output: 33df.bedrooms.idxmax() #output: 15856df.drop(df.loc[df['bedrooms']==33].index, inplace=True) #dropping the row from the bedrooms columnf = 'price~bedrooms'
model = smf.ols(formula=f, data=df).fit()
model.summary() #running the OLS model again

Other case of influence statistics

References

International Hospitality and Tourism Management Graduate . Junior Data Scientist

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Random_Forest_Medium_Article

Sustainable Development Goals Country Scores

LitCoin NLP Challenge by NCATS & NASA

How to optimize your website based on customer happiness dimensions

The value of a service: data science and user experience investigate the good, good life

Initial design for the customer journey

Knowledge Graph — A Powerful Data Science Technique to Mine Information from Text (with Python…

CaseStudy-Mercedes Benz-Greener Manufacturing Challenge

PPM Buzzword Overload

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store

Ly Nguyenova

International Hospitality and Tourism Management Graduate . Junior Data Scientist

More from Medium

Carbon Reduction Technologies: How Can We Lower Carbon Emissions Around the World?

Week 6- Alliance Financial Group

Emotional Scientist© Dr. Tracy On Differences In Making Money For Men And Women

Getaways and Safe Havens