How To Find Outliers in Data Using Python

…And How To Handle Them

Eric Kleppen
11 min readFeb 21, 2022
Photo by Will Myers on Unsplash

This article was originally written for CareerFoundary.com. Check them out if you’re interested in becoming a UX designer, UI designer, web developer, or data analyst from scratch

Outliers are important

Identifying and dealing with outliers can be tough, but it is an essential part of the data analytics process, as well as for feature engineering for machine learning. So how do we find outliers? Luckily, there are several methods for identifying outliers that are easy to execute in Python using only a few lines of code. Before diving into methods that can be used to find outliers, let’s first review the definition of an outlier and load a dataset. By the end of the article, you will not only have a better understanding of how to find outliers, but also know how to work with them when preparing your data for machine learning.

We’ll cover all of this using the following headings:

  1. What is an outlier?
  2. How do you find outliers in your dataset?
  3. Finding outliers using statistical methods
  4. Working with outliers using statistical methods
  5. Wrapping up and next steps

What is an outlier?

--

--