Pandas dropna() - Drop Null/NA Values from DataFrame Examples

1. Pandas DataFrame dropna() Function

Pandas DataFrame dropna() function is used to remove rows and columns with Null/NaN values. By default, this function returns a new DataFrame and the source DataFrame remains unchanged.

We can create null values using None, pandas.NaT, and numpy.nan variables.

The dropna() function syntax is:

  • axis: possible values are {0 or ‘index’, 1 or ‘columns’}, default 0. If 0, drop rows with null values. If 1, drop columns with missing values.
  • how: possible values are {‘any’, ‘all’}, default ‘any’. If ‘any’, drop the row/column if any of the values is null. If ‘all’, drop the row/column if all the values are missing.
  • thresh: an int value to specify the threshold for the drop operation.
  • subset: specifies the rows/columns to look for null values.
  • inplace: a boolean value. If True, the source DataFrame is changed and None is returned.

Let’s look at some examples of using dropna() function.

2. Pandas Drop All Rows with any Null/NaN/NaT Values

This is the default behavior of dropna() function.

Output:

3. Drop All Columns with Any Missing Value

We can pass axis=1 to drop columns with the missing values.

Output:

4. Drop Row/Column Only if All the Values are Null

Output:

5. DataFrame Drop Rows/Columns when the threshold of null values is crossed

Output:

The rows with 2 or more null values are dropped.

6. Define Labels to look for null values

Output:

We can specify the index values in the subset when dropping columns from the DataFrame.

Output:

The ‘ID’ column is not dropped because the missing value is looked only in index 1 and 2.

7. Dropping Rows with NA inplace

We can pass inplace=True to change the source DataFrame itself. It’s useful when the DataFrame size is huge and we want to save some memory.

Output:

8. References

By admin

Leave a Reply

%d bloggers like this: