Pandas Drop Duplicate Rows - drop_duplicates() function Examples

Pandas drop_duplicates() Function Syntax

Pandas drop_duplicates() function removes duplicate rows from the DataFrame. Its syntax is:

  • subset: column label or sequence of labels to consider for identifying duplicate rows. By default, all the columns are used to find the duplicate rows.
  • keep: allowed values are {‘first’, ‘last’, False}, default ‘first’. If ‘first’, duplicate rows except the first one is deleted. If ‘last’, duplicate rows except the last one is deleted. If False, all the duplicate rows are deleted.
  • inplace: if True, the source DataFrame is changed and None is returned. By default, source DataFrame remains unchanged and a new DataFrame instance is returned.

Pandas Drop Duplicate Rows Examples

Let’s look into some examples of dropping duplicate rows from a DataFrame object.

1. Drop Duplicate Rows Keeping the First One

This is the default behavior when no arguments are passed.


The source DataFrame rows 0 and 1 are duplicates. The first occurrence is kept and the rest of the duplicates are deleted.

2. Drop Duplicates and Keep Last Row


The index ‘0’ is deleted and the last duplicate row ‘1’ is kept in the output.

3. Delete All Duplicate Rows from DataFrame


Both the duplicate rows ‘0’ and ‘1’ are dropped from the result DataFrame.

4. Identify Duplicate Rows based on Specific Columns


The columns ‘A’ and ‘B’ are used to identify duplicate rows. Hence, rows 0, 1, and 2 are duplicates. So, rows 1 and 2 are removed from the output.

5. Remove Duplicate Rows in place



By admin

Leave a Reply

%d bloggers like this: