How to Select Rows with Head and Tail in Pandas
You may use Pandas functions head() and tail() in order to show first and last N rows of a DataFrame simultaneously. Let's review several ways to combine head() and tail functionality.
Can we use head
and tail
as one Pandas Function? The answer is yes! Check Step 3.
Step 1: Create a Sample DataFrame
Let's use this example DataFrame for this article:
import pandas as pd
data = {'productivity': [80, 20, 60, 30, 50, 55, 95],
'salary': [3500, 1500, 2000, 1000, 2000, 1500, 4000],
'age': [25, 30, 40, 35, 20, 40, 22]}
data_ix = ['Tim', 'Jim', 'Kim', 'Bim', 'Dim', 'Sim', 'Lim']
df = pd.DataFrame(data, index=data_ix)
productivity | salary | age | |
---|---|---|---|
Tim | 80 | 3500 | 25 |
Jim | 20 | 1500 | 30 |
Kim | 60 | 2000 | 40 |
Bim | 30 | 1000 | 35 |
Dim | 50 | 2000 | 20 |
Sim | 55 | 1500 | 40 |
Lim | 95 | 4000 | 22 |
Step 2: Pandas head() and tail() together
The easiest and simplest way to show the first and last N rows of a DataFrame in Pandas is by appending functions head
and tail
:
rows = 2
df.head(rows).append(df.tail(rows))
The result is:
productivity | salary | age | |
---|---|---|---|
Tim | 80 | 3500 | 25 |
Jim | 20 | 1500 | 30 |
Sim | 55 | 1500 | 40 |
Lim | 95 | 4000 | 22 |
Step 3: head() and tail() function iloc
The second option is to use the function iloc
which simulates a combination of head() and tail behavior. Let's show the first 2 rows and the one from the sample DataFrame:
import numpy as np
df.iloc[np.r_[0:2, -1:0]]
The result is:
productivity | salary | age | |
---|---|---|---|
Tim | 80 | 3500 | 25 |
Jim | 20 | 1500 | 30 |
Lim | 95 | 4000 | 22 |
Step 4: Show First and Last Rows with pd.option_context
One more way is available to show rows from the beginning and the end at the same time - pd.option_context. Let say that we want to show first and last 2 rows from a DataFrame - then we can use 'display.max_rows',4
which is going to show only 4 rows - equally selected from the start and the end of a DataFrame:
with pd.option_context('display.max_rows',4):
print(df)
the output is:
productivity | salary | age | |
---|---|---|---|
Tim | 80 | 3500 | 25 |
Jim | 20 | 1500 | 30 |
.. | ... | ... | ... |
Sim | 55 | 1500 | 40 |
Lim | 95 | 4000 | 22 |
[7 rows x 3 columns]
Note: This works only for even numbers.
Step 5: Pandas head(), tail and middle
The final option which is described here includes function concat
and row selection by slicing. This technique and iloc
allows showing asymmetric combination of first and last rows of a DataFrame:
pd.concat([df[:1], df[-2:]])
productivity | salary | age | |
---|---|---|---|
Tim | 80 | 3500 | 25 |
Sim | 55 | 1500 | 40 |
Lim | 95 | 4000 | 22 |
Now if we need to add a row from the middle of our DataFrame than we can use shape and floor division in order to calculate the middle and then we can concatenate it to the rest by:
start = 1
mid = df.shape[0]//2
mid_end = mid + 1
end = 2
pd.concat([df[:start], df[mid:mid_end], df[-end:]])
the output is:
productivity | salary | age | |
---|---|---|---|
Tim | 80 | 3500 | 25 |
Bim | 30 | 1000 | 35 |
Sim | 55 | 1500 | 40 |
Lim | 95 | 4000 | 22 |
Those are just several examples which might be useful to display your DataFrame in the way you like it.
One more option is to combine head(), tail() and sample in order to get: first, last and some random rows. Below we can see how to get the first, the last and two random rows:
pd.concat([df[:1], df.sample(2), df[-1:]])