Python and Pandas group by and sum examples

In this article you can find two examples how to use pandas and python with functions: group by and sum. You can see the example data below. This article describes how to group by and sum by two and more columns with pandas.

  • table 1
Country Company Date Sells
0 India Samsung 10/09/18 15
1 India Samsung 10/09/18 81
2 USA Samsung 10/09/18 29
3 France Samsung 10/10/18 33
4 India Samsung 10/10/18 21
5 India LG 10/10/18 42
6 Germany LG 10/09/18 67
7 USA LG 10/09/18 35
8 Brazil LG 10/09/18 2
9 Tony LG 10/10/18 34
10 India Sony 10/10/18 21
11 Germany Sony 10/10/18 50
12 India Sony 10/10/18 10
13 India Sony 10/10/18 26
14 Brazil Sony 10/10/18 53

table 2

Company Date Sells Promo
0 Samsung 08/09/2018 11 54
1 Samsung 10/09/2018 8 56
2 Samsung 09/09/2018 43 54
3 Samsung 04/09/2018 4 4
4 Sony 08/09/2018 45 3
5 Sony 10/09/2018 534 34
6 Sony 09/09/2018 42 32
7 Sony 04/09/2018 32 15

Now lets say that we want to get information about the sells per country in the first table using Python and pandas. The next example shows how this can be done:

import pandas as pd

df = pd.DataFrame({'Company': ['Samsung', 'Samsung', 'Samsung', 'Samsung', 'Samsung',
                               'LG', 'LG', 'LG', 'LG', 'LG', 'Sony', 'Sony', 'Sony',
                               'Sony', 'Sony'],
                   'Date': ['10/9/2018', '10/9/2018', '10/9/2018', '10/10/2018',
                            '10/10/2018', '10/10/2018', '10/9/2018', '10/9/2018',
                            '10/9/2018', '10/10/2018', '10/10/2018', '10/10/2018',
                            '10/10/2018', '10/10/2018', '10/10/2018'],
                   'Country': ['India', 'India', 'USA', 'France', 'India', 'India',
                               'Germany', 'USA', 'Brazil', 'Brazil', 'India', 'Germany',
                               'India', 'India', 'Brazil'],
                   'Sells': [15, 81, 29, 33, 21, 42, 67, 35, 2, 34, 21, 50, 10, 26, 53]})

print(df.groupby(['Company', 'Country']).sum())

result:

Company Country Sells
LG Brazil 36
Germany 67
India 42
USA 35
Samsung France 33
India 117
USA 29
Sony Brazil 53
Germany 50
India 57

now lets see how to get sells in a country per company. This can be done by:

print(df.groupby(['Country', 'Company']).sum())

the result is:

Sells
Country Company
Brazil LG 36
Sony 53
France Samsung 33
Germany LG 67
Sony 50
India LG 42
Samsung 117
Sony 57
USA LG 35
Samsung 29

Now lets have a look on the second table and get sells per date:

df = pd.DataFrame({'Date': ['08/09/2018', '10/09/2018', '09/09/2018', '04/09/2018', '08/09/2018', '10/09/2018',
                            '09/09/2018', '04/09/2018'],
                   'Company': ['Samsung', 'Samsung', 'Samsung', 'Samsung', 'Sony', 'Sony', 'Sony', 'Sony'],
                   'Promotions': [54, 56, 54, 4, 3, 34, 32, 15]})

print(df)
group = df['Promotions'].groupby(df['Date']).sum()
print(group)

result:

Date
04/09/2018 19
08/09/2018 57
09/09/2018 86
10/09/2018 90
Name: Promotions, dtype: int64

You can have also new column as a total from your data:

import pandas as pd
df = pd.DataFrame({'Date': ['08/09/2018', '10/09/2018', '09/09/2018', '04/09/2018', '08/09/2018', '10/09/2018',
                            '09/09/2018', '04/09/2018'],
                   'Company': ['Samsung', 'Samsung', 'Samsung', 'Samsung', 'Sony', 'Sony', 'Sony', 'Sony'],
                   'Promotions': [54, 56, 54, 4, 3, 34, 32, 15],
                   'Sells': [23, 34, 54, 6, 96, 45, 3, 12]})

df['Total'] = df['Promotions'].groupby(df['Date']).transform('sum')
print(df)

result:

Company Date Promotions Sells Total
0 Samsung 8/9/2018 54 23 57
1 Samsung 10/9/2018 56 34 90
2 Samsung 9/9/2018 54 54 86
3 Samsung 4/9/2018 4 6 19
4 Sony 8/9/2018 3 96 57
5 Sony 10/9/2018 34 45 90
6 Sony 9/9/2018 32 3 86
7 Sony 4/9/2018 15 12 19