Python/pandas convert string column to date

Often with Python and Pandas you import data from outside - CSV, JSON etc - and the data format could be different from the one you expect. For example dates and numbers can come as strings. This cause problems when you need to group and sort by this values stored as strings instead of a their correct type. Fortunately pandas offers quick and easy way of converting dataframe columns. In this article we can see how date stored as a string is converted to pandas date.

You can see previous posts about pandas here:

Below is the code example which is used for this conversion:

df['Date'] = pd.to_datetime(df['Date'])

or you can see/use only the converted date of a single column by:

print(pd.to_datetime(df['Date']))

import pandas as pd

df = pd.DataFrame({'Company': ['Samsung', 'Samsung', 'Samsung', 'Samsung', 'Samsung',
                               'LG', 'LG', 'LG', 'LG', 'LG', 'Sony', 'Sony', 'Sony',
                               'Sony', 'Sony'],
                   'Date': ['10/9/2018', '10/9/2018', '10/9/2018', '10/10/2018',
                            '10/10/2018', '10/10/2018', '10/9/2018', '10/9/2018',
                            '10/9/2018', '10/10/2018', '10/10/2018', '10/10/2018',
                            '10/10/2018', '10/10/2018', '10/10/2018'],
                   'Country': ['India', 'India', 'USA', 'France', 'India', 'India',
                               'Germany', 'USA', 'Brazil', 'Brazil', 'India', 'Germany',
                               'India', 'India', 'Brazil'],
                   'Sells': [15, 81, 29, 33, 21, 42, 67, 35, 2, 34, 21, 50, 10, 26, 53]})


print(df.groupby(['Country',  'Date']).sum().groupby(level=[0]).cumsum())

print(pd.to_datetime(df['Date']))
df['Date'] = pd.to_datetime(df['Date'])

print(df.groupby(['Country', 'Date']).sum().groupby(level=[0]).cumsum())

print(df.groupby(['Country',  'Date']).sum().groupby(level=[0]).groups)

result:

before convert of Date column

		Sells
Country	Date
Brazil	10/10/2018	87
	10/9/2018	89
France	10/10/2018	33
Germany	10/10/2018	50
	10/9/2018	117
India	10/10/2018	120
	10/9/2018	216
USA	10/9/2018	64

after convert of Date column

Country	Date
Brazil	10/9/2018	2
	10/10/2018	89
France	10/10/2018	33
Germany	10/9/2018	67
	10/10/2018	117
India	10/9/2018	96
	10/10/2018	216
USA	10/9/2018	64

Now the column date is properly formatted and the sum is based on sorted data in ascending order. Seeing the converted dates only:

0    2018-10-09
1    2018-10-09
2    2018-10-09
3    2018-10-10
4    2018-10-10
5    2018-10-10
6    2018-10-09
7    2018-10-09
8    2018-10-09
9    2018-10-10
10   2018-10-10
11   2018-10-10
12   2018-10-10
13   2018-10-10
14   2018-10-10

If you want to see only the groups than you can use the last line of code which product:

{'USA': MultiIndex(levels=[['Brazil', 'France', 'Germany', 'India', 'USA'], [2018-10-09 00:00:00, 2018-10-10 00:00:00]],
           labels=[[4], [0]],
           names=['Country', 'Date']), 'Brazil': MultiIndex(levels=[['Brazil', 'France', 'Germany', 'India', 'USA'], [2018-10-09 00:00:00, 2018-10-10 00:00:00]],
           labels=[[0, 0], [0, 1]],
           names=['Country', 'Date']), 'France': MultiIndex(levels=[['Brazil', 'France', 'Germany', 'India', 'USA'], [2018-10-09 00:00:00, 2018-10-10 00:00:00]],
           labels=[[1], [1]],
           names=['Country', 'Date']), 'Germany': MultiIndex(levels=[['Brazil', 'France', 'Germany', 'India', 'USA'], [2018-10-09 00:00:00, 2018-10-10 00:00:00]],
           labels=[[2, 2], [0, 1]],
           names=['Country', 'Date']), 'India': MultiIndex(levels=[['Brazil', 'France', 'Germany', 'India', 'USA'], [2018-10-09 00:00:00, 2018-10-10 00:00:00]],
           labels=[[3, 3], [0, 1]],
           names=['Country', 'Date'])}

> Python Basics

> Advanced Tutorials

> Python Errors

> Pandas Advanced

> Pandas Count

> Pandas Column

> Pandas Basics

> Pandas DataFrame

> Pandas Row

> User Interface

> Advanced Linux

> Troubleshoot

> Video & Sound

> Linux Commands

> MySQL

> SQL Basics

> Python

> DB apps

> JupyterLab

> Jupyter Tips

> Jupyter Display

> Regex in Text Editor

> Regex Basics

> Regex Match

> Regex Date

> PyCharm Advanced

> Git and PyCharm

> PyCharm Error

> PyCharm Tips

> Linux Mint Applications

> VIrtual Machine

> Miscellaneous

> Java

> Automation

> Windows

> Office

> Cheat Sheet