Spread the love

Learn how to make beautiful GUI apps

Comprehensive, for Kivy beginners, easy to follow.

Get the book here (PDF) or on Amazon:

ebook / paperback (black and white) / paperback (full color)

MULTI-INDEXED DATAFRAME COLUMNS

Up to now we’ve been using multi-level indexing for rows, but it also works with columns. In this part we’ll create a four-dimensional DataFrame with two levels of indexing for the rows and two levels of indexing for the columns.

The DataFrame is supposed to represent the total values of sales and purchases of three companies in 2017, 2018 and 2019, in two six-month periods for each year. We’re going to use some mock integer data in the example. Have a look:

In [2]:

import numpy as np
import pandas as pd

# Here's the MultiIndex for the rows.
rows = pd.MultiIndex.from_product([[2017, 2018, 2019], ['Jan-Jun', 'Jul-Dec']],
                                  names=['year', 'period'])

# Here's the MultiIndex for the columns.
columns = pd.MultiIndex.from_product([['Company A', 'Company B', 'Company C'], ['sales', 'purchases']],
                                  names=['company', 'total value'])

# some mock data
data = np.random.randint(20000, 100000, (6, 6))

# and the DataFrame itself
a = pd.DataFrame(data, index=rows, columns=columns)
a

Out[2]:

	company	Company A		Company B		Company C
	total value	sales	purchases	sales	purchases	sales	purchases
year	period
2017	Jan-Jun	37881	57154	31772	42543	76570	50100
2017	Jul-Dec	76179	91352	63050	81077	37992	78935
2018	Jan-Jun	64322	28778	21564	96446	78386	97677
2018	Jul-Dec	29510	41779	54760	58096	88186	56180
2019	Jan-Jun	92856	47639	78972	30000	44988	54410
2019	Jul-Dec	56761	31147	94927	33529	37692	89675

You can now index the DataFrame to access the data that you need. Here are some examples:

In [3]:

# sales and puchases for Company B
a['Company B']

Out[3]:

	total value	sales	purchases
year	period
2017	Jan-Jun	31772	42543
2017	Jul-Dec	63050	81077
2018	Jan-Jun	21564	96446
2018	Jul-Dec	54760	58096
2019	Jan-Jun	78972	30000
2019	Jul-Dec	94927	33529

In [5]:

# just the sales for Company C
a['Company C']['sales']

Out[5]:

year  period 
2017  Jan-Jun    76570
      Jul-Dec    37992
2018  Jan-Jun    78386
      Jul-Dec    88186
2019  Jan-Jun    44988
      Jul-Dec    37692
Name: sales, dtype: int32