Spread the love

Learn how to make beautiful GUI apps

Comprehensive, for Kivy beginners, easy to follow.

Get the book here (PDF) or on Amazon:

ebook / paperback (black and white) / paperback (full color)

DATA AGGREGATION ON MULTIINDICES

We can aggregate hierarchically indexed data using common aggragation methods like mean, sum, min or max. Let’s demonstrate it on the following multi-indexed DataFrame that we used in one of the preceding articles in the series:

In [1]:

import numpy as np
import pandas as pd

rows = pd.MultiIndex.from_product([[2017, 2018, 2019], ['Jan-Jun', 'Jul-Dec']],
                                  names=['year', 'period'])

columns = pd.MultiIndex.from_product([['Company A', 'Company B', 'Company C'], ['sales', 'purchases']],
                                  names=['company', 'total value'])

data = np.random.randint(20000, 100000, (6, 6))
a = pd.DataFrame(data, index=rows, columns=columns)
a

Out[1]:

	company	Company A		Company B		Company C
	total value	sales	purchases	sales	purchases	sales	purchases
year	period
2017	Jan-Jun	93991	83742	53339	64714	87043	56829
2017	Jul-Dec	89819	88190	52651	62935	81435	84043
2018	Jan-Jun	40709	46692	37939	33882	93266	57254
2018	Jul-Dec	41856	52878	99839	64401	65748	40280
2019	Jan-Jun	86673	43148	70669	81171	88066	97462
2019	Jul-Dec	21522	24968	49673	49746	84734	45877

Here are some examples. First let’s find the minimum sales and purchases for each year:

In [2]:

a_min = a.min(level='year')
a_min

Out[2]:

company	Company A		Company B		Company C
total value	sales	purchases	sales	purchases	sales	purchases
year
2017	89819	83742	52651	62935	81435	56829
2018	40709	46692	37939	33882	65748	40280
2019	21522	24968	49673	49746	84734	45877

And now the maximum purchases and sales:

In [3]:

a_max = a.max(level='year')
a_max

Out[3]:

company	Company A		Company B		Company C
total value	sales	purchases	sales	purchases	sales	purchases
year
2017	93991	88190	53339	64714	87043	84043
2018	41856	52878	99839	64401	93266	57254
2019	86673	43148	70669	81171	88066	97462

Next, let’s find the mean purchases and sales for each year:

In [4]:

a_mean = a.mean(level='year')
a_mean

Out[4]:

company	Company A		Company B		Company C
total value	sales	purchases	sales	purchases	sales	purchases
year
2017	91905.0	85966.0	52995.0	63824.5	84239.0	70436.0
2018	41282.5	49785.0	68889.0	49141.5	79507.0	48767.0
2019	54097.5	34058.0	60171.0	65458.5	86400.0	71669.5

And finally, let’s sum the purchases and sales for each year:

In [5]:

a_sum = a.sum(level='year')
a_sum

Out[5]:

company	Company A		Company B		Company C
total value	sales	purchases	sales	purchases	sales	purchases
year
2017	183810	171932	105990	127649	168478	140872
2018	82565	99570	137778	98283	159014	97534
2019	108195	68116	120342	130917	172800	143339

You can also use the axis keyword if you want to use the methods on the columns too. For example, let’s find the minimum sales and purchases among all the three companies:

In [8]:

a_all_min = a_min.min(axis=1, level='total value')
a_all_min

Out[8]:

total value	sales	purchases
year
2017	52651	56829
2018	37939	33882
2019	21522	24968

Your Panda3D Magazine

Make Awesome Games and Other 3D Apps

with Panda3D and Blender using Python.

Cool stuff, easy to follow articles.

Get the magazine here (PDF).

Python Jumpstart Course

Learn the basics of Python, including OOP.

with lots of exercises, easy to follow

The course is available on Udemy.

Blender Jumpstart Course

Learn the basics of 3D modeling in Blender.

step-by-step, easy to follow, visually rich

The course is available on Udemy and on Skillshare.

Spread the love

pandas Part 23 – Data Aggragation on MultiIndices

Like this:

Leave a ReplyCancel reply

pandas Part 23 – Data Aggragation on MultiIndices

Share this:

Like this:

Leave a ReplyCancel reply