Spread the love

Learn how to make beautiful GUI apps

Comprehensive, for Kivy beginners, easy to follow.

Get the book here (PDF) or on Amazon:

ebook / paperback (black and white) / paperback (full color)

INDEXING AND SLICING MULTIPLY INDEXED DATAFRAMES

In the previous part of the Pandas series we were talking about indexing and slicing multiply indexed Series objects. Today we’ll be talking about multiply indexed DataFrames. They behave very much like Series objects as far as indexing and slicing is concerned. Here’s the example from one of the previous parts that we’ll be working on:

In [1]:

import numpy as np
import pandas as pd

rows = pd.MultiIndex.from_product([[2017, 2018, 2019], ['Jan-Jun', 'Jul-Dec']],
                                  names=['year', 'period'])

columns = pd.MultiIndex.from_product([['Company A', 'Company B', 'Company C'], ['sales', 'purchases']],
                                  names=['company', 'total value'])

data = np.random.randint(20000, 100000, (6, 6))
a = pd.DataFrame(data, index=rows, columns=columns)
a

Out[1]:

	company	Company A		Company B		Company C
	total value	sales	purchases	sales	purchases	sales	purchases
year	period
2017	Jan-Jun	52016	75802	46574	35849	37107	89150
2017	Jul-Dec	46388	79543	47696	86961	32138	64179
2018	Jan-Jun	94779	90781	64409	62316	49368	51174
2018	Jul-Dec	21693	68970	79971	56702	52753	97088
2019	Jan-Jun	49221	60597	36683	47390	83167	30133
2019	Jul-Dec	29882	80608	76448	27891	94124	43042

Here are some examples of indexing and slicing:

In [2]:

# Let's access all data relevant to Company C.
a['Company C']

Out[2]:

	total value	sales	purchases
year	period
2017	Jan-Jun	37107	89150
2017	Jul-Dec	32138	64179
2018	Jan-Jun	49368	51174
2018	Jul-Dec	52753	97088
2019	Jan-Jun	83167	30133
2019	Jul-Dec	94124	43042

In [3]:

# And now we need just the sales data of Company A.
a['Company A', 'sales']

Out[3]:

year  period 
2017  Jan-Jun    52016
      Jul-Dec    46388
2018  Jan-Jun    94779
      Jul-Dec    21693
2019  Jan-Jun    49221
      Jul-Dec    29882
Name: (Company A, sales), dtype: int32

In [4]:

# We can also use the loc and iloc indexers, the former for explicit label indices and the latter for 
# implicit positional integer indices. 
a.iloc[:2, :2]

Out[4]:

	company	Company A
	total value	sales	purchases
year	period
2017	Jan-Jun	52016	75802
2017	Jul-Dec	46388	79543

In [5]:

# You can also pass a tuple of multiple indices.
a.loc[:, ('Company B', 'purchases')]

Out[5]:

year  period 
2017  Jan-Jun    35849
      Jul-Dec    86961
2018  Jan-Jun    62316
      Jul-Dec    56702
2019  Jan-Jun    47390
      Jul-Dec    27891
Name: (Company B, purchases), dtype: int32

As far as slicing is concerned, the optimal way of doing this is by means of the Pandas IndexSlice object. Have a look:

In [8]:

idx = pd.IndexSlice
a.loc[idx[:, 'Jan-Jun'], idx[:, 'sales']]

Out[8]:

	company	Company A	Company B	Company C
	total value	sales	sales	sales
year	period
2017	Jan-Jun	52016	46574	37107
2018	Jan-Jun	94779	64409	49368
2019	Jan-Jun	49221	36683	83167

Your Panda3D Magazine

Make Awesome Games and Other 3D Apps

with Panda3D and Blender using Python.

Cool stuff, easy to follow articles.

Get the magazine here (PDF).

Python Jumpstart Course

Learn the basics of Python, including OOP.

with lots of exercises, easy to follow

The course is available on Udemy.

Blender Jumpstart Course

Learn the basics of 3D modeling in Blender.

step-by-step, easy to follow, visually rich

The course is available on Udemy and on Skillshare.

Spread the love

pandas Part 20 – Indexing and Slicing Multiply Indexed DataFrames

Like this:

Leave a ReplyCancel reply

pandas Part 20 – Indexing and Slicing Multiply Indexed DataFrames

Share this:

Like this:

Leave a ReplyCancel reply