In the previous parts of this series we discussed two of the three fundamental data types in pandas, the Series and the DataFrame. In this part we’ll be talking about the third one, the Index.
Both Series and DataFrame objects contain an index, but an Index is also a type on its own. It’s similar to a one-dimensional numpy array, but unlike a numpy array, an Index object is immutable.
You can create an Index from a list of integers:
import numpy as np
import pandas as pd
index = pd.Index([3, 6, 9, 12, 15])
index
You can access particular elements using the square bracket notation:
index[1]
Slicing also works like expected:
index[1:4]
You can use attributes with pandas Index objects like size, shape, ndim or dtype, which you know from numpy arrays:
index.size, index.shape, index.ndim, index.dtype
But you can’t modify the elements of an Index object, because Index objects are immutable. If you try, you get an error:
index[1] = 7
SET OPERATIONS ON INDEX OBJECTS
You can also regard an Index object as an ordered set. You can use all the set operations with Index objects, so you can make unions, differences, symmetric differences and intersections of Index objects:
# let's define two Index objects
index1 = pd.Index([1, 2, 3, 4, 5])
index2 = pd.Index([2, 4, 6, 8])
# let's make a union of the two Index objects
index1.union(index2)
# or using the short notation with the pipe operator
index1 | index2
# now let's check out the difference of the two object
index1.difference(index2)
# and now the symmetric difference, so all elements from index1, which are not in index2
# plus all elements from index2, which are not in index1
index1.symmetric_difference(index2)
# Here we can also use the ^ operator
index1 ^ index2
# and the intersection
index1.intersection(index2)
# or with the & operator
index1 & index2
That’s all we need to know about the Index object in pandas for now. In the next part we’ll be talking about data selection in pandas.
Here’s the video version of the article: