Skip to content
Home » pandas Part 4 – the DataFrame Class

pandas Part 4 – the DataFrame Class

Spread the love

dataframe

In the previous two parts of this series we were talking about the Series class in general and about creating Series objects in pandas. In this part we’ll be talking about another fundamental pandas data type, the DataFrame.

You can imagine a DataFrame as a sequence of Series objects, all sharing the same index. But even better than to imagine things is to see them in action, so let’s create a DataFrame.

First we’ll create two Series objects with the same indices, then we’ll make a DataFrame from them. Here are our Series objects:

In [2]:
import numpy as np
import pandas as pd

atomic_numbers = pd.Series({'helium': 2, 'oxygen': 8, 'sodium': 11, 'carbon': 6, 'sulfur': 16})
atomic_numbers
Out[2]:
helium     2
oxygen     8
sodium    11
carbon     6
sulfur    16
dtype: int64
In [3]:
atomic_masses = pd.Series({'helium': 4.003, 'oxygen': 15.999, 'sodium': 22.990, 'carbon': 12.011, 'sulfur': 32.066})
atomic_masses
Out[3]:
helium     4.003
oxygen    15.999
sodium    22.990
carbon    12.011
sulfur    32.066
dtype: float64

These two data structures could now be combined into one, the DataFrame. To do that we’ll pass another dictionary to the constructor of the DataFrame class. The keys will be the names of the two columns that will be created and the corresponding values will be the two dictionaries we just created:

In [4]:
elements = pd.DataFrame({'atomic number': atomic_numbers,
                        'atomic mass': atomic_masses})
elements
Out[4]:
atomic number atomic mass
helium 2 4.003
oxygen 8 15.999
sodium 11 22.990
carbon 6 12.011
sulfur 16 32.066

What we just got is a clear two-dimensional data structure that contains all the information combined. Just like we had the values and index attributes with the Series class, here we have the index and columns attributes that will give us Index objects:

In [5]:
elements.index
Out[5]:
Index(['helium', 'oxygen', 'sodium', 'carbon', 'sulfur'], dtype='object')
In [6]:
elements.columns
Out[6]:
Index(['atomic number', 'atomic mass'], dtype='object')

You can use the column name to obtain a Series of all the elements in that column:

In [7]:
elements['atomic mass']
Out[7]:
helium     4.003
oxygen    15.999
sodium    22.990
carbon    12.011
sulfur    32.066
Name: atomic mass, dtype: float64
In [8]:
type(elements['atomic mass'])
Out[8]:
pandas.core.series.Series

In the next part we’ll see how to create DataFrame objects. There are quite a few ways of doing this.

Your Panda3D Magazine

Make Awesome Games and Other 3D Apps

with Panda3D and Blender using Python.

Cool stuff, easy to follow articles.

Get the magazine here (PDF).

Python Jumpstart Course

Learn the basics of Python, including OOP.

with lots of exercises, easy to follow

The course is available on Udemy.

Blender Jumpstart Course

Learn the basics of 3D modeling in Blender.

step-by-step, easy to follow, visually rich

The course is available on Udemy and on Skillshare.

Here’s the video version of this article:


Spread the love

Leave a Reply