In the previous part of the openpyxl series we were iterating over the data in the spreadsheet. In this part we’ll have a look at how to convert spreadsheet data into a dictionary.
This time we’ll be using a spreadsheet that will contain some more useful data. It’s saved as wb2.xlsx. I just entered the data (which is fictional, by the way) by hand and saved the workbook. Here it is:
company | city | category | income | number_of_employees |
Roses and Pansies | Toronto | gardening | 456845.00 | 26 |
Reader’s Paradise | Sydney | publisher | 854782.00 | 44 |
Dogs and Cats | Madrid | pets | 458784.00 | 36 |
Enjoy Your Meal | New Orleans | restaurant | 447884.00 | 18 |
Bloom Store | London | gardening | 1254654.00 | 84 |
Tastes of the South | Miami | restaurant | 587454.00 | 42 |
Pet For You | Athens | pets | 987887.00 | 21 |
Eat ‘Em Up | Helsinki | restaurant | 1254477.00 | 53 |
Mr. Bunny | Vienna | pets | 254478.00 | 12 |
As you can see, there are 10 rows (1-10) and 5 columns (A-E). The first row contains the headers. As we are going to work with this workbook now, let’s import it:
>>> from openpyxl import load_workbook
>>> workbook = load_workbook(filename="wb2.xlsx")
>>> sheet = workbook.active
In the previous part we saw that the data was returned as tuples. Suppose we want to retrieve all the rows, including the headers row:
>>> for value in sheet.iter_rows(values_only=True):
... print(value)
...
('company', 'city', 'category', 'income', 'number_of_employees')
('Roses and Pansies', 'Toronto', 'gardening', 456845, 26)
("Reader's Paradise", 'Sydney', 'publisher', 854782, 44)
('Dogs and Cats', 'Madrid', 'pets', 458784, 36)
('Enjoy Your Meal', 'New Orleans', 'restaurant', 447884, 18)
('Bloom Store', 'London', 'gardening', 1254654, 84)
('Tastes of the South', 'Miami', 'restaurant', 587454, 42)
('Pet For You', 'Athens', 'pets', 987887, 21)
("Eat 'Em Up", 'Helsinki', 'restaurant', 1254477, 53)
('Mr. Bunny', 'Vienna', 'pets', 254478, 12)
But sometimes tuples is not what you need. Luckily, you can use all the Python techniques to convert the data to any other format. The dictionary is a format that often comes in handy, so let’s convert the data to a dictionary:
>>> companies = {}
>>> for row in sheet.iter_rows(min_row=2, values_only=True):
... company_name = row[0]
... company_data = {
... 'city': row[1],
... 'category': row[2],
... 'income': row[3],
... 'number of employers': row[4]
... }
... companies[company_name] = company_data
As you can see, we’re using the names of the companies as keys and the other company data is presented as a dictionary too.
Now we can retrieve the data like so:
>>> companies['Enjoy Your Meal']
{'city': 'New Orleans', 'category': 'restaurant', 'income': 447884, 'number of employers': 18}
>>> companies['Pet For You']['income']
987887
Another data type we may often want to convert our spreadsheet data into is the class. In the next part we’ll see how to do that.