Dictionaries#
Dictionaries are mappings. They give the relationships between things.
This will be clearer with an example.
Let’s say we are interested in the English Premier League (EPL).
In particular, we are interested in wage bills of each team. We will confine our attention to teams outside London, suspecting that teams in London have to spend more in wages, because London is such an expensive city.
We consult the data on 2022-2023 EPL wages, and the EPL league table and we find these values:
Manchester City came first in the League, and spent an estimated £185,240,000 on wages.
Manchester United came third in the League, and spent an estimated £212,135,000 on wages.
Liverpool came fifth, and spent £160,868,000.
We want a way to store the correspondence (or mapping) between the name of team and their wage bill. If we have the team name, we want to be able to fetch the corresponding wage bill.
One way of doing this is a basic Python data type called a dictionary.
Here is an empty dictionary:
wages = {}
wages
{}
type(wages)
dict
We can fill this dictionary by entering pairs of values. The pair is in the form of a key and a value. To start with, let us put the key and value into their own variables.
key = 'Manchester City'
value = 185_240_000
value
185240000
By the way, notice the underscores in 185_240_000
. These are just for decoration, to make it easier for us humans to read the number we typed in. Notice that Python still reads this number as if it had no underscores. The underscores look nice where they are, because they delimit the thousands and millions, but Python just ignores them; you can put them anywhere you want:
18_52_4000_0
185240000
Now back to the key and the value. We enter this pair into the dictionary like this:
wages[key] = value
wages
{'Manchester City': 185240000}
Notice that the dictionary is now no longer empty. It has one entry, where the entry consists of our key and our value. This entry says:
If you ask “Manchester City”, I’ll answer 185240000
That is, if you give the dictionary the key “Manchester City” it will return the corresponding value 18524000.
You give the dictionary the key by indexing. Indexing occurs where we follow the value (here wages
) by square brackets []
. Inside the square brackets we put the key for which we want the value:
# Indexing into the dictionary with a key gives
# the corresponding value.
wages['Manchester City']
185240000
Now let us enter the other key value pairs:
wages['Manchester United'] = 212_135_000
wages['Liverpool'] = 160_868_000
wages
{'Manchester City': 185240000,
'Manchester United': 212135000,
'Liverpool': 160868000}
Notice the dictionary has three entries now. Each entry is the pair of key (the team name) and value (the corresponding wage bill.
We can fetch the wage bills corresponding to each team, using the team name:
wages['Manchester United']
212135000
wages['Liverpool']
160868000
Here we first created an empty dictionary, and filled in the three elements with assignment statements like:
wages['Manchester United'] = 212135000
Building dictionaries with {}
#
We can also create dictionaries directly with key, value pairs, like this:
# Creating the dictionary and adding the key, value pairs.
wages_again = {'Manchester City': 185_240_000,
'Manchester United': 212_135_000,
'Liverpool': 160_868_000}
wages_again
{'Manchester City': 185240000,
'Manchester United': 212135000,
'Liverpool': 160868000}
Notice the syntax for the value on the left hand side that creates the dictionary. There is:
An open curly bracket
{
followed by none or more of:Key value pairs, written as:
The key then
colon
:
the value then
a comma (if more key value pairs follow).
A closing curly bracket
}
.
Building dictionaries with keyword arguments#
In fact there is another common way to create dictionaries, using the dict
function. (We say “function”, but in fact it’s a Python class. That doesn’t matter for our purposes, you use the dict
class by calling it, like a function).
You can call dict
in various ways — try a cell with dict?
to see the help for dict
.
However, the most common use of dict
is to make clever use of keyword arguments. Again, an example may be clearer than an explanation:
more_wages = dict(Liverpool=160_868_000,
Everton=80_707_000,
Leeds=20_800_000)
more_wages
{'Liverpool': 160868000, 'Everton': 80707000, 'Leeds': 20800000}
Notice the trick. We pass the value for Liverpool with a keyword argument, with name “Liverpool” and value 160_868_000
. dict
then creates the dict with the keys from the keyword argument names, and the values from the keyword argument values.
Perhaps you noticed that we didn’t use “Manchester United” or “Manchester City” in the dict
example above. Why? (We suggest you try and work this out before going to the next cell).
We didn’t use “Manchester United” with dict
because it would have failed with an error:
bad_dict = dict(Manchester United=212_135_000)
Cell In[13], line 1
bad_dict = dict(Manchester United=212_135_000)
^
SyntaxError: invalid syntax. Perhaps you forgot a comma?
What happened there? Keyword arguments to functions have to be valid variable names in Python, because, in general, they do become variables inside the function. For example, here’s a function with a keyword argument (my_var=10
):
def my_func(a, my_var=10): # Notice the my_var=10 keyword argument.
return a * my_var
my_func(2) # Uses the default my_var=10
20
my_func(2, my_var=100) # Sets a not-default value for my_var
200
Therefore, keyword argument names must be valid Python variable names, and these cannot contain spaces.
The upshot is, the dict(first_key=first_value)
trick is a nice one, but it will only work if all your keys are strings that are also valid Python variable names, such as Liverpool
and Leeds
above.
Keys are unique#
Consider our original wages
dictionary:
wages
{'Manchester City': 185240000,
'Manchester United': 212135000,
'Liverpool': 160868000}
Let’s say I discovered that Liverpool had hired someone else, and their actual wage bill was £164,000,000. You can imagine that I could change the original dictionary with:
wages['Liverpool'] = 164_000_000
# Show the result
wages
{'Manchester City': 185240000,
'Manchester United': 212135000,
'Liverpool': 164000000}
Notice that the original wages
value (dictionary) has changed, with a new value for the key Liverpool
.
A technical note: we have discovered that the dictionary value is mutable — meaning, you can change the contents of the value, without making a new copy. See the mutable / immutable page for more detail.
Keys are unique#
Now - returning to the dictionary. Notice that with:
wages['Liverpool'] = 164_000_000
we didn’t make a new repeated key, we replaced the value corresponding to the original Liverpool
key. Perhaps that is obvious in this case, as wages['Liverpool']
on the left hand side is going to point to the value for wages['Liverpool']
and that value will be set to the new number 164_000_000
. Slightly less obvious is the fact that this will happen when we build the dictionary in one go:
# Creating the dictionary and adding the key, value pairs.
wages_at_once = {'Manchester City': 185_240_000,
'Manchester United': 212_135_000,
'Liverpool': 160_868_000,
'Liverpool': 164_000_000}
wages_at_once
{'Manchester City': 185240000,
'Manchester United': 212135000,
'Liverpool': 164000000}
Notice that, even here, when we construct the dictionary, Python overwrites the
first value associated with the Liverpool
key, with the second value, as it
builds the dictionary. This means that it is not possible to create
a dictionary with duplicate keys — the keys of the dictionary are unique.
Iterating over keys and values#
You can get the keys of the dictionary with:
wages.keys()
dict_keys(['Manchester City', 'Manchester United', 'Liverpool'])
The thing that comes back from the .keys()
method is a special container that
refers to the keys in the dictionary. You can use the container after the in
of a for
loop, like this:
for key in wages.keys():
print(key)
Manchester City
Manchester United
Liverpool
Notice the form of the for
line above. It is for variable in expression:
, where, in our case:
The variable is
key
.The expression is
wages.keys()
.
As you know, we can use any name for the variable.
The expression can be anything that represents a sequence of things. In Python
technical terms, it must be an iterable — something that we can request
a sequence of things from. The thing that comes back from wages.keys()
is
such an iterable.
Another way you can use an iterable such as wages.keys()
is to pass it to
constructor functions like list
or tuple
or np.array
. In that case,
the constructor gets all the values from the iterator thing, and puts them into
the relevant container
list(wages.keys())
['Manchester City', 'Manchester United', 'Liverpool']
import numpy as np
np.array(wages.keys())
array(dict_keys(['Manchester City', 'Manchester United', 'Liverpool']),
dtype=object)
In fact, the dictionary is itself iterable. That is, you can also ask the dictionary to give a sequence of things. And in fact, that sequence of things is just the keys:
# Asking the dictionary to iterate, gives the keys.
for key in wages: # Identical to "for key in wages.keys():"
print(key)
Manchester City
Manchester United
Liverpool
# Asking the dictionary to iterate, gives the keys.
list(wages)
['Manchester City', 'Manchester United', 'Liverpool']
As you might guess, you can also iterate over the corresponding values of the dictionary.
for val in wages.values():
print(val)
185240000
212135000
164000000
list(wages.values())
[185240000, 212135000, 164000000]
Notice that values arrive in the same order as the corresponding keys.
Key, value pairs#
Think of the dictionary as a store of key, value pairs. You find a value with the corresponding key. Put another way, you can look up the value using the key. Or again, the dictionary is a mapping between keys and values. As you have seen, or example dictionary has keys:
list(wages.keys()) # or list(wages)
['Manchester City', 'Manchester United', 'Liverpool']
and values:
list(wages.values())
[185240000, 212135000, 164000000]
len()
gets the number of key, value pairs in the dictionary. Of course this is the same as the number of keys, and the same as the number of values:
len(wages)
3
We can see the component key, value pairs, by using the .items()
method of the dictionary. One item is a pair — a single key, value pair.
for pair in wages.items():
print(pair)
('Manchester City', 185240000)
('Manchester United', 212135000)
('Liverpool', 164000000)
Here’s the last pair
, left from the execution of the loop above:
pair
('Liverpool', 164000000)
We may want to put the key and the value into their own variables, rather than keeping them together in a tuple like this. This is a job for unpacking. Unpacking occurs when you have a sequence on the right hand side of an expression, and multiple values on the left. Here we unpack the pair into two variables, key
and value
:
# Unpacking.
# Two variables on the left, sequence of length two on the right.
key, value = pair
print('Key is', key)
print('Value is', value)
Key is Liverpool
Value is 164000000
You will often see this happening in for
loops, like this:
for key, value in wages.items():
print('This key is', key)
print('This value is', value)
This key is Manchester City
This value is 185240000
This key is Manchester United
This value is 212135000
This key is Liverpool
This value is 164000000
Update#
Dictionaries have an update()
method that pulls key, value pairs across from another dictionary:
wages
{'Manchester City': 185240000,
'Manchester United': 212135000,
'Liverpool': 164000000}
new_wages = {'West Bromwich Albion': 4_716_200,
'Hull City': 520_000}
result = wages.update(new_wages)
result
Notice that Jupyter / IPython does not display anything for the result.
This is because the thing that came back from that operation was the value None
:
result is None
True
Python sends back that result to remind you that the update()
is not creating a new dictionary, it is updating the original dictionary, in this case, wages
. Notice that wages
now has the new keys and values:
# Update modified 'wages'
wages
{'Manchester City': 185240000,
'Manchester United': 212135000,
'Liverpool': 164000000,
'West Bromwich Albion': 4716200,
'Hull City': 520000}
update()
will overwrite key, value pairs that exist already in the left-hand-side dictionary.
# Notice new, different value for Man City.
more_new_wages = {'Sheffield': 15_679_600,
'Manchester City': 100_000_000}
wages.update(more_new_wages) # Returns None
# wages modified
wages
{'Manchester City': 100000000,
'Manchester United': 212135000,
'Liverpool': 164000000,
'West Bromwich Albion': 4716200,
'Hull City': 520000,
'Sheffield': 15679600}
Keys don’t have to be strings#
So far we have used strings as the dict
keys. Strings are the most common
types of keys, but in fact, you can use many immutable types as
keys. Just for example, let’s say we wanted to be able to look up the EPL
teams by their league position, rather than by name. We only have and want
the data for teams outside London, so we can’t use a list — we don’t have
a continuous series of positions to record. In that case one might use the
position number for keys, and the team names as values:
# Notice that Arsenal was second, but that's in London, so we don't include it here.
by_position = {1: 'Manchester City',
3: 'Manchester United',
4: 'Newcastle',
5: 'Liverpool'}
by_position
{1: 'Manchester City', 3: 'Manchester United', 4: 'Newcastle', 5: 'Liverpool'}
Notice that the keys here are numbers, and the values are strings. As ever, the we get the values by indexing with the corresponding key:
by_position[4]
'Newcastle'
We can use many other types as keys, but in practice you won’t need this feature often in your programming life; you can look up dictionaries in more detail when you need to.