Lists#

Thus far we have seen the following data types:

  • integer (counting number) - int.

  • floating point number (number with decimal point) - float.

  • string (text) - str.

  • boolean (True or False value) - bool.

In data analysis, we often want to collect together several numbers, or strings, into a sequence. This allows us to collect values together into one place, and refer to all the values with a single name. Now that we can refer to all the values with a single name, we can write code that works on the whole sequence at once.

For example, remember our problem of the three girls in a family of four.

You have seen that we can make four random values, that can be 0 or 1, like this:

# Load the Numpy package, and rename to "np"
import numpy as np
# Make random number generator.
rng = np.random.default_rng()
# Get four random numbers between 0 and 1.
a = rng.integers(0, 2)
b = rng.integers(0, 2)
c = rng.integers(0, 2)
d = rng.integers(0, 2)

Here, each random number gets its own variable. Then we added the numbers like this:

a + b + c + d
3

This gave us the number of girls for one simulated family.

Now imagine that we were trying to simulate a family of 12 children. We would have to give each child their own variable a through l (small L), and then have a long line like this:

a + b + c + d + e + f + g + h + i + j + k + l

That would be annoying in this case, and impractical when we have to simulate larger units, such as the 100 members of Robert Swain’s jury panel.

Sequences are part of the solution to this problem. Sequences are values that contain zero or more other values. There are various types of sequences, but the two we will use the most are:

  • Lists - list

  • Arrays - np.array

We will see arrays soon. This page is about lists.

Making a list#

You make a list using square brackets - [ and ].

Here we make a list that contains three integers:

my_numbers = [1, 2, 3]
my_numbers
[1, 2, 3]

[1, 2, 3] is a list expression because it is a recipe that results in a value that is a list). We confirm it is an expression by showing that the notebook displays a value:

[1, 2, 3]
[1, 2, 3]

What kind of value is it?

type(my_numbers)
list

Lists can be empty. To make an empty list, put nothing inside the square brackets, like this:

my_empty_list = []
my_empty_list
[]

A list can have any kind of value as its elements. For example, here is a list of strings:

some_strings = ["Captain", "Agent", "Jessica", "Luke"]
some_strings
['Captain', 'Agent', 'Jessica', 'Luke']

Functions and lists#

Python has some built-in functions that operate on sequences, like lists.

A built-in function is a function that we can use in any Python session, without importing it from a library.

One of these built-in functions is len. Here is a call expression, using the function len on our list:

len(my_numbers)
3

As you might predict, calling len with the list as an argument, returns the number of elements in the list.

Another function that operates on lists is sum. This adds all the elements in the list together, and returns the result:

sum(my_numbers)
6

A half-solution for three girls#

Now we have another way of solving the problem of making the family of four children above. Use Control-Enter a few times on the cell below to convince yourself it returns numbers between 0 and 4 - simulated numbers of girls in a family of four children.

# Get four random numbers between 0 and 1, put them in a list.
# The opening square brackets tell Python the statement is not finished
# when it reaches the end of the first line.
family = [rng.integers(0, 2),
          rng.integers(0, 2),
          rng.integers(0, 2),
          rng.integers(0, 2)]
# Add all the values in the list
sum(family)
2