Boolean arrays#
# Import the array library
import numpy as np
# Make random number generator.
rng = np.random.default_rng()
We continue to improve on our implementation for solving the three girls problem.
The random choice page gave us a way to simulate the four children of a family in one go:
sexes = [0, 1]
children = rng.choice(sexes, size=4)
children
array([0, 0, 0, 1])
The next step is to get the count of girls from our four simulated children. Because girls are 1 and boys are 0, we can do that with:
# Add up the integers to count the number of girls.
count = np.sum(children)
count
1
Our interest in whether the count
value is equal to 3. We can look at that
number and write down “Yes” if the number is equal to 3 and “No” otherwise, but
we would like the computer to do that routine work for us. We use
comparison:
is_three = count == 3
is_three
False
True means our simulation found a family with three girls, and False means we found a family some other number of girls.
In a while, we are going to simulate a very large number of these families, but for now, let us simulate 5 families, in a somewhat laborious way:
# Make an array to store the counts for each family.
counts = np.zeros(5)
# Make five families, store the counts.
family = rng.choice(sexes, size=4)
counts[0] = np.sum(family)
# Second family
family = rng.choice(sexes, size=4)
counts[1] = np.sum(family)
# Third
family = rng.choice(sexes, size=4)
counts[2] = np.sum(family)
# Fourth
family = rng.choice(sexes, size=4)
counts[3] = np.sum(family)
# Fifth.
family = rng.choice(sexes, size=4)
counts[4] = np.sum(family)
# Show the counts
counts
array([1., 3., 1., 2., 0.])
Each value in counts
is the number of girls in one simulated family.
Now we have 5 numbers for which we want to ask the question - is this number equal to 3? We would like five corresponding True or False values.
Here is where arrays continue to work their magic - we can get this result with a single expression:
are_three = counts == 3
are_three
array([False, True, False, False, False])
are_three
is an array with 5 elements, one for every element in the array we
compared, counts
.
are_three
is a Boolean array because it contains only Boolean (True, False) values.
We can see what kind of data the array contains by looking at the dtype
attribute of the array. Remember, an attribute is a value attached to another value. In this case it is a value attached to the are_three
value.
are_three.dtype
dtype('bool')
Each element in are_three
has the result of the comparison for the
corresponding element. The code above is equivalent to doing:
# Make an array of Boolean type (the "dtype" argument)
are_three_longhand = np.zeros(5, dtype=bool)
# Do the comparisons one by one.
are_three_longhand[0] = counts[0] == 3
are_three_longhand[1] = counts[1] == 3
are_three_longhand[2] = counts[2] == 3
are_three_longhand[3] = counts[3] == 3
are_three_longhand[4] = counts[4] == 3
# Show the result
are_three_longhand
array([False, True, False, False, False])
Now we want to know how many of the counts
values are equal to 3. This is
the same as asking how many True values there are in are_three
(or
are_three_longhand
.
We can do this using the np.count_nonzero
function. It accepts an array as its argument, and returns the number of non-zero values in the array. It turns out that np.count_nonzero
treats True as non-zero, and False as zero, so np.count_nonzero
on a Boolean array counts the number of True values:
my_booleans = np.array([True, False, True])
np.count_nonzero(my_booleans)
2
To see the number of times we found 3 in counts
:
np.count_nonzero(are_three)
1