Gender equality statistics

Gender equality statistics#

The data are a version of this dataset from the World Bank on gender and inequality usually housed at World Bank gender-statistics.

The version I’ve used here was from 2017. The 2017 version had data on health care expenditure that the current data does not have.

You can get a copy of the 2017 data at: Gender_StatsData.csv.

See the Gender Statistics dataset page for more detail on my processing of the data.

In summary, I’ve rearranged the data, calculated averaged data from 2012 to 2016 for a set of columns I was interested in, and only kept rows corresponding to actual countries, rather than summary groups like “Arab World” and “Lower middle income”.

The data file is gender_stats.csv.

I have also made a version with fewer columns for the starting pages on Pandas, called gender_stats_min.csv:

import pandas as pd
df = pd.read_csv('gender_stats.csv')
df = df.loc[:, ['country_name', 'country_code', 'gdp_us_billion',
                'mat_mort_ratio', 'population']]
df.to_csv('gender_stats_min.csv', index=None)

See gender stats data dictionary for a list of the column names and their meaning.

The data is licensed under CC-BY. Please attribute to the World Bank Data Catalogue with the this URL: https://datacatalog.worldbank.org/dataset/gender-statistics.