Why are we lacking data about women?

We all rely on research in some way, to do our jobs. But sometimes that research hasn’t included women, or has assumed women and men are the same.

Share article

why lack data about women

When my sister-in-law Louise was studying fashion design, her required course materials included a specific brand and model of dressmaking scissors. She could scarcely believe how big they were, compared with her hands. They were difficult and slow to use, and it never got easier. Risking losing marks for having the wrong equipment, Louise replaced them with a pair that fit her hands.

Fashion design is dominated by women, but Louise’s course requirements had been designed around people with much larger hands than hers – I would guess, men. This failure to account for women and men having physical differences cost Louise, and probably most of her classmates, at the very least, time and money. Perhaps marks.

You might hear a story like this and think, that’s just one fashion school with outdated requirements. But increasingly, we’re asking where the data came from that we use to develop things like medicines, tech and safety equipment, and finding often, it came solely, or mostly, from men.

The gender data gap

Author and researcher Caroline Criado-Perez coined the term, “gender data gap” to describe when research used to develop products for both men and women, or women alone, uses only (or almost entirely) male subjects. This can mean the resulting product performs more poorly for women.

And ‘to perform more poorly’ may mean fatally so. Looking behind statistics that might seem hard to explain, like women’s higher rate of injury and death in car accidents, Criado-Perez found cars are safety tested with crash test dummies designed around an average male’s weight and measurements. Similarly, the UK’s Trade Union Council (TUC) lists multiple examples of workplace safety equipment not designed to fit women, alongside cases of serious injuries and fatalities directly linked to poorly fitting equipment. It’s not uncommon to find medicines trialed only on men, even when the drug is only for women.
lack data about women
The significance of this topic is such that Criado-Perez’s book Invisible Women: Exposing Data Bias in a World Designed For Men won the Royal Society’s Science Book Prize in 2019 and Financial Times Business Book of the Year. Several conferences to be held in 2020 (for example, Gender Data Gap Conference in Stockholm and Measure what Matters in Canberra) aim to educate participants to think about, and act on, the gender data gap.

Where are the gender data gaps?

Everywhere you look, according to Criado-Perez’s research. But since we’re in cybersecurity, let’s draw some examples from tech, software and the internet.

Software doesn’t identify women’s voices or faces so well

Researcher at MIT Media Lab Joy Buolamwini found facial recognition software was more likely to misidentify women, and especially Black women.

While it correctly identified white men 99 percent of the time, the results fell to 93 when identifying white women, and plummeted to just 35 percent for Black women.

Voice recognition systems also fail more frequently for women’s voices. University of Washington’s Dr. Rachael Tatman found YouTube’s auto-captioning accurately identified men’s speech 13 percent more often than women’s.

Twitter reports many women are in fact, men

When Head of Social Insight at OgilvyOne in 2015, Karin Robinson analyzed her own Twitter following. She found Twitter’s algorithm that supposedly identifies users’ age, gender and income (for marketers’ use in targeting ads) was out by a significant margin when it came to her followers – under-counting women and over-counting men.

While the consequences of misidentifying users’ gender for marketing purposes might seem trivial, it may also mean analyses Twitter performs on its own data, like checking whether they’re handling online abuse or bullying complaints in a gender-neutral way, will be compromised.

AI taught itself to discriminate against women

Amazon realized in 2015 that the AI it created to shortlist candidates’ resumes had developed bias similar to that of humans when shortlisting candidates for roles.

The technology had learned its ways from patterns of employment over the past 10 years.

Because most people who apply and are appointed for software developer roles are men, the AI decided a resume containing words like ‘women’s’ (for example, ‘women’s college’) should go to the bottom of the pile.

Amazon no longer uses the AI for shortlisting, in part because of this finding.

What can business leaders in tech do?

Even if you think you don’t perform or use research, you probably do, somewhere along the line. From user testing software, to customer satisfaction surveys, if you’re not looking to understand how well you’re doing and how to improve, you should be.

There are several things you can do to reduce gender bias in the data you rely upon to do your work.

1.   Consider women a potential market you’ve not yet reached

Many businesses are now starting to realize that determining your potential audience by your current audience limits growth. If you make design software, for example, and know it’s currently most popular with men ages 35 to 45, rather than aiming to increase your market by reaching more of the same people, there may be bigger gains in catering to groups you’re not reaching.

Many brands once thought altering their product or marketing to reach more women would always alienate an existing male market, but prominent examples show the male market is not so easily threatened. For example, when a number of movie studios started releasing action films centered around female characters, like Mad Max: Fury Road and Star Wars: The Force Awakens, their box office success now eclipses the concerns that preceded their release – that the popular franchises were taking a huge risk with an existing male fan base.

2.   Check where your data comes from

One reason voice recognition software fails more often with female voices may be that popular speech and language open data sources, such as TED talks, are male-dominated.

Asking your researchers, or companies you commission research from, where their data comes from, and whether broader sources are available, will make results less likely to favor groups already over-represented.

3.   Research your cold audiences

People who love what you do are always easiest to research, but they’ll teach you the least. You shouldn’t be trying to improve your product or expand your market by always asking fans, or existing customers, what they think.

It’s much harder to connect with ‘the one that got away’ (or even, the one who’s never heard of you) but they’re who you most need to talk to.

4. To recruit more women, change your strategy

Caroline Criado-Perez notes in her book that one reason for the gender data gap is women step up to take part in research less often than men and more often drop out of studies.

What reasons lie behind this? It’s hard to say, but we know women are more likely to face financial pressure and have caring responsibilities, so they’ll naturally struggle more to give up their time. Considering how you design research, and whether it’s equally appealing to men and women, could help. Here are some research design factors to consider.

Recruitment methods

Asking staff to share an invitation on social media will likely reach people like your staff.

If requested to do so, a good research company will be able to find you participants matching a balance of profiles, such as age, profession and gender.

Inviting women specifically in the advertising description can also help.


A lunchtime focus group might be easier for women with caring responsibilities, rather than one held after work.

Time commitment

A 20-minute interview will attract more participants than a one-hour interview.


Review and reduce any potential personal safety risks for those who take part. For example, how are you making sure participants never see each other’s personal data? If you’re asking people to go somewhere to take part, do the lighting, parking or connections to public transport feel safe?

Women may feel more comfortable meeting a researcher in a public place, like a coffee shop or library, rather than going to an unknown office building.


Even with studies that are well paid, more people will be able to take part if they don’t have to travel. Consider whether the researcher can go to the participant or meet over video conferencing rather than face-to-face.

5. Delineate data for men and women, and analyze it separately

In Invisible Women, Criado-Perez says sometimes the gender data gap problem isn’t that the data on women is missing, but that it’s mixed in with men’s data or not analyzed separately to see if there are differences. This hides when something isn’t working for one gender or another. For example, if your male users are having an above-average experience and your female users are having a below-average experience, without delineating the data according to which subjects are men, and which are women, the results will tell you it’s fine for everyone.

By gathering gender information about participants and making sure any analysis looks for potential differences between men and women, you can identify gender-related problems early.

A product can only be as good as the research used to develop it. It’s good business sense to make sure the data that informs you includes your whole intended audience, not just those who are easiest to engage.

The notion of the gender data gap has given consumers a language to identify and object to the gender data biases behind products, from software to scissors. Business leaders can build a better reputation, and better products, by asking the right questions. Make sure the research you rely on is representative and allows for the possibility that the gender of the person using your product might change how good it is for them.

Research matters to us

When it comes to cyberthreats, we’re continually sharing what we know, to grow the conversation.

About authors

Suraya Casey is a freelance writer, editor and content strategist based in New Zealand. Her interests include cybersecurity, technology, climate, transport, healthcare and accessibility.