Features:

How to tell good LGBTQ+ stories with bad data

Concepts and methods to help you do rigorous journalism even when the data is tricky, Part 1

Posted on: February 20, 2024

Data on LGBTQ+ people can be scarce, inconsistent, or less methodologically rigorous than what data journalists are used to relying on. At the end of the day, though, doing good LGBTQ+ data journalism is just doing good data journalism.

In addition to our list of tips on how to wrangle this kind of data, we — Kae Petrin and Jasmine Mithani — wanted to share our insights into the ethical and conceptual issues we’ve encountered, and how to leverage those insights into rigorous, data-driven journalism.

Here’s what we have learned about crafting a pitch and framing a story on LGBTQ+ data.

This article is about concepts and ethics — if you’re interested in technical and reporting solutions, don’t miss Part 2.

External identification versus self-identification

Broadly, there are a few types of data: self-reported data, externally reported data, and external interpretations of self-reported data. Each has its own advantages and problems when it comes to LGBTQ+ communities.

For instance, if you want to understand research studies about “transition regret” for trans people, you have to ask, Does a particular study measure self-reported feelings of regret? Or does it measure people who stopped undergoing hormone therapy, and make assumptions about why they did so?

This information may be more standardized than self-reports, but it also may inaccurately assign meaning to a personal experience. Studies that also ask people why they stopped gender-affirming medical care often indicate that they did so for many reasons besides regret. There’s a long history of similar definitional contradictions, for instance, in studies of men who have sex with men while still self-describing as straight, and studies of any number of other LGBTQ+ groups.

As with any data reporting, it’s important to walk audiences through limitations and nuances of this sort.

Police records — which are fully externally reported, but sometimes involve verbal self-reports to officers — consistently under-report gender-motivated and other LGBTQ+-related hate crimes. That’s in part because police often fail to recognize transgender people at all.

Likewise, many public records may list people under outdated names and gender markers. This can make identifying trans people during breaking news particularly difficult. It’s even harder to go back to an anonymous database row and check that their gender has been correctly described.

Behavior does not directly imply identity or vice-versa

Data can be a way to break through stereotypes about the lives of LGBTQ+ people. For instance, despite the Catholic Church generally considering homosexual activity to be a sin, similar shares of both heterosexual and queer people identify as Catholic.

But behavior can also be a poor approximation for LGBTQ+ people. The U.S. Census Bureau has repeatedly refined its attempts to gather information on same-sex couples. But even with improving methods, historical census data measured same-sex couples who lived together. This likely undercounted lower-income queer couples, who may be less likely to live together, and excluded bisexual and trans people who live with someone of a different legal sex.

The result is not an actual count of LGBTQ+ people — it’s a count of a subgroup that has a specific behavior. Think through when something can, and can’t, be generalized to broader communities.

Feelings versus action in polling

Much polling on LGBTQ+ populations asks people how certain laws have affected their lives. Usually people will say that a law has increased their fear of discrimination, and many fewer will respond that they have experienced increased discrimination.

The chilling effect of laws is necessary coverage, as it is a way peoples’ lives are restricted. It is also often an intended effect of these laws — not to result in arrests or charges, but to scare people into hiding their identities. But overly focusing reporting on emotional safety can lead to a skewed perspective of what is actively happening.

Consider reporting, when available, poll questions showing feelings and those that show self-reported experiences. Be attentive to the distinction between “I have considered moving states because of anti-LGBTQ+ legislation” and “I have moved states because of anti-LGBTQ+ legislation,” for instance, or “I am scared of being harassed” and “I have been harassed.”

Don’t overlook stories about the enforcement, outcomes, and consequences of such laws in favor of quick-hit polling data. Spending some time digging into action — or inaction — based on new laws and policies can yield important stories.

Think critically while reporting on polling about LGBTQ+ people

Public opinion doesn’t dictate human dignity or rights.

Historically, the general population is willing to take away rights from minority groups — see the willingness to incarcerate Japanese Americans during World War II — and is hostile to expanding rights for the oppressed.

Consider adding this context into stories explicitly.

What cisgender and heterosexual people think about LGBTQ+ people is important, because those are the groups legislating rights. But just because the public opinion coalesces around a specific point of view does not mean it is backed by science, evidence, or human and civil rights principles. Think about how newsrooms should consider climate change, for example: It exists and should be covered accordingly, no matter how many people still tell pollsters they don’t believe in it.

Story framing and textual production

Sometimes the existence of the data shouldn’t be the headline of a story (e.g. “TK insight, survey says”). Instead, use it as a hint of where a story could be: as a jumping-off point to ask deeper questions about a trend, or a source for story ideas. Cite the data as a piece of supporting evidence in a story where you also consult outside experts and people with lived experience. Use the opportunity to report, and explain to an audience, the limitations of information that come with its source and production process. This increases readers’ data literacy and produces higher-quality journalism.

Sometimes, it might make sense to use a notable data finding as part of a lede. But often, LGBTQ+ data needs more context. It may make more sense to hold the data points for further down, in a nutgraf or another section where you have the space to ensure that the data isn’t misinterpreted.

When data is only one piece of evidence within a story, all of the story copy needs to consistently communicate that. Proactively flag this with headline writers in your newsroom, and consider mentioning the data release as a time peg only in the dek. If possible, loop in the audience team too to make sure your message is the same across platforms.

In the absence of good data, write about it

Have you hit a wall with your search for accurate data? Do you have a burning question that cannot be answered because the proper demographic information wasn’t collected? This doesn’t have to be a dead-end for your story. Consider writing about the effect of the data gap.

Data runs our lives, and its absence is felt keenly among more than just journalists. Demographic data in particular is used for resource allocation and grant funding; a lack of quantitative information at scale likely produces downstream effects for a variety of health researchers, scientists, nonprofits and lawmakers.

Increasingly, LGBTQ+ people have good reasons to fear and distrust systems that collect data. So even as there are efforts to collect more and better data, there are more and more stories about the absence or misuse of data. (For instance, educational technology companies have come under fire for flagging LGBTQ+ related terms and outing students.)

For the same reasons, newsrooms should also give increasing thought to their data release and publication policies when we do obtain good data.

Resources for building a practice around LGBTQ+ data

Want to understand and troubleshoot these issues in your own reporting? Kevin Guyan’s Queer Data provides a useful conceptual overview of some of the quandaries. The book LGBTQ+ Stats offers a historical birds-eye view of what data does and doesn’t exist — and why.

The Urban Institute also published an extensive “Do No Harm” guide on working with data about gender and sexuality. Kae and Jasmine have written separately on visualizing missing or bad data and why data gap stories are important.

There’s also always the News Nerdery Slack; post your questions in the #helpme channel (and ping Kae and Jasmine for input if desired). The Trans Journalists Association also has a members’ Slack for journalists of all gender identities that includes data reporters and editors willing to offer insight.

This article is Part 1 of 2. Also see: 7 tips for data-driven journalism about LGBTQ+ communities

Credits

Jasmine Mithani

Jasmine Mithani is the data visuals reporter at The 19th, an independent newsroom covering the intersection of gender, politics and policy. She also writes the newsletter Data + Feelings about being human and being data. Her experience in journalism spans outlets national to hyper-local, including FiveThirtyEight, National Public Radio, and South Side Weekly.
- FiveThirtyEight
- @jazzmyth
Kae Petrin

Kae is a data & graphics reporter on Chalkbeat’s data visuals team, where they collaborate with local reporters to tell data-driven stories about education. Previously, Kae created graphics, built newsroom-wide tools, and produced investigative reporting for St. Louis-based radio and print publications. Kae co-founded the Trans Journalists Association in 2020 with a collective of transgender and nonbinary media-makers.
- Chalkbeat
- @petrinkae

How to tell good LGBTQ+ stories with bad data

Concepts and methods to help you do rigorous journalism even when the data is tricky, Part 1

External identification versus self-identification

Behavior does not directly imply identity or vice-versa

Feelings versus action in polling

Think critically while reporting on polling about LGBTQ+ people

Story framing and textual production

In the absence of good data, write about it

Resources for building a practice around LGBTQ+ data

Credits

Jasmine Mithani

Kae Petrin

From our Archives:

Our search for the best tabular-data extraction tool in 2024, and what we found

How to tell good LGBTQ+ stories with bad data

Concepts and methods to help you do rigorous journalism even when the data is tricky, Part 1

External identification versus self-identification

Behavior does not directly imply identity or vice-versa

Feelings versus action in polling

Think critically while reporting on polling about LGBTQ+ people

Story framing and textual production

In the absence of good data, write about it

Resources for building a practice around LGBTQ+ data

Credits

Jasmine Mithani

Kae Petrin

Recently

Product manager diary: What I learned taking an intro course in web development

Our search for the best tabular-data extraction tool in 2024, and what we found

7 tips for data-driven journalism about LGBTQ+ communities

Search this site

From our Archives:

Our search for the best tabular-data extraction tool in 2024, and what we found