Investigating 13,000 “Good” Nursing Homes in Germany
We built a web platform to make choosing a quality nursing home less painful
Over the past two years, several major nursing-home scandals emerged in Germany: residents died after missing doses of medication; nurses called ambulances when they lacked basic resources to treat patients; doctors revealed patient abuse to local newspapers. Yet, shortly before those scandals broke, Germany’s national inspection system gave passing grades to those very same nursing homes.
In early 2015, Correctiv data fellow Vanessa Wormer began looking into the grading system used to evaluate nursing homes across the country, and found that the system itself seemed to be fundamentally flawed. All nursing homes throughout Germany received passing grades, according to the inspection reports. The average one received a grade of “very good.” Almost none received a rating of “good” or worse.
We wondered, how could this be?
To find out more, my colleagues Stefan Wehrmeyer and Daniel Drepper spent several months defining, requesting, and working with national and county-level nursing home data (information that, in the end, cost more than ).
Beginning to Look for Gaps
After some initial research, the scope of the project expanded quickly. Correctiv decided to develop a book and a TV documentary, along with the data analysis. Lead reporter Daniel Drepper spent the last year researching the book, looking into the country’s nursing home infrastructure, understanding the bureaucratic and financial obstacles that German nursing homes face, and telling the stories of four nurses with powerful insight into their working conditions. With all this material developed for the book, we had a better idea of the main story we wanted to tell online, and we began to see where we needed more data. But there was still a lot to figure out.
When I began my fellowship at Correctiv at the end of March, I began cleaning the data and analyzing it, discovering which stories might take shape. This wasn’t the easiest task. Coming in with zero knowledge of Germany’s nursing home system and zero knowledge of the German language, I hit a few challenges while working with the data. But together with my colleagues, I worked through the key points we wanted to understand, and we started to see where gaps in the data might be.
Inspecting the Inspectors
Germany has about 13,000 nursing homes, and a mandatory insurance pays half of the cost for each resident. The rest has to be paid by citizens themselves, which is about a month. That’s why people tend to use cheap nursing homes. That’s also why transparency about nursing-home quality is very important. But the big problem is that there is no transparency.
Federal inspectors use 77 questions to define the quality of a nursing home. Most of these questions are very superficial and easy to manipulate, although some indicate real problems. Do the residents drink enough fluids? Are they getting proper food and medical treatment? Nursing homes can mask critical issues by excelling in other, less-important ones—perhaps by offering a well-designed menu or a nice garden. That’s how some nursing homes end up receiving “very good” grades, despite issues with medical treatment and chronic pain, or with properly feeding their patients.
Sites like the official AOK insurance company website provide information about individual nursing homes and allow users to filter through the data, but because of all the "very good” grades, it’s hard to see real differences between them. So, people usually sort by cost. While cost is certainly an important factor when choosing a nursing home, it shouldn’t be the deciding one.
We decided to concentrate on just the most important categories, using 17 out of the 77 inspection questions for our platform. And thanks to the research done for the book, where we spoke with experts and nurses about the inspection process, we learned which variables might be particularly interesting and point to deeper issues. Those include:
The size of the rooms (to check for signs of overcrowding)
The portion of part-time workers (to see if people receive the full amount of care they need)
The number of trainees (to see how much work is done by untrained people)
Cleaning the Data
For the most part, federal data on the individual nursing homes was structurally clean, in XML files. The data from the statistical offices in the counties? Not so much.
Our choice tools: CSVkit, Python, R and Excel. Before we dug into the files, we created a data dictionary for the information that we wanted to bring over into a clean format. This provided us with the basic structure that could be used later on, as we mapped the data to the individual nursing homes.
Then we worked through inconsistencies in the data, like redacted information, unclear abbreviations, and hard-to-parse footnotes. We also mapped nursing home locations to their correct AGS (Amtliche Gemeindeschlüssel)—the community identification number given to each state and district by the statistics office.
I loaded each sheet into R to see a summary of the data, to spot null values, and to start combining data the way we wanted. For example, for the state of Thüringa, I used a combination of a bash script and CSVkit to put the data together. For others, like Berlin and Brandenburg, I just did a simple copy and paste, since there were cells that were merged. I also used R to move the columns around and spit out a new CSV that could be used with the other data files.
My colleague, Stefan, mostly used the Python library
pandas to combine the other states’ data. The resulting CSVs from each Python notebook can be found in the folder labeled `csvs’. The hardest part was transforming the data from the ugly Excel format into an actual working file. We ran a lot of checks on the data to see if our scripting accidently omitted data. Doing these checks was really important, because we could see if our scripts missed anything.
Once we had all the files in one CSV, I did an initial analysis. I mostly used R and SQL to query my data and answer some initial questions on a county-by-county basis. Since we were partnering with the public TV station, a national paper, and some regional papers, I created Pivot Tables to share some findings.
What We Found
We found several gaps in the data, and in the end, we needed more context to tell any of the stories we’d hoped to tell.
One of the biggest problems: strict restrictions on the release of data in Germany. For example, we weren’t able to get the number of people working for each nursing home, let alone their qualifications or their salary, because that would infringe on the business secrets of those nursing homes.
With those limitations in mind, we focused our platform on a few things:
Informing the public about the data that exists
Highlighting the gaps and problems with the data
Teaching people how to make the most informed decision about a nursing home
We also wanted to empower the public to ask more questions when they visit nursing homes and to request the inspection reports, which they can do by connecting via our site to Fragdenstaat (the German equivalent of MuckRock, run by Stefan). By clicking on the Fragdenstaat link, users land on a pre-filled FOI request asking for nursing home reports. Once people get their documents for their nursing home, the documents are automatically uploaded to our platform.
Designing the Platform
There are existing platforms that rate nursing homes, but none take a journalistic approach, none explain the shortcomings of their data, and none explain how to choose a nursing home or where to get more information.
We designed correctiv.org/pflege to help users navigate the data easily and understand how to make an informed choice.
A few essential things about the platform we built:
Each user inputs in their home address, so that they learn about the nursing homes closest to them—sorted not by cost (like the AOK platform) but by distance to their homes.
Each nursing home gets its own URL, so that people can easily share links to specific facilities. (And in the past, we’ve had good results with assigning specific URLs to entities, because it led to additional search engine traffic.)
Individual nursing home pages start with two short paragraphs of automatically generated text. We translated data points on individual nursing homes into text so they would be easier to process for an audience that’s probably a bit older than the typical internet user.
Context matters. We explain where the data comes from, and we also say why the data shouldn’t be trusted as a primary factor in any decision.
We help people ask the right things when visiting a nursing home, with a list of follow-up questions we generated based on important criteria found in the inspection reports.
A Home for Nursing Home Data
Choosing a nursing home is a big decision, and for some, of course, it’s a final move. But with so many problems emerging in German nursing homes, the choice can be terribly complicated. Our goal was to make as much information as possible accessible—and understandable.
In the end, the platform doesn’t provide all information one would want before choosing a nursing home. But it does provide all the information that’s available, and it puts it in context. For Germany, it’s the first platform on nursing home quality that combines available data with a journalistic approach, giving people a sense of what they can (and need) to do before deciding on a nursing home.
In the coming weeks, we plan to write more stories using the data. We will also publish several stories that originate from the original research for the book and hope to do follow-up pieces as people start coming to us with their individual stories about issues with certain nursing homes.
We hope people will contribute their reports to the platform and request data on the nursing homes in their states.
Sandhya Kambhampati just finished up her 2016 Knight-Mozilla fellowship at Correctiv in Berlin, Germany and is currently looking for her next job. During her fellowship, she worked on investigations at Correctiv, a guide for newsrooms for on-boarding and off-boarding and taught at conferences. Follow her @sandhya__k