Our Custom HMIS Data Quality Reports
For many years, our HMIS's main data quality report was pretty much the APR (except for HPRP). The conventional wisdom was "if your APR looks alright, your data must be fine." Once we started participating in the AHAR in a meaningful way, and submitting HMIS APR's and using other reports, like the NOFA, we quickly realized that the definition of "missing" varies greatly between reports and that there is more to clean data than having complete assessment data. There was a Missing and Nulls report, but it wasn't heavily used and in general, its definition of missing was not any more stringent than the APR's.
When I was supporting the HPRP projects, I created a way of checking MANY aspects of data quality, some borrowed from the HPRP Data Quality reports that Bowman put out, and some I created from scratch because I came across an error people were making that was not being checked.
What you may notice about the kinds of things that I added is that not all of them indicate definite data quality issues, that some of them merely point to issues that could be either programmatic in nature, or actual data quality. Anyway, here they are:
No Case Management Service Transaction - Case Management service transactions were a big sticking point with HUD, because they wanted to be sure each household was receiving an intake. The Case Management service transaction signaled that there was an intake or recertification done.
Exit or Serve - any current households where the latest Service Transaction was over 35 days ago.
Missing Recertification - this compared the number of Case Management Service Transactions to the length of stay, and if there wasn't one Case Management for every 90ish days, it would be assumed that either the recertification had been done but not updated in HMIS, or the client didn't get a recertification.
Duplicate Entry Exits - self explanatory!
Red Flag Destinations - any clients with a Destination that didn't seem right, like home ownership, or shelter, Don't Know, or Other with no explanation entered.
No HOH - any household without a Head of Household designated.
Type of Living Situation Does Not Match Housing Status - compares Type of Living Situation to the Housing Status at Entry, and if they are not congruent, it shows as an error. For example, if the Type of Living situation is "Place Not Meant for Habitation" and the Housing Status is "Imminently Losing Their Housing", one of those answers is clearly not correct.
Stably Housed at Entry - sometimes users didn't save an updated Housing Status at Exit correctly, and the Housing Status at Entry would accidentally show as Stably Housed.
Children Only Households - self- explanatory!
Income Decreased - shows any clients that have a decrease in income between program entry and program exit.
As I began to use these reports, the fact that some of these issues are not *necessarily* Data Quality problems, but more negative outcomes or unlikely things, made users nervous. For instance, "Income Decreased" is a great example: it is certainly *possible* that an HPRP client actually did leave the program with less income than when they began it, however, since this is considered a bad outcome and there was a popular misconception that every income record should have an End Date, I wanted users to simply verify that the clients showing as having decreased income really had a decreased income and that it was not a data entry mistake.
There are a few issues to balance here:
- We want to catch potential errors in data entry.
- Some people like their "Data Quality Reports" to show all zeroes.
- We don't want users modifying correct but unfortunate or unusual data to make their Data Quality reports show all zeroes.
So to head this off, I added a Summary page that broke out the "Data Quality Issues" from the "Possible Data Quality Issues", and tried to clearly define which kind of issue they were looking at in the header of each tab. If it is a definite data quality issue, I indicate that, and if it is only a possible data quality issue, I indicate that along with the reminder to leave data that reflects reality alone.
Differentiating between these two ways of data quality checking helped me with broadening the kinds of things we were looking at. Including the things that come from the Bowman Data Quality reports, we wound up with a fairly exhaustive list of things we were checking. The list was so long that I broke it out into multiple reports because I did not want the reports to be so unwieldy and out of hand that they were using multiple universes and such. Thinking back on this, I could probably have combined some of them into one report, but as it is, I divided the reports into Entry Exits, Assessments, Households, and Services. Each report examines the various aspects of data entry in different ways. Also, and most importantly, each report tab contains a sometimes lengthy explanation of the issue it is aiming to resolve. Each explanation aims to answer these questions:
- What data is showing here?
- Why is this considered an issue?
- How might this have happened in HMIS?
- Should I fix it? If so, how can I fix it?
Of course, this was only the report-designing half of it. To go along with the reports, we had to create a "Data Quality Standards" document for the HMIS implementation that would lay out the expectations for all the agencies. The problem areas included not really having the capacity to check up on each agency every month, with four people and 80 counties. So the way it's worded in the Standards document is that agencies are responsible for running and correcting the reports monthly, and we spot check, sending reports that need attention on to the agencies to prompt them to sort out their reports.
One problem with these reports is there is not a way currently to run a single report that will return total data quality metrics for all the agencies at once. You have to run these one agency at a time, and it's multiple reports, too, so it is kind of unwieldy. This is an area I hope to work out as things settle and more people start using the reports.
In the meantime, here's some screenshots of some of the reports!
Here's the Summary page of the Assessments report:
And here is the report tab for "Housing Data at Entry":
Moving on to the Entry Exits report, here's the Summary page for that one:
And her's the report tab in that report called "Future Entry Exits"!
And finally, the Households report's Summary page:
And finally the Children Only Households report tab:
Custom ART reporting offers unlimited ways to steer your data quality toward perfection! Combining good report design with relevant and accurate data quality checks (plus the explanations!) is critical to getting the information your users need when they need it.
Hope this helps some people! Leave a comment if you have suggestions, criticisms, or nice things to say. :)