The New Things to Data Quality Check Post-2014 Data Standards
Happy New Year! This January we are scrutinizing data quality in preparation for the Point in Time Count.
Ever since the new Data Standards came out, we have been watching the new data being saved to see where there may be confusion out in the land! Here are some things we have found to be issues to watch for. Length of Time Homeless Questions!! OH MY!
These questions really stump people. There are so many ways to enter data that does not align with other answers that it really is almost funny. Here is a list of the ones we are checking for:
- Obviously, if any of them are missing or Don't Know/Refused, this is flagged.
- Compare Program Type Code to Number of Times Homeless in Past 3 Years. If the Program Type is not Prevention, it must be greater than 0. (Returns "Incorrect".)
- Number of Times Homeless in the Past Three Years is "4 or more" but the "If 4 or more, ..." question is not answered. (Returns "Missing".)
- Number of Times Homeless in the Past Three Years is 0 but Number of Months Homeless in the Past Three years is greater than zero. (Returns "Incorrect".)
- Number of Times Homeless in the Past Three Years is "4 or more" and the Number of Months Homeless in the Past Three Years is less than 3. (It returns "Unlikely".)
- Program Type is not Prevention or Emergency Shelter but the Number of Months Homeless Immediately Prior to Entry is 0. (Returns "Incorrect".)
- I'm sure there are more!?! Leave your ideas in the comments!
Entry Dates Compared to Move-In Dates for Rapid Rehousing
All this time without a Move-In Date field, and some of our case managers began to make the Entry Date the same thing as the Move-In Date. Some of them wanted to avoid showing overlapping program stays with shelters, some of them just didn't realize the difference, and others were actually getting it right. :) Anyway, we have had to clarify the difference between the Entry Date and the Move-In Date so that we can use the Move-In Date to determine just how rapid "Rapid Rehousing" actually is. In our Data Quality Reports, I have subtracted the Move-In Date from the Entry Date and if the answer is anywhere from -1 to 1, I have it return "Questionable". If it's actually less than -1 (and yes, there are quite a few like this!) I have it return "Not Correct" because it does not make sense to have already housed a household prior to them entering a Rapid Rehousing program!
Permanently Housed at Entry should always be No for Rapid Rehousing projects
No Rapid Rehousing clients should have Entry Data that indicates that they came into the program "Permanently Housed". Because then they wouldn't be homeless, which would make them ineligible for the services.
Permanently Housed at Exit should ideally be Yes for Rapid Rehousing projects. If it isn't, then our reports return "Questionable" since it is technically possible that the program stay didn't end well. It's just as possible that the user forgot to update it.
Health Insurance data!
Since this is a totally new field, it's important to add this data element into whatever Data Quality checking you are already using. Similar to our error checking with Income and Non-Cash, we check for consistency between the Yes/No question and the subassessments. For example, if the user answers "No" to the question asking if the client is covered by health insurance, but then they have a subassessment record that indicates that they are getting Medicare, the user would need to go back and correct whichever piece of that 2-part question is incorrect.
Disability Subassessments
Prior to October 1, 2014, the disability data reporting was actually pretty lax. Our users were not answering "Yes" to "Disability Determination" in a reliable way, simply because it was not affecting anything when they didn't. Well, now, this affects the AHAR (at the very least) so we had to have it all corrected, but secondarily, this got added into our Data Quality reports as well to be sure users were answering the subassessment records correctly.
Aside from answering "Yes" to "Disability Determination", we are also checking to be sure the "If yes, ..." questions that apply are also being answered within the subassessment record.
The Military Information Subassessment for SSVF projects.
Many users do not realize they have to answer for each intervention, and they will sometimes only select an answer for the intervention the client was involved with. I had to add this to our Data Quality reports as well. Be sure also to only have it check this for adults whose Veteran Status is "Yes"!
Client Location
Prior to this data element coming out, some of our users had no idea what the CoC boundaries were, or that they even were a thing. We had to train on this one, and even still, we get surprising questions on it. The data quality check here is to test for the client being a Head of Household (either in the Households mechanism or in the Relationship to Head of Household question on the Assessment) and if the client is, this question should be answered.
Our users have not really struggled with this one too much, as when we set up the assessment, we put it right after the Relationship to Head of Household question, indented the Client Location question and changed the wording to read "If Head of Household, specify Client Location". This way it's clear to the user that answer is contingent on the client being a Head of Household.
Splitting Out Data at Entry and Data at Exit
Since there are so many more distinctions now between what is collected at Entry and at Exit for various Program Types, one big change our Data Quality report has seen is the addition of a second data block that looks at the data associated with a client's Exit. A couple of examples are In Permanent Housing and Move-In Date for Rapid Rehousing projects and Housing Assessment at Exit for Prevention projects. This gives a fuller picture of what is happening during a program stay.
Housing Assessment at Exit for Prevention Projects
The most common issue for this data element is it is getting skipped. In our Exit Assessment, we created header rows to make it clear what users are supposed to be answering when and for whom, but it is a lot of changes and they're still getting used to all this.
So in the Data Quality report, I have added this to the "At Exit" data block and it only checks it for Prevention projects that are NOT SSVF. It basically looks for nulls, including the various "If..." questions being answered when appropriate.
General Missing Data
Missing Name, Relationship to Head of Household, Client Location, In Permanent Housing/Move-In Date (RRH), Length of Time Homeless questions, Last Perm Address (SSVF), Military Info (SSVF), Percent of AMI (SSVF), and Housing Assessment at Exit (Prevention) all need the obvious attention, checking for nulls or "Data not collected" answers. Some of them must be collected at Entry and Exit, some only at one or the other, so that has to be incorporated as well. I also have the report show Don't Know/Refused answers so that they can be sure those really were not collected.
RHY? HOPWA? PATH?
None of these programs have been incorporated into our Data Quality reports yet but they will need to be very soon!
If you have ideas, comments, or questions, feel free to comment! I am considering writing on a few of these topics more in depth (with code) so if there's one that interests you, please say.