Love is Blind, But Data Shouldn’t Be: Spotting the Red Flags

data-validation language-python project-chartwatch post-miscellaneous
Kayla Vanderkruk, Chloe Pou-Prom, Maitreyee Sidhaye (DSAA, Unity Health Toronto)
2025-03-31

Back in November 2024, as the hospital network prepared for the launch of its new EPR system, Epic 🐄, DSAA was busy at work building data validation dashboards to ensure smooth sailing during the transition. Our mission? To validate that the brand new data we’re seeing looks like what we’d expect from historical data. Once we have “enough good” data we can proceed to the next step, model validation, before relaunching our AI tools 🚀.

Dashboard construction

We created a number of dashboards and tests for ourselves with a variety of tools (e.g., pointblank). One of the dashboards created was dedicated to patient labs and vitals, with the general goal of highlighting potential issues for any of our inpatient tools and, more specifically, to help prioritize the relaunch of CHARTwatch.

When we first started building this dashboard, we had a bit of a problem: we had no idea what the new data would look like. So, we focused our efforts on one measure at a time, building a few different visualizations and generating summary statistics in the hopes of creating a comprehensive validation tool. We included medians, IQRs, and counts (quick gut check stats), along with violin and QQ plots for a more in-depth investigation of each lab and vital.

Vibe check

Now fast-forward to today, with data flowing in from Epic, we get to put our dashboards to the test! Our first challenge faced is that for the time being, patient labs and vitals are limited to current patients only. While we’re working with many people across the organization to ensure we have historic data access, until then, the daily data makes it tricky to compare the new data to the historical data. Any deviation observed between the pre and post go-live data raises the question, is the data weird or are the current patients sicker?

Working with what we currently have, we dove in right away. Some measures are providing “good” vibes, like the sodium lab (above) 💚 🧂. Whereas others are raising yellow and even (possible) red flags, demanding closer attention 🕵️

What’s next

These are only three out of 200+ measures we’re validating, and we have a ways to go to understand each of them. Our next step, once the data pipelines are in place, will be to compare the historical data to the entire post go-live data, providing larger samples and stronger vibes.

It’s clear DSAA still has its hands full with data and model validation before putting their AI tools back into action. But on the bright side 🌞, there have been plenty of opportunities for growth and learning along the way. Some of the highlights include: