DSAA Blog: Ooh na na... where are my sodium labs?

February 2025 update: It happened again!

Silent deployment

Our team had been working actively on developing CHARTwatch, an early warning system for patients in general internal medicine at St. Michael’s Hospital. In November 2019 we were ready to move to a silent deployment phase, which means our entire pipeline was running (from data extraction to data processing to model prediction), but no outputs were going to the end-user.

Typically, the goal of the silent deployment phase is to uncover unexpected behaviors with the data, system, or model. During model development and evaluation, we had only worked with historical extracts of the data. When moving from historical data to live data, there’s the risk of running into data issues (Cohen et al. 2021).

The data can be different due to external factors. For example, all of our models were trained on data prior to COVID-19, but shortly after the beginning of our silent deployment phase, we began to observe cases of COVID-19 in the hospital.
The data can be different due to data entry errors. For example, a body temperature could incorrectly be entered as 3700 °C instead of 37.00 °C.
The data can be different due to selection bias. For example, during training we excluded patients with really short and really long visits, as they were rare. However, we may encounter these kinds of visits in the live data.

Monitoring labs

We had set up a monitoring dashboard to measure model inputs and model outputs. On close inspection, we made a discovery that was unquestionably odd… no sodium labs had been measured since we had moved to silent testing!

Figure 1: Daily counts of lab measurements: this includes counts for calcium (CA), chloride (CL), glucose (GLPOC), potassium (K), and sodium (NA).

Did this make sense? NA! Sodium is measured in routinely ordered blood tests. It’ll usually get ordered alongside other tests (such as calcium, chloride, glucose, and potassium) as part of a basic metabolic panel. In Figure 1, we look at the daily counts of labs on units in which CHARTwatch was silently deployed. The other labs were regularly measured, but our pipeline had not detected a single sodium lab. There was NA way sodium would be missing!

The NA bug

After hours of detective work, we found the issue:

In R, the programming language we used to develop CHARTwatch, the symbol NA stands for “not available” and is used to represent missing data.
In chemistry, Na is the symbol used to represent the chemical element of sodium.

Figure 2: Daily counts of lab measurements after fixing the NA bug

Depending on the context, the symbol meant something different! Our data extraction pipeline was interpreting the chemical element Na as “not available”!

The fix was quite straightforward. We updated the parameters of one of our function calls to specify that "" (empty string) should be used to represent “not available”, instead of "NA". From the documentation of the RODBC package:

na.strings: character string(s) to be mapped to NA when reading character data, default “NA”

After deploying this fix, sodium counts were back to normal (as seen in Figure 2).

While the fix was a simple one-line change, the problem we uncovered lead to plenty of follow-up questions!

Were there other cases where the same symbol meant two different things based on the context?
What does our electronic health record use to represent a missing value? Do they go with a number that’s biologically impossible? (e.g., a body temperature of -1000) Do they use a specific symbol/term? (e.g., “not measured”, “missing”)
How are these decisions made?

Recently, there’s been a push for improvement in data quality standards, such as “Datasheets for Datasets” (Gebru et al. 2021) and the explosion of features stores, model stores, and evaluation stores ¹.

Takeaways

NA (sodium) ≠ NA (not available)
Silent deployment is important.
Thorough metadata and data quality standards are important to mitigating these kinds of issues.

Cohen, Joseph Paul, Tianshi Cao, Joseph D. Viviano, Chin-Wei Huang, Michael Fralick, Marzyeh Ghassemi, Muhammad Mamdani, Russell Greiner, and Yoshua Bengio. 2021. “Problems in the Deployment of Machine-Learned Models in Health Care.” CMAJ 193 (35): E1391–94. https://doi.org/10.1503/cmaj.202066.

Gebru, Timnit, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. 2021. “Datasheets for Datasets.” Communications of the ACM 64 (12): 86–92. http://arxiv.org/abs/1803.09010.

What kind of “store” do we think is next? 🤔↩︎

Ooh na na… where are my sodium labs?

Silent deployment

Monitoring labs

The NA bug

Takeaways

References