If I had a nickel for every time NA (sodium) got interpreted as NA (Not Available)…

language-python language-R project-chartwatch post-miscellaneous

I’d have 2 nickels which isn’t a lot but it’s weird that it happened twice

Chloe Pou-Prom (DSAA, Unity Health Toronto)
2025-02-07

The hospital recently launched a new electronic patient record (EPR). Many of our AI tools have been put on pause due to this change. As we prepare to bring our tools back online, we’ve been monitoring data from the new EPR. I wanted to use this opportunity learn some Python, so I decided to build a Quarto dashboard with shiny and pandas.

Little did I know I would run into a familiar bug…

This happened when processing data with R. And it’s happened again with Python:

import pandas

df = pd.read_excel("labs.xlsx")

What I expect:

Lab name Lab abbreviation
Sodium NA
Sodium in urine NAUR
Sodium in urine (24 hours) NA24HUR

What I get:

Lab name Lab abbreviation
Sodium Missing value
Sodium in urine NAUR
Sodium in urine (24 hours) NA24HUR

Luckily, it’s a quick fix:

pd.read_excel("labs.xlsx", keep_default_na=False)

A quick search on StackOverflow shows that this is actually something people regularly encounter. And this isn’t just an issue for people working with sodium and/or lab values data: this also happens with country codes

Anyway, tune in for a future update when I inevitably encounter this bug when coding in Julia! (Just kidding… Maybe…)