I’d have 2 nickels which isn’t a lot but it’s weird that it happened twice
The hospital recently launched a new electronic patient record (EPR). Many of our AI tools have been put on pause due to this change. As we prepare to bring our tools back online, we’ve been monitoring data from the new EPR. I wanted to use this opportunity learn some Python, so I decided to build a Quarto dashboard with shiny and pandas.
Little did I know I would run into a familiar bug…
This happened when processing data with R. And it’s happened again with Python:
import pandas
= pd.read_excel("labs.xlsx") df
What I expect:
Lab name | Lab abbreviation |
---|---|
Sodium | NA |
Sodium in urine | NAUR |
Sodium in urine (24 hours) | NA24HUR |
What I get:
Lab name | Lab abbreviation |
---|---|
Sodium | Missing value |
Sodium in urine | NAUR |
Sodium in urine (24 hours) | NA24HUR |
Luckily, it’s a quick fix:
"labs.xlsx", keep_default_na=False) pd.read_excel(
A quick search on StackOverflow shows that this is actually something people regularly encounter. And this isn’t just an issue for people working with sodium and/or lab values data: this also happens with country codes
Anyway, tune in for a future update when I inevitably encounter this bug when coding in Julia! (Just kidding… Maybe…)