Application of Positive and Unlabeled Learning Algorithms to Hospital Administrative Data

Finding unlabeled sepsis cases to increase the quality of hospital administrative data for health economics and health services research

Project description

In Positive and Unlabeled (PU) data learning problems, the main assumption is that while positive examples exist in a dataset, the negative examples are actually not all negatives, but they are a mixture of both positive and negative examples.

The current state of research on the coding of sepsis indicates, that hospital administrative data can at least in part be classified as PU data. This poses a major challenge for health economics and health services research as hospitalization for certain diseases, inpatient complications, and other indicators (e.g., robotic-assisted surgery) might be underestimated.

In this project, we apply PU learning algorithms to hospital administrative data to find positive examples among the unlabeled examples. Our results will indicate whether existing learning algorithms can be used to enhance the quality of hospital administrative data.

Project team

Dr. Justus VogelJohannes Cordier


Funding source




12 months (01.01.2024-31.12.2024)