Kaplan-Meier Survival Analysis - Study Notes

Gamze Bulut
Apr 1, 2025
2 min read

Have you ever wondered how survival probabilities are calculated?

Well, even if you haven’t, here are my study notes to spark your curiosity. 🙂

In survival studies, we track patients over time to observe if and when certain “events” occur—like death, relapse, or recovery. This type of analysis helps us estimate the probability that a patient with a specific diagnosis will survive beyond a given number of years. Let's explore some key definitions to understand how this works:

✅ What is Time-to-Event (Survival) Data?

Data where the outcome is the time until an event occurs (e.g., death, relapse, failure).
Special because:
- It is right-skewed (not normally distributed)
- It may involve censoring (we don't know the exact time of event for some subjects)

Here is a very helpful video!

✅ Why Can't We Use Linear Regression?

Linear regression assumes:
- Normally distributed residuals
- No censored data (everyone has a complete outcome)
But in survival data:
- Distribution is right-skewed
- Censoring is common (e.g., patient drops out or event hasn't happened yet)

✅ What is Right Censoring?

Occurs when we stop observing a subject before the event happens.
Common reasons:
- The study ends
- Patient is lost to follow-up
- Patient is still event-free at end of study