Kaplan-Meier Survival Analysis - Study Notes
- Gamze Bulut
- Apr 1
- 2 min read

Have you ever wondered how survival probabilities are calculated?
Well, even if you haven’t, here are my study notes to spark your curiosity. 🙂
In survival studies, we track patients over time to observe if and when certain “events” occur—like death, relapse, or recovery. This type of analysis helps us estimate the probability that a patient with a specific diagnosis will survive beyond a given number of years. Let's explore some key definitions to understand how this works:
✅ What is Time-to-Event (Survival) Data?
Data where the outcome is the time until an event occurs (e.g., death, relapse, failure).
Special because:
It is right-skewed (not normally distributed)
It may involve censoring (we don't know the exact time of event for some subjects)
✅ Why Can't We Use Linear Regression?
Linear regression assumes:
Normally distributed residuals
No censored data (everyone has a complete outcome)
But in survival data:
Distribution is right-skewed
Censoring is common (e.g., patient drops out or event hasn't happened yet)
✅ What is Right Censoring?
Occurs when we stop observing a subject before the event happens.
Common reasons:
The study ends
Patient is lost to follow-up
Patient is still event-free at end of study
Example: A patient is still alive at study end → we don't know when they will die → their survival time is right-censored.
Still here? Brave. Now let’s survive these equations together.


Okay, that escalated quickly! But don’t worry—these formulas are just fancy ways of describing how survival works over time

The derivation part was a little over my calculus level. But I hope these are still helpful to understand the formulas and their meaning. 🤗



Comments