What Is Survival Analysis?
Essay by samiresh • May 31, 2016 • Case Study • 1,387 Words (6 Pages) • 1,059 Views
Survival Analysis
Survival analysis is also known in other names including Failure Time Analysis and Reliability, Event History Analysis, Duration analysis, Transition analysis, Failure time analysis, and Time to-event analysis.
What is Survival Analysis?
Survival analysis is an analysis focusing on how long the subjects stay in the sample. In the analysis, the subjects are tracked until the event of an interest happens or until the subjects leave from the study. When the event of an interest happens, it is said that the subject fails. The failure can be both good and bad events such as getting a job or being dead due to a disease. The subject is said to be censored when it is no longer in the study. For example, if individuals are tracked to see how long they live before they die due to a disease; the individuals are censored when they die because of an accident. Moreover, the survival analysis also focuses on the hazard rate which is the probability that the individual will experience the event at time t while the individual is at risk to experience that event.
Applications of Survival Analysis
The survival analysis can be applied to be used in many areas such as medical area, finance, engineering, business, and others.
- Medical:
- The time to recurrence of an illness or other medical condition.
- Finance:
- The time that borrowers default their loans or continue to repay their loans.
- Industrial Engineering:
- The failure time distributions of industrial machine components, electronic equipment, and automobile components.
- Business and Economics:
- The time that new firms exit the business or survive.
- The time until the companies adopt new technology (adopt the new technology or still have not adopted it)
- Social:
- The time to marriage, to pregnancy, retirement, or to getting a first job.
Variables in Survival Analysis
There are two main variables used in the survival analysis that are time and event variables.
- Time variable: providing the length of time until the individuals experience the event or until the individuals leave from the study.
- Event variable: is dummy variable. The event variable is 1 when the event already happened and is 0 when the event has not yet happened.
- Censor variable: can be used instead of event variable. The censor variable is 1 when the event has not yet happened and 0 when the event already happened.
So that it is very important that you understand the type and meaning of your variables as the event variable and censor variable have an opposite meaning.
Models in survival analysis
In survival analysis, there are mainly two different types of model.
- Non-Parametric
- Semi-Parametric
- Parametric
Non-Parametric models
Non-parametric model is the model is the analysis without other dependent variables focusing on only time, and event. Non-parametric model can demonstrate survival function by different groups of the study.
Install[pic 1] package to use functions for survival analysis, and install [pic 2]package to use dataset .[pic 3]
[pic 4]
Is a data frame with 7 variables and 2,843 rows [pic 5]
[pic 6]
Here, we want to track the subjects who die from Aids due to blood contraction. We select only the rows that contain value of “blood” for T.categ column, select all columns, and name a new data frame as BloodAids.
[pic 7]
To study about the survival function, we use [pic 8] function to create survival curve which is survival probability versus time and use [pic 9] function to store data between time and event information where time is the length of time and event is 0 or 1 (1=dead , 0=not dead).
First we calculate the length of time which is number of days from the beginning of diagnosis. Then specify the event variable by changing values in status column to 0 and 1. (Remark that 1 = dead =D in status column, and 0 = not dead = A in status column) [pic 10]
[pic 11]
In order to create the plot of survival curve for this sample, the syntax is
[pic 12]
The result is shown in below figure.
[pic 13]
To see the summary of survival probability at time t, use [pic 14].
[pic 15]
Next, we will use non-parametric model to demonstrate survival function by group which is gender in this case. We use
[pic 16]
Then plot survival curve.
[pic 17]
The result shown in below curve has two curves for each gender.
[pic 18]
The summary of the plot can be reported using [pic 19]. There are separated reports for each gender. We know which curve is for which gender by reading the survival values of each group from the reported summary. From the reported result below, time column tells at each time t, n.risk column is total number of samples in the study at time t, n.event column is number of samples experiencing the event, and survival column is survival probability at time t. There are also giving information of standard error and control limits of 95% confidence level.
[pic 20]
[pic 21]
The table below is an example of how to calculate survival rates and hazard rates at time t.
Time t (A) | Number of Total Samples at time t (B) | Number of subjects experiencing the event (C) | Number of censored observations (D) | Hazard Function (C)/(B) | Cumulative Hazard Function | Survival Function |
1 | 100 | 4 | 2 | 4/100=0.04 | 0.04 | 1-0.04= 0.96 |
2 | 100-4-2= 94 | 6 | 3 | 6/94= 0.064 | 0.04+0.064=0.104 | 0.96*(1-0.064)= 0.899 |
3 | 94-6-3= 85 | 1 | 10 | 1/85= 0.012 | 0.104+0.012= 0.116 | 0.899*(1-0.012)= 0.888 |
4 | 85-1-10= 74 | 5 | 2 | 5/74= 0.068 | 0.116+0.068= 0.184 | 0.888*(1-0.068)= 0.828 |
...
...