R语言使用ipwtm函数对生存数据进行时点加权
这是一个HIV患者的生存数据,比较的是一种HAART治疗治疗方法对患者生存的影响。ipwtm函数对生存数据加权很简单,其实就是一句话代码,exposure就是放入你的研究变量,numerator这里填入权重分子,如果按照Robins的算法这里填入1,属于不稳定权重,denominator这里属于权重分母,填入我们的协变量,它对权重的计算有点类似ipwpoint函数,不过更复杂,但是大概的原理是分子
后台粉丝希望我介绍一下能进行生存数据时点加权的ipwtm函数,今天简单分享一下。ipwtm函数属于ipw包,我们先导入包和数据。
library(ipw)
data(haartdat)
head(haartdat,6)
## patient tstart fuptime haartind event sex age cd4.sqrt endtime dropout
## 1 1 -100 0 0 0 1 22 23.83275 2900 0
## 2 1 0 100 0 0 1 22 25.59297 2900 0
## 3 1 100 200 0 0 1 22 23.47339 2900 0
## 4 1 200 300 0 0 1 22 24.16609 2900 0
## 5 1 300 400 0 0 1 22 23.23790 2900 0
## 6 1 400 500 0 0 1 22 24.85961 2900 0
这是一个HIV患者的生存数据,比较的是一种HAART治疗治疗方法对患者生存的影响。patient患者的ID,tstart开始随访的时间,fuptime随访终止的时间,haartind是否采用HAART治疗,event结局指标,是否死亡。sex性别,age随访时候的年龄,cd4.sqrt:CD4细胞的平方根。 ipwtm函数对生存数据加权很简单,其实就是一句话代码,exposure就是放入你的研究变量,numerator这里填入权重分子,如果按照Robins的算法这里填入1,属于不稳定权重,denominator这里属于权重分母,填入我们的协变量,它对权重的计算有点类似ipwpoint函数,不过更复杂,但是大概的原理是分子这个地方的变量生成一个模型,分母这个地方的变量生成一个模型,然后对预测值相除,当然过程中还有复杂的处理。tstart开始时间,timevar结束时间,data就是你的数据。type的类型有”first”, “cens” and “all”. “first”是权重由最低值进行转换。
temp <- ipwtm(
exposure = haartind,
family = "survival",
numerator = ~ sex + age,
denominator = ~ sex + age + cd4.sqrt,
id = patient,
tstart = tstart,
timevar = fuptime,
type = "first",
data = haartdat)
跑出结果后权重就是在temp中
我们可以把权重提取出来绘图
ipwplot(weights = temp$ipw.weights, timevar = haartdat$fuptime,
binwidth = 100, ylim = c(-1.5, 1.5), main = "Stabilized inverse probability weights")

这个图画法其实很简单,X就是我们的随访时间,Y轴就是权重取了个对数,设定一段宽度,每隔一段画一个箱线图。如果把ype设置为”cens”,进行的是逆概率删减权重(IPCW),我也不是很了解。
temp2 <- ipwtm(
exposure = dropout,
family = "survival",
numerator = ~ sex + age,
denominator = ~ sex + age + cd4.sqrt,
id = patient,
tstart = tstart,
timevar = fuptime,
type = "cens",
data = haartdat)
ipwplot(weights = temp2$ipw.weights, timevar = haartdat$fuptime,
binwidth = 100, ylim = c(-1.5, 1.5), main = "Stabilized inverse probability of censoring weights")

下面是加权和不加权对结果的影响
加权的
require(survival)
## 载入需要的程辑包:survival
summary(coxph(Surv(tstart, fuptime, event) ~ haartind + cluster(patient),
data = haartdat, weights = temp$ipw.weights*temp2$ipw.weights))
## Call:
## coxph(formula = Surv(tstart, fuptime, event) ~ haartind, data = haartdat,
## weights = temp$ipw.weights * temp2$ipw.weights, cluster = patient)
##
## n= 19175, number of events= 31
##
## coef exp(coef) se(coef) robust se z Pr(>|z|)
## haartind -0.9363 0.3921 0.4299 0.4527 -2.068 0.0386 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## haartind 0.3921 2.551 0.1614 0.9521
##
## Concordance= 0.597 (se = 0.029 )
## Likelihood ratio test= 5.57 on 1 df, p=0.02
## Wald test = 4.28 on 1 df, p=0.04
## Score (logrank) test = 5.07 on 1 df, p=0.02, Robust = 4.83 p=0.03
##
## (Note: the likelihood ratio and score tests assume independence of
## observations within a cluster, the Wald and robust score tests do not).
不加权的
summary(coxph(Surv(tstart, fuptime, event) ~ haartind, data = haartdat))
## Call:
## coxph(formula = Surv(tstart, fuptime, event) ~ haartind, data = haartdat)
##
## n= 19175, number of events= 31
##
## coef exp(coef) se(coef) z Pr(>|z|)
## haartind -0.5862 0.5565 0.4368 -1.342 0.18
##
## exp(coef) exp(-coef) lower .95 upper .95
## haartind 0.5565 1.797 0.2364 1.31
##
## Concordance= 0.569 (se = 0.032 )
## Likelihood ratio test= 1.97 on 1 df, p=0.2
## Wald test = 1.8 on 1 df, p=0.2
## Score (logrank) test = 1.85 on 1 df, p=0.2
加权的haartind是0.3921,不加权的haartind是0.5565,加权后影响变小。
绘制加权后的生存函数曲线
f1<-survfit(Surv(tstart, fuptime, event) ~ haartind + cluster(patient),
data = haartdat, weights = temp$ipw.weights*temp2$ipw.weights)
ggsurvplot(f1, data = haartdat)

DAMO开发者矩阵,由阿里巴巴达摩院和中国互联网协会联合发起,致力于探讨最前沿的技术趋势与应用成果,搭建高质量的交流与分享平台,推动技术创新与产业应用链接,围绕“人工智能与新型计算”构建开放共享的开发者生态。
更多推荐


所有评论(0)