These models are fitted using the Julia package Optim (methods included: “NM” (NelderMead), “N” (Newton), “LBFGS” (LBFGS), “CG” (ConjugateGradient), “GD” (GradientDescent)). Thus, the user needs to specify the initial points and to check the convergence of the optimisation step, as usual.
A description of these hazard models is presented below as well as the available baseline hazards.
1.1 General Hazard model
The GH model is formulated in terms of the hazard structure
where \({\bf x}\in{\mathbb R}^p\) are the covariates that affect the hazard level; \(\tilde{\bf x} \in {\mathbb R}^q\) are the covariates the affect the time level (typically \(\tilde{\bf x} \subset {\bf x}\)); \(\alpha \in {\mathbb R}^q\) and \(\beta \in {\mathbb R}^p\) are the regression coefficients; and \(\theta \in \Theta\) is the vector of parameters of the baseline hazard \(h_0(\cdot)\).
This hazard structure leads to an identifiable model as long as the baseline hazard is not a hazard associated to a member of the Weibull family of distributions (Chen and Jewell 2001).
1.2 Accelerated Failure Time (AFT) model
The AFT model is formulated in terms of the hazard structure
\[
h(t; \beta, \theta, {\bf x}) = h_0\left(t \exp\{{\bf x}^{\top}\beta\}; \theta\right) \exp\{{\bf x}^{\top}\beta\}.
\] where \({\bf x}\in{\mathbb R}^p\) are the available covariates; \(\beta \in {\mathbb R}^p\) are the regression coefficients; and \(\theta \in \Theta\) is the vector of parameters of the baseline hazard \(h_0(\cdot)\).
1.3 Proportional Hazards (PH) model
The PH model is formulated in terms of the hazard structure
where \({\bf x}\in{\mathbb R}^p\) are the available covariates; \(\beta \in {\mathbb R}^p\) are the regression coefficients; and \(\theta \in \Theta\) is the vector of parameters of the baseline hazard \(h_0(\cdot)\).
1.4 Accelerated Hazards (AH) model
The AH model is formulated in terms of the hazard structure
where \(\tilde{\bf x}\in{\mathbb R}^q\) are the available covariates; \(\alpha \in {\mathbb R}^q\) are the regression coefficients; and \(\theta \in \Theta\) is the vector of parameters of the baseline hazard \(h_0(\cdot)\).
2 Available baseline hazards
The current version of the HazReg.jl Julia package implements the following parametric baseline hazards for the models discussed in the previous section.
Weibull (Weibull) distribution. (only for AFT, PH, and AH models)
All positive parameters are transformed into the real line using a log link (reparameterisation).
3 Illustrative example: Julia code
In this example, we analyse the LeukSurv data set from the R package spBayesSurv. This data set contains information about the survival of acute myeloid leukemia in 1,043 patients.
For the GH model, we consider the hazard level covariates (\({\bf x}\)) age (standardised), sex, wbc (white blood cell count at diagnosis, standardised), and tpi (the Townsend score, standardised); and the time level covariates (\({\bf x}\)) age (standardised), wbc (white blood cell count at diagnosis, standardised), and tpi (the Townsend score, standardised). For the PH, AFT, and AH models, we consider the covariates age (standardised), sex, wbc (white blood cell count at diagnosis, standardised), and tpi (the Townsend score, standardised).
For illustration, we fit the 4 models with both (3-parameter) PGW and (2-parameter) LL baseline hazard. In addition, we fit the GH model with GenGamma, EW, LogNormal, LogLogistic, and Gamma baseline hazards. We compare these models in terms of AIC (BIC can be used as well). We summarise the best selected model with the available tools in this package.
# Confidence intervals for the survival function based on a normal approximation# at specific time points t0# Hessian and asymptotic covariance matrixHESS = ForwardDiff.hessian(OPTLLGH[2], OPTLLGH[1].minimizer);Sigma =inv(HESS);#= A "hackish" workaround to a bug in ForwardDiff, which mayproduce non-symmetric hessian matrices.Here, I am replacing the lower diagonal of Sigma by its upper diagonal=#ps =size(Sigma, 1)for i in1:ps-1for j in i+1:ps Sigma[i, j] = Sigma[j, i]endend# Reparameterised MLE r_MLE = OPTLLGH[1].minimizer;# The function to obtain approximate CIs based on Monte Carlo simulations # from the asymptotic normal distribution of the MLEs# t0 : time where the confidence interval will be calculated# level : confidence level# nmc : number of Monte Carlo iterationsfunctionConfIntSurv(t0::Float64, level::Float64, nmc::Int64) p0 =size(des_t)[2] p1 =size(des)[2] mc =fill(0.0, nmc)functionS_par(par::Vector{Float64}) outs =mean( exp.( -chLogLogistic(t0.*exp.(des_t * par[3:(2+p0)]), par[1], par[2]) .*exp.( des * par[(3+p0):(2+p0+p1)].- des_t * par[3:(2+p0)]) ) )return outsendfor i in1:nmc mv_normal =MvNormal(r_MLE, Sigma) val =rand(mv_normal, 1) val1 = [val[1],exp(val[2]),val[3:end]...] mc[i] =S_par(val1)end L =quantile(mc,(1-level)*0.5) U =quantile(mc,(1+level)*0.5) M =S_par(MLE)return [L,M,U]end# times for CIs calculationstimesCI = [1.0,2.5,5,7.5,10,12.5];CIS =zeros(length(timesCI), 4);for k in1:length(timesCI)CIS[k,:] =vcat(timesCI[k], ConfIntSurv(timesCI[k],0.95,10000))endCIS =DataFrame(CIS, :auto);rename!( CIS, ["year","lower","population survival","upper"] );println(CIS)
Chen, Y. Q., and N. P. Jewell. 2001. “On a General Class of Semiparametric Hazards Regression Models.”Biometrika 88 (3): 687–702.
Chen, Y. Q., and M. C. Wang. 2000. “Analysis of Accelerated Hazards Models.”Journal of the American Statistical Association 95 (450): 608–18.
Cox, D. R. 1972. “Regression Models and Life-Tables.”Journal of the Royal Statistical Society: Series B (Methodological) 34 (2): 187–202.
Kalbfleisch, J. D., and R. L. Prentice. 2011. The Statistical Analysis of Failure Time Data. John Wiley & Sons.
Rubio, F. J., L. Remontet, N. P. Jewell, and A. Belot. 2019. “On a General Structure for Hazard-Based Regression Models: An Application to Population-Based Cancer Research.”Statistical Methods in Medical Research 28: 2404–17.