Data
Description of the cohort
The data used for this study were obtained from Jimma University Specialized Hospital HIV/AIDS clinic, located 352 Km Southwest of Addis Ababa, Ethiopia. The Hospital gives VCT, PMTCT and free ART service for people living in Jimma Town and Southwest Ethiopia. Patients begin ART after they have been checked for medical eligibility and are counseled for adherence for ART. Patients presenting with WHO stage 4 disease and/or a CD4 count lower than 200 were eligible to start ART. Those who started ART, have a regular followup for drug adverse effects, management of opportunistic infection, TB screen and counseling related to family planning. In addition, CD4 count is measured at each visit. Viral load measurement is not available. Adverse event monitoring is conducted by clinicians during medical visits in accordance with national guidelines.
Decisions on which treatment regimen to start or substitute are made by the clinician in consultation with the patient. During the study period, the standard firstline regimens to be initiated were d4T + 3TC + NVP (1), d4T + 3TC + EFV (2), AZT + 3TC + NVP (3), AZT + 3TC + EFV (4), TDF + 3TC + EFV (5), and TDF + 3TC + NVP (6). If a patient suffered from side effects/toxicities related to the NRTI’s (d4T, AZT or TDF), and was not in need of secondline therapy for virologic failure, the recommendation was to substitute d4T with either AZT or TDF, to substitute AZT either with d4T or TDF, and to substitute TDF with either d4T or AZT. Similarly, patients initiated on the NNRTI EFV could substitute with NVP, while those on NVP could substitute with EFV. In addition, upon the recommendation of WHO, in Oct 2012 the hospital started to phase out d4T backbone by replacing either AZT or TDF for patients who were on d4T based regimen.All data, including demographic, clinical conditions, laboratory test results and medications are recorded and entered in to the database by a data entry clerk at the clinic. In addition, data on prescribed ARV along with start and stop dates of the drug and reasons for discontinuation are documented. Use of Jimma University Hospital HIV/AIDS clinic data and analysis of deidentified data was approved by the Human Research Ethics Committee of Jimma University.
Study population
All ART naive patients, aged 18 years or older and who initiated a standard, publicsector, firstline ART regimen at the clinic between between January 1, 2007 and December 31, 2011 were eligible for this analysis. The data was closed for analysis at the end of August, 2013.
Outcome
The primary outcome was defined as the timetotreatment change (treatment modification or regimen switching). For the purpose of this study, treatment change is defined as changing at least one ARV in the regimen without initiating a secondline therapy. ARV dosage adjustments were not considered as treatment change. Time zero was defined as the day of ART initiation and each recurrent treatment change time was measured from the beginning of the patient’s ART initiation in months. Persontime of the study subject ended at the earliest of initiation on secondline therapy, lost to follow up, death, transfer or closure of the data set for analysis (August 25, 2013).
Multistate survival model
Model formulation
Possible transition between treatment combinations are presented in Fig. 1 which illustrates the treatment history of patients under ART. From here onward we use the term state to denote a specific treatment combination. The model has 6 transient states (which represents 6 firstline treatment combinations): d4T + 3TC + NVP (1), d4T + 3TC + EFV (2), AZT + 3TC + NVP (3), AZT + 3TC + EFV (4), TDF + 3TC + EFV (5), and TDF + 3TC + NVP (6). The model assumes that every patient can switch to all the regimen at one point or other. However, a patient can only switch to one regime at a time. For example, a patient who started treatment with d4T + 3TC + NVP State 1) is at risk of making one of the following transitions at a particular time; 1→2,1→3,1→4,1→5 and 1→6. If the patients made the transition 1→2 or 1→3, the subject has undergone a single drugsubstitution (treatment modification). Transition 1→2 implies that the patient has substituted their NNRTI’s NVP by EFV without changing their NRTI treatment (d4T). However, transition 1→3 implies that the patient has substituted their NRTI’s d4T by AZT without changing their NNRTI treatment. Transitions 1→4 or 1→5, imply regimen switching, substituting both NNRTI and NRTI at the same time. After making one of these possible transitions patients will be at risk of making further transition.
Let (X
_{
t
})_{
t>0} be a multistate process with a state space {1,2,…,6}. The stochastic process (X
_{
t
})_{
t>0} is defined as X
_{
t
}=ℓ, if the process is in state ℓ at time t (in months). As mentioned above, for the case study presented in this paper, there are 6 possible first lines regimens which implies that the initial state of the patient X
_{0}∈{1,2,…,6}.
Our main interest is to model the transition from ℓ
^{th} regimen (state ℓ) to j
^{th} regimen (state j) at time t. A transition will be simply denoted by ℓ
j. The distribution of this multistate process is characterized by the transition intensities, or hazard rate, a
_{
ℓ
j
}(t), which expresses the instantaneous risk of a transition from state ℓ into state j at time t, that is
$$ {}\begin{aligned} a_{\ell j}(t)={\lim}_{\Delta t \rightarrow 0} \frac{P(X_{(t+\Delta t)}=jX_{t}=\ell,{F}_{t^{}})}{\Delta t}, \ell, j \in \{1,2,...,6\}, \ell \neq j. \end{aligned} $$
(1)
Here, \({F}_{t^{}}\phantom {\dot {i}\!}\) represents process history prior to time t. In our application, time t represents time since ART initiation. The cumulative transition hazards is defined as \(A_{\ell j}(t) = \int _{0}^{t} a_{\ell j}(u)du, u \leq t, \text {where}\ A_{\ell j}(t) = 0\) if a direct transition between state ℓ and j is impossible. These intensities can be gathered in to a 6×6 matrix A(t) with diagonal elements \(A_{\ell \ell }(t)=\sum _{\substack {j=1,j \neq \ell }}^{6} A_{\ell j}(t), \ell, j = \{1,2, \ldots, 6\}\). Note that individuals who have no transition should remain on their initial regimen (starting state) after ART initiation.
A central issue related to ART management is the ability to estimate the probability of the future treatment combination of the patient (i.e. the patient state) given all the information available until the present moment. For example, given a patient who substituted his NRTI’s d4T by AZT without changing his NNRTI after 6 months and (i.e., the current state of the patient is either in state 3 or state 4, depending on the initial NNRTI component) who has had no further events at one year post ART, one may be interested in estimating the probability of staying on this combination for additional 6 months as well as in comparing this probability to a patient who did not substitute their NRTI (d4T). We propose to use the transition probabilities for longterm prediction of a patient’s state. Let s be the time at which the prediction is made measured from the time origin of the patient (start of treatment) and let us denote the event history of the patient up to time s by X
_{
u
},0≤u≤s. Then, the transition probability from state ℓ to state j in the time interval [s,t], given information available until time s, is defined as
$$ {}\begin{aligned} P_{\ell j}(s,t) = P(X_{t} = j \mid X_{s}=\ell,X_{u}), s \leq t, \ell,j \in \{1,2,...,6\}, u \in\, [0,s]\!. \end{aligned} $$
(2)
In order to estimate P
_{
ℓ
j
}(s,t) we proposed to use a Markov model [28]. The model assumes that the future course of the patient depends on the patient’s state in the current time but not on the patient’s history before the current state. This means that, conditional on the present state, the past has no influence on the risk. This implies that
$$ {}\begin{aligned} a_{\ell j}(t)dt=P(X_{(t+\Delta t)^{}}=jX_{t^{}}=\ell), \ell, j \in \{1,2,...,6\}, \ell \neq j \end{aligned} $$
(3)
and
$$ {}\begin{aligned} P_{\ell j}(s,t) = P(X_{t} = j \mid X_{s}=\ell), s \leq t,\ell,j \in \{1,2,...,6\}. \end{aligned} $$
(4)
Analogous to A(t),these probabilities can be gathered in to a 6×6 matrix M(s,t) with (P
_{
ℓ
j
}(s,t)) as its (ℓ,j)^{th} entry. A single element (P
_{
ℓ
j
}(s,t)) combines both direct and indirect transition from state ℓ to state j. The matrix is fully presented in Additional file 1: Section S.2.
Inference
In this section we present nonparametric approaches for time continuous Markov models with finite state space and under independent right censoring. We consider n individual multistate processes \(\left (X^{(i)}_{t}\right)_{t \geq 0}, X^{(i)}_{t} \in \{1,...,6\},\ \text {and}\ i=1,2,...,n\). We assume that the n process are all, conditional on the initial state \(X^{(i)}_{0}\), independent multistate processes. Observation of the individual multistate data is subject to a right censoring time C
_{
i
}. Our notation and ideas are based on a counting process formulation [29, 30].
Let N
_{
ℓ
j;i
}(t) be the counting process denoting individual i
^{′}
s number of observed direct (without visiting another state in between) ℓ→j transition in [0,t],ℓ,j∈1,2,…,6,ℓ≠j. Here, time t refers to the time since the patient entered the initial state (i.e., the time since ART initiation). Let Y
_{
ℓ;i
}(t) be an indicator variable which represent the at risk process where we have
$${}{\begin{aligned} Y_{\ell;i}(t) = \left\{ \begin{array}{cl} 1 & : \text{if individual}~ i~ \text{is in state}~ \ell~ \text{and under observation before time}~ t,\\ 0 & :\text{otherwise.} \end{array}\right. \end{aligned}} $$
Let the aggregated process \(N_{\ell j}(t) = \sum _{i=1}^{n} N_{\ell j;i} (t), \ell \neq j\) and \(Y_{\ell }(t) = \sum _{i=1}^{n} Y_{\ell ;i}(t)\) denote, respectively, the number of observed direct ℓ→j transitions during the time interval [0,t] and the number of individuals to be observed at risk in state ℓ just prior to time t. We define the the increment of N
_{
ℓ
j
}(t) as △N
_{
ℓ
j
}(t)=N
_{
ℓ
j
}(t)−N
_{
ℓ
j
}(t
^{−}) for the increment of N
_{
ℓ
j
}(t) which gives the number of ℓ→j transitions observed exactly at time t.
Nonparametric estimation of baseline hazards
From the definition of the transition intensities in Eq. (3) a
_{
ℓ
j
}
d
t=P(X
_{(t+d
t)−}=j∣X
_{
t−}=ℓ),ℓ≠j. Hence, if we observe no ℓ→j transition at t (i.e △N
_{
ℓ
j
}(t)=0) we estimate the increment a
_{
ℓ
j
}(t)d
t of the cumulative hazard as 0. If we do observe ℓ→j transition at t (i.e △N
_{
ℓ
j
}(t)>0), we estimate this conditional transition probability by
$$ \triangle \hat{A}_{\ell j}(t) = \frac{\triangle N_{\ell j}(t)}{Y_{\ell}(t)}, $$
(5)
summing up over these increments yields the NelsonAalen estimators [29]
$$ \hat{A}_{\ell j}(t) =\sum\limits_{s \leq t} \frac{\triangle N_{\ell j}(s)}{Y_{\ell} (s)}, \ell \neq j, $$
(6)
where summation is over all observed event times in [0,t] and its variance is given by
$$ \hat{\delta}_{\ell j}^{2}(t) =\sum\limits_{s \leq t} \frac{\triangle N_{\ell j}(s)}{Y_{\ell}^{2}(s)}, \ell \neq j. $$
(7)
Nonparametric estimation of Transition probabilities
The transition probabilities are a complex function of the transition hazards, because the state occupied at some time t may potentially result from a complex nested series of competing risks experiments and there may also be more than one possible sequence of competing risks experiments leading to being in a certain state at a certain time [31]. Under the Markov model the transition probabilities defined in (4) are the solution of a set of differential equations [29]
$$ \frac{d}{dt}\mathbf{M}(s,t) =\mathbf{A}^{T}(t)\mathbf{M}(s,t), $$
(8)
where M(s,t) is transition probability matrix with (ℓ,j) element P
_{
ℓ
j
}(s,t)=P(X
_{
t
}=j∣X
_{
s
}=ℓ) and A(t) is a matrix with off diagonal elements A
_{
ℓ
j
}(t)=a
_{
ℓ
ℓ
}(t) and diagonal elements \(A_{\ell \ell }(t)=\sum _{\substack {j=1,j \neq \ell }}^{6} a_{\ell j}(t)\). In coordinates, (8) is \(d/dt P_{\ell j}(s,t) =\sum _{k} P_{\ell k}(s,t)A_{kj}(t)\) and for any fixed initial state ℓ, the vector of transition probabilities from ℓ, (P
_{
ℓ1}(s,t),P
_{
ℓ2}(s,t),...,P
_{
ℓ6}(s,t)) satisfies this equation. Even though this equation can not be solved in general due to the nonconstancy over time of the matrix A(t), under Markov assumption, the transition probabilities satisfy
$$ P_{\ell j}(s,t)=\sum_{r} P_{\ell r}(s,u)P_{rj}(u,t)\;\;, s \le u \le t, $$
(9)
and based on a partition s=t
_{0}<t
_{1}<t
_{2}<…<t
_{
k−1}<t
_{
k
}=t of the time interval [s,t], the matrix of transition probabilities M(s,t) can be approximated by [31]
$$ \mathbf{M}(s,t) \approx \prod\limits_{k=1}^{K} (\mathbf{I}+\triangle \mathbf{A}(t_{k})), $$
(10)
where I is the (6×6) identity matrix, the (ℓ,j)^{th} element of △A(t
_{
k
}) is equal to A
_{
ℓ
j
}(t
_{
k
})−A
_{
ℓ
j
}(t
_{
k−1}), and \(A_{\ell \ell }(t)=\sum _{\substack {j=1,j \neq l }}^{6} A_{lj}(t)\). Computing the approximation for ever finer partition [s,t] approaches a limit, namely a matrixvalued product integral [31, 32], which equals the matrix of transition probabilities,
$$ \mathbf{M}(s,t)=\prod\limits_{u \in (s,t]}(\mathbf{I}+d\mathbf{A}(u)), $$
(11)
where u ranges from s to t and d
A(u) is defined as d
A
_{
ℓ
j
}(u)=a
_{
ℓ
j
}(u)d
u,ℓ,j∈1,2,...,6 [29]. Therefore, for Markov models, given A(t), the product integration is the mapping that switches from cumulative transition hazards to the matrix of transition probabilities while all cumulative transition hazards are involved.
An estimator of M(s,t) can be obtained by replacing A(u) with the NelsonAalen estimators, \(\hat {\mathbf {A}}(u)\), and by defining \(d\hat {\mathbf {A}}(u)\) as the matrix with entries \(\triangle \hat {A}(u)= \hat {A}_{\ell j}(u)  \hat {A}_{\ell j}(u^{})\) (i.e., the increment of the NelsonAalen estimators at time u). This results in the AalenJohansen type estimator [29],
$$ \hat{\mathbf{M}}(s,t) = \prod\limits_{\substack{u \in (s,t]}}(\mathbf{I} + \Delta \hat{\mathbf{A}}(u)), $$
(12)
in which u indicates all event times in (s,t]. Note that the transition probability matrix defined in (11) is calculated by means of a product integral, while its estimator in (12) is based on a finite product, which only changes at event times.
The transition probabilities can be used for two types of prediction: forward and fixed horizon [30, 33]. In the former case, at a given fixed time s the probabilities of possible future events are evaluated for varying time horizons t. In the latter case, the prediction is made from several starting points to one future fixed time point. In both cases, Aalentype or Greenwood type estimators of the variancecovariance matrix of M(s,t) can be calculated directly or through a recursion formula which can for instance be used to construct pointwise confidence intervals around the estimated transition probability curves [30]. In our application we use forward prediction type.
Robustness of Firstline HAART towards Treatment Modification/change
The primary aim of this study is to quantify the robustness of first line treatments to treatment modification. Given the individuals initial state ℓ at time s, the waiting time in state ℓ. i.e., the duration of stay at state ℓ, can be used as a summary measure of the model. The waiting time in state ℓ is generated with hazard
$$a_{\ell.}(t)=\sum\limits_{j=1,j \neq \ell}^{6} a_{\ell j}(t), t>=0. $$
We define the total cumulative transition hazard out of state ℓ as \( A_{\ell.}(t)=\int _{0}^{t} a_{\ell.}(u)du=\sum _{j=1,j \neq \ell }^{6} A_{\ell j}(t)\). Using these quantities one can evaluate the probability of no events during a period. The survival function of the waiting time in the initial state ℓ, i.e., probability to stay in state ℓ until time t, given that the individual had already been in the respective state at time s,s≤t is given by,
$$\begin{array}{*{20}l} P(X_{t}=\ellx_{s}=\ell) &= \prod\limits_{\substack{u \in (s,t]}}(1a_{\ell.}(u)du)\\ &=exp \left(\int_{s}^{t}a_{\ell.}(u)du\right)\\&=\text{exp}(A_{\ell.}(t)) \ell=1,...,6. \end{array} $$
These quantities are essentially common survival probabilities with all cause hazard a
_{
ℓ.}(u), taking time s as time origin [31]. However, this can also be seen as a solution of the product integral in (11). Since, the ℓ
^{th} row of the AalenJohansen type estimator of \(\hat {\textbf {M}}(s,t)\) contains the estimates \(\hat {P}_{\ell j}(s, t)\) for ℓ≠j and the diagonal element is such that the sum over the ℓ
^{th} row equals 1, the AalenJohansen estimator of P(X
_{
t
}=ℓx
_{
s
}=ℓ) is just \(\hat {P}_{\ell \ell }(s,t)\).
The multistate model formulated above allows us to evaluate whether treatment modification reflect a substitution of NNRTI, substitution of NRTI or substitution of both NNRTI and NRTI by initial treatment combinations. We propose to use the following measures of HAART robustness to treatment modification:

1.
Probability of NNRTI substitution
P
_{12} for state 1 P
_{34} for state 3 P
_{56} for state 5
P
_{21} for state 2 P
_{43} for state 4 P
_{65} for state 6

2.
Probability of NRTI substitution
P
_{13}+P
_{16} for state 1 P
_{35}+P
_{36} for state 3 P
_{54}+P
_{52} for state 5
P
_{24}+P
_{25} for state 2 P
_{42}+P
_{45} for state 4 P
_{63}+P
_{61} for state 6

3.
Probability of regimen switching
P
_{14}+P
_{15} for state 1 P
_{32}+P
_{35} for state 3 P
_{53}+P
_{51} for state 5
P
_{23}+P
_{26} for state 2 P
_{41}+P
_{46} for state 4 P
_{64}+P
_{62} for state 6