Friday, 15 August 2014

python - Difference between BUGS model and PyMC? -



python - Difference between BUGS model and PyMC? -

i'm unable replicate results provided bugs code using pymc. bugs model andersen-gill multiplicative intensity cox ph model.

model { # set info for(i in 1:nsubj) { for(j in 1:t) { # risk set = 1 if obs.t >= t y[i,j] <- step(obs.t[i] - t[j] + eps) # counting process jump = 1 if obs.t in [ t[j], t[j+1] ) # i.e. if t[j] <= obs.t < t[j+1] dn[i, j] <- y[i, j] * step(t[j + 1] - obs.t[i] - eps) * fail[i] } useless[i] <- pscenter[i] + hhcenter[i] + ncomact[i] + rleader[i] + dleader[i] + inter1[i] + inter2[i] } # model for(j in 1:t) { for(i in 1:nsubj) { dn[i, j] ~ dpois(idt[i, j]) # likelihood idt[i, j] <- y[i, j] * exp(beta[1]*pscenter[i] + beta[2]* hhcenter[i] + beta[3]*ncomact[i] + beta[4]*rleader[i] + beta[5]*dleader[i] + beta[6]*inter1[i] + beta[7]*inter2[i]) * dl0[j] # intensity } dl0[j] ~ dgamma(mu[j], c) mu[j] <- dl0.star[j] * c # prior mean hazard } c ~ dgamma(0.0001, 0.00001) r ~ dgamma(0.001, 0.0001) (j in 1 : t) { dl0.star[j] <- r * (t[j + 1] - t[j]) } # next line indicates number of covariates , corresponding betas for(i in 1:7) {beta[i] ~ dnorm(0.0,0.00001)} }

i utilize next initial values

list(beta=c(-.36,-.26,-.29,-.22,-.61,-9.73,-.23), c=0.01, r=0.01, dl0=c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1))

i utilize single chain (for now) , 5000 iterations burn-in. run estimation 10000 additional iterations , receive same point estimates reported in paper. these close before frequentist estimates.

openbugs> samplesstats('beta') mean sd mc_error val2.5pc median val97.5pcstart sample beta[1] 3.466 0.8906 0.03592 1.696 3.48 5.175 501 9500 beta[2] -0.04155 0.06253 0.002487 -0.1609 -0.04355 0.08464 501 9500 beta[3] -0.009709 0.07353 0.002008 -0.1544 -0.01052 0.1365 501 9500 beta[4] 0.3535 0.1788 0.004184 -0.01523 0.3636 0.6724 501 9500 beta[5] 0.08454 0.1652 0.004261 -0.2464 0.08795 0.3964 501 9500 beta[6] -4.109 1.325 0.05224 -6.617 -4.132 -1.479 501 9500 beta[7] 0.1413 0.08594 0.003381 -0.03404 0.1423 0.3031 501 9500 openbugs> samplesstats('c') mean sd mc_error val2.5pc median val97.5pcstart sample c 4.053 1.08 0.02896 2.202 3.974 6.373 1001 10000 openbugs> samplesstats('r') mean sd mc_error val2.5pc median val97.5pcstart sample r 0.01162 0.002929 7.846e-5 0.007387 0.01119 0.01848 1001 10000

i tried replicate in pymc 2.3.2 next code. total replication code available here

def cox_model(dta): (t, obs_t, pscenter, hhcenter, ncomact, rleader, dleader, inter1, inter2, fail) = load_data_cox() t = len(t) - 1 nsubj = len(obs_t) # risk set equals 1 if obs_t >= t y = np.array([[int(obs >= time) time in t] obs in obs_t]) # counting process. jump = 1 if obs_t \in [t[j], t[j+1]) dn = np.array([[y[i,j]*int(t[j+1] >= obs_t[i])*fail[i] in range(nsubj)] j in range(t)]) c = gamma('c', .0001, .00001, value=.1) r = gamma('r', .001, .0001, value=.1) dl0_star = r*np.array([t[j+1] - t[j] j in range(t)]) mu = dl0_star * c # prior mean hazard dl0 = gamma('dl0', mu, c, value=np.ones(t)) beta = normal('beta', np.zeros(7), np.ones(7)*.00001, value=np.array([-.36, -.26, -.29, -.22, -.61, -9.73, -.23])) @deterministic def idt(b1=beta, dl0=dl0): mu_ = [[y[i,j] * np.exp(b1[0]*pscenter[i] + b1[1]*hhcenter[i] + b1[2]*ncomact[i] + b1[3]*rleader[i] + b1[4]*dleader[i] + b1[5]*inter1[i] + b1[6]*inter2[i])*dl0[j] in range(nsubj)] j in range(t)] # intensity homecoming mu_ dn_like = poisson('dn_like', idt, value=dn, observed=true) homecoming locals() m = mcmc(cox_model()) m.sample(15000)

however, not come close same point estimates. like

beta: mean sd mc error 95% hpd interval ------------------------------------------------------------------ -0.537 1.094 0.099 [-2.549 1.492] 0.276 0.048 0.004 [ 0.184 0.36 ] -1.092 0.385 0.038 [-1.559 -0.371] -1.461 0.746 0.073 [-2.986 -0.496] -1.865 0.382 0.038 [-2.471 -1.329] 3.778 1.539 0.133 [ 1.088 6.623] -0.449 0.109 0.01 [-0.661 -0.26 ] posterior quantiles: 2.5 25 50 75 97.5 |---------------|===============|===============|---------------| -2.892 -1.274 -0.385 0.268 1.253 0.191 0.244 0.278 0.305 0.374 -1.553 -1.434 -1.179 -0.793 -0.258 -3.132 -1.856 -1.196 -0.904 -0.526 -2.471 -2.199 -1.864 -1.632 -1.201 1.287 2.685 3.601 4.72 7.262 -0.714 -0.519 -0.445 -0.368 -0.273

most worryingly, signs different. thought maybe convergence issue, ran overnight 50,000 iterations without much change. maybe there's bug or difference in pymc model, particularly dl0 specification?

i've tried different starting values. i've tried letting model run many iterations. i've centered priors on point estimates bugs.

i think issue non-convergence, thought, , substantial difference between pymc2 , bugs implementations step methods , burn-in period. investigate thisi made alter idt create run faster, has same values idt:

x = np.array([pscenter, hhcenter, ncomact, rleader, dleader, inter1, inter2]).t @deterministic def idt(beta=beta, dl0=dl0): intensity = np.exp(np.dot(x, beta)) homecoming np.transpose(y[:,:t] * np.outer(intensity, dl0))

with in place, plotting trace of beta shows mcmc has not converged in 50,000 iterations:

there few things recommend: different starting values, different step methods, , burn-in interval comparable bugs burn-in:

vars = cox_model(dta) pm.map(vars).fit(method='fmin_powell') m = pm.mcmc(vars) m.use_step_method(pm.adaptivemetropolis, m.beta) m.sample(iter=50000, burn=25000, thin=25)

plotting trace in case shows more promising:

this produces point estimates much more bugs estimates have above:

beta: mean sd mc error 95% hpd interval ------------------------------------------------------------------ 3.436 0.861 0.035 [ 1.827 5.192] -0.039 0.063 0.002 [-0.155 0.081] -0.028 0.073 0.003 [-0.159 0.119] 0.338 0.174 0.007 [ 0.009 0.679] 0.069 0.164 0.007 [-0.263 0.371] -4.022 1.29 0.055 [-6.552 -1.497] 0.136 0.085 0.003 [-0.027 0.307]

python bayesian pymc survival-analysis winbugs

No comments:

Post a Comment