> Ok based on this specific data set, what would be a reasonable value? Any > of these reasonable values wouldn't cause a fake convergence?
Convergence can be because the objective function (log likelihood) is not changing, or because the parameter values are not changing. In your case the convergence happened because the log likelihood change was very small from one iteration to the next.
> What are actually the estimated Cox coefs for this data set? I feel that > as you make the tolerance more strict the coef of z3 will increase. For > tolerance of 10^-100 it did increase a lot when the warning came up.
Theoretically, perhaps the coefficient of z3 is infinite. You can set TolFun to 0. Here's the difference between that and TolFun set to 1e-12:
>> opt.TolFun = 1e-12; >> [b,logL,H,stats] = >> coxphfit(Zsam,time,'censoring',status,'baseline',0,'opt',opt); b', logL Iterations terminated because relative function value changing by less than OPTIONS.TolFun ans = -0.455477736374803 -0.570001118740885 25.319779735999436 0.002985662045312 logL = -2.073455924088337e+02
>> opt.TolFun = 0; >> [b,logL,H,stats] = >> coxphfit(Zsam,time,'censoring',status,'baseline',0,'opt',opt); b', logL Iterations terminated because norm of the current step is less than OPTIONS.TolX Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 2.769358e-17. > In coxphfit at 204 ans = -0.455477736374802 -0.570001118740886 30.544278608655912 0.002985662045312 logL = -2.073455924087432e+02
So you can see that by setting TolFun to 0, the coefficient changed by 5 compared with the previous value but the effect on the log likelihood was very small. Then it reached a point where the problem became singular and it stopped there.
> What about these problems I mentioned when all observations fro z3=0 are > censored? Does MATLAB check this before attempting maximizing a > likelihood? And what about about the case when the last event of one group > is earlier than the other? Does MATLAB checks for these cases? Otherwise, > it attempts to maximize a monotone likelihood. So every result without a > warning would be missleading I think.
MATLAB doesn't do that. It doesn't even try to determine that z3 is a binary variable.
I will think about this. Something like what you suggest is done in glmfit for a binary regression when the predictors can perfectly separate the two classes. In the absence of any clever diagnostics from coxphfit, there are a couple of indications of what has happened:
1. There's a coefficient of about 16. Since the model for the hazard is exp(X*B), a value of 16 is basically saying that there's an infinitely larger hazard for x=1 compared with x=0.