and fortunately, I do have a Fortran compiler installed in my Linux system that will enable me to call the PDL logistic regression module from a regular PERL script.
So after learmomg some details, I can readily generate the requisite jackknife b' vectors (there will be a LOT of them.)
But before embarking on this effort in earnest, permit me to ask whether the new population selection protocol that I mentioned in a recent off-line email might enable us to embded a sampling procedure that would be compatible with the underlying assumption of binominal disrribution implicit in a logistic regression.
This new popultion selection protocol guarantees that for any cell (of the 32 in our present typical run), I will have no less than 40 inputs, and therefore (40^2 - 40)/2 = 780 pairs for submission to Arthur's program. (Under the new protocol, I will often start with far more than 40, somewhere between 40 and 125.)
So even if his program does not compare 280 of these 780 pairs (due to missing atomic coordinates or segments being from the same protein), I will still MINIMALLY have a set of 500 pairs that will make it thru his program.
Is this minimal N of 500 large enough to permit a random sampling of pairs from the 780 to generate the n0 and n1 for the cell? Or perhaps even multiple samplings of the 780?
If so, it seems to me that this would be a legitimate way to feed the logistic regression model with data compatible with the assumption of a binomial distribution.
Please advise. It's not that I'm not willing to do the jacknifing as you have laid it out - I just want to be sure that there's no alternative we can pursue in order to come into compliance with the initial assumption of a binomial distribution.