Topic: Correct way to normalize an rmsd-based distance metric used in
repeated trials of pairs

Replies: 148   Last Post: May 8, 2012 3:40 AM

 Messages: [ Previous | Next ]
 Halitsky Posts: 600
Re: Correct way to normalize an rmsd-based distance metric used in
repeated trials of pairs

Posted: Apr 14, 2012 4:32 PM

Here are the coefficients and sig values for the two correlations c/e
vs c/L and c/u vs c/L for all ten length intervals for the b.1
(immunoglobulin) fold:

c/e c/u
vs vs
Len c/L c/L
Int N Coeff Sig Coeff Sig

13-22 988 0.565 3.0E-84 0.606 3.2E-100
23-32 810 0.569 1.4E-70 0.655 1.8E-100
33-42 700 0.617 8.9E-75 0.643 7.5E-83
43-52 565 0.657 3.8E-71 0.609 1.0E-58
53-62 517 0.683 3.6E-72 0.642 2.0E-61
63-72 402 0.750 9.8E-74 0.669 1.9E-53
73-82 381 0.714 1.3E-60 0.683 9.7E-54
83-92 333 0.667 3.5E-44 0.485 5.1E-21
93-102 248 0.710 2.3E-39 0.405 3.5E-11
103-112 184 0.607 6.4E-20 0.451 1.3E-10

whether these two linear correlations are OK to use (of course
assuming that they are at least as good, if not better, for the other
five folds ...). If you want to see the underlying raw data, I can
send you the ten spreads off-line.

Also, note how the coefficients of the c/e vs c/L correlations are
lower for the first two length intervals, while the coefficients of
the c/u vs c/L correlations are lower for the last three length
intervals. This probably reflects an intrinsic property of the data
which is responsible for the difficulty I've had in certain folds
(like b1) when I'v tried to get a direct correlation between u and e
within length intervals.

