Yes, pseudo R squared statistics were amongst other options, but I haven't looked at those as of yet. I was steered away from them by this paper (Zheng, B. and A. Agresti. 2000. Summarizing the predictive power of a generalized linear model. Statistics in Medicine 19: 1771-1781), which justified my calculation of the correlation between predicted and measured values. I also thought I'd see if I could clear up the reasons behind the apparently contradictory scores returned by the methods I'd tried so far - they seem to be highly polarised, as either saying the model's exceptionally good or total twaddle.
Regarding the number of possible explanatory variables that were offered to the backward variable selection stage, it was significantly fewer than the sample size (at 9 << 31). 4 of these 9 were chosen from ten related alternative measures by comparison of the explanatory power in univariate relationships, the remaining 5 of the 9 were rationally selected from a large list, without any regression, as being most likely to affect the response variable.
Your guidance that I should probably only keep the most significantly explanatory variables is useful.
Can you point me to any further reading on more appropriate methods of model selection than stepwise variable selection? I understood that it was common practice to use the AIC as an indication of the relative appropriateness of two models differing in the (number of) explanatory variables used. This almost implies nested models, where one is a stepwise modification of the other. I appreciate the caveats around the expected R^2 achievable from even random explanators when their number approaches that of the sample size, but is there any better alternative to the numerically enthusiastic ecologist?