I run a Monte Carlo simulation of a black box code,i.e., I assign probability distributions to the code inputs and I obtain a Monte Carlo sample of the output variable Y. Y doesn't have to be Gaussian, because the input distribution aren't necessarily Gaussian, and even if they were, the output depends nonlinearly on inputs.
My bosses asked me to give them a "plausible range" for the variable Y. Trying to rephrase this question in a statistical framework, I thought about finding a lower bound L and an upper bound U for Y, such that p(L<=Y<=U) equal to, say, 95%. In practice, that's percentiles estimation. For example, if I were to set L=-inf, then U would be precisely the 95-th percentile of the distribution of Y, so the problem would become to estimate the 95-th percentile of Y. Questions: 1. Is there a preferred way to select L and U? I don't think so, since I don't know which is the distribution of Y. So I was thinking to just select two percentiles "symmetrical about the median", such that p(L<=Y<=U) = alpha. For example, if alpha = .95, I just choose L as the 2.5-percentile and U as the 97.5th percentile. 2. How do I estimate L and U? I know I could just load my samples in R and use bootstrap. However, I'd prefer to have also an analytical formula, for a variety of reasons. I have fairly large samples (usually N ~= 2000), so I guess that there should be some expression for the confidence intervals of percentiles, based on CLT. Can you post them?