"Ben" wrote in message <firstname.lastname@example.org>... > I have the number concentration per particle size for a range of sizes and for 24 instrument channels. The plot seems to follow a normal distribution and I would like to fit the data to one. My data is not in the form of a probability density function since the y-axis is not probability, but rather concentration. Thus the equation I would like to fit to is: y = H.*exp(-.5.*((dp-mu)/sigma)^2), where H is the height of the distribution. > > Following the methods outlined from the "Curve Fitting and Distribution Fitting" tutorial (http://www.mathworks.com/products/statistics/examples.html?file=/products/demos/shipping/stats/cfitdfitdemo.html), I attempted to use nlinfit to determine the optimal parameters H, mu, sigma. The result is not a good fit, however. I'm getting some errors when I run the code, namely, "Warning: Rank deficient, rank = 2, tol = 5.617004e-06." This, along with the terrible fit, makes me think I'm using this incorrectly. > > Does anyone know what's going on here? > > Here is my code: > %data > N_data = [ > 15284177237 > 23286616170 > 26033896374 > 39848389636 > 95606590189 > 1.95004E+11 > 4.1652E+11 > 6.90045E+11 > 9.16205E+11 > 1.13311E+12 > 1.34397E+12 > 1.52057E+12 > 1.70126E+12 > 1.87655E+12 > 1.84008E+12 > 1.75936E+12 > 1.527E+12 > 1.09247E+12 > 6.12101E+11 > 2.26027E+11 > 49697188712 > 17322725331 > 10247909480 > 5456170587]; > > dp_data = [ > 1.05E-08 > 1.14E-08 > 1.22E-08 > 1.31E-08 > 1.41E-08 > 1.51E-08 > 1.62E-08 > 1.75E-08 > 1.88E-08 > 2.02E-08 > 2.18E-08 > 2.34E-08 > 2.52E-08 > 2.72E-08 > 2.93E-08 > 3.15E-08 > 3.40E-08 > 3.67E-08 > 3.96E-08 > 4.27E-08 > 4.61E-08 > 4.98E-08 > 5.38E-08 > 5.81E-08]; > > %interpolate between points to get initial guess for height of dist. > dp_space = logspace(log10(dp_star_meas(1)*0.75),log10(dp_star_meas(length(dp_star_meas))*1.25),1000); > N_data_intr = interp1(dp_data,N_data,dp_space,'spline',0); > initial_guesses(1) = peak(N_data_intr); > > %calculate mean and std for initial guess values > for i = 1:length(N_data) > mu = mu + dp_space(i)*N_data_intr(i); > N_tot = N_tot + N_data_intr(i); > end > mu = mu/N_tot; > for i = n:m > f = f + N_data_intr(i)*(dp_space(i) - mu)^2; > end > std = sqrt(f/(N_tot)); > init_guesses(2) = mu; > init_guesses(3) = std; > > %define function for nlinfit > norm_func = @(p,dp) p(1).*exp(-.5.*((dp-p(2))/p(3)).^2); > > [bestfit,resid]=nlinfit(dp_data,log(N_data),@(p,x) log(norm_func(p,x)),init_guesses); > > here is an image of the final fit versus the data: http://s2.postimage.org/xeu65jqyx/nlinfit_graph.png > > I appreciate the help
You fit the log of your data against the log of your fit function. This introduces a distortion to the estimates of your parameters. Why don't you try the direct way [bestfit,resid]=nlinfit(dp_data,N_data,norm_func,init_guesses); ?