"Matt J" wrote in message <firstname.lastname@example.org>...
----- > If you believe you're getting the correct solution without scaling, why are you trying to fix it? ===
I know that "if it ain't broke, don't fix it". But still I tought that making my results independent of a parameter that has no meaning was a good thing. (Now I have a 1e4 because for no specific reason my matrix has values in a specific physical units of measurement. If I changed it I could be dealing with 1e-6 or whatever. Then why not scaling in a way that with every unit my algorithms works in the same way?)
----- > In any case, what is the basis of your belief that the solution is correct? What have you done to verify that the solution you get without scaling is indeed a local/global minimum? There are various things you can do to check, like plot the objective function through the final point along the direction of the gradient (and maybe a along few other directions as well like the axis of your "untouched parameters"). You could also run the algorithm on simulated data, where you do know ground truth. ==== I'm writing a segmentation algorithm. The parameters I'm getting define the surface of an object I have in my image. If I superimpose the extraced surface on the original image I can distinguish, as far as eyes can see, at least between three conditions: "absolutely perfect", "looks correct with some minor problems", "blatantly wrong". Without scaling I always get "absolutely perfect", as far as I can distinguish. With scaling I get: "blatantly wrong". My surface is totally mispositioned.
------------ > stepdir = -inv(Hessian)*gradient > > This is superior to what steepest descent does. If you apply stepdir to the problem x^2+1e4y^2 you will see that it reaches the minimum in 1 iteration, as opposed to steepest descent which might take hundreds of iterations. > > The reason you might be getting "dramatically different" behavior with scaling is partly because we're not sure how the scaling you're doing is playing with fminunc's stopping parameters like TolX (you say that it's really a local min, but I don't know what you've done to check this). It also might be affecting the finite difference computations used by fminunc, making them either better or worse. Probably worse, since you're getting poorer results... You could try making FMINUNC compute the Hessian analytically, both with and without scaling, to experiment with the effect. ========== Ok, got it. I see that in the first iterations fminunc is calling my function once or twice per iteration. How can it approximate the hessian from so few function calls? (that's off topic, just out of curiosity)
To undestrand what happens to TolX (which, I stress, is at least 1e-7 lower than my required precision even on scaled parameters) I opened a topic about how is the step-size reported calculated. But I did not get any answer.