"Bruno Luong" <email@example.com> wrote in message > When you scale the parameters, you affects the (1) non-linearity AND (2) the conditioning of the cost function. You can't overlook either aspect.
Thank you, very interesting. I have to note that even without formal training on the topic I got right most of the points! I have regularization, I spend lots of time in making a first guess which is "almost the solution", I compute the gradient (I took me a bit to get it correct :-p ) and so on. The reason why the whole topic started is stated in this point in your old post: "At east, the correct rescaling should be carried out on the unknown space so that the variation among unknowns must affect comparably on the cost function. Do not mix brutally parameters with different physical scales together."
The optimization worked correctly before scaling the parameters. But I knew that it wasn't correct to mix such different parameters, affecting much differently the cost function. Therefore I scaled my parameters hoping to improve the performance of the minimization. But it turned out the other way round! If I scale the parameters such that each unknown affects the cost function comparably... I get stuck in local minima!! (Worse to me than my initial guess!) Without scaling some unknown are just untouched by the minimizator, they remain at the initial guess... And everything works better and local minima are avoided.