Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
NCTM or The Math Forum.


Math Forum
»
Discussions
»
sci.math.*
»
sci.math
Notice: We are no longer accepting new posts, but the forums will continue to be readable.
Topic:
Step Control for Gradient Descent ?
Replies:
5
Last Post:
Feb 23, 2009 1:46 PM




Re: Step Control for Gradient Descent ?
Posted:
Feb 18, 2009 8:34 PM


On Feb 18, 3:31 pm, rgvick...@gmail.com wrote: > > How large is the problem? (That is, how many variables and constraints > do you have?) If it is small (say, no more than 200300 variables and > 100200 constraints) you can try using readilyavailable software, > such as the builtin EXCEL Solver.
Roughly 1000 variables with perhaps 50 or 100 constraints.
> Alternatively, if the problem is > much smaller, you could try using the latest Global optimization > package in the LINGO solver, which is available from the LINDO Corp. I > think that limitedcapacity "student" versions are available for free, > while largercapacity implementations are available for a modest > costor much larger versions for around $1000. If you are at a > university that has a site licence for some optimization packages, why > not try using them? >
Certainly, I can't use Excel for anything other than testing, as eventually this has to be integrated with other software. However, buying software might be an option down the road when I have more time to redesign what is in front of me. There are other reasons to keep along this track, but I will spare you the details. For now, I'm just trying to squeeze a tiny bit more juice out of the current solver until I have time to integrate a different one. I heard that people had worked on dynamic stepping for gradient descent, and so I thought I would ask around.
> Why am I making an issue of this? Well, there are two aspects to your > problem: (1) getting a local optimum in reasonable time and with > reasonable accuracy; and (2) getting a global optimum. Let's just look > at (1) for the moment. It has long been known through examples that > simple gradient searches (of the type you seem to be using) are > dangerous: you can have convergence to a NONOPTIMAL point.
What do you mean nonoptimal? Of course it can converge on a local minimum, but that is a problem with any local method, as far as I understand. Newton's method, BFGS, conjugate gradient all have this problem. Gradient methods will find the local minimum, given enough time. Or do you have a counterexample of this?
> Furthermore, even in the absence of constraints, the method can be > incredible slow, and can even converge to a nonoptimal point when > using finitewordlength arithmetic on a real computer (even though it > converges theoretically if one uses infiniteprecision computations). > That is why numerous teams of researchers working for decades have > devoted so much effort to dealing with such problems. Typical, > powerful implementations would include better unconstrained > optimization methods (such as conjugategradient or quasiNewton > methods) having quadratic convergence or better. And there is the > important issue of how to handle constraints, especially of inequality > type. There a typical, powerful approach is the reducedgradient or > generalized reducedgradient method, perhaps with modern "augmented > Lagrangian" methods. For example, some algorithms (such as MINOS) for > nonlinearlyconstrained problems use generalized reducedgradient > methods to handle linear constraints and augmented lagrangians to deal > with nonlinear constraints. >
Alright, I will look into these as well.
> As I said, teams of researchers have spent decades developing powerful > methods to handle such problemsbecause realworls problems are too > hard to be handled naivelyand have come up with techniques that > will be orders of magnitudes better than what you are attempting. Why > reinvent the wheel, and ineffectively at that? Even a tool such as > the EXCEL Solver (also available on opensource3 spreadsheets) embody > much of this advanced research, and utilize quasiNewton mehtods > together with generalized reducedgradient methods. The LINGO package > uses similar recent researh results. > > Next, there is the issue (2): the question of global optimization. It > would likely pay you to read some of the extant research on this > topic; even doing a Google search will turn up a lot of useful > material free of charge. It is still a difficult problem, and without > some special structure to exploit, one can probably never be > guaranteed that a true global optimum has been reached. There are, > however, numerous effective heuristics that seem to have good > performance. Often one must apply (local) optimization packages > numerous times, perhaps using several randomlygenerated starting > points. Other techniques, such as simulated annealing, taboo search or > genetic algorithms are brought to bear on such problems. > > You might get better information and advice by posting your message to > the Operations Research newsgroup 'sci.opresearch'. The majority of > postings over there deal with issues related to optimization. > > Good luck. > > R.G. Vickson > Adjunct Professor, University of Waterloo > >
Thanks Ray. This gives me a good start.
Dave



