The Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.


Math Forum » Discussions » sci.math.* » sci.math

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Step Control for Gradient Descent ?
Replies: 5   Last Post: Feb 23, 2009 1:46 PM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
RGVickson@shaw.ca

Posts: 1,677
Registered: 12/1/07
Re: Step Control for Gradient Descent ?
Posted: Feb 18, 2009 9:17 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

On Feb 18, 5:34 pm, dave.rud...@usask.ca wrote:
> On Feb 18, 3:31 pm, rgvick...@gmail.com wrote:
>
>
>

> > How large is the problem? (That is, how many variables and constraints
> > do you have?) If it is small (say, no more than 200-300 variables and
> > 100-200 constraints) you can try using readily-available software,
> > such as the built-in EXCEL Solver.

>
> Roughly 1000 variables with perhaps 50 or 100 constraints.
>

> > Alternatively, if the problem is
> > much smaller, you could try using the latest Global optimization
> > package in the LINGO solver, which is available from the LINDO Corp. I
> > think that limited-capacity "student" versions are available for free,
> > while larger-capacity implementations are available for a modest
> > cost---or much larger versions for around $1000. If you are at a
> > university that has a site licence for some optimization packages, why
> > not try using them?

>
> Certainly, I can't use Excel for anything other than testing, as
> eventually this has to be integrated with other software. However,
> buying software might be an option down the road when I have more time
> to redesign what is in front of me. There are other reasons to keep
> along this track, but I will spare you the details. For now, I'm just
> trying to squeeze a tiny bit more juice out of the current solver
> until I have time to integrate a different one. I heard that people
> had worked on dynamic stepping for gradient descent, and so I thought
> I would ask around.
>

> > Why am I making an issue of this? Well, there are two aspects to your
> > problem: (1) getting a local optimum in reasonable time and with
> > reasonable accuracy; and (2) getting a global optimum. Let's just look
> > at (1) for the moment. It has long been known through examples that
> > simple gradient searches (of the type you seem to be using) are
> > dangerous: you can have convergence to a NON-OPTIMAL point.

>
> What do you mean non-optimal? Of course it can converge on a local
> minimum, but that is a problem with any local method, as far as I
> understand.


By non-optimal, I mean not even a local minimum, even in a "convex"
problem where any local minimum is automatically a global minimum. The
successive points can get jammed up along a constraint, but at a point
that is very far from satisfying optimality conditions.


> Newton's method, BFGS, conjugate gradient all have this
> problem. Gradient methods will find the local minimum, given enough
> time. Or do you have a counter-example of this?


No, of course not (although, of course, some practical problems are
hard to solve when roundoff errors are present, and getting a solution
may take a long time in "bad" cases). These methods are designed to
find local optima. The issue of global optimization is separate.

RGV

>
>
>

> > Furthermore, even in the absence of constraints, the method can be
> > incredible slow, and can even converge to a non-optimal point when
> > using finite-wordlength arithmetic on a real computer (even though it
> > converges theoretically if one uses infinite-precision computations).
> > That is why numerous teams of researchers working for decades have
> > devoted so much effort to dealing with such problems. Typical,
> > powerful implementations would include better unconstrained
> > optimization methods (such as conjugate-gradient or quasi-Newton
> > methods) having quadratic convergence or better. And there is the
> > important issue of how to handle constraints, especially of inequality
> > type. There a typical, powerful approach is the reduced-gradient or
> > generalized reduced-gradient method, perhaps with modern "augmented
> > Lagrangian" methods. For example, some algorithms (such as MINOS) for
> > nonlinearly-constrained problems use generalized reduced-gradient
> > methods to handle linear constraints and augmented lagrangians to deal
> > with nonlinear constraints.

>
> Alright, I will look into these as well.
>
>
>

> > As I said, teams of researchers have spent decades developing powerful
> > methods to handle such problems---because real-worls problems are too
> > hard to be handled naively---and have come up with techniques that
> > will be orders of magnitudes better than what you are attempting. Why
> > re-invent the wheel, and ineffectively at that? Even a tool such as
> > the EXCEL Solver (also available on open-source3 spreadsheets) embody
> > much of this advanced research, and utilize quasi-Newton mehtods
> > together with generalized reduced-gradient methods. The LINGO package
> > uses similar recent researh results.

>
> > Next, there is the issue (2): the question of global optimization. It
> > would likely pay you to read some of the extant research on this
> > topic; even doing a Google search will turn up a lot of useful
> > material free of charge. It is still a difficult problem, and without
> > some special structure to exploit, one can probably never be
> > guaranteed that a true global optimum has been reached. There are,
> > however, numerous effective heuristics that seem to have good
> > performance. Often one must apply (local) optimization packages
> > numerous times, perhaps using several randomly-generated starting
> > points. Other techniques, such as simulated annealing, taboo search or
> > genetic algorithms are brought to bear on such problems.

>
> > You might get better information and advice by posting your message to
> > the Operations Research newsgroup 'sci.op-research'. The majority of
> > postings over there deal with issues related to optimization.

>
> > Good luck.
>
> > R.G. Vickson
> > Adjunct Professor, University of Waterloo

>
> Thanks Ray. This gives me a good start.
>
> Dave





Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© The Math Forum at NCTM 1994-2018. All Rights Reserved.