Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Topic: Step Control for Gradient Descent ?
Replies: 5   Last Post: Feb 23, 2009 1:46 PM

 Messages: [ Previous | Next ]
 Dave Rudolf Posts: 37 Registered: 10/3/05
Re: Step Control for Gradient Descent ?
Posted: Feb 18, 2009 8:34 PM

On Feb 18, 3:31 pm, rgvick...@gmail.com wrote:
>
> How large is the problem? (That is, how many variables and constraints
> do you have?) If it is small (say, no more than 200-300 variables and
> 100-200 constraints) you can try using readily-available software,
> such as the built-in EXCEL Solver.

Roughly 1000 variables with perhaps 50 or 100 constraints.

> Alternatively, if the problem is
> much smaller, you could try using the latest Global optimization
> package in the LINGO solver, which is available from the LINDO Corp. I
> while larger-capacity implementations are available for a modest
> cost---or much larger versions for around \$1000. If you are at a
> university that has a site licence for some optimization packages, why
> not try using them?
>

Certainly, I can't use Excel for anything other than testing, as
eventually this has to be integrated with other software. However,
buying software might be an option down the road when I have more time
to redesign what is in front of me. There are other reasons to keep
along this track, but I will spare you the details. For now, I'm just
trying to squeeze a tiny bit more juice out of the current solver
until I have time to integrate a different one. I heard that people
had worked on dynamic stepping for gradient descent, and so I thought

> Why am I making an issue of this? Well, there are two aspects to your
> problem: (1) getting a local optimum in reasonable time and with
> reasonable accuracy; and (2) getting a global optimum. Let's just look
> at (1) for the moment. It has long been known through examples that
> simple gradient searches (of the type you seem to be using) are
> dangerous: you can have convergence to a NON-OPTIMAL point.

What do you mean non-optimal? Of course it can converge on a local
minimum, but that is a problem with any local method, as far as I
understand. Newton's method, BFGS, conjugate gradient all have this
problem. Gradient methods will find the local minimum, given enough
time. Or do you have a counter-example of this?

> Furthermore, even in the absence of constraints, the method can be
> incredible slow, and can even converge to a non-optimal point when
> using finite-wordlength arithmetic on a real computer (even though it
> converges theoretically if one uses infinite-precision computations).
> That is why numerous teams of researchers working for decades have
> devoted so much effort to dealing with such problems. Typical,
> powerful implementations would include better unconstrained
> optimization methods (such as conjugate-gradient or quasi-Newton
> methods) having quadratic convergence or better. And there is the
> important issue of how to handle constraints, especially of inequality
> type. There a typical, powerful approach is the reduced-gradient or
> generalized reduced-gradient method, perhaps with modern "augmented
> Lagrangian" methods. For example, some algorithms (such as MINOS) for
> nonlinearly-constrained problems use generalized reduced-gradient
> methods to handle linear constraints and augmented lagrangians to deal
> with nonlinear constraints.
>

Alright, I will look into these as well.

> As I said, teams of researchers have spent decades developing powerful
> methods to handle such problems---because real-worls problems are too
> hard to be handled naively---and have come up with techniques that
> will be orders of magnitudes better than what you are attempting. Why
> re-invent the wheel, and ineffectively at that? Even a tool such as
> the EXCEL Solver (also available on open-source3 spreadsheets) embody
> much of this advanced research, and utilize quasi-Newton mehtods
> together with generalized reduced-gradient methods. The LINGO package
> uses similar recent researh results.
>
> Next, there is the issue (2): the question of global optimization. It
> would likely pay you to read some of the extant research on this
> topic; even doing a Google search will turn up a lot of useful
> material free of charge. It is still a difficult problem, and without
> some special structure to exploit, one can probably never be
> guaranteed that a true global optimum has been reached. There are,
> however, numerous effective heuristics that seem to have good
> performance. Often one must apply (local) optimization packages
> numerous times, perhaps using several randomly-generated starting
> points. Other techniques, such as simulated annealing, taboo search or
> genetic algorithms are brought to bear on such problems.
>
> You might get better information and advice by posting your message to
> the Operations Research newsgroup 'sci.op-research'. The majority of
> postings over there deal with issues related to optimization.
>
> Good luck.
>
> R.G. Vickson
> Adjunct Professor, University of Waterloo
>
>

Thanks Ray. This gives me a good start.

Dave

Date Subject Author
2/18/09 Dave Rudolf
2/18/09 rgvickson@gmail.com
2/18/09 RGVickson@shaw.ca
2/18/09 Dave Rudolf
2/18/09 RGVickson@shaw.ca
2/23/09 Dave Rudolf