Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Mathematical optimization
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Iterative methods=== {{Main|Iterative method}} The [[iterative methods]] used to solve problems of [[nonlinear programming]] differ according to whether they [[subroutine|evaluate]] [[Hessian matrix|Hessians]], gradients, or only function values. While evaluating Hessians (H) and gradients (G) improves the rate of convergence, for functions for which these quantities exist and vary sufficiently smoothly, such evaluations increase the [[Computational complexity theory|computational complexity]] (or computational cost) of each iteration. In some cases, the computational complexity may be excessively high. One major criterion for optimizers is just the number of required function evaluations as this often is already a large computational effort, usually much more effort than within the optimizer itself, which mainly has to operate over the N variables. The derivatives provide detailed information for such optimizers, but are even harder to calculate, e.g. approximating the gradient takes at least N+1 function evaluations. For approximations of the 2nd derivatives (collected in the Hessian matrix), the number of function evaluations is in the order of N². Newton's method requires the 2nd-order derivatives, so for each iteration, the number of function calls is in the order of N², but for a simpler pure gradient optimizer it is only N. However, gradient optimizers need usually more iterations than Newton's algorithm. Which one is best with respect to the number of function calls depends on the problem itself. * Methods that evaluate Hessians (or approximate Hessians, using [[finite difference]]s): ** [[Newton's method in optimization|Newton's method]] ** [[Sequential quadratic programming]]: A Newton-based method for small-medium scale ''constrained'' problems. Some versions can handle large-dimensional problems. ** [[Interior point methods]]: This is a large class of methods for constrained optimization, some of which use only (sub)gradient information and others of which require the evaluation of Hessians. * Methods that evaluate gradients, or approximate gradients in some way (or even subgradients): ** [[Coordinate descent]] methods: Algorithms which update a single coordinate in each iteration ** [[Conjugate gradient method]]s: [[Iterative method]]s for large problems. (In theory, these methods terminate in a finite number of steps with quadratic objective functions, but this finite termination is not observed in practice on finiteāprecision computers.) ** [[Gradient descent]] (alternatively, "steepest descent" or "steepest ascent"): A (slow) method of historical and theoretical interest, which has had renewed interest for finding approximate solutions of enormous problems. ** [[Subgradient method]]s: An iterative method for large [[Rademacher's theorem|locally]] [[Lipschitz continuity|Lipschitz functions]] using [[subgradient|generalized gradients]]. Following Boris T. Polyak, subgradientāprojection methods are similar to conjugateāgradient methods. ** Bundle method of descent: An iterative method for smallāmedium-sized problems with locally Lipschitz functions, particularly for [[convex optimization|convex minimization]] problems (similar to conjugate gradient methods). ** [[Ellipsoid method]]: An iterative method for small problems with [[quasiconvex function|quasiconvex]] objective functions and of great theoretical interest, particularly in establishing the polynomial time complexity of some combinatorial optimization problems. It has similarities with Quasi-Newton methods. ** [[FrankāWolfe algorithm|Conditional gradient method (FrankāWolfe)]] for approximate minimization of specially structured problems with [[linear constraints]], especially with traffic networks. For general unconstrained problems, this method reduces to the gradient method, which is regarded as obsolete (for almost all problems). ** [[Quasi-Newton method]]s: Iterative methods for medium-large problems (e.g. N<1000). ** [[Simultaneous perturbation stochastic approximation]] (SPSA) method for stochastic optimization; uses random (efficient) gradient approximation. * Methods that evaluate only function values: If a problem is continuously differentiable, then gradients can be approximated using finite differences, in which case a gradient-based method can be used. ** [[Interpolation]] methods ** [[Pattern search (optimization)|Pattern search]] methods, which have better convergence properties than the [[NelderāMead method|NelderāMead heuristic (with simplices)]], which is listed below. ** [[Mirror descent]]
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Mathematical optimization
(section)
Add topic