Content

Nonlinear programming

Stationary point: x_0 is a stationary point if d f(x) /dx evaluated at x_0 is equal to 0. For simplicity, denote d f(x_0) /dx as d f(x) /dx evaluated at x_0.
- local extremal/extreme point (extremum): x_0 is an extremum if d^k f(x_0)/dx^k=0, for all k (1<=k<n), and d^n f(x_0)/dx^n is nonzero and n is an even number (n>=2). Further,
  - x_0 is a local/relative minimum (point) if d^n f/dx^n>0. A local minimum is the smallest value in some local neighborhood.
  - x_0 is a local/relative maximum (point) if d^n f/dx^n<0. A local maximum is the largest value in some local neighborhood.
  - Minimum (resp., maximum) can mean either minimum (resp., maximum) point or minimum (resp., maximum) value.
- saddle point: x_0 is a saddle point if d^k f(x_0)/dx^k=0, for all k (1<=k<n), and d^n f(x_0)/dx^n is nonzero and n is an odd number.
Convex programming
- Optimize a convex objective function under convex constraints.
- Unique features
  - The problem can be solved numerically with great efficiency
  - The unique local optimal solution is also the global optimal solution
  - No heuristic stopping criteria; it has provable lower bounds to provide a stopping criterion for any accuracy.
  - Convex programming can also handle non-differential/non-smooth problems as well as smooth problems.
  - Its duality can provide certificates that prove infeasibility or lower bounds on objective.
  - The duality theory can be used for sensitivity analysis w.r.t. changes in the objective and the constraints.
Geometric programming
- monomial function: f(x) = c * x_1 ^{alpha_1} * x_2 ^{alpha_2} *... * x_n ^{alpha_n}, where
  - c>=0
  - domain of f: x_1>=0; x_2>=0, ..., x_n>=0
  - alpha_i \in R, for all i.
- posynomial function: a sum of monomial functions.
- geometric programming has a posynomial objective with posynomial constraints.
- transformation of variables: y_i = log x_i; or x_i = exp(y_i).
- monomial function: log f(x) = alpha_1 * y_1 + ... + alpha_n * y_n + beta is affine in y (where beta = log c).
- posynomial function: log f(x) = log \sum_{k=1}^r exp(alpha_{1k} * y_1 + ... + alpha_{nk} * y_n + beta_k) is convex in y.
- geometric programming in convex form:
  - take the log and replace x_i by exp(y_i), e.g., from f(x), we get log f( exp(y_1), ..., exp(y_n)).
Semi-definite Programming (SDP)
- linear objective function with linear matrix inequality (LMI) constraints
- LMI: \sum_{i=1}^n F_i * x_i <= 0 (semi-negative definite), where F_i are symmetric matrices. It can be regarded as linear combination of x_i with matrix coefficients.
- SDP is a convex program.
- LMI is a generalization of linear constraints in linear programming (LP), where the constraints are Ax<= b.
What does regression mean?
- Regression means
  - given observation y, estimating the value of the underlying parameter x that produces y. It is like moving backward (in logical sense, not necessarily in time sense) to find the cause, given the observation.
  - For example, estimating engine efficiency based on oil pressure.
- Regression generalizes classification since x can be any real-valued quantity, including a class index. Many classification algorithms can be understood as quantizing the output of a regression. Like classification, one can obtain the conditional model of x from a joint model (which includes a model of y) or it can be learned directly. Curve fitting is the common special case where y is assumed to be a deterministic function of x, plus additive noise (usually Gaussian), i.e., y=f(x) +n. Methods for curve fitting include radial basis functions, feed-forward neural networks, and mixtures of experts.
Constrained vs. unconstrained optimization
- Typically, constrained optimization problems are more difficult to solve than the unconstrained ones since constrained optimization problems have boundary conditions to be handled. Typically, the more constraints, the more difficult to solve the problem. But if the the feasible solution set is very small due to too many constraints, it is easier to solve when the problem has more constraints.

Nonlinear programming

Key Techniques: