riso.numerical
Class LBFGS

java.lang.Object
  |
  +--riso.numerical.LBFGS

public class LBFGS
extends java.lang.Object

This class contains code for the limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) algorithm for large-scale multidimensional unconstrained minimization problems. This file is a translation of Fortran code written by Jorge Nocedal. The only modification to the algorithm is the addition of a cache to store the result of the most recent line search. See solution_cache below. LBFGS is distributed as part of the RISO project. Following is a message from Jorge Nocedal:

   From: Jorge Nocedal [mailto:nocedal@dario.ece.nwu.edu]
   Sent: Friday, August 17, 2001 9:09 AM
   To: Robert Dodier
   Subject: Re: Commercial licensing terms for LBFGS?
   
   Robert:
   The code L-BFGS (for unconstrained problems) is in the public domain.
   It can be used in any commercial application.
   
   The code L-BFGS-B (for bound constrained problems) belongs to
   ACM. You need to contact them for a commercial license. It is
   algorithm 778.
   
   Jorge
 

This code is derived from the Fortran program lbfgs.f. The Java translation was effected mostly mechanically, with some manual clean-up; in particular, array indices start at 0 instead of 1. Most of the comments from the Fortran code have been pasted in here as well.

Here's some information on the original LBFGS Fortran source code, available at http://www.netlib.org/opt/lbfgs_um.shar. This info is taken verbatim from the Netlib blurb on the Fortran source.

 	file    opt/lbfgs_um.shar
 	for     unconstrained optimization problems
 	alg     limited memory BFGS method
 	by      J. Nocedal
 	contact nocedal@eecs.nwu.edu
 	ref     D. C. Liu and J. Nocedal, ``On the limited memory BFGS method for
 	,       large scale optimization methods'' Mathematical Programming 45
 	,       (1989), pp. 503-528.
 	,       (Postscript file of this paper is available via anonymous ftp
 	,       to eecs.nwu.edu in the directory pub/lbfgs/lbfgs_um.)
 


Nested Class Summary
static class LBFGS.ExceptionWithIflag
          Specialized exception class for LBFGS; contains the iflag value returned by lbfgs.
 
Field Summary
static double gtol
          Controls the accuracy of the line search mcsrch.
static double[] solution_cache
          The solution vector as it was at the end of the most recently completed line search.
static double stpmax
          Specify upper bound for the step in the line search.
static double stpmin
          Specify lower bound for the step in the line search.
 
Constructor Summary
LBFGS()
           
 
Method Summary
static void daxpy(int n, double da, double[] dx, int ix0, int incx, double[] dy, int iy0, int incy)
          Compute the sum of a vector times a scalara plus another vector.
static double ddot(int n, double[] dx, int ix0, int incx, double[] dy, int iy0, int incy)
          Compute the dot product of two vectors.
static void lb1(int[] iprint, int iter, int nfun, double gnorm, int n, int m, double[] x, double f, double[] g, double[] stp, boolean finish)
          Print debugging and status messages for lbfgs.
static void lbfgs(int n, int m, double[] x, double f, double[] g, boolean diagco, double[] diag, int[] iprint, double eps, double xtol, int[] iflag)
          This subroutine solves the unconstrained minimization problem
static int nfevaluations()
          This method returns the total number of evaluations of the objective function since the last time LBFGS was restarted.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

gtol

public static double gtol
Controls the accuracy of the line search mcsrch. If the function and gradient evaluations are inexpensive with respect to the cost of the iteration (which is sometimes the case when solving very large problems) it may be advantageous to set gtol to a small value. A typical small value is 0.1. Restriction: gtol should be greater than 1e-4.


stpmin

public static double stpmin
Specify lower bound for the step in the line search. The default value is 1e-20. This value need not be modified unless the exponent is too large for the machine being used, or unless the problem is extremely badly scaled (in which case the exponent should be increased).


stpmax

public static double stpmax
Specify upper bound for the step in the line search. The default value is 1e20. This value need not be modified unless the exponent is too large for the machine being used, or unless the problem is extremely badly scaled (in which case the exponent should be increased).


solution_cache

public static double[] solution_cache
The solution vector as it was at the end of the most recently completed line search. This will usually be different from the return value of the parameter x of lbfgs, which is modified by line-search steps. A caller which wants to stop the optimization iterations before LBFGS.lbfgs automatically stops (by reaching a very small gradient) should copy this vector instead of using x. When LBFGS.lbfgs automatically stops, then x and solution_cache are the same.

Constructor Detail

LBFGS

public LBFGS()
Method Detail

nfevaluations

public static int nfevaluations()
This method returns the total number of evaluations of the objective function since the last time LBFGS was restarted. The total number of function evaluations increases by the number of evaluations required for the line search; the total is only increased after a successful line search.


lbfgs

public static void lbfgs(int n,
                         int m,
                         double[] x,
                         double f,
                         double[] g,
                         boolean diagco,
                         double[] diag,
                         int[] iprint,
                         double eps,
                         double xtol,
                         int[] iflag)
                  throws LBFGS.ExceptionWithIflag
This subroutine solves the unconstrained minimization problem
     min f(x),    x = (x1,x2,...,x_n),
 
using the limited-memory BFGS method. The routine is especially effective on problems involving a large number of variables. In a typical iteration of this method an approximation Hk to the inverse of the Hessian is obtained by applying m BFGS updates to a diagonal matrix Hk0, using information from the previous M steps. The user specifies the number m, which determines the amount of storage required by the routine. The user may also provide the diagonal matrices Hk0 if not satisfied with the default choice. The algorithm is described in "On the limited memory BFGS method for large scale optimization", by D. Liu and J. Nocedal, Mathematical Programming B 45 (1989) 503-528. The user is required to calculate the function value f and its gradient g. In order to allow the user complete control over these computations, reverse communication is used. The routine must be called repeatedly under the control of the parameter iflag. The steplength is determined at each iteration by means of the line search routine mcsrch, which is a slight modification of the routine CSRCH written by More' and Thuente. The only variables that are machine-dependent are xtol, stpmin and stpmax. Progress messages are printed to System.out, and non-fatal error messages are printed to System.err. Fatal errors cause exception to be thrown, as listed below.

Parameters:
n - The number of variables in the minimization problem. Restriction: n > 0.
m - The number of corrections used in the BFGS update. Values of m less than 3 are not recommended; large values of m will result in excessive computing time. 3 <= m <= 7 is recommended. Restriction: m > 0.
x - On initial entry this must be set by the user to the values of the initial estimate of the solution vector. On exit with iflag = 0, it contains the values of the variables at the best point found (usually a solution).
f - Before initial entry and on a re-entry with iflag = 1, it must be set by the user to contain the value of the function f at the point x.
g - Before initial entry and on a re-entry with iflag = 1, it must be set by the user to contain the components of the gradient g at the point x.
diagco - Set this to true if the user wishes to provide the diagonal matrix Hk0 at each iteration. Otherwise it should be set to false in which case lbfgs will use a default value described below. If diagco is set to true the routine will return at each iteration of the algorithm with iflag = 2, and the diagonal matrix Hk0 must be provided in the array diag.
diag - If diagco = true, then on initial entry or on re-entry with iflag = 2, diag must be set by the user to contain the values of the diagonal matrix Hk0. Restriction: all elements of diag must be positive.
iprint - Specifies output generated by lbfgs. iprint[0] specifies the frequency of the output:
  • iprint[0] < 0: no output is generated,
  • iprint[0] = 0: output only at first and last iteration,
  • iprint[0] > 0: output every iprint[0] iterations.
iprint[1] specifies the type of output generated:
  • iprint[1] = 0: iteration count, number of function evaluations, function value, norm of the gradient, and steplength,
  • iprint[1] = 1: same as iprint[1]=0, plus vector of variables and gradient vector at the initial point,
  • iprint[1] = 2: same as iprint[1]=1, plus vector of variables,
  • iprint[1] = 3: same as iprint[1]=2, plus gradient vector.
eps - Determines the accuracy with which the solution is to be found. The subroutine terminates when
            ||G|| < EPS max(1,||X||),
		
where ||.|| denotes the Euclidean norm.
xtol - An estimate of the machine precision (e.g. 10e-16 on a SUN station 3/60). The line search routine will terminate if the relative width of the interval of uncertainty is less than xtol.
iflag - This must be set to 0 on initial entry to lbfgs. A return with iflag < 0 indicates an error, and iflag = 0 indicates that the routine has terminated without detecting errors. On a return with iflag = 1, the user must evaluate the function f and gradient g. On a return with iflag = 2, the user must provide the diagonal matrix Hk0. The following negative values of iflag, detecting an error, are possible:
  • iflag = -1 The line search routine mcsrch failed. One of the following messages is printed:
    • Improper input parameters.
    • Relative width of the interval of uncertainty is at most xtol.
    • More than 20 function evaluations were required at the present iteration.
    • The step is too small.
    • The step is too large.
    • Rounding errors prevent further progress. There may not be a step which satisfies the sufficient decrease and curvature conditions. Tolerances may be too small.
  • iflag = -2 The i-th diagonal element of the diagonal inverse Hessian approximation, given in DIAG, is not positive.
  • iflag = -3 Improper input parameters for LBFGS (n or m are not positive).
Throws:
LBFGS.ExceptionWithIflag

lb1

public static void lb1(int[] iprint,
                       int iter,
                       int nfun,
                       double gnorm,
                       int n,
                       int m,
                       double[] x,
                       double f,
                       double[] g,
                       double[] stp,
                       boolean finish)
Print debugging and status messages for lbfgs. Depending on the parameter iprint, this can include number of function evaluations, current function value, etc. The messages are output to System.out.

Parameters:
iprint - Specifies output generated by lbfgs.

iprint[0] specifies the frequency of the output:

  • iprint[0] < 0: no output is generated,
  • iprint[0] = 0: output only at first and last iteration,
  • iprint[0] > 0: output every iprint[0] iterations.

iprint[1] specifies the type of output generated:

  • iprint[1] = 0: iteration count, number of function evaluations, function value, norm of the gradient, and steplength,
  • iprint[1] = 1: same as iprint[1]=0, plus vector of variables and gradient vector at the initial point,
  • iprint[1] = 2: same as iprint[1]=1, plus vector of variables,
  • iprint[1] = 3: same as iprint[1]=2, plus gradient vector.
iter - Number of iterations so far.
nfun - Number of function evaluations so far.
gnorm - Norm of gradient at current solution x.
n - Number of free parameters.
m - Number of corrections kept.
x - Current solution.
f - Function value at current solution.
g - Gradient at current solution x.
stp - Current stepsize.
finish - Whether this method should print the ``we're done'' message.

daxpy

public static void daxpy(int n,
                         double da,
                         double[] dx,
                         int ix0,
                         int incx,
                         double[] dy,
                         int iy0,
                         int incy)
Compute the sum of a vector times a scalara plus another vector. Adapted from the subroutine daxpy in lbfgs.f. There could well be faster ways to carry out this operation; this code is a straight translation from the Fortran.


ddot

public static double ddot(int n,
                          double[] dx,
                          int ix0,
                          int incx,
                          double[] dy,
                          int iy0,
                          int incy)
Compute the dot product of two vectors. Adapted from the subroutine ddot in lbfgs.f. There could well be faster ways to carry out this operation; this code is a straight translation from the Fortran.