Root solver by Stochastic Approximation

Usage

optim_sa(
  f,
  init = 0,
  function.args = list(),
  ...,
  method = c("standard", "discrete", "interpolate"),
  projection = identity,
  control = list(niter = 100, alpha = 0.25, stepmult = 1),
  verbose = TRUE,
  burn.in = 0.75
)

Arguments

f: Stochastic function
init: Initial value to evaluate f in
function.args: Additional arguments passed to f
...: Additional arguments passed to f
method: Method ('standard': standard Robbins-Monro, 'discrete' or 'interpolate' for integer stochastic optimization. See details section)
projection: Optional projection function that can be used to constrain the parameter value (e.g., function(x) max(x, tau)). Applied at the end of each iteration of the optimization.
control: Control arguments (niter number of iterations, alpha and stepmult control the step-length of the Polyak-Ruppert algorithm as described in the details section).
verbose: if TRUE additional messages are printed throughout the optimization
burn.in: Burn-in (fraction of initial values to disregard) when applying Polyak–Ruppert averaging (alpha<1). Alternatively, burn.in can be given as an integer defining the absolut number of iterations to disregard.

Details

The aim is to find the root of the function $ M(\theta) = E f(\theta)$, where we are only able to observe the stochastic function $f(\theta)$. We will assume that $M$ is non-decreasing with a unique root $\theta^\ast$. The Robbins-Monro algorithm works through the following update rule from an initial start value $\theta_0$: $$\theta_{n+1} = \theta_{n} - a_n f(\theta_n)$$ where $\sum_{n=0}^\infty a_n = \infty$ and $\sum_{n=0}^\infty a_n^2 < \infty$.

By averaging the iterates $$\overline{\theta}_n = \frac{1}{n}\sum_{i=1}^{n-1}\theta_i$$ we can get improved stability of the algorithm that is less sensitive to the choice of the step-length $a_n$. This is known as the Polyak-Ruppert algorithm and to ensure convergence longer step sizes must be made which can be guaranteed by using $a_n = Kn^{-\alpha}$ with $0<\alpha<1$. The parameters $K$ and $\alpha$ are controlled by the stepmult and alpha parameters of the control argument.

For discrete problems where $\theta$ must be an integer, we follow (Dupac & Herkenrath, 1984). Let $\lfloor x\rfloor$ denote the integer part of $x$, and define either (method="discrete"): $$g(\theta) = I(U<\theta-\lfloor\theta\rfloor)f(\lfloor\theta\rfloor) + I(U\geq\theta-\lfloor\theta\rfloor)f(\lfloor\theta\rfloor+1)$$ where $U\sim Uniform([0,1])$, or (method="interpolate"): $$g(\theta) = (1-\theta+\lfloor\theta\rfloor)f(\lfloor\theta\rfloor) + (\theta-\lfloor\theta\rfloor)f(\lfloor\theta\rfloor+1).$$ The stochastic approximation method is then applied directly on $g$.

Dupač, V., & Herkenrath, U. (1984). On integer stochastic approximation. Aplikace matematiky, 29(5), 372-383.

Examples

# Finding approximate median u, P(X<=u) = .5, of X~Exp(1):
f <- function(x) mean(rexp(10) <= x) - 0.5
res <- carts:::optim_sa(f, 0, control=list(niter=2000, alpha=.5))
res$estimate
#> [1] 0.682097