n
EI (x) = [∆
n
n
(x)]+ + σ
(x)ϕ ∫∆n(x), − |∆
(x)|Φ ∫∆n(x), , (8)
σ
(x)
σ
(x)
n
where ∆ n( x) := µn( x) − fn∗ is the expected difference in quality between the proposed point x and the previous best.
The expected improvement algorithm then evaluates at the point with the largest expected improve- ment,
xn+1 = argmax EIn(x), (9)
breaking ties arbitrarily. This algorithm was first proposed by Moˇckus (Moˇckus, 1975) but was popu- larized by Jones et al. (1998). The latter article also used the name “Efficient Global Optimization” or EGO.
Implementations use a variety of approaches for solving (9). Unlike the objective f in our original optimization problem (1), EIn(x) is inexpensive to evaluate and allows easy evaluation of first- and second- order derivatives. Implementations of the expected improvement algorithm can then use a continuous first- or second-order optimization method to solve (9). For example, one technique that has worked well for the author is to calculate first derivatives and use the quasi-Newton method L-BFGS-B (Liu and Nocedal, 1989).
Figure 3 shows the contours of EIn(x) in terms of ∆n(x) and the posterior standard deviation σn(x). EIn(x) is increasing in both ∆n(x) and σn(x). Curves of ∆n(x) versus σn(x) with equal EI show how EI balances between evaluating at points with high expected quality (high ∆n(x))) versus high uncertainty (high σn(x)). In the context of optimization, evaluating at points with high expected quality relative
Figure 3: Contour plot of EI(x), the expected improvement (8), in terms of ∆n(x) (the expected difference in quality between the proposed point and the best previously evaluated point) and the posterior standard deviation σn(x). Blue indicates smaller values and red higher ones. The expected improvement is increasing in both quantities, and curves of ∆n(x) versus σn(x) with equal EI define an implicit tradeoff between evaluating at points with high expected quality (high ∆n(x) versus high uncertainty (high σn(x)).
to the previous best point is valuable because good approximate global optima are likely to reside at such points. On the other hand, evaluating at points with high uncertainty is valuable because it teaches about the objective in locations where we have little knowledge and which tend to be far away from where we have previously measured. A point that is substantially better than one we have seen previously may very well reside there.
Figure 1 shows EI(x) in the bottom panel. We see this tradeoff, with the largest expected improvement occurring where the posterior standard deviation is high (far away from previously evaluated points), and where the posterior mean is also high. The smallest expected improvement is 0, at points where we have previously evaluated. The posterior standard deviation is 0 at this point, and the posterior mean is necessarily no larger than the best previously evaluated point. The expected improvement algorithm would evaluate next at the point indicated with an x, where EI is maximized.
Choosing where to evaluate based on a tradeoff between high expected performance and high un- certainty appears in other domains, including multi-armed bandits Mahajan and Teneketzis (2008) and reinforcement learning Sutton and Barto (1998), and is often called the “exploration vs. exploitation tradeoff” (Kaelbling et al., 1996).
Dostları ilə paylaş: |