Optimization



Yüklə 0,51 Mb.
səhifə3/19
tarix12.05.2023
ölçüsü0,51 Mb.
#112044
1   2   3   4   5   6   7   8   9   ...   19
bayesian optimallash

Overview of BayesOpt


BayesOpt consists of two main components: a Bayesian statistical model for modeling the objective function, and an acquisition function for deciding where to sample next. After evaluating the objective according to an initial space-filling experimental design, often consisting of points chosen uniformly at
random, they are used iteratively to allocate the remainder of a budget of N function evaluations, as shown in Algorithm 1.

Algorithm 1 Basic pseudo-code for Bayesian optimization


Place a Gaussian process prior on f


Observe f at n0 points according to an initial space-filling experimental design. Set n = n0.


while n N do

Update the posterior probability distribution on f using all available data


Let xn be a maximizer of the acquisition function over x, where the acquisition function is computed using the current posterior distribution.
Observe yn = f (xn).
Increment n
end while

Return a solution: either the point evaluated with the largest f (x), or the point with the largest posterior mean.


The statistical model, which is invariably a Gaussian process, provides a Bayesian posterior probability distribution that describes potential values for f (x) at a candidate point x. Each time we observe f at a new point, this posterior distribution is updated. We discuss Bayesian statistical modeling using Gaussian processes in detail in Section 3. The acquisition function measures the value that would be generated by evaluation of the objective function at a new point x, based on the current posterior distribution over f . We discuss expected improvement, the most commonly used acquisition function, in Section 4.1, and then discuss other acquisition functions in Section 4.2 and 4.3.



n
One iteration of BayesOpt from Algorithm 1 using GP regression and expected improvement is illustrated in Figure 1. The top panel shows noise-free observations of the objective function with blue circles at three points. It also shows the output of GP regression. We will see below in Section 3 that GP regression produces a posterior probability distribution on each f (x) that is normally distributed with mean µn(x) and variance σ2 (x). This is pictured in the figure with µn(x) as the solid red line, and a 95% Bayesian credible interval for f (x), µn(x) ± 1.96 × σn(x), as dashed red lines. The mean can be interpreted as a point estimate of f (x). The credible interval acts like a confidence interval in frequentist statistics, and contains f (x) with probability 95% according to the posterior distribution. The mean interpolates the previously evaluated points. The credible interval has 0 width at these points, and grows wider as we move away from them.
The bottom panel shows the expected improvement acquisition function that corresponds to this posterior. Observe that it takes value 0 at points that have previously been evaluated. This is reason- able when evaluations of the objective are noise-free because evaluating these points provides no useful information toward solving (1). Also observe that it tends to be larger for points with larger credible intervals, because observing a point where we are more uncertain about the objective tends to be more useful in finding good approximate global optima. Also observe it tends to be larger for points with larger posterior means, because such points tend to be near good approximate global optima.
We now discuss the components of BayesOpt in detail, first discussing GP regression in Section 3, then discuss acquisition functions in Section 4, starting with expected improvement in Section 4.1. We then discuss more sophisticated acquisition functions (knowledge gradient, entropy search, and predictive entropy search) in Sections 4.2 and 4.3. Finally, we discuss extensions of the basic problem described in Section 1 in Section 5, discussing problems with measurement noise, parallel function evaluations, constraints, multi-fidelity observations, and others.



  1. Yüklə 0,51 Mb.

    Dostları ilə paylaş:
1   2   3   4   5   6   7   8   9   ...   19




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin