Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective Optimization

Hybrid Multi-Gradient Explorer Algorithm for
Global Multi-Objective Optimization
Vladimir Sevastyanov1
eArtius, Inc., Irvine, CA 92614, US

EXTENDED ABSTRACT

Hybrid Multi-Gradient Explorer (HMGE) algorithm for global multi-objective
optimization of objective functions considered in a multi-dimensional domain is presented
(patent pending). The proposed hybrid algorithm relies on genetic variation operators for
creating new solutions, but in addition to a standard random mutation operator, HMGE
uses a gradient mutation operator, which improves convergence. Thus, random mutation
helps find global Pareto frontier, and gradient mutation improves convergence to the
Pareto frontier. In such a way HMGE algorithm combines advantages of both
gradient-based and GA-based optimization techniques: it is as fast as a pure gradient-based
MGE algorithm, and is able to find the global Pareto frontier similar to genetic algorithms
(GA). HMGE employs Dynamically Dimensioned Response Surface Method (DDRSM) for
calculating gradients. DDRSM dynamically recognizes the most significant design
variables, and builds local approximations based only on the variables. This allows one to
estimate gradients by the price of 4-5 model evaluations without significant loss of accuracy.
As a result, HMGE efficiently optimizes highly non-linear models with dozens and
hundreds of design variables, and with multiple Pareto fronts. HMGE efficiency is 2-10
times higher when compared to the most advanced commercial GAs.

I. Introduction

H ybrid Multi-Gradient Explorer (HMGE) algorithm is a novel multi-objective optimization algorithm
which combines a genetic algorithms technique with a gradient-based technique. Both techniques have
strong and weak points:
• Gradient-based techniques have high convergence to a local Pareto front, but a low ability to find the global
Pareto frontier and disjoint parts of a Pareto frontier;
• GAs have a strong ability to find global optimal solutions, but have low convergence.
It is apparent that creating a hybrid optimization algorithm which combines the strong points of both approaches
is the most efficient solution, and this is one of the most popular research topics over the last few years [1,2,3,4,5].
However, there is a serious obstacle for developing efficient hybrid algorithms: all known methods of gradient
evaluation are computationally expensive, or have low accuracy. For instance, the traditional finite difference method
is prohibitively computationally expensive because it requires N+1 model evaluations to estimate a gradient, where N
is the number of design variables. An alternative method of local Jacobian estimation [1] is based on current
population layout, and does not require additional points’ evaluation. Clearly, this approach is very efficient, but the
accuracy of the gradients estimation depends on the points’ distribution, and is very low in most of cases. By the
authors’ opinion [1], it performs well only for smooth models with a low level of non-linearity.
Ref. [4] gives a good example of a hybrid optimization algorithm which combines SPEA and NSGA genetic
algorithms with a gradient-based SQP algorithm. NSGA-SQP and NSGA-SQP hybrid algorithms improve
convergence in 5-10 times compared with SPEA and NSGA. However, the algorithms still require 4,000-11,000
model evaluations for the benchmark problems ZDT1-ZDT6 which does not allow one to use such algorithms for
optimizing computationally expensive simulation models.
Presented HMGE optimization algorithm employs the in-house developed Dynamically Dimensioned Response
Surface Method (DDRSM) [6] designed to estimate gradients. The key point of DDRSM is the ability to dynamically

1
Chief Executive Officer
1
American Institute of Aeronautics and Astronautics

recognize the most significant design variables, and to build local approximations based on the variables. As a result,
DDRSM requires just about 4-7 model evaluations to estimate gradients without a significant decrease in accuracy,
and regardless of the task dimension. Thus, DDRSM successfully matches contradicting requirements to estimate
gradients efficiently and accurately, and allows for the development of efficient and scalable optimization algorithms
based on it. HMGE is one of such algorithms.
HMGE was tested for tasks with different levels of non-linearity and a large variety of task dimensions ranging
from a few and up to thousands of design variables. It consistently shows high convergence and high computational
efficiency.
The optimization problem that the proposed algorithm solves is formally stated in (1):

Minimize F ( X ) = [ F1 ( X ), F2 ( X ),..., Fm ( X )]T
X

subject to : q j ( X ) ≤ 0; j = 1,2,...k (1)
X = { x1 , x2 ,..., xn }; X ∈S ⊂ ℜ n

Where S ⊂ ℜ is the design space.
n

The remainder of this paper is organized as follows. In section 2, the proposed optimization algorithm is
described. In section 3, benchmark problems, simulation results, and inferences from the simulation runs are
presented. Finally in section 4, a brief conclusion of the study is presented.

II. Hybrid Multi-Gradient Explorer Algorithm
The main idea of the HMGE algorithm is to use both gradient-based and random mutation operators.
Theoretical analysis and numerous experiments with different benchmarks have shown that using just one of two
mutation mechanisms reduces the overall efficiency of the optimization algorithm. An algorithm performing just
random mutation is a genetic algorithm with low convergence. In turn, using just gradient-based mutation improves
the convergence towards the nearest local Pareto frontier, but reduces the ability of the algorithm to find the global
Pareto frontier. Therefore, the fast global optimization algorithm has to use both gradient-based and random mutation
operators. This is an important consideration that makes the difference between local and global optimization
technology, between fast and slow optimization algorithms, and it needs special attention.
Gradient-based mutation operator always indicates a direction towards the nearest local Pareto front. If the
global Pareto front is located in an opposite (or a different enough) direction then such mutation will never find
dominating solutions! Thus, random mutation is a critical part of any hybrid optimization algorithm. On the other
hand, any global optimization algorithm needs to use gradients to improve convergence. This is why HMGE uses
both random and gradient-based mutation operators.
Another important question with designing of HMGE is how to estimate gradients without reducing the
algorithm efficiency. HMGE employs the Dimensionally Independent Response Surface Method (DDRSM) [6] to
estimate gradients because DDRSM is equally efficient for any task dimension, and requires a low number (4-7) of
model evaluations to estimate gradients for a large variety of task dimensions.

The HMGE algorithm is an evolutionary multi-objective optimization algorithm combined with a gradient-based
technique. A number of evolutionary optimization algorithms could be used as a basement for developing hybrid
algorithms such as HMGE. Any evolutionary algorithm would benefit from the proposed gradient-based technique.
But the most advanced evolutionary technique combined with the proposed gradient-based technique would give the
biggest synergy effect in the hybrid algorithm efficiency. After careful consideration NSGA-II [10] and AMGA [5]
were selected to borrow several concepts like concept of Pareto ranking, formulation for crossover, mutation, two-tier
fitness assignment mechanism, preservation of elite and diverse solutions, and archiving of solutions.
As a hybrid, evolutionary optimization algorithm HMGE relies on genetic and gradient-based variation operators
for creating new solutions. HMGE employs generational scheme since during a particular iteration, only solutions
created before that iteration take part in the selection process. Similarly to AMGA algorithm, HMGE generates a
small number of new solutions per iteration. HMGE works with a small population size and maintains a large
external archive of all solutions obtained. At each iteration HMGE creates a small number of solutions using genetic
and gradient-based variation operators, and all of the solutions are used to update the archive. The parent population
is created from the archive. The creation of the mating pool is based on binary tournament selection and is similar to
the one used in NSGA-II. Genetic and gradient-based variation operators are used to create the offspring population.

2

Large archive size and the concept of Pareto ranking borrowed from NSGA-II [10] help to maintain the diversity of
the solutions, and obtain a large number of non-dominated solutions.

The pseudo-code of the proposed HMGE algorithm is as follows.

1. Begin
2. Generate required number of initial points X1,…,XN using Latin Hypercube sampling
3. Set iteration number i=0
4. Add newly calculated points to the archive
5. Sort points using crowding distance
6. Select m best individuals and produce k children using SBX crossover operator
7. Select first child as current
8. If i is multiple of p then go to 9 else go to 10
9. Improve current individual by the MGA analysis in current subregion
10. Apply a random mutation operator to current individual
11. If i is multiple of 10 then go to 12 else go to 13
12. Set subregion size to default value
13. Decrease subregion size by multiplying with Sd (Sd < 1)
14. If current individual is not last, select next and go to 8 else go to 15
15. Increment i
16. If convergence is not reached and maximum number of evaluations is not exceeded then
go to 4
17. Report all the solutions found
18. End

The above pseudo-code explains the main steps of the HMGE algorithm. Basically, HMGE works as a genetic
algorithm with elements of gradient-based techniques.
The following two sub-sections explain GA- and gradient aspects of HMGE in details.

A. GA-Based Elements of HMGE Algorithm
HMGE employs operators typical for genetic algorithms as follows.

Simulated Binary Crossover (SBX) based on the search features of the single-point crossover is used in
binary-coded genetic algorithms. This operator employs the interval schemata processing, which means that
common interval schemata of the parents are preserved in the offspring. The SBX crossover mostly generates
offspring near the parents. Thus, the crossover guarantees that the extent of the children is proportional to the extent
of the parents [8].

The algorithm receives two parent points p0 and p1 points and produces two children X0 and X1. It comprises the
following steps [9]:
1. Increment current coordinate index j
2. Get random value in the range [0; 1]
3. Bq is found so that the area under the probability curve from 0 to βq is equal to the
chosen random number u
4. Calculate child coordinates as follows:
X 0 j = 0.5[(1 + Bq ) p0 + (1 − Bq ) p1 ]
X 1 j = 0.5[(1 − Bq ) p0 + (1 + Bq ) p1 ]
5. If j is less than independent variable count then go to 1

Random mutation is an important part of genetic algorithms in general, and HMGE in particular. The main idea
of HMGE random mutation operator is to move the initial point in a random direction within given mutation range.
The random mutation operator algorithm comprises the following steps:
3

1. Increment current coordinate index j
2. Get a random value in the range [0; 1]
3. If the random number is less than 0.5 go to 4 else go to 9
4. Get a random value in the range [0; 1]
5. If the random number is less than 0.5 go to 6 else go to 7
6. Set positive direction for move
7. Set negative direction for move
8. Calculate new coordinate by move in the selected direction with the step equal to r
percent of the design space.
9. If j is less than the number of independent variables then go to 1

HMGE maintains an archive of best and diverse solutions obtained over optimization process. The archived
solutions are used for the selection of parents for SBX crossover and mutation operators.
It is important to keep as many solutions as possible in the archive. On the other hand, the size of the archive
determines the computational complexity of the proposed algorithm, and needs to be maintained on a reasonable level
by removing the worst in certain sense solutions. For instance, non-Pareto-optimal points should be removed before
any other points.

HMGE maintains the archive size using the following steps:

1. Add newly created solutions to the archive
2. If archive size does not exceed given threshold then stop
3. Sort points by the crowding distance method; now the worst points are located at the
end of the archive
4. Remove a number of last solutions in the archive to match the archive size requirement.

B. Gradient-Based Elements of HMGE Algorithm
As can be seen from the HMGE pseudo-code, the Multi-Gradient Analysis (MGA) [6, 7] is used to perform a
gradient-based mutation, and improve a given point with respect to all objectives. Essentially, MGA determines a
direction of simultaneous improvement for all objective functions, and performs a step in this direction. In order to
successfully achieve this goal, HMGE needs to estimate gradients.
Since real design optimization tasks involve computationally expensive simulation models, a model evaluation
typically takes hours and even days of computational time. Thus, estimating gradients is the most challenging part of
any optimization algorithm because it is necessary to satisfy two contradicting requirements: (a) estimate the gradient
in a computationally cheap way, and (b) provide a high enough accuracy of the gradient estimation.
The traditional finite difference method is accurate, but it requires N+1 model evaluations (N – the number of
design variables) to estimate a gradient. A hybrid optimization algorithm which employs the finite difference method
will be slower than a pure genetic algorithm, unless the algorithm is used for optimization of low-dimensional models.
Computationally expensive gradient estimation eliminates the possibility of developing hybrid optimization
algorithms for optimizing models with more than 10 design variables.
Alternative methods of gradient estimation such as [1,5] have low accuracy, and converge well only for the
simplest benchmarks.
The presented method of gradient estimation is based on Dynamically Dimensioned Response Surface Method
(DDRSM) [6], and satisfies both of the previously mentioned requirements: it is computationally cheap and accurate
at the same time. DDRSM requires just 4-7 model evaluations to estimate gradients on each step regardless of the task
dimension.
DDRSM (patent pending) is a response surface method, which allows for one to build local approximation of all
objective functions (and other output variables), and use the approximations to estimate gradients.
All known response surface methods suffer from the curse of dimensionality phenomenon. The curse of
dimensionality is the problem caused by the exponential increase in volume associated with adding extra dimensions
to a design space [12], which in turn requires exponential increase in number of sample points to build an accurate
enough response surface model. This is a significant limitation for all known response surface approaches, forcing

4

engineers to artificially reduce optimization task dimension by assigning constant values to the most of design
variables.
DDRSM successfully resolved the curse of dimensionality problem in the following way.
DDRSM is based on a realistic assumption that most of real life design problems have a few significant design
variables, and the rest of the design variables are not significant. Based on this assumption, DDRSM estimates the
most significant projections of gradients for all output variables on each optimization step.
In order to achieve this, DDRSM generates 5-7 sample points in the current sub-region, and uses the points to
recognize the most significant design variables for each objective function. Then DDRSM builds local
approximations which are utilized to estimate the gradients.
Since an approximation does not include non-significant variables, the estimated gradient has only projections
that corresponded to significant variables. All other projections of the gradient are equal to zero. Ignoring
non-significant variables slightly reduces the accuracy, but allows estimating gradients by the price of 5-7 evaluations
for tasks of practically any dimension.
DDRSM recognizes the most significant design variables for each output variable (objective functions and
constraints) separately. Thus, each output variable has its own list of significant variables that will be included in its
approximating function. Also, DDRSM recognizes significant variables over and over again on each optimization
step, each time when it needs to estimate gradients. This is crucial because the topology of objective functions and
constraints can diverge in different parts of the design space throughout the optimization process.
As follows from the previous explanation, DDRSM dynamically reduces the task dimension in each sub-region,
and does it independently for each output variable by ignoring non-significant design variables. The same variable can
be critically important for one of the objective functions in the current sub-region, and not significant for other
objective functions and constraints. Later, in a different sub-region, the lists of significant design variables can be very
different, but DDRSM will be able to recognize the most relevant design variables, and estimate gradients regardless.
Thus, gradient-based mutation can be performed reliably and accurately in any circumstance.

III. Benchmark Problems
In this section, a set of unconstraint problems are used to test the HMGE algorithm’s performance.
Zitzler et al. described six problems (ZDT1 to ZDT6) [13], which have been further studied by other
researchers. All the problems are used here except the ZDT5 since its variables are discrete and are not suitable for
the gradient based technique employed in HMGE algorithm.
In this study seven multi-objective optimization algorithms have been compared to the proposed HMGE
algorithm. The algorithms can be split into the following three groups:
• Two well-known multi-objective evolutionary algorithms: Non-dominated Sorting Genetic Algorithm
(NSGA-II) [10], and Strength Pareto Evolutionary Algorithm (SPEA) [14]. The optimization results of these
two algorithms applied to the benchmark problems ZDT1 to ZDT6 have been taken from [4];
• Two hybrid multi-objective optimization algorithms SPEA-SQP and NSGA-SQP presented in [4] ;
optimization results for these algorithms applied to ZDT1 to ZDT6 benchmark problems are also taken from
[4];
• Three state of the art multi-objective optimization algorithms developed by a leading company of the
Process Integration and Design Optimization (PIDO) market: Pointer, NSGA-II, and AMGA. These
commercial algorithms represent the highest level of optimization technology developed by the best companies
and are currently available on the PIDO market.

NSGA-II and AMGA are pure multi-objective optimization algorithms suitable to compare with HMGE.
Pointer is a more questionable algorithm regarding multi-objective optimization because it works as an automatic
optimization engine that controls four different optimization algorithms, and only one of them is a true
multi-objective algorithm. Clearly, three other algorithms use a weighted sum method for solving multi-objective
optimization tasks. Thus, it is not the most suitable algorithm for performing scientific research and cannot be
properly compared to other multi-objective techniques. However, Pointer is a great optimization tool, and it is
widely used for multi-objective optimization in engineering practice. Therefore, testing Pointer on ZDT1-ZDT6
benchmark problems makes practical sense.
For the algorithms AMGA, NSGA-II, Pointer, and HMGE only the default parameter values have been used to
make sure that all algorithms are in equal conditions.

The benchmark ZDT1 has 30 design variables and multiple Pareto fronts. The optimization task formulation
5

used is as follows:
Minimize F1 = x1
(2)
⎡ F ⎤
Minimize F2 = g ⎢1 − 1 ⎥
⎣ g⎦
9 n
g = 1+ ∑ xi
n − 1 i =2
0 ≤ xi ≤ 1, i = 1,..n; n = 30

The Pareto-optimal region for the problem ZDT1 corresponds to x1 ∈ [0;1], and xi = 0, i = 2,...,30.

FIG.1 Optimization results for the benchmark problem ZDT1 (2) found by HMGE algorithm and three
state of the art algorithms: Pointer, NSGA-II, and AMGA. HMGE spent 300 evaluations while other
algorithms spent 10 times more model evaluations: NSGA-II—3500; AMGA and Pointer—5000.

As follows from FIG.1, HMGE is able to find the global Pareto frontier, and HMGE evenly covers it by the
price of 400 model evaluations.
NSGA-II algorithm spent 3,500 model evaluations, and evenly covered the entire Pareto frontier with a high
enough level of diversity. However, only a few points found by NSGA-II algorithm belong to the global Pareto
frontier. The rest of the points are dominated by the points found by HMGE algorithm. This can be clearly seen on
both left (objective space) and right (design space) diagrams (see FIG.1.).
Pointer has spent 5000 model evaluations, and found a few Pareto optimal points in the middle part of Pareto
frontier. The diversity of the points is low, and most of Pareto frontier is not covered.
AMGA algorithm has spent 5000 model evaluations, and has not approached the global Pareto frontier at all.

The benchmark ZDT2 has 30 design variables and multiple Pareto fronts. The optimization task formulation
used is as follows:
Minimize F1 = x1
⎡ ⎛ F ⎞2 F ⎤
Minimize F2 = g ⎢1 − ⎜ 1 ⎟ − 1 sin(10 π F1 )⎥ (3)
⎜g⎟
⎢ ⎝ ⎠
⎣
g ⎥
⎦
n
9
g = 1+ ∑ xi
n − 1 i =2
0 ≤ xi ≤ 1, i = 1,..n; n = 30

6

The Pareto-optimal region for the problem ZDT2 corresponds to x1 ∈ [0;1], and xi = 0, i = 2,...,30.

FIG.2 Results for the benchmark problem ZDT2 (3) found by HMGE algorithm and three state of the
art algorithms: Pointer, NSGA-II, and AMGA. HMGE spent 400 evaluations while other algorithms spent
7-12 times more model evaluations: NSGA-II—3000; AMGA and Pointer—5000.

As follows from FIG.2, HMGE has found the global Pareto frontier, and evenly covered it by the price of 400
model evaluations.
NSGA-II algorithm spent 3,000 model evaluations, and evenly covered entire Pareto frontier with a high
enough level of diversity. However, just a few points found by NSGA-II algorithm belong to the global Pareto
frontier. The rest of the points are dominated by the points found by HMGE algorithm. This can be clearly seen on
both left (objective space) and right (design space) diagrams (see FIG.2.).
Pointer has spent 5000 model evaluations, and found a few Pareto optimal points in the top part of Pareto
frontier. The diversity of the points is low, and most of Pareto frontier is not covered.
AMGA algorithm has spent 5000 model evaluations, and has not approached the global Pareto frontier at all.

The following benchmark ZDT3 (4) has 30 design variables and multiple discontinuous Pareto fronts.

Minimize F1 = x1
⎡ F F ⎤
Minimize + F2 = g ⋅ ⎢1 − 1 − 1 sin(10 π F1 )⎥
⎣ g g ⎦
n
9 (4)
g =1+ ∑ xi
n − 1 i =2
0 ≤ xi ≤ 1, i = 1,..n; n = 30

The Pareto-optimal region for the problem ZDT3 corresponds to x1 ∈ [0;1], and xi = 0, i = 2,...,30. The
Pareto front is discontinuous for ZDT3 problem, which means that not all points satisfying x1 ∈ [0;1] are Pareto
optimal.

7

FIG.3 Results for the benchmark problem ZDT3 found by HMGE algorithm and three state of the art
algorithms: Pointer, NSGA-II, and AMGA. HMGE spent 800 evaluations while other algorithms spent
significantly more model evaluations: NSGA-II—4000; AMGA and Pointer—5000.

As follows from FIG.3, HMGE was able to find the global Pareto frontier for the benchmark ZDT3, and cover it
evenly by the price of 800 evaluations. NSGA-II spent 4000 model evaluations, and found all 5 disjoint parts of
Pareto frontier. However, most of the points are dominated by the solutions found by HMGE (see FIG.3.)

AMGA and Pointer algorithms spent 5000 evaluations, and found only local Pareto frontiers. Pointer was able
to find a few global Pareto optimal points on the top part of Pareto frontier. AMGA algorithm was not able to
approach the global Pareto frontier at all.

The following benchmark ZDT4 (5) has 10 design variables and multiple local Pareto fronts.

Minimize F1 = x1
Minimize F2 = g ( X ) ⋅ h( F1 ( X ), g ( X ))
(5)
( )
n
g ( X ) = 1 + 10(n − 1) + ∑ x − 10 cos(4π ⋅ xi )
2
i
i =2

F1
h( X ) = 1 − , x1 ∈ [0;1], xi ∈ [−5;5], i = 2,..n; n = 10
g

The global Pareto-optimal front corresponds to x1 ∈ [0;1], xi = 0, i = 2,...,10 . There exist 219 local
Pareto-optimal solutions, and about 100 distinct Pareto fronts [4].

The following FIG.4 shows Pareto optimal points found by HMGE algorithm. A relatively small number of
variables (2 objectives and 10 design variables) allows for the ability to show all of them on six scatter plots, and see
exactly how precise the Pareto optimal solution is.

8

FIG.4 Optimization results for the benchmark problem ZDT4 found by HMGE algorithm.

The diagrams on FIG.4 allow one to see values of both objectives and all design variables. Values of the
variable x1 cover the interval [0;1] evenly and completely; the rest of design variables have exact value x2,…,x10=0.
This means that HMGE has found the global Pareto frontier precisely, and covered it completely after 700 model
evaluations.

FIG.5A

FIG.5B
FIG.5A Optimization results for the benchmark problem ZDT4 found by HMGE algorithm and two
state of the art algorithms: Pointer and NSGA-II. HMGE spent 700 evaluations while other algorithms spent
5000 model evaluations each.
9

FIG.5B Optimization results for the benchmark problem ZDT4 found by AMGA algorithm after 5000
model evaluations.

As follows from FIG. 5A, HMGE algorithm has found the global Pareto frontier; the Pareto frontier is covered
completely and evenly after 700 evaluations. Pointer spent 5000 evaluations, and was able to cover half of Pareto
frontier with lower values of F1. NSGA-II has found points in the same part of Pareto frontier as Pointer. However
the points found by NSGA-II are dominated by HMGE points.
AMGA algorithm also spent 5000 model evaluations, but with unsatisfactory results. Comparison with FIG.5A
shows that AMGA algorithm has failed to find the global Pareto frontier.

The following benchmark ZDT6 (6) has 10 design variables and multiple local Pareto fronts.

Minimize F1 = 1 − exp(−4 x1 ) ⋅ sin 6 (6π ⋅ x1 )
Minimize F2 = g ( X ) ⋅ h( F1 ( X ), g ( X ))
0.25
⎡⎛ n ⎞ ⎤
g ( X ) = 1 + 9 ⋅ ⎢⎜ ∑ ( xi )⎟ /(n − 1)⎥ (6)
⎣⎝ i=2 ⎠ ⎦
h( F1 , g ) = 1 − (F1 / g ) , xi ∈ [0;1], i = 1,..., n; n = 10
2

The global Pareto-optimal front corresponds to x1 ∈ [0;1], xi = 0, i = 2,...,10 . The Pareto optimal front is
non-convex. But the most challenging obstacle is that the density of solutions across the Pareto-optimal region is
highly non-uniform.

FIG.6 Optimization results for the benchmark problem ZDT6 found by HMGE algorithm.

The diagrams on FIG.6 allow one to see the values of both objectives and all design variables. Values of the
variable cover the interval [0;1] completely, but not evenly because the density of solutions across Pareto front is not
uniform by the nature of the model. The rest of the design variables have exact value x2,…,x10=0. This means that
HMGE has found global Pareto frontier precisely, and covered it completely by 151 Pareto optimal points after 1000
model evaluations.

10

FIG.7A

FIG.7B
FIG.7A Results for the benchmark problem ZDT6 found by HMGE algorithm and two state of the art
algorithms: Pointer and NSGA-II. HMGE spent 1000 evaluations while other algorithms spent 5000 model
evaluations each.
FIG.7B Optimization results for the benchmark problem ZDT6 found by AMGA algorithm after 5000
model evaluations.

As follows from FIG.7A, neither Pointer, nor NSGA-II have covered the entire interval [0,1] for the variable x1,
which indicates that non-uniform density of Pareto optimal solutions caused by ZDT6 task is a significant obstacle
for the algorithms. In contrast, HMGE has covered the [0;1] interval entirely, and spent 5000/1000=5 times less
model evaluations. AMGA algorithm has failed finding the global Pareto frontier for ZDT6 problem (see FIG.7B).

Table 1 Objective evaluations for ZDT1-ZDT6 benchmark problems

SPEA NSGA SPEA-SQP NSGA-SQP Pointer NSGA-II AMGA HMGE
ZDT1 20,000 25,050 4,063 4,290 5,000 3,500 5,000 300
ZDT2 20,000 25,050 3,296 3,746 5,000 4,000 5,000 400
ZDT3 20,000 25,050 11,483 11,794 5,000 4,000 5,000 800
ZDT4 80,000 100,050 76,778 93,643 5,000 5,000 5,000/failed 700
ZDT6 20,000 25,050 2,042 2,115 5,000 5,000 5,000/failed 1,500

Table 1 remembers us that the results of HMGE algorithm are obtained through far fewer objective evaluations
than all other algorithms do for all problems. For instance, for ZDT1 and ZDT2 problems HMGE evaluates the
11

objectives nearly 1/8 – 1/12 times as SPEA-SQP, NSGA-SQP, Pointer, NSGA-II and AMGA do. For ZDT4 HMGE
is 7 times faster than Pointer and NSGA-II and more than 100 times faster when compared with SPEA-SQP and
NSGA-SQP optimization algorithms.
It can be also observed that the solutions of the HMGE represent a better diversity for all benchmark problems
ZDT1-ZDT6 compared with most of other optimization algorithms (see FIG.1-7, and the scatter plots published in
[4].) Only NSGA-II represents a comparable level of diversity with HMGE results in the objective space (see FIG.7.)
However, in the design space (FIG.7, right diagram) NSGA-II still shows a low diversity even after 5000
evaluations.
It is important to mention that for ZDT6, which is designed to cause non-uniform density difficulty, HMGE has
shown a high level of the solutions’ diversity.
Apparently, SPEA-SQP and NSGA-SQP algorithms spend N+1 model evaluations to estimate gradients in the
SQP part of the hybrid algorithms. This significantly reduces the overall efficiency of the algorithms for tasks with
N=30 design variables. In contrast, HMGE spends just 4-7 model evaluations to estimate gradients. Probably, this is
the reason why SPEA-SQP and NSGA-SQP algorithms are 8-12 times less efficient compared with HMGE.

IV. Constrained Optimization Benchmark Problems
The test problems for evaluating the constrained optimization performance of HMGE algorithm were chosen
from the benchmark domains commonly used in past multi-objective GA research.

The following BNH benchmark problem was used by Binh and Korn [15].

Minimize F1 = 4 ⋅ x12 + 4 ⋅ x2
2

(7)
Minimize F2 = ( x1 − 5) 2 + ( x2 − 5) 2
c1 = ( x1 − 5) 2 + x2 − 25 ≤ 0
2

c2 = −( x1 − 8) 2 − ( x2 + 3) 2 + 7.7 ≤ 0
x1 ∈ [0;5], x2 ∈ [0;3]

The following FIG.8 illustrates optimization results for the benchmark problem (7). The solutions found by
HMGE algorithm, and three state of the art algorithms NSGA-II, Pointer, and AMGA.

FIG.8A FIG.8B
FIG.8A Results for the benchmark problem BNH found by HMGE and NSGA-II algorithms. Both
algorithms spent 1000 model evaluations, and have shown equally good results.

FIG.8B Optimization results for the benchmark problem BNH found by Pointer and AMGA algorithms.
Pointer has spent 3000 model evaluations, and has not covered part of Pareto frontier corresponded with
high values of F1. AMGA spent 5000 model evaluations, and has shown even worse results in the top part of
the Pareto frontier.

The BNH problem is fairly simple since the constraints do not introduce additional difficulty in finding the
Pareto optimal solutions. It was observed that both HMGE and NSGA-II methods performed equally well within
12

1000 of objective evaluations, and gave a dense sampling of solutions along the true Pareto-optimal front. However,
Pointer and AMGA algorithms did not show such good results even after 3000 (Pointer) and 5000 (AMGA)
evaluations.
The following OSY benchmark problem was used by Osyczka, Kundu [16]. The OSY problem (Fig. 9) is
relatively difficult because the constraints divide the Pareto frontier into five regions which creates difficulties for
optimization algorithms to find all parts of the Pareto frontier.

[
Minimize f1 ( X ) = − 25( x1 − 2) 2 + ( x2 − 2) 2 + ( x3 − 1) 2 + ( x4 − 4) 2 + ( x5 − 1) 2 ]
Minimize f 2 ( X ) = x + x + x + x + x + x
2
1
2
2
2
3
2
4
2
5
2
6

C1 ( X ) = x1 + x2 − 2 ≥ 0
C2 ( X ) = 6 − x1 − x2 ≥ 0
(8)
C3 ( X ) = 2 − x2 + x1 ≥ 0
C4 ( X ) = 2 − x1 + 3 x2 ≥ 0
C5 ( X ) = 4 − ( x3 − 3) 2 − x4 ≥ 0
C6 ( X ) = ( x5 − 3) 2 + x6 − 4 ≥ 0
x1 , x2 , x6 ∈ [0;10]; x4 ∈ [0;6]; x3 , x5 ∈ [1;5]

FIG.9 Optimization results for the benchmark problem OSY found by HMGE algorithm, and three
state of the art optimization algorithms NSGA-II, Pointer, and AMGA. HMGE algorithm has spent 2000
model evaluations, and outperformed all other algorithms that spent 3000 model evaluations each.

Constraints of the OSY problem divide the Pareto-optimal set into five regions. This requires for a genetic
algorithm to maintain its population in disjoint parts of design space determined by intersections of the constraints
boundaries. In terms of non-genetic algorithms, sample points need to be generated in all disjoint parts of the design
space related to the parts of Pareto frontier.
As follows from FIG.9, Pointer and NSGA-II algorithms were not able to recognize and populate all necessary
disjoint areas. As result, Pointer has not found some of Pareto frontier segments. NSGA-II was not able to find
Pareto frontier at all, but has somehow shown the correct shape of the frontier.
AMGA and HMGE algorithms demonstrated a better performance finding the global Pareto frontier. AMGA
algorithm spent 3000 evaluations, and found optimal points on all parts of Pareto frontier. However, it could not
cover the Pareto frontier evenly. HMGE spent 2000 evaluations, and outperformed AMGA—it covered the Pareto
frontier completely and evenly (see FIG.9).

13

V. Conclusion
In this study, a gradient-based technique is incorporated in a genetic algorithm framework, and the new Hybrid
Multi-Gradient Explorer (HMGE) algorithm for multi-objective optimization is developed. Dynamically
Dimensioned Response Surface Method (DDRSM) is used for fast gradients estimation to perform gradient mutation.
HMGE algorithm provides an appropriate balance in use of gradient-based and GA-based techniques in
optimization process. As a result, HMGE is very efficient in finding the global Pareto frontier, and demonstrates high
convergence towards local Pareto frontiers.
DDRSM requires just 4-7 model evaluations for estimating gradients regardless of task dimension, and provides
a high convergence accuracy even for models with high non-linearity. Synergy of these features brings HMGE on
unparallel level of efficiency and scalability when compared to the state of the art commercial multi-objective
optimization algorithms.
HMGE is believed to be the first global multi-objective optimization algorithm which (a) has high convergence
typical for gradient-based methods; (b) is very efficient in finding the global Pareto frontier; (c) efficiently and
accurately solves multi-objective optimization tasks with dozens and hundreds of design variables.
Comparison of HMGE with SPEA-SQP and NSGA-SQP hybrid algorithms shows that HMGE in all the test
cases spends 2-50 times less model evaluations and shows a better diversity of optimal points. This confirms that
HMGE’s hybridization schema of gradient- and GA- techniques is more efficient compared to [4].
Comparison of HMGE with state of the art commercial multi-objective optimization algorithms NSGA-II,
AMGA, and Pointer on a number of challenging benchmarks has shown that HMGE finds the global Pareto frontiers
2-10 times faster. This eliminates the need for DOE and surrogate models for global approximation, and instead
allows for one to apply HMGE directly for optimization of computationally expensive simulation models even
without use of parallelization and cluster computing. Since HMGE algorithm supports parallelization as well, it
allows an additional reducing of optimization time in 4-8 times.
HMGE is the best choice for solving global multi-objective optimization tasks for simulation models with
moderate estimation time when 200-500 model evaluations are considered as a reasonable number of model
evaluations for finding Pareto optimal solutions.
HMGE and other eArtius optimization algorithms are implemented in eArtius design optimization product
Pareto Explorer. Also, the algorithms are available as plug-ins for the most popular design optimization environments
Noesis OPTIMUS, ESTECO modeFrontier, and Simulia Isight. Additional information about eArtius design
optimization technology can be found on the website www.eartius.com.

References
1
M. Brown, R. E. Smith Directed Multi-Objective Optimization. International Journal of Computers, Systems and
Signals, Vol. 6, No. 1, 2005
2
Peter Bosman, Edwin de Jong Combining Gradient Techniques for Numerical Multi–Objective Evolutionary
Optimization, Genetic and Evolutionary Computation Conference, GECCO 2006
3
Ralf Salomon, Evolutionary Algorithms and Gradient Search: Similarities and Differences, IEEE Transactions on
Evolutionary Computation, Vol. 2, No. 2, July 1998
4
Xiaolin Hu, Zhangcan Huang, Zhongfan Wang, Hybridization of the multi-objective evolutionary algorithms and
the gradient-based algorithms The 2003 Congress on Evolutionary Computation, 2003. CEC '03.
5
Pradyumn Shukla “Gradient Based Stochastic Mutation Operators in Evolutionary Multi-objective Optimization”
Adaptive and Natural Computing Algorithms: 8th International Conference, Warsaw, Poland, 2007.
6
Vladimir Sevastyanov, Oleg Shaposhnikov Gradient-based Methods for Multi-Objective Optimization. Patent
Application Serial No. 11/116,503 filed April 28, 2005.
7
US Patent # 7,593,834, 2009. Lev Levitan, Vladimir Sevastyanov The Exclusion of Regions Method for
Multi-Objective Optimization.
8
Domingo Ortiz-Boyer, César Hervás-Martínez, Nicolás García-Pedrajas, CIXL2: A Crossover Operator for
Evolutionary Algorithms Based on Population Features, Journal of Artificial Intelligence Research (JAIR), Volume
24, 2005.
9
M. M. Raghuwanshi, P. M. Singru, U. Kale and O. G. Kakde, Simulated Binary Crossover with Lognormal
Distribution, In Proceedings of the 7th Asia-Pacific Conference on Complex Systems (Complex 2004).
10
K. Deb, S. Agrawal, A. Pratap, and T. Meyarivan. A fast and elitist multi-objective genetic algorithm: NSGA-II.

14

IEEE Transactions on Evolutionary Computation, 6(2):182-197, 2002.
11
Santosh Tiwari, Georges Fadel, Patrick Koch, Kalyanmoy Deb. AMGA: An Archive-based Micro Genetic
Algorithm for Multi-objective optimization. Proceedings of the 10th Annual Conference on Genetic and
Evolutionary Computation. Atlanta, GA, USA, 2008.
12
Bellman, R.E. 1957. Dynamic Programming. Princeton University Press, Princeton, NJ.
13
Zitzler, E., Deb, K. and Thiele, L., “Comparison of multi-objective evolutionary algorithms: Empirical results,”
Evohitionory Cornputofion. 2000, 8(2), 125-148.
14
Zitzler, E. and Thiele, L., An Evolutionary Algorithm for Multi-objective Optimization: The Strength Pareto
Approach. Technical Report 43, Zurich, Switzerland Computer Engineering and Network Laboratory, Swiss Federal
Institute of Technology.
15
Binh and Korn. MOBES: A multi-objective Evolution Strategy for constrained optimization Problems. In
Proceedings of the 3rd International Conference on Genetic Algorithm MENDEL 1997, Brno, Czech Republic,
pp.176-182.
16
Osycza, A. and Kundu, S. (1995). A new method to solve generalized multicriteria optimization problems using
the simple genetic algorithm. Structural Optimization (10). 94-99.

15

Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective Optimization

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective Optimization

Semelhante a Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective Optimization (20)

Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective Optimization