## Research

### Dependency structure matrix genetic algorithm II

**Introduction**

In 2015, we proposed the dependency structure matrix genetic algorithm II (DSMGA-II) [1], which is currently a state-of-the-art discrete genetic algorithm. Based on the dependency structure matrix (DSM), a new linkage model called the incremental linkage set (ILS) is adopted in DSMGA-II to provide potential models for mixing. Restricted mixing and back mixing are the major recombination operators of DSMGA-II. Such a combination significantly reduces the number of function evaluations (NFE) compared with those of other optimal mixing operators. Experiment results show that DSMGA-II requires fewer function evaluations than the linkage tree genepool optimal mixing evolutionary algorithm (LT-GOMEA) [2] and the hierarchical Bayesian optimization algorithm (hBOA) [3], two previously used state-of-the-art genetic algorithms, on various benchmark problems.

DSMGA-II consists of four major components: pairwise linkage detection, model building, restricted mixing and back mixing. First, DSMGA-II randomly initializes a population. Then, to enhance the quality of the pairwise linkage model and reduce noise in the population, DSMGA-II performs bit-flipping greedy hill climbing (GHC) on each chromosome. Performing GHC before linkage model building can further ensure pairwise linkage information between genes by ruling out trivial cases that can be solved without linkage information. After initializing the population, only the selected chromosomes are used to build the linkage model. DSMGA-II adopts mutual information as a pairwise linkage measure and stores linkage information in a DSM. Then, recombination via the restricted and back mixings proceeds with the original population. To prevent overﬁtting due to frequent model building, the algorithm updates DSM once every O(

*l*) generations, where*l*is the problem size. Note that the population after selection is only used for updating the DSM; the rest of the algorithm proceeds with the original population. The pseudo code of DSMGA-II is presented in Algorithms 1, 2, and 3.**Performance of DSMGA-II**

Six types of linkage benchmark problems are tested, including four classical linkage-underlying problems and two real-world problems. These benchmark problems cover the wide range of different characteristics of real-world problems. Their mathematical formulations are listed in the following table.

The experimental results are shown in Figures 1 to 3. Figure 1 shows that the diﬀerences between DSMGA-II and the other algorithms become larger as the degree of overlap decreases. During the initial back mixing process, a few function evaluations are performed to refine the graph. As the linkage information becomes clearer, the correct subsolution stands out, and back mixing causes the subsolution to quickly dominate the population. This approach makes the algorithm more eﬃcient. However, if the problem structures severely overlap, the graph reﬁning process is prolonged, and hence, the NFE increases. The results indicate that DSMGA-II is capable of handling problems with overlapping structures, even for randomly generated problem landscapes. In other words, the ILS model expresses overlapping relations well.

**Figure 1: Scalability of DSMGA-II, LT-GOMEA and hBOA on NK-landscape problems with various degrees of overlap.**

**Figure 2: Scalability of DSMGA-II, LT-GOMEA and hBOA on the problems of deceptive variants.**

The results pertaining to deceptive variants are shown in Figure 2. For the concatenated trap, the slope of DSMGA-II decreases asa the problem becomes larger. For the cyclic trap, the scalabilities of all algorithms appear very similar to that of DSMGA-II, which is the lowest by a constant factor. As previously mentioned, the cyclic trap cannot be solved eﬃciently merely by the correct problem decomposition. The use of linkage information is also key. ILS automatically extends the mask for trials and stops on the ﬁrst successful recombination to avoid spending unnecessary function evaluations, while many other genetic algorithms utilize the linkage models determined by certain thresholds, which are sensitive to the parameter settings. The folded trap contains a large number of local optima that reside on plateaus. Performing an eﬃcient search without losing too much diversity is the key to conquering such a diﬃculty. Back mixing leads to the drift of subsolutions as necessary, and restricted mixing checks whether the trial solution is unique in the population to maintain diversity. As a result, DSMGA-II shows a good ability to address attractions without trading eﬃciency for diversity.

**Figure 3: Scalability of DSMGA-II, LT-GOMEA and hBOA on Spin-glass and MAX-SAT (*LT- GOMEA fails to reach the global optima for two instances with**

*l*= 100 on MAX-SAT). For the Ising spin-glass problems, the slope of DSMGA-II decreases as the problem becomes larger (Figure 3). This trend is probably observed because the problems contain a large number of plateaus, for which the advantages of DSMGA-II hold. The overall amount of time spent on ﬁtness evaluations appears to grow polynomially as O(

*n*^{3}) for DSMGA-II, which is close to the best known result of problem-speciﬁc algorithms for Ising spin-glass problems. For MAX-SAT problems, although the NFE of DSMGA-II still grows exponentially (because it is a well-known NP-complete problem), DSMGA-II consumes fewer functional evaluations than hBOA and LT-GOMEA. In 2017, we proposed modifying the linkage model [4] by separating two different types of linkage and produced a two-edge graphical model (Figure 4). With this modified linkage model, DSMGA-II consumes 5~30% fewer function evaluations on the abovementioned test problems.

In summary, genetic algorithms have been widely used for black-box optimization in many different areas of engineering. Having a more competent optimizer means that we can have a better performing system, a stronger architecture, a faster bullet train (Figure 5), and most importantly, a more convenient life.

**Figure 4: (a) The original linkage model in DSMGA-II. (b) The two-edge graphical model, where the homogenous linkages are black and the heterogeneous linkages are red. (c) and (d) Two cases of ILS for different alleles.**

**Figure 5: The shape of the nose cone of the N700 bullet train was designed by genetic algorithms.**

**References**

1. Hsu, S.-H and Yu, T.-L. (2015). Optimization by pairwise linkage detection, incremental linkage set, and restricted/back mixing: DSMGA-II.

*Annual Conference on Genetic and Evolutionary Computation*, 519–526.2. Thierens, D. and Bosman, P. A. (2011). Optimal mixing evolutionary algorithms. In

*Proceedings of the 13th annual conference on Genetic and Evolutionary Computation Conference*, 617–624.3. Pelikan, M. and Goldberg, D. E. (2001). Escaping hierarchical traps with competent genetic algorithms. In

*Proceedings of Genetic and Evolutionary Computation Conference*, 511-518.4. Chen, P.-L., Peng, C.-J., & Yu, T.-L. (2017). Two-edge graphical linkage model for DSMGA-II. In

*Proceedings of the Genetic and Evolutionary Computation Conference*, 745–752.**Tian-Li Yu**

Department of Electrical Engineering