Journal of Computer Science and Technology ›› 2019, Vol. 34 ›› Issue (6): 1241-1257.doi: 10.1007/s11390-019-1973-1

Special Issue: Artificial Intelligence and Pattern Recognition; Data Management and Data Mining

• Artificial Intelligence and Pattern Recognition • Previous Articles     Next Articles

Large-Scale Estimation of Distribution Algorithms with Adaptive Heavy Tailed Random Projection Ensembles

Momodou L. Sanyang1,2, Ata Kabán1, Member, IEEE   

  1. 1 School of Computer Science, University of Birmingham, Edgbaston, Birmingham B15 2TT, U. K.;
    2 School of Information Technology and Communication, University of The Gambia, Serekunda, The Gambia
  • Received:2018-09-06 Revised:2019-09-13 Online:2019-11-16 Published:2019-11-16
  • About author:Momodou L. Sanyang received his Ph.D. degree in computer science from the University of Birmingham, Birmingham, in 2017. Prior to that, he received his B.Sc. degree in mathematics and physics from the University of The Gambia (UTG), Serekunda, The Gambia, in 2006, and his M.Sc. degree in information systems and applications from the "National" Tsing Hua University, Hsinchu, in 2010. He is currently a senior lecturer in computer science at the University of The Gambia (UTG), and a honorary research fellow at the University of Birmingham, UK. His research interests include machine learning and data mining, dimensionality reduction, random projections, evolutionary computation and black box optimisation in high-dimensional settings.
  • Supported by:
    The work of Momodou L. Sanyang was partly funded by a Ph.D. scholarship from the Islamic Development Bank. The work of Ata Kabán is funded by the Engineering and Physical Sciences Research Council of UK under Fellowship Grant EP/P004245/1.

We present new variants of Estimation of Distribution Algorithms (EDA) for large-scale continuous optimisation that extend and enhance a recently proposed random projection (RP) ensemble based approach. The main novelty here is to depart from the theory of RPs that require (sub-)Gaussian random matrices for norm-preservation, and instead for the purposes of high-dimensional search we propose to employ random matrices with independent and identically distributed entries drawn from a t-distribution. We analytically show that the implicitly resulting high-dimensional covariance of the search distribution is enlarged as a result. Moreover, the extent of this enlargement is controlled by a single parameter, the degree of freedom. For this reason, in the context of optimisation, such heavy tailed random matrices turn out to be preferable over the previously employed (sub-)Gaussians. Based on this observation, we then propose novel covariance adaptation schemes that are able to adapt the degree of freedom parameter during the search, and give rise to a flexible approach to balance exploration versus exploitation. We perform a thorough experimental study on high-dimensional benchmark functions, and provide statistical analyses that demonstrate the state-of-the-art performance of our approach when compared with existing alternatives in problems with 1 000 search variables.

Key words: covariance adaptation; estimation of distribution algorithm; random projection ensemble; t-distribution;

[1] Larrañaga P, Lozano J A. Estimation of Distribution Algorithms:A New Tool for Evolutionary Computation (2002 edition). Kluwer Academic Publishers, 2002.
[2] Yuan B, Gallagher M. On the importance of diversity maintenance in estimation of distribution algorithms. In Proc. the 2005 Genetic and Evolutionary Computation Conference, June 2005, pp.719-729.
[3] Friedman J H. An overview of predictive learning and function approximation. In From Statistics to Neural Networks:Theory and Pattern Recognition Applications, Cherkassky V, Friedman J H, Wechsler H (eds.), Springer-Verlag Berlin Heidelberg, 1994, pp.1-61.
[4] Kabán A, Bootkrajang J, Durrant R J. Toward large-scale continuous EDA:A random matrix theory perspective. Evolutionary Computation, 2016, 24(2):255-291.
[5] Dong W, Yao X. Covariance matrix repairing in Gaussian based EDAs. In Proc. the 2007 IEEE Congress on Evolutionary Computation, September 2007, pp.415-422.
[6] Paul T K, Iba H. Linear and combinatorial optimizations by estimation of distribution algorithms. In Proc. the 9th MPS Symposium on Evolutionary Computation, Jan. 2002, pp.99-106.
[7] Ros R, Hansen N. A simple modification in CMA-ES achieving linear time and space complexity. In Proc. the 10th International Conference on Parallel Problem Solving from Nature, September 2008, pp.296-305.
[8] Bosman P. On empirical memory design, faster selection of Bayesian factorizations and parameter-free Gaussian EDAs. In Proc. the 2009 Genetic and Evolutionary Computation Conference, July 2009, pp.389-396.
[9] de Bonet J S, Isbell C L, Viola P. MIMIC:Finding optima by estimating probability densities. In Proc. the 1997 International Conference on Neural Information Processing Systems, December 1996, pp.424-430.
[10] Bosman P A N, Thierens D. An algorithmic framework for density estimation based evolutionary algorithms. Technical Report, Utrecht University, 1999., June 2019.
[11] Armañanzas R, Inza I, Santana R et al. A review of estimation of distribution algorithms in bioinformatics. BioData Mining, 2008, 1:Article No. 6.
[12] Weicker K, Weicker N. On the improvement of coevolutionary optimizers by learning variable interdependencies. In Proc. the 1999 Congress on Evolutionary Computation, July 1999, pp.1627-1632.
[13] Dong W, Chen T, Tiňo P, Yao X. Scaling up estimation of distribution algorithm for continuous optimisation. IEEE Transaction of Evolutionary Computation, 2013, 17(6):797-822.
[14] Yang Z, Tang K, Yao X. Multilevel cooperative convolution for large scale optimization. In Proc. IEEE World Congress on Computational Intelligence, June 2008, pp.1663-1670.
[15] Molina D, Lozano M, Sánchez A M, Herrera F. Memetic algorithms based on local search chains for large scale continuous optimisation problems:MA-SSW-Chains. Soft Computing, 2011, 15(11):2201-2220.
[16] Shin S Y, Cho D Y, Zhang B T. Function optimization with latent variable models. In Proc. the 3rd International Symposium on Adaptive Systems, March 2001, pp.145-152.
[17] Sanyang M L, Kabán A. REMEDA:Random embedding EDA for optimising functions with intrinsic dimension. In Proc. the 14th International Conference on Parallel Problem Solving from Nature, September 2016, pp.859-868.
[18] Sanyang M L, Kabán A. Heavy tails with parameter adaptation in random projection based continuous EDA. In Proc. the 2015 IEEE Congress on Evolutionary Computation, May 2015, pp.2074-2081.
[19] Yao X, Liu Y, Lin G. Evolutionary programming made faster. IEEE Transaction on Evolutionary Computation, 1999, 3(2):82-102.
[20] Gao B, Wood I. TAM-EDA:Multivariate t distribution, archive and mutation based estimation of distribution algorithm. ANZIAM Journal, 2014, 54:720-746.
[21] Gao B. Estimation of distribution algorithms for singleand multi-objective optimization[Ph.D. Thesis]. School of Mathematics and Physics, The University of Queensland, 2014.
[22] Sanyang M L, Durrant R J, Kabán A. How effective is Cauchy-EDA in high dimensions? In Proc. the 2016 IEEE Congress on Evolutionary Computation, July 2016, pp.3409-3416.
[23] Dang D C, Lehre P K, Nguyen P T H. Level-based analysis of the univariate marginal distribution algorithm. Algorithmica, 2018, 81(2):668-702.
[24] Kabán A. On compressive ensemble induced regularisation:How close is the finite ensemble precision matrix to the infinite ensemble? In Proc. the 2017 International Conference on Algorithmic Learning Theory, October 2017, pp.617-628.
[25] Kabán A. New bounds for compressive linear least squares regression. In Proc. the 17th International Conference on Artificial Intelligence and Statistics, April 2014, pp.448-456.
[26] Achlioptas D. Database-friendly random projections:Johnson-Lindenstrauss with binary coins. Journal of Computer and System Sciences, 2003, 66(4):671-687.
[27] Grahl J, Bosman P A, Rothlauf F. The correlation-triggered adaptive variance scaling IDEA. In Proc. the 2006 Genetic and Evolutionary Computation Conference, July 2006, pp.397-404.
[28] Eiben A E, Hinterding R, Michalewicz Z. Parameter control in evolutionary algorithms. IEEE Transaction on Evolutionary Computation, 1999, 3(2):124-141.
[29] Wong Y Y, Lee K H, Leung K S, Ho C W. A novel approach in parameter adaptation and diversity maintenance for genetic algorithms. Soft Computing, 2003, 7(8):506-515.
[30] Beyer H G, Schwefel H P. Evolution strategies-A comprehensive introduction. Natural Computing, 1999, 1(1):3-52.
[31] Tang K, Li X D, Suganthan P N, Yang Z, Weise T. Benchmark functions for the CEC'2010 special session and competition on large scale global optimization. Technical report, Nature Inspired Computation and Applications Laboratory, 2012., June 2019.
[32] Abramowitz M, Stegun I A, Morse P M (eds.). Handbook of Mathematical Functions with Formulas, Graphs and Mathematical Tables. National Bureau of Standards, 1964.
[1] Einollah Pira. Using Markov Chain Based Estimation of Distribution Algorithm for Model-Based Safety Analysis of Graph Transformation [J]. Journal of Computer Science and Technology, 2021, 36(4): 839-855.
[2] Concha Bielza, Juan A. Fernández del Pozo, and Pedro Larrañaga. Parameter Control of Genetic Algorithms by Learning and Simulation of Bayesian Networks —— A Case Study for the Optimal Ordering of Tables [J]. , 2013, 28(4): 720-731.
[3] Nan Ding, Shu-De Zhou, and Zeng-Qi Sun. Histogram-Based Estimation of Distribution Algorithm: A Competent Method for Continuous Optimization [J]. , 2008, 23(1): 35-43 .
Full text



[1] Liu Mingye; Hong Enyu;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] Chen Shihua;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] Tang Tonggao; Zhao Zhaokeng;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] Zhong Renbao; Xing Lin; Ren Zhaoyang;. An Interactive System SDI on Microcomputer[J]. , 1987, 2(1): 64 -71 .
[9] Min Yinghua;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[10] Zhu Hong;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
  Copyright ©2015 JCST, All Rights Reserved