Metaheuristics Local Methods
The term heuristic in the expression heuristic optimization originates from ancient Greek, where heuriskein (?????????v) means to discover or to learn. There is an even more subtle meaning of this word, as revealed by the following example: assume that the readers of this book are challenged to find and mark in the text the position of the words metaheuristic or metaheuristics. Each of us has a specific strategy to meet the challenge. Nevertheless, in general, there are two categories of readers: readers who systematically analyze the text, trying not to miss occurrences, and readers who approach the text diagonally, relying on their visual capacity of pattern recognition, without actually looking at every word. We say that the first category of readers performs an exhaustive search (like a computer), while the second category makes a heuristic search (like a living entity, not necessarily consciously). Obviously, the research duration of the first type of readers is usually longer than that of the second type of readers. However, it is very likely that the first type of readers will be able to find the most occurrences, while the second type of readers could miss enough occurrences. Finally, when comparing the performance of the two categories of readers, it can be seen that the number of occurrences found by the first category is surprisingly close to the number of occurrences marked by the second category, despite the big difference between the search durations.
Chapters 1 and 2 of this book are devoted to the description of a collection of optimization methods that simulate the second type of readers attempt. Such methods actually are inspired either by the behavior of biological entities/systems or by the evolution of some natural phenomena. Next, the discussion focuses on a special class of optimizat
ion problems in engineering, more specifically on the class of granular optimization. This concept is explained in the following.
The optimization problem of concern is formulated in the context of a cost function (or criterion), as defined below:
where the search space S is usually a bounded subset of Rnx. (Sometimes, f is referred to as fitness.) What makes the criterion f so special? On the one hand, it has a fractal variation, with many ruptures (and thus with many local extremes). On the other hand, it is quite difficult (if not impossible) to evaluate its derivatives in complete form (provided that they exist). In terms of regularity, the f criterion is locally continuous or derivable, but this property does not extend to the global variation. (The criterion could be globally smooth, but this very seldom happens.) In turn, we can evaluate the f criterion for each point x of research space S, according to some a priori known mathematical expression. Thus, in order to solve the optimization problem:
which means to find the global optimum of f and the optimum point , t is only possible to compare various criterion values in points of research space. As the search has to end after a reasonable duration, only a finite discrete subset of S, say D, can be used for this purpose. We aim not to miss the global optimum, and therefore the D subset has to include a very large number of points to test.
Problem [1.2] is then relaxed, being replaced by the following problem:
where, as already mentioned, D is a finite discrete subset. Due to their large number, the test points of D are referred to as granules or grains. Consequently, [1.3] becomes a granular optimization problem. Solving problem [
1.3] means in this context finding the grain of D, which is located as close as possible to the global optimum point of f.
The following example can illustrate the principle of granulation. Assume that the following adage: Ev ????o? o ?????o?. (/En arheos o aritmos./) has to be translated (it belongs to the famous mathematician and physicist Archimedes). We want the closest translation. Obviously, the criterion here is the map between the ancient Greek and a modern language, say English. The search space S is a dictionary with many words (a few tens of thousands), granular by nature (a grain being associated here with a word). In order to perform the translation, we may want first to insulate the subdictionary D including all words of S that begin with ?(/a/), ?(/e/) and o (/o/). Next, an appropriate search strategy has to be set, as checking all words of D (still very large) is unthinkable. By adopting a heuristic-like strategy, the subdictionary size can be reduced, as soon as the next letters composing the words to translate are taken into account. Finally, D only includes the words to translate and we thus obtain a first but coarse translation: in, ancient/antique, is/was, number. However, a refinement is necessary so that the good meaning of the adage is found. A second optimization problem can thus be stated, but, this time, by accounting for all synonyms of already found words. We even can add all usual expressions containing those words and synonyms. Since the size of restricted subdictionary stays small, we can now adopt exhaustive search as suitable strategy. We then reach for the closest translation to the adage spirit: the number was in the beginning.
The methods for solving granular optimization problems should lead to numerical computer algorithms; otherwise, they are not really useful in engineering. The strategies adopted in the
previous example can easily be followed by a human being, but the computer needs programming in order to perform similar reasoning. Thus, first, the words of the S dictionary are recorded to memory locations addressed by the American Standard Code for Information Interchange (ASCII) codes of their letters (S thus becomes an electronic dictionary). In this manner, the exhaustive search is avoided (no need to parse all dictionary addresses until the word is found). Second, an expert system of usual expression in the modern language has to be designed and implemented. Here, again, the memory addresses have to be generated in a way such that the exhaustive search can be avoided (if possible). Third, in order to increase the search speed, some statistic database pointing to the most employed words of dictionary can be built into the dictionary. The modern automatic translators are based on the principles of heuristic strategies, as stated above.
Concerning problem [1.3], the main goal is to design solving methods that can avoid the exhaustive search or, at least, that are only adopting this strategy as a final stage, on a restricted search subset. Moreover, such methods should lead to the numerical algorithms to implement on computer. Obviously, by avoiding the exhaustive search, the global optimum could be missed, but, in turn, there is a hope that the search speed increases sensibly, while the accuracy stays good enough. There is a preference for methods allowing the user to control the trade-off between the search speed and the optimal solution accuracy, in spite of their higher complexity (some of these methods are described in this chapter). For the other (usually less complex) methods, it is up to the user to select, or not, one of them in applications.
The heuristic methods that can be implemented on a computer are referred to as metaheuristics. They rely on the following basic principle: the search for optimum
actually simulates either the behavior of a biologic system or the evolution of a natural phenomenon, including an intrinsic optimality mechanism. For this reason, a new optimizations branch has developed in the past 20 years, namely Evolutionary Programming. All numerical algorithms designed from metaheuristics are included into this class of optimization techniques.
In general, all metaheuristics are using a pseudo-random engine to select some parameters or operations that yield estimation of optimal solution. Therefore, the procedures to generate pseudo-random (numerical) sequences (PRSs) are crucial in metaheuristics design. When speaking about pseudo-random numbers, their probability density should a priori be specified. In the case of metaheuristics, two types of probability densities are generally considered: uniform and normal (Gaussian). Thus, the following types of generators are useful:
1) uniformly distributed pseudo-random sequences generator (U-PRSG);
2) prescribed probability distribution pseudo-random sequences generators (P-PRSGs).
In Appendices 1 and 2 of this book, algorithms of both generator types are described. They are simple and effective. Sometimes (although rather seldom), more complex and sophisticated algorithms are preferred in applications.
Unfortunately, the use of pseudo-random numbers in conjunction with metaheuristics makes impossible the formulation of any general and sound result related to convergence. The only way to deal with convergence is to state some conjectures for those metaheuristics that seemingly are successful in the most applications. If the natural optimality mechanism...