In this instance of the competition, we continue the successful 2013 and 2015 competitions and rely on an unchanged problem set and similar measures. However, we try out two new aspects regarding the evaluation criteria of the participants:
The aim of the competition is to provide a common platform that encourages fair and easy comparisons across different niching algorithms. The competition allows participants to run their own niching algorithms on 20 benchmark multimodal functions with different characteristics and levels of difficulty. Researchers are welcome to evaluate their niching algorithms using this benchmark suite, and report the results by submitting a paper to the main tracks of GECCO (i.e., submitting via the online submission system of GECCO'2016).
The GECCO 2016 competition on Niching Methods for Multimodal Optimization largely follows the procedures of the 2013/2015 CEC niching competitions. The set of problems is the same, and the interface is not changed. However, we modify the submission report format (the files that contain the best solutions found within a run) slightly by adding some new information. The participants have to follow the format described in the SUBMISSION section bellow.
The new formats includes information for the final solution set found by the algorithm for each execution run. It includes the search space coordinates of a reported solution, its fitness value and information on the resources used until this solution has been found (number of used fitness evaluations, and the time). Time measurements are not essential but rather needed for checking that the generated order is correct. It may also be employed for rough comparison of the overall time (wall time) the algorithms need for generating a solution set, but this is of no importance for this issue of the competition.
The other important change concerns the performance criteria. As before, we fix the maximum amount of evaluations per run and problem and count the number of detected peaks as our first measure. This is also known as recall in information retrieval, or sensitivity in binary classification. Recall is defined as the number of successfully detected items (out of the set of relevant items), divided by the number of relevant items.
Additionally, we will also look at 2 more measures, namely the static F1 measure (after the budget has been used up), and the (dynamic) F1 measure integral. The F1 measure is usually understood as the product of precision and recall. Precision is meant as the fraction of relevant detected items compared to the overall number of detected items, in our case this is the number of detected peaks divided by the number of solutions in the report ("best") file. That is, the higher the number of duplicates or non-optimal solutions in the report, the worse the computed performance indicator gets. Ideally, the report file shall provide all sought peaks and only these, which would result in an F1 value of 1 (as both recall and precision would be 1).
The static F1 measure uses the whole report file and computes the F1 measures of complete runs, and these are then averaged over the number of runs and problems. The 3rd measure is more fine-grained and also looks at the time (in function evaluation counts) when solutions are detected. Therefore, track the report file line by line and compute the F1 value for each point in time when a new solution (line) is written, using all the information that is available up to that point. Thus we compute the F1 value "up to that time", and doing that for each step results in a curve. The F1 measure integral is the area under-the-curve (AUC), divided by the maximum number of function evaluations allowed for that specific problem. It is therefore also bounded by 1.
In the case if you cannot attend GECCO'16, we encourage you to submit a short report about your algorithm and results. You can submit the report directly to the special session organizer Michael Epitropakis, in order to be counted in the competition. More specifically, the authors are expected to submit the solutions found in this report, in order for us to validate the results. Please follow the format described in the SUBMISSION section bellow.
Test suite for the competition as well as the performance measures are implemented in Matlab, python, Java, and C/C++, and available for download here or from GitHub. Please refer to the following technical report for more information about these benchmark functions:
Reference: X. Li, A. Engelbrecht, and M.G. Epitropakis, ``Benchmark Functions for CEC'2013 Special Session and Competition on Niching Methods for Multimodal Function Optimization'', Technical Report, Evolutionary Computation and Machine Learning Group, RMIT University, Australia, 2013 [.bib].
This competition is supported by the newly established IEEE CIS Task force on Multi-modal Optimization
Each participant can submit more that one entries (algorithms). For each entry (algorithm) the participant has to provide a compressed folder, named preferably as the algorithm's name (eg DEnrand.zip), with the following ASCII text files. For each benchmark problem (20 benchmarks) and for each independent execution run (50 runs), the participant has to save the final solution set of the execution in a separate text file named for example: for benchmark 1, execution run 1 the filename should be: problem001run001.dat. A total of 50 x 20 = 1000 ASCII text files have to be submitted (compressed as stated above). Each text file should contain the final solution set that will be submitted to the organizers. Manual pre/post processing of the files is not allowed.
The basic format of each ASCII text file is:
Solution stands for the search space coordinates of a reported solution (tab or space separated), then we use the equal sign to mark that now the fitness values (only one in our case) follow. The "@" ends this block, and the last two columns stand for the number of used fitness evaluations until this solution has been found, and the time in milliseconds. As mentioned above, the latter is not essential but rather needed for checking that the generated order is correct. It may also be employed for rough comparison of the overall time (wall time) the algorithms need for generating a solution set, but this is of no importance for this issue of the competition.
In this version of the competition we will adopt three ranking procedures to explore different aspects of the submitted algorithms:
Final Ranking based on all three scenarios.
The final ranking for the competition is calculated as the ranking based on the average score across all three previous mentioned scenarios.
Analysis on all scenarios can be found in the competition's presentation which can be downloaded from here.
Data Science Institute,
Department of Management Science,
Lancaster LA1 4YX, UK.
ERCIS, WWU Muenster,
48149 Münster, Germany
South African Research Chair in Artificial Intelligence,
Department of Computer Science,
School of Information Technology,
University of Pretoria,
Pretoria 0002, South Africa.