Optimizing Recommendations for Clustering Algorithms Using Meta-Learning

Document Type

Conference Proceeding

Date of Original Version



The field of machine learning has seen explosive growth over the past decade, largely due to increases in technology and improvements of implementations. As powerful as machine learning solutions can be, they are still reliant on human input to select the optimal algorithms and parameters. Clustering algorithms, in particular, are typically chosen by trial and error, as researchers will select a number of algorithms and choose whichever provides the most desirable result.This study will use a process called meta-learning to evaluate and analyze datasets and extract a series of meta-features. These meta-features can then be used to intelligently recommend an optimal clustering algorithm without the cost of having to manually run the algorithm. To accomplish this, we will experiment using 135 datasets and determine their expected outcomes using only their meta-features. The outcomes being optimized are performance (accuracy) and runtime.Results are then ranked separately for performance and runtime and we can determine how accurately the learning model was able to choose the optimal algorithm for each objective.With respect to runtime, we are able to predict the top-performing algorithm 71.1% of the time, one of the top two algorithms 89.6% of the time, and an algorithm in the top three 93.3% of the time. Performance is correctly predicted in the top two 50.4% of the time and in the top three at a rate of 63.7%.

Publication Title

Proceedings of the International Joint Conference on Neural Networks