Algorithms for predicting dichotomy on a dataset is always fascinated data scientists. It is one of the active research areas in machine learning. Classification algorithms are best suited for this kind of YES/NO identification. Some of the arrows in quiver are Logistic Regression, Bayesian Reference, Neural Networks, Support Vector Machines and even Decision Trees. How can we select the best among these algorithms? Simple answer is – the algorithm that works well with the data. Here the definition of “well” is complex and dynamic in most of the situations. Analysis on time and space complexity of algorithms with bias and variance they make on data (both sample and scoring) need to be optimized for a good selection. Algorithmic complexity of a non-linear SVM with a poly kernel is higher compared to a Decision Tree or Logistic Regression or Perceptron with back propagation. But the space complexity based on support vector is always less compared with other competing classification techniques. Bias-Variance tradeoff usually requires a causal-correlation analysis of data and most of the time Neural Networks performs well compared to other algorithms. But any perceptron based classification strategy (this is true for regression and classification techniques other than SV-based) has space complexity of N, that may lead to higher bias. An analytics platform on higher space complexity requires higher in-memory also, still a computing question on Big Data. An empirical comparison of Supervised Learning Algorithms can be found here: http://www.cs.cornell.edu/~caruana/ctp/ct.papers/caruana.icml06.pdf. IEEE Data Mining conference identifies C4.5, k-Means, SVM, EM, PageRank, AdaBoost, kNN, Naïve Bayes and CART as top 10 DM/ML algorithms (2006 http://www.cs.uvm.edu/~icdm/algorithms/index.shtml). Opinion differs person to person, statisticians will favor algorithms has roots in regression and probability theory like Logistic Regression and Naïve Bayes where computer scientists tends towards SVM and Neural Networks.
In my next post I’ll explain some tips on choosing algorithms (I am more a computer engineer than a statistician)