Top 10 data mining algorithms

This selection is the subject of a paper (1) that has appeared online in December 2007 (Knowledge and Information Systems).

In their paper, Wu and co-authors give a description of each algorithm as well as current and future research overviews. Here is the (unordered) list of the 10 chosen algorithms:
  • C4.5
  • K-means
  • SVM
  • A priori
  • EM
  • PageRank
  • AdaBoost
  • kNN
  • Naive Bayes
  • CART
If you want to know why these algorithms were chosen, have a look at the article. First, I was surprised to see PageRank in the list. Even if it is a famous algorithm, it is not known to be a data mining algorithm. However, authors show the links to social network and therefore data mining.
Most algorithms have been written by different authors. Thus, the style is very different throughout the article. The part on SVM is written by answering some specific questions and is therefore very interesting. The AdaBoost is written in a very exciting way (if you don’t know it, you will want to know more about it). Finally, the CART part is the longest (a bit too long, to my opinion) description among the 10 algorithms. At the end, this paper is a good overview of state of the art algorithms in data mining.
(1) Xindong Wu, Vipin Kumar, J. Ross Quinlan, Joydeep Ghosh, Qiang Yang, Hiroshi Motoda, Geoffrey J. McLachlan, Angus Ng, Bing Liu, Philip S. Yu, Zhi-Hua Zhou, Michael Steinbach, David J. Hand and Dan Steinberg, Top 10 Algorithms in Data Mining, Knowledge and Information Systems, 14(2008), 1: 1-37.

0 comments:

Post a Comment

Translate this blog:

Copyrights

Disclaimer

No responsibility is taken for any potential inaccuracies and/or errors in the text, and any damages that are incurred through the use of this material.Most of the material is sourced from internet and taken from other scientific blogs(text or idea;as well as pictures)& if you find any of your copyright material which you donot wish to appear on this blog,kindly inform at the e-mail id:ooogyx@gmail.com and it will be promptly removed.All the opinions expressed on this blog are solely of the author and not of any organization or institution.

  © MAD_HELIX

Design by OogYx