LPIS Home Page
Google Search

Title: Distributed Data Mining of Large Classifier Ensembles
Author(s): G. Tsoumakas, I. Vlahavas.
Availability: Click here to download the PDF (Acrobat Reader) file (6 pages).
Appeared in: Proc. (companion volume) 2nd Hellenic Conference on AI (SETN '02), I. Vlahavas, C. Spyropoulos (Ed.), pp. 249-255, Thessaloniki, Greece, 2002.
Abstract: Nowadays, classifier ensembles are often used for distributed data mining in order to discover knowledge from inherently distributed information sources and scale up learning to very large databases. One of the most successful methods used for combining multiple classifiers is Stacking. However, this method suffers from very high computational cost in the case of large number of distributed nodes. This paper presents a new classifier combination strategy that scales up efficiently and achieves both high predictive accuracy and tractability of problems with high complexity. It induces a global model by learning from the averages of the local classifiers' output. This way, fast and effective combination of large number of classifiers is achieved.
See also :

        This paper has been cited by the following:

1 Yousry El-Gamal, Osama Badawy, Ashraf Al-Jerjawy, Combining Multiple Classifiers System (CMC System) for Mining Distributed Databases using Meta-Learning Approaches, Proc. 5th International Conference on Information and Computer Science (ICICS 2004), downloaded from the following URL: (http://faculty.kfupm.edu.sa/COE/sadiq/proceedings/ICICS2004/toc.htm)
2 Yanyan Wei, Taoshen Li. "A Researches on Learning Mechanism Based on Stacking Framework", Journal of Guangxi Academy of Sciences, Vol.20, No.4, pp.231-233, 2004.
3 Mohamed Aounallah and Guy Mineau. Rule Confidence Produced From Disjoint Databases: a Statistically Sound Way to Regroup Rules Sets . in IADIS International Conference, Applied Computing 2004. pp II-27 - II-32, Lisbon, Portugal. March 23-26 2004.
4 D. Caragea. "Learning classifiers from distributed, semantically heterogeneous, autonomous data sources". PhD Thesis, Iowa University, USA, 2004.
5 Mohamed Aounallah and Guy Mineau. Le forage distribue des donnees: une methode simple, rapide et efficace. Revue des Nouvelles Technologies de l'Information, Extraction et Gestion de Connaissances, 1(E-6): 95-106, 2006.
6 Mitchell, S. Machine assistance in collection building: New tools, research, issues, and reflections (2006) Information Technology and Libraries, 25 (4), pp. 190-216.
7 Yanyan Wei, Taoshen Li. "A New Approach of Stacking Based on Voting". Computer Engineering, Vol.32, No.7, pp.199-201, 2006.
8 Yanyan Wei, Taoshen Li. "A Meta-learning Strategy Based on Stacking Framework". Journal of Communication and Computer 3(4), ISSN1548-7709, 2006.
9 Mohamed Aounallah. Le forage distribue des donnees: une approche basee sur lagregation et le raffinement de modeles. PhD Thesis, University of Laval, Canada, 2006.
10 Mohamed Aounallah and Guy Mineau. "Distributed Data Mining: Why Do More Than Aggregating Models", Proc. International Joint Conference on Artificial Intelligence (IJCAI 07), pp. 2645-2650, Hyderabad, India, January 6-12, 2007.