LPIS Home Page
Google Search

Title: An Ensemble of Classifiers for coping with Recurring Contexts in Data Streams
Author(s): I. Katakis, G. Tsoumakas, I. Vlahavas.
Availability: Click here to download the PDF (Acrobat Reader) file.
Keywords: text classification, ensemble, data stream, concept drift, classification.
Appeared in: 18th European Conference on Artificial Intelligence, IOS Press, pp. 763-764, Patras, Greece, 2008.
Abstract: This paper proposes a general framework for classifying data streams by exploiting incremental clustering in order to dynamically build and update an ensemble of incremental classifiers. To achieve this, a transformation function that maps batches of examples into a new conceptual feature space is proposed. An incremental clustering algorithm is then applied in order to group different concepts and identify reoccurring themes. The ensemble is produced by creating and training an incremental classifier for every concept discovered in the data stream. An experimental study is performed using three new real-world datasets from the text domain, a basic implementation of the proposed framework and three baseline methods for dealing with drifting concepts. Results are promising and encourage further investigation.
See also :

        This paper has been cited by the following:

1 Žliobaitė I., "Learning under Concept Drift: an Overview", Vilnius University, Faculty of Mathematics and Informatics, Technical Report, 2009.
2 Kosina, P., "Stream Mining and Meta-Learning", Diploma Thesis, Masaryk University, Faculty of Informatics, 2009.
3 Gama, J., Kosina, P. (2009) Tracking Recurring Concepts with Meta-learners, Proc. 14th Portuguese Conference on Artificial Intelligence, EPIA 2009, Aveiro, Portugal, October 12-15, 2009, pp. 423-434.
4 Giannikopoulos, P., Varlamis, I., Eirinaki, M. (2009) Mining Frequent Generalized Patterns for Web Personalization in the presence of Taxonomies, in International Journal of Data Warehousing and Mining, Vol. 6, No.1, pp. 4-15, August 2009
5 Žliobaitė, I. and Krilavičius, T. CLAN: Clustering for Credit Risk Assessment, Technical Report, Vilnius University, Department of Informatics, 2009.
6 Kantardzic, M., Ryu, J.W., Walgampaya, C. (2010) Building a new classifier in an ensemble using streaming unlabeled data. In Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part II (IEA/AIE'10), Nicolas Garcia Pedrajas, Francisco Herrera, Jose Benitez, Colin Fyfe, and Moonis Ali (Eds.), Trends in Applied Intelligent Systems, Lecture Notes In Computer Science, Vol. Part II. Springer-Verlag, 6097, Berlin, Heidelberg, 77-86.
7 Gama, J., Kosina, P. (2010) "Recurring Concepts and Meta-learners", 3rd Planning to Learn Workshop (PlanLearn) at ECAI 2010, Pavel Brazdil, Abraham Bernstein, Jorg-Uwe Kietz (eds) pp. 79-83, August 17, 2010, Lisbon, Portugal.
8 Ashok Venkatesan, Narayanan C. Krishnan, Sethuraman Panchanathan,(2010), "Cost-sensitive Boosting for Concept Drift",Proceedings of the First International Workshop on Handling Concept Drift in Adaptive Information Systems: Importance, Challenges and Solutions, HaCDAIS 2010, pp. 41-47, Held in conjunction with ECML/PKDD 2010, September 24, 2010, Barcelona, Spain.
9 Zliobaite I., "Adaptive Training Set Formation", PhD Thesis, Vilnius University, 2010.
10 J.W. Ryu, M. Kantardzic, and C. Walgampaya, "Ensemble Classifier based on Misclassified Streaming Data",in Proceedings of The Tenth IASTED International Conference on Artificial Intelligence and Applications (AIA 2010), February 15 17 2010, Innsbruck, Austria.
11 Sarah N. Kohail (2011) Learning Concept Drift Using Adaptive Training Set Formation Strategy, MSc Thesis, Faculty of Information Technology, The Islamic University of Gaza, October 2011.
12 Sobolewski, P., Woźniak, M. (2011) Artificial recurrence for classification of streaming data with concept shift, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 6943 LNAI, pp. 76-87.
13 Hoens, T.R., Chawla, N.V., Polikar, R. (2011) Heuristic Updatable Weighted Random Subspaces for non-stationary environments, Proceedings - IEEE International Conference on Data Mining, ICDM, art. no. 6137228, pp. 241-250.
14 Sobolewski,P.;Woźniak,M. (2012) Data with shifting concept classification using simulated recurrence. In Proceedings of the 4th Asian conference on Intelligent Information and Database Systems - Volume Part I (ACIIDS'12), Jeng-Shyang Pan, Shyi-Ming Chen, and Ngoc Thanh Nguyen (Eds.), Vol. Part I. Springer-Verlag, Berlin, Heidelberg, 403-412.