LPIS Home Page
Google Search

Title: Clustering Classifiers for Knowledge Discovery from Physically Distributed Databases
Author(s): G. Tsoumakas, L. Angelis, I. Vlahavas.
Availability: Click here to download the PDF (Acrobat Reader) file (20 pages).
Keywords: Multi DBs, Knowledge discovery, Machine learning.
Appeared in: Data and Knowledge Engineering, Elsevier, 49(3), pp. 223-242, 2004.
Abstract: Most distributed classification approaches view data distribution as a technical issue and combine local models aiming at a single global model. This however, is unsuitable for inherently distributed databases, which are often described by more than one classification models that might differ conceptually. In this paper we present an approach for clustering distributed classifiers in order to discover groups of similar classifiers and thus similar databases with respect to a specific classification task. We also show that clustering distributed classifiers as a pre-processing step for classifier combination enhances the achieved predictive performance of the ensemble.
See also :

        This paper has been cited by the following:

1 Neves, J.B.J., Vieira, M.T.P. (2004) A Hypotheses-based Method for Identifying Skewed Itemsets, Proc. Brazilian Symposium on Databases, Brasilia, Brazil, October 2004.
2 Viswanathan, M., Yang, Y.K., Whangbo, T.K. (2005) Distributed Data Mining on Clusters with Bayesian Mixture Modeling, Proc. 2nd Int. Conf. on Fuzzy Systems and Knowledge Discovery (FSKD 2005), LNAI 3613, pp. 1207-1216
3 an, Chi-kit (2005) Design and analysis of agent-based FMS control Systems, PhD Thesis, Department of Industrial and Manufacturing Systems Engineering, The University of Hong Kong.
4 Zhang, S., McClean, S., and Scotney, B. (2006) Model-based clustering on semantically heterogeneous distributed databases on the internet, AAAI 2006 Fall Symposium on Semantic Web for Collaborative Knowledge Acquisition.
5 Fu, H., Kechadi, T. (2006) A new Distributed Data Mining system on Grid, Proc. ECML/PKDD 2006 Workshop on Autonomic Computing: A New Challenge for Machine Learning, Berlin Germany, September 2006
6 Pedrycz, W. (2007) Collaborative and Knowledge-Based Fuzzy Clustering, International Journal of Innovative Computing, Information and Control, 3(1), pp 1-12, February 2007.
7 Pedrycz, W. (2007) Distributed and Collaborative Soft Computing: An Emerging Development Environment, Proc. 2007 International Conference on Computing: Theory and Applications (ICCTA'07), pp. 231-240.
8 Brazhnik, O. (2007) Databases and the geometry of knowledge, Data and Knowledge Engineering, Volume 61, Issue 2, May 2007, pp. 207-227.
9 Brazhnik, O., , J.F. (2007) Anatomy of data integration, Journal of Biomedical Informatics 40(3), June 2007, pp. 252-269.
10 Pedrycz, W. (2007) Knowledge-based clustering in computational intelligence, Studies in Computational Intelligence 63, pp 317-341, 2007.
11 Zhang, S., McClean, S., Scotney, B. (2007) Knowledge Discovery from Semantically Heterogeneous Aggregate Databases Using Model-Based Clustering, Proc. 24th British National Conference on Databases, BNCOD 24, Glasgow, UK, July 3-5, 2007, pp. 190-202.
12 Cios, K., Swiniarski, R., Pedrycz, W., Kurgan, L. (2007) Data Mining: A Knowledge Discovery Approach, Springer, ISBN: 978-0387333335.
13 Pedrycz W. (2007) Distributed and collaborative fuzzy modeling, Iranian Journal of Fuzzy Systems 4(1), pp. 1-19, Apr 2007
14 Pedrycz, W., Rai, P., and Zurada, J. (2008) Experience-consistent modeling for radial basis function neural networks, International Journal of Neural Systems, 18(4), pp. 279-292
15 Yang, W., Huang, S. (2008). Data privacy protection in multi-party clustering. Data and Knowledge Engineering, 67 (1), 185-199.
16 McClean, S., Scotney, B., Morrow, P., Greer, K. (2008) Integrating semantically heterogeneous aggregate views of distributed databases, Distributed and Parallel Databases 24 (1-3), pp. 73-94
17 Czarnowski, I., Jedrzejowicz, P., Wierzbowska, I. (2008) An A-Team Approach to Learning Classifiers from Distributed Data Sources, Proc. Second KES International Symposium, KES-AMSTA 2008, Incheon, Korea, March 26-28, 2008, pp. 536-546.
18 Pedrycz, W., (2009) Metastructural facets of granular computing. Int. J. Knowledge Engineering and Soft Data Paradigms, 1(1), 11-25.
19 Czarnowski, I., Jedrzejowicz, P. (2009) Distributed learning algorithm based on data reduction, ICAART 2009 - Proceedings of the 1st International Conference on Agents and Artificial Intelligence, pp. 198-203.
20 Forestier, G., Gancarski, P., Wemmert, C., (2010) Collaborative clustering with background knowledge, Data and Knowledge Engineering, 69(2), pp. 211-228.
21 Czarnowski, I. (2010) Prototype selection algorithms for distributed learning, Pattern Recognition 43 (6) pp. 2292-2300.
22 Czarnowski, I., Jedrzejowicz, P., Wierzbowska, I. (2010) An A-Team approach to learning classifiers from distributed data sources, International Journal of Intelligent Information and Database Systems, 4 (3), pp. 245-263.
23 Pedrycz, W (2010) The development of granular metastructures and their use in a multifaceted representation of data and models, Kybernetes, 39(7), pp.1184 1200
24 Czarnowski, I., Jedrzejowicz, P. (2010) Cluster Integration for the Cluster-Based Instance Selection, Computational Collective Intelligence. Technologies and Applications, Lecture Notes in Computer Science, 2010, Volume 6421/2010, pp. 353-362.
25 Peteiro-Barral, D., Guijarro-Berdias, B., Prez-Snchez, B. (2011) On the effectiveness of distributed learning on different class-probability distributions of data, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7023 LNAI, pp. 114-123.
26 Czarnowski, I., Jedrzejowicz, P. (2011) An agent-based framework for distributed learning, Engineering Applications of Artificial Intelligence, 24 (1), pp. 93-102.
27 Peteiro-Barral, D., Guijarro-Berdias, B., Prez-Snchez, B. (2011) Dealing with "very large" datasets: An overview of a promising research line: Distributed learning, ICAART 2011 - Proceedings of the 3rd International Conference on Agents and Artificial Intelligence, 1, pp. 476-481.
28 Czarnowski, I. (2011) Distributed learning with data reduction, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 6660, pp. 3-121.
29 Khot, L.R., Panigrahi, S., Doetkott, C., Chang, Y., Glower, J., Amamcharla, J., Logue, C., Sherwood, J. (2012) Evaluation of technique to overcome small dataset problems during neural-network based contamination classification of packaged beef using integrated olfactory sensor system, LWT - Food Science and Technology, 45 (2), pp. 233-240.