FUSION
FUnctionality Sharing In Open eNvironments
Heinz Nixdorf Chair for Distributed Information Systems
 

Aggregation of similarity measures in schema matching based on generalized mean

Title: Aggregation of similarity measures in schema matching based on generalized mean
Authors: Faten A Elshwimy, Amany Sarhan, Elsayed A Sallam
Source: 2014 IEEE 30th International Conference on Data Engineering Workshops (ICDEW)
Date: 2014-04-01
Type: Workshop Paper
Abstract:

Schema matching represents a critical step to integrate heterogeneous e-Business and shared-data applications. Most existing schema matching approaches rely heavily on similarity-based techniques, which attempt to discover correspondences based on various element similarity measures, each computed by an individual base matcher. It has been accepted that aggregating results of multiple base  matchers is a promising technique to obtain more accurate matching correspondences. A number of current matching systems use experimental weights for aggregation of similarities among different element matchers while others use machine learning approaches to find optimal weights that should be assigned to different matchers. However, both approaches have their own deficiencies. To overcome the limitations of existing aggregation strategies and to achieve better performance, in this paper, we propose a new aggregation strategy, called the AHGM strategy, which aggregates multiple element matchers based on the concept of generalized mean. In particular, we first develop a practical way to obtain optimal weights that will be assigned to each associated matcher for the given aggregation task. We then use these weights in our aggregation method to improve the performance of matcher combining. To validate the performance of the proposed strategy, we conducted a set of experiments, and the obtained results are encouraging.