International Journal of Computer Applications (0975 – 8887)
Volume 94 – No 2, May 2014
Falcon-AO++: An Improved Ontology Alignment System
Fatsuma Jauro
S.B. Junaidu
S.E. Abdullahi
Department of Mathematics
Ahmadu Bello University, Zaria.
Iya Abubakar Computer Center,
Ahmadu Bello University, Zaria
Department of Mathematics
Ahmadu Bello University, Zaria.
ABSTRACT
With the semantic web, data becomes machine-readable and
ontologies define the data. Ontologies in any domain are
heterogeneous due to rapid increase in ontology development
and differences in views of developers. Agents can fully
understand the data only if the correspondences between
ontologies are known. Various ontology alignment systems
have been developed to automatically discover such
correspondences. However, human involvement is still
indispensible because the results provided by fully automatic
systems are not always complete or precise. This paper
introduces Falcon-AO++, an extension of the Falcon-AO
alignment system that supports the interactive contribution of
a domain expert in the matching process. The evaluation
results have shown that contribution of an expert and
matching ability of matchers can improve alignment results.
General Terms
Ontology alignment
Keywords
Ontology matching, user input
1. INTRODUCTION
In the semantic web, ontologies describe domain by defining
terms and relations enabling machines to understand the
meaning and reason about data. Numerous ontologies have
been developed by different designers to describe the same
domain. These ontologies are heterogeneous due to variation
in views of their developers. Ontology heterogeneity could be
due to variation in used terms, depth or area of coverage etc.
Ontology heterogeneity causes the problem of variation in
meaning or ambiguity in terms interpretation. This is known
as semantic heterogeneity.
Overcoming semantic heterogeneity is typically achieved in
two steps [1]: (i) matching entities to determine an alignment
and (ii) interpreting an alignment according to application
needs, such as data translation or query answering. This paper
focuses on ontology matching.
Ontology alignment or matching tries to solve semantic
heterogeneity problems by means of discovering a map
between similar terms (entities) of two different ontologies of
the same domain. This enables applications using the
ontologies to understand information and interoperate.
Different systems have been developed to automatically
handle alignment process such as AOAS [2], OMReasoner
[3], RiMOM [4], CIDER [5] and Falcon-AO [6]. However,
the challenges faced by fully automatic methods are manifold,
including vocabulary differences (e.g., due to synonymy and
homonymy), modeling differences (e.g., due to different
model granularity or different attribute formats) and different
points of view on the modeled reality [7]. For many realworld datasets, fully automatic state-of-the-art tools still yield
results at a quality level that is unsatisfying for many use
cases [8]. Different researchers pointed out the need to
involve humans in alignment process. According to Sarasua et
al [9], Ontology alignment is still one of those problems that
we cannot automate completely, and having a human in the
loop might increase the quality of the results of machinedriven approaches.
Again Paulheim et al., [8] stated that there is an upper bound
to the quality of the alignment which is hard to exceed by
fully automatic ontology matching tools. Furthermore,
ontology matching is ―a very challenging problem for both
man and machine‖ which calls for semi-automatic approaches
combining the strengths of automatic matching algorithms
and the expertise of domain experts in the matching process.
In this paper, Falcon-AO++, a semi-automatic ontology
alignment system that combines the knowledge of domain
experts and the matching ability of matchers to align
ontologies is introduced. The system is an extension of the
Falcon-AO automatic ontology alignment system [6].
The paper is organised as follows: review of some ontology
alignment systems is given in Section 2, architecture of
Falcon-AO++ in Section 3, implemented processes in Section
4, evaluation and results in Section 5, and directions for future
improvement in Section 6.
2. RELATED WORK
Different tools have been developed to align ontologies. Most
of the systems apply different matching techniques. AOAS
[2],[10] is a domain-specific ontology matching system
developed specifically to align anatomical ontologies. It uses
different techniques and an external resource; Unified
Medical Language System (UMLS). The anatomy track of
OAEI (Ontology Alignment Evaluation Initiative) proved that
domain specific alignment systems perform better than
domain independent systems [11]. However such systems fail
in other domains. Falcon-AO++ is a domain independent
system. CHIMAERA [12] semi-automatically aligns and
merges ontologies. The system requires user intervention at
almost every stage which adds more stress to the user. FalconAO++ only requires users to provide input information.
OMReasoner [3] automatically performs ontology alignment
using string technique and WordNet. It also performs
reasoning about ontologies but does not consider structural
information which is very important especially when concept
names are meaningless. Falcon-AO++ uses structural
information to match entities. RiMOM [4] is an alignment
tool that employs different techniques to match ontologies at
about six different stages. The system can handle large
ontologies but it takes large amount of memory and very long
time. Falcon-AO++ uses portioning technique for large
ontologies which minimizes the amount of memory consumed
and execution time. ONION [13] resolves terminological
heterogeneity in ontologies and produces articulation rules for
mappings. Similar to Falcon-AO++, it has a GUI and a human
expert is involved. In ONION, the expert chooses, deletes, or
modifies suggested matches using the GUI tool while in
1
International Journal of Computer Applications (0975 – 8887)
Volume 94 – No 2, May 2014
Falcon-AO++ a user only gives input information. FalconAO [6] is a fully automatic and domain independent ontology
alignment system. It uses string and structural techniques to
align ontologies. It also uses partitioning technique for large
ontologies and employs the idea of virtual documents. It
considers both comment and label information attached to
entities. The major drawbacks of this system are: 1. When the
structures of given ontologies are dissimilar some alignments
are lost. 2. The system accepts matches with high similarity
from a string matcher, this leads to error alignment when
equal terms are used to mean different things.
Falcon-AO was extended with user-input strategy so that
domain experts can assist the system by providing input
information. The aim is to improve results produced by
Falcon-AO. An evaluation of the two systems; Falcon-AO
and Falcon-AO++ was carried out using standard benchmarks
to examine the impact of user interaction on alignment results.
3. ARCHITECTURE OF FALCON-AO++
The semi-automatic approach implements the idea that a
domain expert can provide input information to an alignment
system. Figure 1 presents the architecture of Falcon-AO++
which is an extension of the architecture of Falcon-AO [14].
In the system architecture, all the components apart from the
integrated User Restrictions are components of Falcon-AO
[14]. Brief description of the components is given below:
3.1 User Restriction
The system receives two ontologies as input. User input is
generated by the User Restrictions component in the
following stages:
3.1.1
Parsing and Entity Retrieval Component
The parsing and entity retrieval component parses input
ontologies and retrieves all the entities in the ontologies to
provide easy access.
3.1.2
Constraint Specification
This component allows a user to select pairs of entities and a
constraint for each pair. Two constraints are supported in this
paper;
3.1.2.1 Equivalent Constraint
This constraint allows two entities to be defined as equivalent.
The constraint here prevents possible alignment lost.
3.1.2.2 Disjoint Constraint
This constraint allows two entities to be defined as entirely
different even if their strings are exactly the same. The
constraint also prevents possible error alignments.
3.1.3
Rule Implementation
The rule implementation component defines and implements
a rule by Jena based on specified constraint.
3.2 Model Pool
In the model pool, the ontology parser checks the validity
(parses) of given ontologies and creates a model (an internal
representation) of the ontologies. The model coordinator
adjusts the models using some coordination rules.
Graphical
User
Interface
Central Controller
Iteration
Ontologies
Model
Construction
Matcher
Execution
User
Input
Similarity
Combination
Alignments
Model Pool
Ontology
Parser
Model
Coordinator
User
Restrictions
Parsing and
Entity Retrieval
Matcher Library
V-DOC
GMO
I-SUB
PBM
Alignment Set
RDF/XML
Format
Alignment
Generator
Constraint
Specification
Rule
Implementation
Alignment
Evaluator
Repository
Figure 1: Architecture of Falcon-AO++
2
International Journal of Computer Applications (0975 – 8887)
Volume 94 – No 2, May 2014
3.3 Matcher Library
4. IMPLEMENTATION
The Matcher library Controls four different matchers used by
the system. The matchers are described below.
The semi-automatic approach (user input) was implemented
as part of Falcon-AO. Implemented processes are discussed
below.
3.3.1
PBM (Partition Block Matcher)
PBM [15] uses the Divide and Conquer approach to match
ontologies. It is employed when the ontologies to be matched
are large. The matcher first partitions entities of each input
ontology into clusters based on their proximities in the graph,
generates blocks by reassigning the link between entities,
matches similar blocks
from the two ontologies by
distributing anchors, and finally matched blocks are passed to
V-DOC and GMO for final alignment and output.
3.3.2
I-SUB
The I-SUB [14] matcher takes names of entities as string of
characters and computes the similarity between the strings.
The matcher has the capability of comparing not only the
commonalities between strings but also their differences.
3.3.3
V-DOC (Virtual Document)
V-DOC [16] matcher is based on the idea that the meaning of
an entity is encoded in its documentation. It therefore creates
virtual documents for all the entities. The virtual document of
an entity contains the entity’s name and any description
attached to it and also the names of its neighbours. The
similarity between two entities comes from the existence of
shared word(s) in their virtual documents.
3.3.4 GMO (Graph Matching for Ontologies)
GMO [17] uses the structural matching approach. It uses RDF
bipartite graph to represent ontologies and computes structural
similarities between domain entities and between statements
(triples) in ontologies by recursively generating similarities in
the bipartite graphs. It receives alignments found by V-DOC
and I-SUB as external input and outputs additional
alignments.
4.1 User Input processes
Different processes have been implemented to process user
specified input:
4.1.1
Parsing and Entity Retrieval Process
Parsing and Entity Retrieval Process is the first process that
executes. The process first parses input ontologies and
retrieves all the entities from the two ontologies and presents
them on the GUI.
4.1.2 Model Creation and User Constraint
Specification Process
This process implements user specified input by allowing user
to select pairs of entities and a constraint for each pair. The
process first creates an empty temporary ontology file and an
empty model of it. Each selected entity is saved in the
temporary ontology file then the next process is invoked.
4.1.3
Rule Implementation
As soon as the user selects a constraint, the rule
implementation process generates and implements a rule
defining the relation between the entities in that same
temporary file in stage 2. Rule implementation is done by
GenericRuleReasoner provided by jena. If the selected
constraint is equivalent, the rule defines the entities as
SameAs. Otherwise if the selected constraint is disjoint, the
rule defines the entities as DisjointWith.
4.1.4 Data Structures and Integration with
Falcon-AO
The Central Controller Controls the execution of matchers and
other processes.
After the rules have been implemented, the entities are further
saved in data structures of the same type as Falcon-AO. This
data structures are further integrated with the data structures
of Falcon-AO for final output.
3.5 Alignment Set
4.2 GUI of Falcon-AO++
In the Alignment Set, the alignment generator generates
alignments, and the alignment evaluator can be used to
evaluate alignments using a reference alignments.
Figure 2 shows the Graphical user interface of Falcon-AO++.
The GUI of Falcon-AO was modified to allow the display of
retrieved entities from the input ontologies. List of available
constraints are also provided on the GUI.
3.4 Central Controller
3.6 Repository
The Repository stores any reusable data for the alignment
process.
3
International Journal of Computer Applications (0975 – 8887)
Volume 94 – No 2, May 2014
Figure 2: GUI of Falcon-AO++ showing retrieved entities and constraints
5. EVALUATION AND RESULT
5.2 Result and Discussion
The Ontology Alignment Evaluation Initiative (OAEI) is a
coordinated international initiative, which organizes the
evaluation of the increasing number of ontology matching
systems. To examine the impact of user input on the quality of
alignment result, the OAEI conference track dataset consisting
of different ontologies and reference alignments were used.
The alignment results of the system are compared with
alignments provided in the reference alignment. For the
purpose of evaluation, the standard measures of Precision,
Recall, and F-measure for evaluating ontology alignment
systems were used.
Table 1 gives the comparative alignment results of Falcon-AO
and Falcon-AO++. The input ontologies are the ontologies
needed to be matched (i.e., OAEI conference track
ontologies), for each pair of input ontologies, the number of
existing alignment in the reference alignment, the number of
alignments found by each system, and the number of correct
alignments found by each system are all given in the table.
For Falcon-AO++, the number of inputs (E for equivalent and
D for disjoint) provided by the user is specified.
�
�
�
=
=
�−
�
�
�
=
�� �
�
�
�
2∗� � � ∗�
� � � +�
5.1 Experimental Setup
The experiments were performed on a PC with Intel (R) Core
(TM) i3 CPU, 2.13GHz, 4 GB RAM, Windows 7, and Java
NetBeans IDE 7.2
4
International Journal of Computer Applications (0975 – 8887)
Volume 94 – No 2, May 2014
Table 1: Alignment results of Falcon-AO and Falcon-AO++
Falcon-AO
S/No
Input
Ontologies
No. of
existing
alignments in
reference
alignment
No. of
found
alignments
15
18
16
Falcon-AO++
No. of
correct
found
alignments
No. Of
inputs
No. of
found
alignments
No. of
correct
found
alignments
E
D
9
1
1
18
10
11
6
2
1
12
8
13
13
9
2
2
13
11
11
11
6
3
2
12
9
4
8
4
0
3
5
4
12
13
10
1
2
12
11
Cmt
1
conference
Cmt
2
confOf
Cmt
3
Edas
Cmt
4
Ekaw
Cmt
5
Iasted
Cmt
6
Sigkdd
The values in Table 1 were used to obtain values for
precision, recall, and f-measure for Falcon-AO and FalconAO++ using the previously specified formulas. The results are
shown in Table 2.
Six pairs of input ontologies were used. For each pair, the
values for precision, recall, and f-measure were computed. In
the second result in Table 1, the user provided 2 inputs as
equivalent and 1 input as disjoint. For these two inputs the
values for precision, recall, and f-measure appreciated by
22.22%, 33.33% and 28.57%, respectively, as shown in Table
2. There is more increase in recall than precision because the
number of equivalent inputs is higher than the number of
disjoint inputs. In the last input, 1 input was provided as
equivalent and 2 inputs as disjoint. With these there was
19.17%, 10.00% and 14.58% increase in precision, recall and
f-measure respectively. In this case, there is more increase in
precision than recall because more inputs were provided for
disjoint. This shows that equivalent input increases recall and
disjoint input increases precision. However, when same
numbers of input pairs are provided for equivalent and
disjoint, the increase is the same. This was observed from first
and third inputs, where 11.11% increase was obtained for all
three measures while 22.22% increase in was recorded for all
three measures, respectively, for the first and third inputs.
From the results in Table 1 and Table 2, the following
observations are made:
1.
The highest gain in precision, recall and f-measure
was obtained in the fourth input pair as 37.49%,
49.99% and 43.48% respectively. This is because
more user inputs were provided for this pair.
2.
The least gain in precision, recall and f-measure was
obtained in the first input as 11.11% for all the
measures. This is because less user inputs were
provided. From the results and observations, it is
clear that even the minimum input provided by a
user improves values for precision, recall, and fMeasure. This shows that user input has positive
impact on alignment results. The comparative result
in terms of average in precision, recall and fmeasure is shown in Figure 3.
5
International Journal of Computer Applications (0975 – 8887)
Volume 94 – No 2, May 2014
Table 2: Evaluation results of Falcon-AO and Falcon-AO++
Falcon-AO
S/No
Input
Ontologies
Falcon-AO++
Precision
Recall
F-measure
Precision
Recall
F-measure
0.5
0.6
0.54545456
0.5555556
0.6666667
0.6060606
0.54545456
0.375
0.44444445
0.6666667
0.5
0.5714286
0.6923077
0.6923077
0.6923077
0.84615386
0.84615386
0.84615386
0.54545456
0.54545456
0.54545456
0.75
0.8181818
0.7826087
0.5
1.0
0.6666667
0.8
1.0
0.8888889
0.7692308
0.83333333
0.8
0.9166667
0.9166667
0.9166667
0.5920746
0.674372133
0.61572133
0.75584048
0.79127818
0.76863456
cmt
1
conference
cmt
2
confOf
cmt
3
edas
cmt
4
ekaw
cmt
5
iasted
cmt
6
sigkdd
Average
1
0.8
0.6
Falcon-AO
0.4
Falcon-AO++
0.2
0
Precision
Recall
F-measure
Figure 3: Comparative Result of Falcon-AO and Falcon-AO++
The comparative result in Figure 3 indicates clearly that
Falcon-AO++ performs better than Falcon-AO. This shows
that user assistance can significantly improve results of an
alignment system.
6. CONCLUSION AND FUTURE WORK
In this paper, an ontology alignment system Falcon-AO++,
based on Falcon-AO have been presented. The system
supports the contribution of a domain expert. The two systems
were evaluated using standard benchmark ontologies. The
evaluation results have shown that user (domain expert)
interaction has positive impact on alignment result.
Although user interaction improves results of an alignment
system, providing input information is still a bottleneck since
experts have to be manually involved especially when many
inputs are passed. As a future improvement, it would be good
to integrate more methods that would reduce the level of
expert interaction. Presently, the system supports only one to
one mapping and considers only equivalent relation. As
another direction for future work, the system can be improved
to support different kinds of mappings and also different
relations. Also, V-doc can be extended to consider further
neighbours rather than only one-step neighbours.
6
International Journal of Computer Applications (0975 – 8887)
Volume 94 – No 2, May 2014
7. REFERENCES
[1] Shvaiko, P., & Euzenat, J. (2013). Ontology Matching:
State art and Future Challenges. IEEE , pp.1-15.
[2] Jean-Mary, Y., Shironoshita, E., and Kabuka, M. (2009).
Ontology Matching with Semantic Verification. Journal
of Web Semantics. 7(3), 235–251
[3] Shen, G., Jin, L., Zhao, Z., Jia, Z., He, W., and Huang, Z.
(2011). OMReasoner: Using Reasoner for Ontology
Matching : results for OAEI 2011.
[4] Saruladha, K., Aghila, G., and Sathiya, B. (2011). A
Comparative Analysis of Ontology and Schema
Matching Systems. International Journal of Computer
Application. 34(8), 14-21.
[5] Gracia, J., and Mena, E. (2008). Ontology matching with
CIDER: Evaluation report for the OAEI 2008. In Proc.
of 3rd Ontology Matching Workshop (OM’08), at 7th
International Semantic Web Conference (ISWC’08),
Karlsruhe, Germany.
[6] Jian, N., Hu, W., Cheng, G., and Qu, Y. (2005). FalconAO: aligning ontologies with Falcon. In Proceedings of
K-CAP Workshop on Integrating Ontologies. pp. 85–91.
[7] Granitzer, M., Sabol, V., Weng, K. O., Lukose, D., and
Tochtermann, K. (2010). Ontology Alignment—A
Survey with Focus on Visually Supported SemiAutomatic Techniques. Future Internet , 2(3), 238-258
[8] Paulheim, H., Hertling, S., and and Ritze, D. (2013).
Towards Evaluating Interactive Ontology Matching
Tools. The semantic web: semantics and big data,
Lecture Notes in Computer Science LNSC. pp.31-45.
[9] Sarasua, C., Simperl, E., and Noy, N. F. (2012).
CROWDMAP: Crowdsourcing Ontology Alignment
with Microtasks. International Semantic Web Conference
ISWC. pp.525-541. Springer.
[10] Zhang, S., and Bodenreider, O. (2007). Lessons Learned
from Cross-Validating Alignments between Large
Anatomical Ontologies. In Proceedings of 12th World
Congress on Medical Informatics. Brisbane, Australia.
[11] Vargas-vera, M., and Nagy, M. (2010). Towards
Intelligent Ontology Alignment Systems for Question
Answering: Challenges and Roadblocks. Journal of
Emerging technologies in web intelligence , 2(3), 244257
[12] Amrouch, S., and Mostefai, S. (2012). Ontology
Interoperability Techniques, The State of the Art.
Journal of Information Organization , 2(1), 20-27.
[13] Choi, N., Song, I., and Han, H. (2006). A Survey on
Ontology Mapping. SIGMOD Record, 35 (3), 34-41.
[14] Hu, W., and Qu, Y. (2008). Falcon-AO: A practical
ontology matching system. Web Semantics: Science,
Services and Agents on theWorldWideWeb, 6(3), 237239.
[15] Hu, W., Qu, Y., and Cheng, G. (2008). Matching large
ontologies: A divide-and-conquer approach. Data and
Knowledge Engineering , 67(1), 140-160.
[16] Qu, Y., Hu, W., and Cheng, G. (2006). Constructing
virtual documents for ontology matching. In Proceedings
of the 15th International World Wide Web Conference.
pp. 23–31.
[17] Hu, W., Jian, N., Qu, Y., and Wang, Y. (2005). GMO: A
Graph Matching for Ontologies. In Proceedings of KCAP Workshop on Integrating Ontologies, pp.41–48.
7
IJCATM : www.ijcaonline.org