Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3477314.3507301acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Generalized graph pattern discovery in linked data with data properties and a domain ontology

Published: 06 May 2022 Publication History

Abstract

Nowadays, in many practical situations, analytical tasks need to be performed on complex heterogeneous data, often described by a domain ontology (DO). Such cases abound in life science fields such as agro-informatics, where observations and measures on animals/plants are logged for subsequent mining. The data is naturally structured as graph(s), unlabelled and missing some values, hence it fits well pattern mining. In our own precision farming project aimed at decision support for dairy cow management, we mine for knowledge in milk production data. In one task, we aim at contrast patterns explaining the relative impact of independent production factors. To that end, ontologically-generalized graph patterns (OGPs), a variety of generalized graph patterns, where vertices and edges are labelled by DO classes and properties, respectively, were defined. A mining methodology was also designed that reconciles OWL DOs, abstraction from RDF graphs and literals in data. To address the well-known cost-related limitations of graph mining -exacerbated here by class/property specializations and data properties- we split the mining task into (1) mining of generic object property topology patterns and (2) label refinement. Those focus on two sorts of OGPs, called topologies and class stars, respectively, which, after being mined separately, get (3) assembled into fully-fledged OGPs.

References

[1]
M. Adda et al. 2005. On the discovery of semantically enhanced sequential patterns. In 4th Intl. Conf. on Machine Learning and Applications. IEEE, 8--pp.
[2]
M. Adda et al. 2010. A framework for mining meaningful usage patterns within a semantically enhanced web portal. In 3rd C* Conf. CS&SE. 138--147.
[3]
C. Aggarwal et al. 2014. Frequent Pattern Mining (2014 ed.). Springer.
[4]
R. Agrawal et al. 1993. Mining Association Rules between Sets of Items in Large Databases. In Proc., ACM SIGMOD Conf., Washington, D.C. 207--216.
[5]
S. Anand et al. 1995. The role of domain knowledge in data mining. In Proc. of the 4th Int. Conf. on Information and knowledge management. ACM, 37--43.
[6]
M. Barati et al. 2017. Mining semantic association rules from RDF data. Knowledge-Based Systems 133 (2017), 183--196.
[7]
S. Bay and M. Pazzani. 2001. Detecting group differences: Mining contrast sets. Data mining and knowledge discovery 5, 3 (2001), 213--246.
[8]
B. Berendt. 2006. Using and learning semantics in frequent subgraph mining. In Intl. WS. KDWEB. Springer, 18--38.
[9]
A. Cakmak and G. Ozsoyoglu. 2008. Taxonomy-superimposed graph mining. In Proc. of the 11th intl. conf. on EDBT. ACM, 217--228.
[10]
V. Carletti et al. 2017. Introducing VF3: A New Algorithm for Subgraph Isomorphism. Lecture Notes in Computer Science, Vol. 10310. Springer, 128--139.
[11]
L. De Raedt. 2008. Logical and relational learning. Springer.
[12]
G. Dong and J. Li. 1999. Efficient mining of emerging patterns: Discovering trends and differences. In Proc. of the fifth ACM SIGKDD intl. conf. ACM, 43--52.
[13]
M. Dyer and C. Greenhill. 2000. The complexity of counting graph homomor-phisms. Random Structures & Algorithms 17, 3--4 (2000), 260--289.
[14]
J. Euzenat and P. Valtchev. 2003. An integrative proximity measure for ontology alignment. In SIW@ISWC-2003. 33--38.
[15]
S. Fortin and L. Liu. 1996. An object-oriented approach to multi-level association rule mining. In Proc. of the fifth intl. CIKM. 65--72.
[16]
V. Fuentes et al. 2021. Toward a Dairy Ontology to Support PrecisionFarming. In Proceedings of ICBO2021.
[17]
A. Goldstein et al. 2019. A Framework for Evaluating Agricultural Ontologies. arXiv preprint arXiv:1906.10450 (2019).
[18]
C. Gonçalves Frasco et al. 2020. Towards an Effective Decision-making System based on Cow Profitability using Deep Learning:. In 12th ICAART. 949--958.
[19]
J. Han and Y. Fu. 1995. Discovery of multiple-level association rules from large databases. In VLDB, Vol. 95. 420--431.
[20]
A. Inokuchi. 2004. Mining Generalized Substructures from a Set of Labeled Graphs. In Fourth IEEE ICDM. IEEE, 415--418.
[21]
A. Inokuchi et al. 2000. An apriori-based algorithm for mining frequent substructures from graph data. In PKDD. Springer, 13--23.
[22]
T. Jiang et al. 2007. Mining generalized associations of semantic relations from textual web content. IEEE TKDE 19, 2 (2007), 164--179.
[23]
C. Jonquet et al. 2018. AgroPortal: A vocabulary and ontology repository for agronomy. Computers and Electronics in Agriculture 144 (2018), 126--143.
[24]
R. Khade et al. 2019. Finding Meaningful Contrast Patterns for Quantitative Data. In EDBT. 444--455.
[25]
S. Kiplagat et al. 2012. Genetic improvement of livestock for milk production. In Milk Production---Advanced Genetic Traits, Cellular Mechanism, Animal Management and Health. Intech Publishers, 77--96.
[26]
F. Kramer and T. Beißbarth. 2017. Working with ontologies. In Bioinformatics. Springer, 123--135.
[27]
T. Martin et al. 2020. Leveraging a Domain Ontology in (Neural) Learning from Heterogeneous Data. In CIKM (Workshops).
[28]
T. Martin et al. 2021. Towards Mining Generalized Patterns From RDF Data And A Domain Ontology. In Proceedings of GEM@ECML-PKDD2021. Springer.
[29]
P. Monnin. 2020. Matching and mining in knowledge graphs of the Web of data-Applications in pharmacogenomics. Ph.D. Dissertation. Université de Lorraine.
[30]
Victoria Nebot and Rafael Berlanga. 2012. Finding association rules in semantic web data. Knowledge-Based Systems 25, 1 (2012), 51--62.
[31]
S. Nijssen and J. Kok. 2004. Frequent graph mining and its application to molecular databases. In IEEE Transact. on Systems, Man and Cybernetics, Vol. 5. 4571--4577.
[32]
P. Novak et al. 2009. Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining. JMLR 10, 2 (2009).
[33]
A. Petermann et al. 2017. Mining and ranking of generalized multi-dimensional frequent subgraphs. In IEEE ICDIM. IEEE, Fukuoka, 236--245.
[34]
P. Ristoski and H. Paulheim. 2016. Rdf2vec: Rdf graph embeddings for data mining. In International Semantic Web Conference. Springer, 498--514.
[35]
R. Srikant and R. Agrawal. 1996. Mining quantitative association rules in large relational tables. In Proceedings of the 1996 ACM SIGMOD. 1--12.
[36]
R. Srikant and R. Agrawal. 1997. Mining generalized association rules. Future Generation Computer Systems 13, 2--3 (1997), 161--180.
[37]
S. Wrobel. 1997. An algorithm for multi-relational discovery of subgroups. In PKDD. Springer, 78--87.
[38]
X. Yan and J. Han. 2002. gSpan: Graph-based substructure pattern mining. In IEEE ICDM. 721--724.
[39]
X. Yan and J. Han. 2003. CloseGraph: mining closed frequent graph patterns. In Proceedings of the ninth ACM SIGKDD. ACM, 286--295.
[40]
X. Zhang et al. 2012. Mining link patterns in linked data. In WAIM. Springer, 83--94.

Cited By

View all
  • (2023)Frequent Generalized Subgraph Mining via Graph Edit DistancesMachine Learning and Principles and Practice of Knowledge Discovery in Databases10.1007/978-3-031-23633-4_32(477-483)Online publication date: 31-Jan-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '22: Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing
April 2022
2099 pages
ISBN:9781450387132
DOI:10.1145/3477314
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 May 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. generalized patterns
  2. graph data
  3. ontologies
  4. pattern mining

Qualifiers

  • Research-article

Funding Sources

Conference

SAC '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Frequent Generalized Subgraph Mining via Graph Edit DistancesMachine Learning and Principles and Practice of Knowledge Discovery in Databases10.1007/978-3-031-23633-4_32(477-483)Online publication date: 31-Jan-2023

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media