research-article

Learning and Deducing Temporal Orders

Authors:

Muhammad Asif AliAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 16, Issue 8

Pages 1944 - 1957

https://doi.org/10.14778/3594512.3594524

Published: 01 April 2023 Publication History

Abstract

This paper studies how to determine temporal orders on attribute values in a set of tuples that pertain to the same entity, in the absence of complete timestamps. We propose a creator-critic framework to learn and deduce temporal orders by combining deep learning and rule-based deduction, referred to as GATE (Get the lATEst). The creator of GATE trains a ranking model via deep learning, to learn temporal orders and rank attribute values based on correlations among the attributes. The critic then validates the temporal orders learned and deduces more ranked pairs by chasing the data with currency constraints; it also provides augmented training data as feedback for the creator to improve the ranking in the next round. The process proceeds until the temporal order obtained becomes stable. Using real-life and synthetic datasets, we show that GATE is able to determine temporal orders with F-measure above 80%, improving deep learning by 7.8% and rule-based methods by 34.4%.

References

[1]

2022. Full version. https://github.com/yyssl88/Timeliness/blob/main/paper_full_version.pdf.

[2]

Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of Databases. Addison-Wesley.

Digital Library

[3]

Marcelo Arenas, Leopoldo Bertossi, and Jan Chomicki. 1999. Consistent Query Answers in Inconsistent Databases. In PODS.

[4]

Tobias Bleifuß, Sebastian Kruse, and Felix Naumann. 2017. Efficient Denial Constraint Discovery with Hydra. PVLDB 11, 3 (2017), 311--323.

Digital Library

[5]

Rpjc Jagadeesh Bose, Rs Ronny Mans, and Van Der Wmp Wil Aalst. 2013. Wanna improve process mining results? It's high time we consider data quality issues seriously. In Computational Intelligence & Data Mining.

[6]

Philip Bramsen, Pawan Deshpande, Yoong Keok Lee, and Regina Barzilay. 2006. Inducing Temporal Graphs. In Conference on Empirical Methods in Natural Language Processing (EMNLP). ACL.

[7]

Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to rank using gradient descent. In international conference on Machine learning. 89--96.

Digital Library

[8]

Businesswire. 2022. Over 80 Percent of Companies Rely on Stale Data for Decision-Making. https://www.businesswire.com/news/home/20220511005403/en/Over-80-Percent-of-Companies-Rely-on-Stale-Data-for-Decision-Making.

[9]

Statistics Canada. 2022. Classification of legal marital status. https://www23.statcan.gc.ca/imdb/p3VD.pl?Function=getVD&TVD=61748&CVD=61748&CLV=0&MLV=1&D=1.

[10]

Nathanael Chambers and Dan Jurafsky. 2008. Jointly Combining Implicit Constraints Improves Temporal Ordering. In Conference on Empirical Methods in Natural Language Processing (EMNLP) (Honolulu, Hawaii). ACL, 698--706.

[11]

Olivier Chapelle and Yi Chang. 2011. Yahoo! learning to rank challenge overview. In Proceedings of the learning to rank challenge. PMLR, 1--24.

[12]

Peter Christen and Ross W. Gayler. 2013. Adaptive Temporal Entity Resolution on Dynamic Databases. In PAKDD. Springer.

[13]

Xu Chu, Ihab F. Ilyas, and Paolo Papotti. 2013. Discovering Denial Constraints. PVLDB 6, 13 (2013), 1498--1509.

Digital Library

[14]

E. F. Codd. 1979. Extending the Database Relational Model to Capture More Meaning. TODS 4, 4 (1979), 397--434.

Digital Library

[15]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. 4171--4186.

[16]

Ioannis Dikeoulias, Saadullah Amin, and Günter Neumann. 2022. Temporal Knowledge Graph Reasoning with Low-rank and Model-agnostic Representations. CoRR abs/2204.04783 (2022).

[17]

Xiaoou Ding, Hongzhi Wang, Yitong Gao, Jianzhong Li, and Hong Gao. 2017. Efficient currency determination algorithms for dynamic data. Tsinghua Science and Technology 22, 3 (2017), 227--242.

[18]

Xiaoou Ding, Hongzhi Wang, Jiaxuan Su, Jianzhong Li, and Hong Gao. 2018. Improve3c: Data cleaning on consistency and completeness with currency. arXiv preprint arXiv:1808.00024 (2018).

[19]

Xiaoou Ding, Hongzhi Wang, Jiaxuan Su, Muxian Wang, Jianzhong Li, and Hong Gao. 2020. Leveraging Currency for Repairing Inconsistent and Incomplete Data. TKDE (2020).

[20]

Aswathy Divakaran and Anuraj Mohan. 2020. Temporal Link Prediction: A Survey. New Gener. Comput. 38, 1 (2020), 213--258.

Digital Library

[21]

X. Dong, L. Berti-Equille, and D. Srivastava. 2009. Truth Discovery and Copying Detection in a Dynamic World. PVLDB 2, 1 (2009), 562--573.

Digital Library

[22]

X. L. Dong, L. Berti-Equille, Y. Hu, and D. Srivastava. 2010. Global detection of complex copying relationships between sources. In PVLDB.

[23]

Xuliang Duan, Bing Guo, Yan Shen, Yuncheng Shen, Xiangqian Dong, and Hong Zhang. 2020. Research on Parallel Data Currency Rule Algorithms. In International Conference on Information Science and System. 24--28.

[24]

Kevin K Duh. 2009. Learning to rank with partially-labeled data. University of Washington.

[25]

Exasol. 2020. Exasol Research Finds 58% of Organizations Make Decisions Based on Outdated Data. https://www.exasol.com/news-exasol-research-finds-organizations-make-decisions-based-on-outdated-data/.

[26]

Wenfei Fan, Floris Geerts, Xibei Jia, and Anastasios Kementsietsidis. 2008. Conditional functional dependencies for capturing data inconsistencies. TODS 33, 2 (2008), 6:1--6:48.

[27]

Wenfei Fan, Floris Geerts, Nan Tang, and Wenyuan Yu. 2013. Inferring data currency and consistency for conflict resolution. In ICDE. IEEE, 470--481.

[28]

Wenfei Fan, Floris Geerts, Nan Tang, and Wenyuan Yu. 2014. Conflict resolution with data currency and consistency. Journal Data and Information Quality (JDIQ) 5, 1--2 (2014), 6:1--6:37.

Digital Library

[29]

Wenfei Fan, Floris Geerts, and Jef Wijsen. 2011. Determining the currency of data. In PODS. ACM.

[30]

Wenfei Fan, Floris Geerts, and Jef Wijsen. 2012. Determining the Currency of Data. TODS 37, 4 (2012), 25:1--25:46.

[31]

Wenfei Fan, Ruochun Jin, Ping Lu, Chao Tian, and Ruiqi Xu. 2022. Towards Event Prediction in Temporal Graphs. PVLDB 15, 9 (2022), 1861--1874.

Digital Library

[32]

Wenfei Fan, Ping Lu, and Chao Tian. 2020. Unifying Logic Rules and Machine Learning for Entity Enhancing. Sci. China Inf. Sci. 63, 7 (2020).

[33]

Wenfei Fan, Chao Tian, Yanghao Wang, and Qiang Yin. 2021. Parallel Discrepancy Detection and Incremental Detection. PVLDB 14, 8 (2021), 1351--1364.

Digital Library

[34]

Shenzhen Municipal Govement. 2022. Self-employed Entrepreneurs. https://opendata.sz.gov.cn/data/dataSet/toDataDetails/29200_01300931.

[35]

Tanya Goyal and Greg Durrett. 2019. Embedding Time Expressions for Deep Temporal Ordering Models. In Conference of the Association for Computational Linguistics (ACL). ACL.

[36]

Shuguang Han, Xuanhui Wang, Mike Bendersky, and Marc Najork. 2020. Learning-to-Rank with BERT in TF-Ranking. arXiv preprint arXiv:2004.08476 (2020).

[37]

Yujing Hu, Qing Da, Anxiang Zeng, Yang Yu, and Yinghui Xu. 2018. Reinforcement learning to rank in e-commerce search engine: Formalization, analysis, and application. In SIGKDD. 368--377.

[38]

Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR (Poster).

[39]

James Kirkpatrick, Razvan Pascanu, Neil C. Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A. Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan Kumaran, and Raia Hadsell. 2016. Overcoming catastrophic forgetting in neural networks. CoRR abs/1612.00796 (2016).

[40]

Angelina Prima Kurniati, Eric Rojas, David Hogg, Geoff Hall, and Owen A Johnson. 2019. The assessment of data quality issues for process mining in healthcare using Medical Information Mart for Intensive Care III, a freely available e-health record database. Health informatics journal 25, 4 (2019), 1878--1893.

[41]

Stefano Leone. 2022. FIFA 22 complete player dataset. https://www.kaggle.com/stefanoleone992/fifa-21-complete-player-dataset.

[42]

Furong Li, Mong-Li Lee, Wynne Hsu, and Wang-Chiew Tan. 2015. Linking Temporal Records for Profiling Entities. In SIGMOD. ACM, 593--605.

[43]

Mohan Li and Jianzhong Li. 2016. A minimized-rule based approach for improving data currency. J. Comb. Optim. (2016), 812--841.

[44]

Mohan Li, Jianzhong Li, Siyao Cheng, and Yanbin Sun. 2018. Uncertain rule based method for determining data currency. IEICE TRANSACTIONS on Information and Systems 101, 10 (2018), 2447--2457.

[45]

Mohan Li and Yanbin Sun. 2018. Currency Preserving Query: Selecting the Newest Values from Multiple Tables. IEICE TRANSACTIONS on Information and Systems 101, 12 (2018), 3059--3072.

[46]

Pei Li, Xin Luna Dong, Andrea Maurino, and Divesh Srivastava. 2011. Linking Temporal Records. PVLDB 4, 11 (2011), 956--967.

Digital Library

[47]

Yuliang Li, Jinfeng Li, Yoshihiko Suhara, AnHai Doan, and Wang-Chiew Tan. 2020. Deep Entity Matching with Pre-Trained Language Models. PVLDB 14, 1 (2020), 50--60.

Digital Library

[48]

Yu Liang, Xuliang Duan, Yuanjun Ding, Xifeng Kou, and Jingcheng Huang. 2019. Data Mining of Students' Course Selection Based on Currency Rules and Decision Tree. In International Conference on Big Data and Computing. 247--252.

[49]

Ashley Little. 2020. Outdated Data: Worse Than No Data? https://info.aldensys.com/joint-use/outdated-data-is-worse-than-no-data#::text=Obsolete%20data%20about%20the%20condition,too%20old%20to%20be%20reliable.

[50]

Tie-Yan Liu. 2009. Learning to Rank for Information Retrieval. Found. Trends Inf. Retr. 3, 3 (2009), 225--331.

Digital Library

[51]

Tie-Yan Liu. 2010. Learning to rank for information retrieval. In SIGIR.

[52]

Ester Livshits, Alireza Heidari, Ihab F. Ilyas, and Benny Kimelfeld. 2020. Approximate Denial Constraints. PVLDB 13, 10 (2020), 1682--1695.

Digital Library

[53]

Niels Martin, Antonio Martinez-Millana, Bernardo Valdivieso, and Carlos Fernandez-Llatas. 2019. Interactive Data Cleaning for Process Mining: A Case Study of an Outpatient Clinic's Appointment System. In International Conference on Business Process Management.

[54]

Qiang Ning, Zhili Feng, and Dan Roth. 2017. A Structured Learning Approach to Temporal Relation Extraction. In Conference on Empirical Methods in Natural Language Processing (EMNLP). ACL, 1027--1037.

[55]

Qiang Ning, Hao Wu, Haoruo Peng, and Dan Roth. 2018. Improving Temporal Relation Extraction with a Globally Acquired Statistical Resource. In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT. ACL, 841--851.

[56]

Rodrigo Frassetto Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. CoRR abs/1901.04085 (2019).

[57]

Rama Kumar Pasumarthi, Sebastian Bruch, Xuanhui Wang, Cheng Li, Michael Bendersky, Marc Najork, Jan Pfeifer, Nadav Golbandi, Rohan Anil, and Stephan Wolf. 2019. Tf-ranking: Scalable tensorflow library for learning-to-rank. In SIGKDD. 2970--2978.

Digital Library

[58]

Eduardo H. M. Pena, Eduardo Cunha de Almeida, and Felix Naumann. 2019. Discovery of Approximate (and Exact) Denial Constraints. PVLDB 13, 3 (2019), 266--278.

Digital Library

[59]

Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In NAACL.

[60]

Royal Mail. 2018. Dynamic Customer Data in a Digital World: Data Services Insight Report. https://www.royalmail.com/business/system/files/royal-mail-data-services-insight-report-2018.pdf.

[61]

Ali Sadeghian, Mohammadreza Armandpour, Anthony Colas, and Daisy Zhe Wang. 2021. ChronoR: Rotation Based Temporal Knowledge Graph Embedding. In AAAI. AAAI Press, 6471--6479.

[62]

Fereidoon Sadri and Jeffrey D. Ullman. 1980. The Interaction between Functional Dependencies and Template Dependencies. In SIGMOD.

[63]

Yi Tay, Minh C Phan, Luu Anh Tuan, and Siu Cheung Hui. 2017. Learning to rank question answer pairs with holographic dual lstm architecture. In SIGIR. 695--704.

[64]

Julien Tourille, Olivier Ferret, Aurélie Névéol, and Xavier Tannier. 2017. Neural Architecture for Temporal Relation Extraction: A Bi-LSTM Approach for Detecting Narrative Containers. In ACL. ACL, 224--230.

[65]

Rakshit Trivedi, Hanjun Dai, Yichen Wang, and Le Song. 2017. Know-Evolve: Deep Temporal Reasoning for Dynamic Knowledge Graphs. In International Conference on Machine Learning (ICML), Vol. 70. PMLR, 3462--3471.

[66]

Hongzhi Wang, Xiaoou Ding, Jianzhong Li, and Hong Gao. 2018. Rule-based entity resolution on database with hidden temporal information. TKDE 30, 11 (2018), 2199--2212.

Digital Library

[67]

Jun Xu, Xiangnan He, and Hang Li. 2020. Deep Learning for Matching in Search and Recommendation. Found. Trends Inf. Retr. 14, 2--3 (2020), 102--288.

Digital Library

[68]

Jing Yao, Zhicheng Dou, Jun Xu, and Ji-Rong Wen. 2021. RLPS: A Reinforcement Learning-Based Framework for Personalized Search. TOIS 39, 3 (2021), 1--29.

Digital Library

[69]

Jingran Zhang, Fumin Shen, Xing Xu, and Heng Tao Shen. 2020. Temporal Reasoning Graph for Activity Recognition. IEEE Trans. Image Process. (2020).

[70]

Meng Zhang, Yang Liu, Huanbo Luan, and Maosong Sun. 2016. Listwise ranking functions for statistical machine translation. IEEE/ACM Transactions on Audio, Speech, and Language Processing 24, 8 (2016), 1464--1472.

Digital Library

Cited By

Recommendations

Temporal shift reinforcement learning
EuroMLSys '22: Proceedings of the 2nd European Workshop on Machine Learning and Systems

The function approximators employed by traditional image-based Deep Reinforcement Learning (DRL) algorithms usually lack a temporal learning component and instead focus on learning the spatial component. We propose a technique, Temporal Shift ...
Extended spatial and temporal learning scale in reinforcement learning
CIMMACS '10: Proceedings of the 9th WSEAS international conference on computational intelligence, man-machine systems and cybernetics

In this paper, the extended learning scale is proposed to improve the efficiency of reinforcement learning. The learning scale is defined and its impact on the performance of learning is investigated. Based on the correlation of the spatial or temporal ...
Temporal Faceted Learning of Concepts Using Web Search Engines
Proceedings of the 12th International Conference on Advances in Web-Based Learning --- ICWL 2013 - Volume 8167

In this paper, we propose the problem of generating temporal faceted learning of concepts. The goal of the proposed problemisto annotate a concept with semantic, temporal, faceted, concise, andstructured information, which can release the cognitive ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 16, Issue 8

April 2023

257 pages

ISSN:2150-8097

Editors:
Georgia Koutrika
Athena Research Center
,
Jun Yang
Duke University

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 April 2023

Published in PVLDB Volume 16, Issue 8

Check for updates

Badges

Artifacts Available / v1.1

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
90
Total Downloads

Downloads (Last 12 months)53
Downloads (Last 6 weeks)7

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents