Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleAugust 2023
MedLink: De-Identified Patient Health Record Linkage
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data MiningAugust 2023, Pages 2672–2682https://doi.org/10.1145/3580305.3599427A comprehensive patient health history is essential for patient care and healthcare research. However, due to the distributed nature of healthcare services, patient health records are often scattered across multiple systems. Existing record linkage ...
- research-articleMay 2022
- research-articleAugust 2021
Capture–Recapture Techniques for Transport Survey Estimate Adjustment Using Permanently Installed Highway-Sensors
- Jonas Klingwort,
- Bart Buelens,
- Rainer Schnell,
- Adam Eck,
- Ana Lucía Córdova Cazar,
- Mario Callegaro,
- Paul Biemer
Social Science Computer Review (SSCR), Volume 39, Issue 4Aug 2021, Pages 527–542https://doi.org/10.1177/0894439319874684In this article, survey, sensor, and administrative data are combined to correct for survey point estimate bias due to underreporting. The response to the Dutch Road Freight Transport Survey is linked to records from a road sensor network consisting of ...
- research-articleJuly 2021
High-Value Token-Blocking: Efficient Blocking Method for Record Linkage
ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 16, Issue 2Article No.: 24, Pages 1–17https://doi.org/10.1145/3450527Data integration is an important component of Big Data analytics. One of the key challenges in data integration is record linkage, that is, matching records that represent the same real-world entity. Because of computational costs, methods referred to as ...
- letterMay 2021
Heritage connector: A machine learning framework for building linked open data from museum collections
AbstractAs with almost all data, museum collection catalogues are largely unstructured, variable in consistency and overwhelmingly composed of thin records. The form of these catalogues means that the potential for new forms of research, access and ...
-
- research-articleApril 2021
Neural Networks for Entity Matching: A Survey
ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 15, Issue 3Article No.: 52, Pages 1–37https://doi.org/10.1145/3442200Entity matching is the problem of identifying which records refer to the same real-world entity. It has been actively researched for decades, and a variety of different approaches have been developed. Even today, it remains a challenging problem, and ...
- research-articleJanuary 2021
Siamese Neural Network for Unstructured Data Linkage
iiWAS '20: Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & ServicesNovember 2020, Pages 417–425https://doi.org/10.1145/3428757.3429106Data integration is one of the key problems in the era of Big Data analytics. The key challenge of data integration is the identification of records representing the same entities (e.g. person). This task is referred to as Record Linkage. It is uncommon ...
- research-articleOctober 2020
An Overview of Phonetic Encoding Algorithms
Automation and Remote Control (ARCO), Volume 81, Issue 10Oct 2020, Pages 1896–1910https://doi.org/10.1134/S0005117920100082AbstractThis paper presents an overview of the phonetic encoding algorithms designed to determine the similarity of words in sound (pronunciation). Phonetic encoding algorithms are divided into the algorithms for comparing words and the algorithms for ...
- research-articleJune 2020
Data Preparation for Duplicate Detection
Journal of Data and Information Quality (JDIQ), Volume 12, Issue 3Article No.: 15, Pages 1–24https://doi.org/10.1145/3377878Data errors represent a major issue in most application workflows. Before any important task can take place, a certain data quality has to be guaranteed by eliminating a number of different errors that may appear in data. Typically, most of these errors ...
- research-articleDecember 2019
Secured technique for healthcare record linkage
NSysS '19: Proceedings of the 6th International Conference on Networking, Systems and SecurityDecember 2019, Pages 30–36https://doi.org/10.1145/3362966.3362972Nowadays, a large amount of health data is electronically accessible, available, and processable. In developed countries, health data can be integrated using a social security number or national health id. However, in developing countries such as ...
- research-articleNovember 2019
A Software Complex for Integration of Attribute Data of Information Objects
Automatic Documentation and Mathematical Linguistics (SPADML), Volume 53, Issue 6Nov 2019, Pages 295–302https://doi.org/10.3103/S0005105519060037AbstractThis paper presents a software complex that provides the interaction of software systems to solve the problem of integrating the attribute data of information objects, which is based on the developed and tested mathematical and programming ...
- research-articleJuly 2019
Adversarial Matching of Dark Net Market Vendor Accounts
KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningJuly 2019, Pages 1871–1880https://doi.org/10.1145/3292500.3330763Many datasets feature seemingly disparate entries that actually refer to the same entity. Reconciling these entries, or "matching," is challenging, especially in situations where there are errors in the data. In certain contexts, the situation is even ...
- research-articleApril 2019
Hybrid Private Record Linkage: Separating Differentially Private Synopses from Matching Records
ACM Transactions on Privacy and Security (TOPS), Volume 22, Issue 3Article No.: 15, Pages 1–36https://doi.org/10.1145/3318462Private record linkage protocols allow multiple parties to exchange matching records, which refer to the same entities or have similar values, while keeping the non-matching ones secret. Conventional protocols are based on computationally expensive ...
- research-articleJanuary 2019
A data cleaning method for heterogeneous attribute fusion and record linkage
International Journal of Computational Science and Engineering (IJCSE), Volume 19, Issue 32019, Pages 311–324https://doi.org/10.1504/ijcse.2019.101341In big data era, massive heterogeneous data are generated from various data sources, the cleaning of dirty data is critical for reliable data analysis. Existing rule-based methods are generally developed in single data source environment, issues like ...
- research-articleSeptember 2018
Experience: Enhancing Address Matching with Geocoding and Similarity Measure Selection
Journal of Data and Information Quality (JDIQ), Volume 10, Issue 2Article No.: 8, Pages 1–16https://doi.org/10.1145/3232852Given a query record, record matching is the problem of finding database records that represent the same real-world object. In the easiest scenario, a database record is completely identical to the query. However, in most cases, problems do arise, for ...
- research-articleApril 2018
Leveraging Social Media Signals for Record Linkage
WWW '18: Proceedings of the 2018 World Wide Web ConferenceApril 2018, Pages 1195–1204https://doi.org/10.1145/3178876.3186018Many data-intensive applications collect (structured) data from a variety of sources. A key task in this process is record linkage, which is the problem of determining the records from these sources that refer to the same real-world entities. ...
- research-articleNovember 2017
Building a Dossier on the Cheap: Integrating Distributed Personal Data Resources Under Cost Constraints
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge ManagementNovember 2017, Pages 1549–1558https://doi.org/10.1145/3132847.3132951A wide variety of personal data is routinely collected by numerous organizations that, in turn, share and sell their collections for analytic investigations (e.g., market research). To preserve privacy, certain identifiers are often redacted, perturbed ...
- research-articleSeptember 2017
Using Wavelets for Matching Records Privately
PCI '17: Proceedings of the 21st Pan-Hellenic Conference on InformaticsSeptember 2017, Article No.: 29, Pages 1–6https://doi.org/10.1145/3139367.3139371This paper presents a wavelet-based methodology for performing privacy preserving record linkage. The proposed methodology is introduced in a bottom-up approach, starting from simple text matching and extending to actual record linkage. The discrete ...
- research-articleApril 2017
AncestryAI: A Tool for Exploring Computationally Inferred Family Trees
WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web CompanionApril 2017, Pages 257–261https://doi.org/10.1145/3041021.3054728Many people are excited to discover their ancestors and thus decide to take up genealogy. However, the process of finding the ancestors is often very laborious since it involves comparing a large number of historical birth records and trying to manually ...
- research-articleOctober 2016
Attribute-based Crowd Entity Resolution
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge ManagementOctober 2016, Pages 549–558https://doi.org/10.1145/2983323.2983831We study the problem of using the crowd to perform entity resolution (ER) on a set of records. For many types of records, especially those involving images, such a task can be difficult for machines, but relatively easy for humans. Typical crowd-based ...