Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3372422.3372441acmotherconferencesArticle/Chapter ViewAbstractPublication PagesciisConference Proceedingsconference-collections
research-article

Chinese Address Standardization Based on seq2seq Model

Published: 07 February 2020 Publication History

Abstract

Address information is important in many industries especially express delivery. However, since address is usually an unstructured text that has multiple expressions, the use of the address is confusing in the real world, for example, people usually simplify and express addresses according to their own habits. This leads to the challenge of missing and error of important information, which will reduce the efficiency for most of industries. In this paper, we propose an address standardization method based on the seq2seq model to solve the above challenges. We use the Gated Recurrent Unit (GRU) to learn the intrinsic link in the Chinese address and use attention mechanism to determine the weight of different parts of the address. To our knowledge, it is the first method that can complete the missing administrative address information and correct the error of address information without using additional information such as standard address database and geological element table. It is also the first time to use deep learning and attention mechanism in Chinese address standardization task. The experiment verifies that our method achieves high accuracy in the industry.

References

[1]
Yong Wang, Jiping Liu, Qingsheng Guo, and Luo An. 2016. The StandardizationMethod of Address Information for POIs from Internet Based on PositionalRelation.Acta Geodaetica et Cartographica Sinica45, 5 (2016), 623--630.
[2]
KANG Mengjun, DU Qingyun, WANG Mingjun. ANew MethodofChineseAddressExtractionBasedonAddressTreeModel[J]. Acta Geodaetica et Cartographica Sinica, 2015, 44(1): 99--107.
[3]
Jing Tang, Xiaoqing Zuo, and Zhaorong Ou. 2019. Chinese address standardiza-tion based on cadastral database.Geospatial Information17, 01 (2019), 12+131--134.
[4]
Shen Ying, Weiyang Li, and HE Biao. 2017. Address Text Matching Method Basedon City Address Tree.Geomatics World 24, 6 (2017).
[5]
Song zihui. 2013. matching algorithm based on chinese natural language under-standing.Journal of Remote Sensing17, 4 (2013), 788--80
[6]
Yu Liu and Jing Zhang. 2018. Address Standardization Algorithm Based on Aho-corasick Automaton and Address Probability Model.Computer and Modernization12 (2018), 45--50.
[7]
Colin De La Higuera and Jose Oncina. 2013. Computing the Most Probable Stringwith a Probabilistic Finite State Machine. InInternational Conference on FiniteState Methods and Natural Language Process.
[8]
Kyunghyun Cho, Bart Van Merrienboer, Caglar Gulcehre, Fethi Bougares, andYoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. InEMNLP.
[9]
Rahul Dey and Fathi M. Salemt. 2017. Gate-variants of Gated Recurrent Unit(GRU) neural networks. (2017).
[10]
Gers F.A. and Schmidhuber E. 2001. LSTM recurrent networks learn simplecontext-free and context-sensitive languages.IEEE Transactions on Neural Net-works12, 6 (2001), 1333--1340.
[11]
Ming Luo and Huang Hailiang. 2016. New method of Chinese address standard-ization based on finite state machine theory.Application Research of Computers33, 12 (2016), 3691--3695.

Cited By

View all
  • (2022)Deep Learning Based Improvement in Overseas Manufacturer Address Quality Using Administrative District DataApplied Sciences10.3390/app12211112912:21(11129)Online publication date: 2-Nov-2022

Index Terms

  1. Chinese Address Standardization Based on seq2seq Model

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    CIIS '19: Proceedings of the 2019 2nd International Conference on Computational Intelligence and Intelligent Systems
    November 2019
    200 pages
    ISBN:9781450372596
    DOI:10.1145/3372422
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • Queensland University of Technology
    • City University of Hong Kong: City University of Hong Kong

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 February 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Address Standardization
    2. Attention
    3. Chinese Address
    4. Deep Learning

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    CIIS 2019

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)20
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 18 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Deep Learning Based Improvement in Overseas Manufacturer Address Quality Using Administrative District DataApplied Sciences10.3390/app12211112912:21(11129)Online publication date: 2-Nov-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media