Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2736277.2741137acmotherconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article

Frankenplace: Interactive Thematic Mapping for Ad Hoc Exploratory Search

Published: 18 May 2015 Publication History
  • Get Citation Alerts
  • Abstract

    Ad hoc keyword search engines built using modern information retrieval methods do a good job of handling fine-grained queries. However, they perform poorly at facilitating spatial and spatially-embedded thematic exploration of the results, despite the fact that many queries, e.g. "civil war," refer to different documents and topics in different places. This is not for lack of data: geographic information, such as place names, events, and coordinates are common in unstructured document collections on the web. The associations between geographic and thematic contents in these documents can provide a rich groundwork to organize information for exploratory research. In this paper we describe the architecture of an interactive thematic map search engine, Frankenplace, designed to facilitate document exploration at the intersection of theme and place. The map interface enables a user to zoom the geographic context of their query in and out, and quickly explore through thousands of search results in a meaningful way. And by combining topic models with geographically contextualized search results, users can discover related topics based on geographic context. Frankenplace utilizes a novel indexing method called geoboost for boosting terms associated with cells on a discrete global grid. The resulting index factors in the geographic scale of the place or feature mentioned in related text, the relative textual scope of the place reference, and the overall importance of the containing document in the document network. The system is currently indexed with over 5 million documents from the web, including the English Wikipedia and online travel blog entries. We demonstrate that Frankenplace can support four distinct types of exploratory search tasks while being adaptive to scale and location of interest.

    References

    [1]
    B. Adams and K. Janowicz. On the geo-indicativeness of non-georeferenced text. In ICWSM, pages 375--378. The AAAI Press, 2012.
    [2]
    B. Adams and G. McKenzie. Frankenplace: An application for similarity-based place search. In ICWSM, pages 616--617. The AAAI Press, 2012.
    [3]
    G. Amati and C. J. Van Rijsbergen. Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Transactions on Information Systems (TOIS), 20(4):357--389, 2002.
    [4]
    L. Backstrom, J. Kleinberg, R. Kumar, and J. Novak. Spatial variation in search engine queries. In WWW, pages 357--366. ACM, 2008.
    [5]
    D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3(4/5):993--1022, 2003.
    [6]
    C. A. Brewer and L. Pickle. Evaluation of methods for classifying epidemiological data on choropleth maps in series. Annals of the Association of American Geographers, 92(4):662--681, 2002.
    [7]
    H. Chen and S. Dumais. Bringing order to the web: Automatically categorizing search results. In SIGCHI, pages 145--152. ACM, 2000.
    [8]
    Y.-Y. Chen, T. Suel, and A. Markowetz. Efficient query processing in geographic web search engines. In SIGMOD, pages 277--288. ACM, 2006.
    [9]
    S. Clinchant and E. Gaussier. Information-based models for ad hoc IR. In SIGIR, pages 234--241. ACM, 2010.
    [10]
    G. Cong, C. S. Jensen, and D. Wu. Efficient retrieval of the top-k most relevant spatial web objects. Proceedings of the VLDB Endowment, 2(1):337--348, 2009.
    [11]
    A. S. Fotheringham and D. W. Wong. The modifiable areal unit problem in multivariate statistical analysis. Environment and planning A, 23(7):1025--1044, 1991.
    [12]
    M. F. Goodchild, H. Guo, A. Annoni, L. Bian, K. de Bie, F. Campbell, M. Craglia, M. Ehlers, J. van Genderen, D. Jackson, A. J. Lewis, M. Pesaresi, G. Remetey-Fulopp, R. Simpson, A. Skidmore, C. Wang, and P. Woodgate. Next-generation digital earth. PNAS, 109(28):11088--11094, 2012.
    [13]
    K. M. Gorski, E. Hivon, A. Banday, B. D. Wandelt, F. K. Hansen, M. Reinecke, and M. Bartelmann. HEALPix: A framework for high-resolution discretization and fast analysis of data distributed on the sphere. The Astrophysical Journal, 622(2):759, 2005.
    [14]
    C. A. Gotway and L. J. Young. Combining incompatible spatial data. Journal of the American Statistical Association, 97(458):632--648, 2002.
    [15]
    S. Graham and P. Healey. Relational concepts of space and place: Issues for planning theory and practice. European Planning Studies, 7(5):623--646, 1999.
    [16]
    L. Gravano, V. Hatzivassiloglou, and R. Lichtenstein. Categorizing web queries according to geographical locality. In CIKM, pages 325--333. ACM, 2003.
    [17]
    R. W. Gray. Exact transformation equations for Fuller's world map. Cartographica, 32(3):17--25, 1995.
    [18]
    B. Gretarsson, J. O'Donovan, S. Bostandjiev, T. Hollerer, A. Asuncion, D. Newman, and P. Smyth. Topicnets: Visual analysis of large text corpora with topic modeling. ACM Transactions on Intelligent Systems and Technology (TIST), 3(2):23, 2012.
    [19]
    B. Hecht, S. H. Carton, M. Quaderi, J. Schoning, M. Raubal, D. Gergle, and D. Downey. Explanatory semantic relatedness and explicit spatialization for exploratory search. In SIGIR, pages 415--424. ACM, 2012.
    [20]
    B. Hecht and M. Raubal. GeoSR: Geographically explore semantic relations in world knowledge. In L. Bernard, A. Friis-Christensen, and H. Pundt, editors, The European Information Society, pages 95--113. Springer Berlin Heidelberg, 2008.
    [21]
    G. F. Jenks. The data model concept in statistical mapping. International Yearbook of Cartography, 7(1):186--190, 1967.
    [22]
    C. Keler, K. Janowicz, and M. Bishr. An agenda for the next generation gazetteer: Geographic information contribution and retrieval. In SIGSPATIAL, pages 91--100. ACM, 2009.
    [23]
    J. M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5):604--632, 1999.
    [24]
    B. Kules, R. Capra, M. Banta, and T. Sierra. What do exploratory searchers look at in a faceted search interface? In JCDL, pages 313--322, New York, NY, USA, 2009. ACM.
    [25]
    J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kontokostas, P. N. Mendes, S. Hellmann, M. Morsey, P. van Kleef, S. Auer, and C. Bizer. DBpedia - A large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web Journal, 2014.
    [26]
    C. D. Manning, H. Schutze, and P. Raghavan. Introduction to Information Retrieval. Cambridge University Press, Cambridge, UK, 2008.
    [27]
    G. Marchionini. Exploratory search: from finding to understanding. Communications of the ACM, 49(4):41--46, 2006.
    [28]
    D. Newman, T. Baldwin, L. Cavedon, E. Huang, S. Karimi, D. Martinez, F. Scholer, and J. Zobel. Visualizing search results and document collections using topic maps. Web Semantics, 8(2):169--175, 2010.
    [29]
    L. T. Nowell, R. K. France, D. Hix, L. S. Heath, and E. A. Fox. Visualizing search results: Some alternatives to query-document similarity. In SIGIR, pages 67--75. ACM, 1996.
    [30]
    L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford InfoLab, 1999.
    [31]
    J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In SIGIR, pages 275--281. ACM, 1998.
    [32]
    K. Sahr, D. White, and A. J. Kimerling. Geodesic discrete global grid systems. Cartography and Geographic Information Science, 30(2):121--134, 2003.
    [33]
    B. Shneiderman, D. Feldman, A. Rose, and X. F. Grau. Visualizing digital library search results with categorical and hierarchical axes. In ACM DL, pages 57--66. ACM, 2000.
    [34]
    B. W. Silverman. Density estimation: for statistics and data analysis. Monographs on Statistics and Applied Probability 26. Chapman and Hall/CRC, 1986.
    [35]
    D. A. Smith. Detecting and browsing events in unstructured text. In SIGIR, pages 73--80. ACM, 2002.
    [36]
    J. P. Snyder. An equal-area map projection for polyhedral globes. Cartographica, 29(1):10--21, 1992.
    [37]
    H. Southall, R. Mostern, and M. L. Berman. On historical gazetteers. International Journal of Humanities and Arts Computing, 5(2):127--145, 2011.
    [38]
    G. R. Terrell and D. W. Scott. Variable kernel density estimation. The Annals of Statistics, 20(3):1236--1265, 1992.
    [39]
    T. Tezuka, T. Kurashima, and K. Tanaka. Toward tighter integration of web search with a geographic information system. In WWW, pages 277--286. ACM, 2006.
    [40]
    W. Tobler. A computer movie simulating urban growth in the Detroit region. Economic Geography, 46(Supplement):234--240, Jun 1970.
    [41]
    H. Wallach, D. Mimno, and A. McCallum. Rethinking LDA: Why priors matter. In NIPS, pages 1973--1981, 2009.
    [42]
    R. W. White and R. A. Roth. Exploratory search: Beyond the query-response paradigm. Synthesis Lectures on Information Concepts, Retrieval, and Services, 1(1):1--98, 2009.
    [43]
    B. Wing and J. Baldridge. Simple supervised document geolocation with geodesic grids. In ACL, pages 955--964. ACL, 2011.
    [44]
    Y. Zhou, X. Xie, C. Wang, Y. Gong, and W.-Y. Ma. Hybrid index structures for location-based web search. In CIKM, pages 155--162, New York, NY, USA, 2005. ACM.

    Cited By

    View all
    • (2023) MixMap : a user-driven approach to place-based semantic similarity Cartography and Geographic Information Science10.1080/15230406.2023.2176930(1-16)Online publication date: 2-Mar-2023
    • (2022)“I Think i Discovered a Military Base in the Middle of the Ocean”—Null Island, the Most Real of Fictional PlacesIEEE Access10.1109/ACCESS.2022.319722210(84147-84165)Online publication date: 2022
    • (2021)Implicit, Formal, and Powerful Semantics in GeoinformationISPRS International Journal of Geo-Information10.3390/ijgi1005033010:5(330)Online publication date: 13-May-2021
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '15: Proceedings of the 24th International Conference on World Wide Web
    May 2015
    1460 pages
    ISBN:9781450334693

    Sponsors

    • IW3C2: International World Wide Web Conference Committee

    In-Cooperation

    Publisher

    International World Wide Web Conferences Steering Committee

    Republic and Canton of Geneva, Switzerland

    Publication History

    Published: 18 May 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. exploratory search
    2. geographic search
    3. information retrieval
    4. information visualization
    5. interactive search
    6. visual analytics

    Qualifiers

    • Research-article

    Conference

    WWW '15
    Sponsor:
    • IW3C2

    Acceptance Rates

    WWW '15 Paper Acceptance Rate 131 of 929 submissions, 14%;
    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)38
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023) MixMap : a user-driven approach to place-based semantic similarity Cartography and Geographic Information Science10.1080/15230406.2023.2176930(1-16)Online publication date: 2-Mar-2023
    • (2022)“I Think i Discovered a Military Base in the Middle of the Ocean”—Null Island, the Most Real of Fictional PlacesIEEE Access10.1109/ACCESS.2022.319722210(84147-84165)Online publication date: 2022
    • (2021)Implicit, Formal, and Powerful Semantics in GeoinformationISPRS International Journal of Geo-Information10.3390/ijgi1005033010:5(330)Online publication date: 13-May-2021
    • (2020)Event Geoparser with Pseudo-Location Entity Identification and Numerical Argument Extraction Implementation and Evaluation in Indonesian News DomainISPRS International Journal of Geo-Information10.3390/ijgi91207129:12(712)Online publication date: 28-Nov-2020
    • (2020)A Fuzzy Spatial Region Extraction Model for Object’s Vague Location Description from Observer PerspectiveISPRS International Journal of Geo-Information10.3390/ijgi91207039:12(703)Online publication date: 25-Nov-2020
    • (2020)A Review of Geospatial Semantic Information Modeling and Elicitation ApproachesISPRS International Journal of Geo-Information10.3390/ijgi90301469:3(146)Online publication date: 1-Mar-2020
    • (2020)Harvesting Big Geospatial Data from Natural Language TextsHandbook of Big Geospatial Data10.1007/978-3-030-55462-0_19(487-507)Online publication date: 17-Dec-2020
    • (2019)Toward Universal Spatialization Through Wikipedia-Based Semantic EnhancementACM Transactions on Interactive Intelligent Systems10.1145/32137699:2-3(1-29)Online publication date: 9-Apr-2019
    • (2018)The thematic modelling of subtextMultimedia Tools and Applications10.5555/3287850.328787477:21(28281-28308)Online publication date: 1-Nov-2018
    • (2018)From spatial representation to processes, relational networks, and thematic roles in geographic information retrievalProceedings of the 12th Workshop on Geographic Information Retrieval10.1145/3281354.3281355(1-2)Online publication date: 6-Nov-2018
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media