Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Information Access Based on Associative Calculation

  • Conference paper
  • First Online:
SOFSEM 2000: Theory and Practice of Informatics (SOFSEM 2000)

Abstract

The statistical measures for similarity have been widely used in textual information retrieval for many decades. They are the basis to improve the effectiveness ofIR systems, including retrieval, clustering, and summarization. We have developed an information retrieval system DualNAVI which provides users with rich interaction both in document space and in word space. We show that associative calculation for measuring similarity among documents or words is the computational basis oft his effective information access with DualNAVI. The new approaches in document clustering (Hierarchical Bayesian Clustering), and measuring term representativeness (Baseline method) are also discussed. Both have sound mathematical basis and depend essentially on associative calculation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. M. R. Anderberg. Cluster Analysis for Applications. Academic Press, 1973. 194, 195, 197

    Google Scholar 

  2. D. Butler. Souped-up search engines. Nature, 405, pages 112–115, 2000. 188

    Article  Google Scholar 

  3. K. W. Church, and P. Hanks. Word association norms, mutual information, and lexicography. Computational Linguistics, 16(1), pages 22–29, 1990. 198

    Google Scholar 

  4. R. M. Cormack. A review of classification. Journal of the Royal Statistical Society, 134:321–367, 1971. 194, 195, 197

    MathSciNet  Google Scholar 

  5. W. B. Croft. A model of cluster searching based on classification. Information Systems, 5:189–195, 1980. 193, 194

    Article  Google Scholar 

  6. W. B. Croft. Document representation in probabilistic models of information retrieval. Journal of the American Society for Information Science, 32(6):451–457, 1981. 194

    Article  Google Scholar 

  7. T. Dunning. Accurate method for the statistics of surprise and coincidence. Computational Linguistics, 19(1), pages 61–74, 1993. 198

    Google Scholar 

  8. R. H. Fowler, and D. W. Dearholt. Information Retrieval Using Pathfinder Networks, chapter 12, pages 165–178, 1990. Ablex.

    Google Scholar 

  9. N. Fuhr. Models for retrieval with probabilistic indexing. Information Processing & Retrieval, 25(1):55–72, 1989. 194

    MathSciNet  Google Scholar 

  10. A. Griffiths, L. A. Robinson, and P. Willett. Hierarchic agglomerative clustering methods for automatic document classification. Journal of Documentation, 40(3):175–205, 1984. 194, 195, 197

    Article  Google Scholar 

  11. M. A. Hearst, and J. O. Pedersen. Reexamining the cluster hypothesis: Scatter/gather on retrieval results. In Proceedings of ACM SIGIR’96, pages 76–84, 1996.

    Google Scholar 

  12. T. Hisamitsu, Y. Niwa, and J. Tsujii. Measuring Representativeness of Terms. In Proceedings of IRAL’99, pages 83–90, 1999. 197, 198

    Google Scholar 

  13. T. Hisamitsu, Y. Niwa, and J. Tsujii. A Method of Measuring Term Representativeness. In Proceedings of COLING 2000, pages 320–326, 2000. 193, 197, 198

    Google Scholar 

  14. M. Iwayama and T. Tokunaga. Hierarchical Bayesian Clustering for Automatic Text Classification. In Proceedings of IJCAI’95, pages 1322–1327, 1995. 194, 195

    Google Scholar 

  15. N. Jardine and C. J. Van Rijsbergen. The use of hierarchic clustering in information retrieval. Information Storage and Retrieval, 7:217–240, 1971. 193

    Article  Google Scholar 

  16. K. L. Kwok. Experiments with a component theory ofp robabilistic information retrieval based on single terms as document components. ACM Transactions on Information Systems, 8(4):363–386, 1990. 194

    Article  MathSciNet  Google Scholar 

  17. D. D. Lewis. An evaluation ofp hrasal and clustered representation on a text categorization task. In Proceedings of ACM SIGIR’92, pages 37–50, 1992. 194

    Google Scholar 

  18. M. Nagao, M. Mizutani, and H. Ikeda. An automated method of the extraction of important words from Japanese scientific documents. In Transaction of IPSJ, 17(2), pages 110–117, 1976. 198

    Google Scholar 

  19. S. Nishioka, Y. Niwa, M. Iwayama, and A. Takano. DualNAVI: An information retrieval interface. In Proceedings of JSSST WISS’97, pages 43–48, 1997. (in Japanese). 188

    Google Scholar 

  20. Y. Niwa, S. Nishioka, M. Iwayama, and A. Takano. Topic graph generation for query navigation: Use of frequency classes for topic extraction. In Proceedings of NLPRS’97, pages 95–100, 1997. 190

    Google Scholar 

  21. Y. Niwa, M. Iwayama, T. Hisamitsu, S. Nishioka, A. Takano, H. Sakurai, and O. Imaichi. Interactive Document Search with DualNAVI. In Proceedings of NTCIR’99, pages 123–130, 1999. 188, 189

    Google Scholar 

  22. H. Sakurai, and T. Hisamitsu. A data structure for fast lookup of grammatically connectable word pairs in japanese morphological analysis. In Proceedings of ICCPOL’99, pages 467–471, 1999.

    Google Scholar 

  23. G. Salton, and C. S. Yang. On the Specification of Term Values in Automatic Indexing. Journal of Documentation, 29(4):351–372, 1973. 198

    Article  Google Scholar 

  24. B. R. Schatz, E. H. Johnson, and P. A. Cochrane. Interactive term suggestion for users of digital libraries: Using subject thesauri and co-occurrence lists for information retrieval. In Proceedings of ACM DL’96, pages 126–133, 1996.

    Google Scholar 

  25. A. Singhal, C. Buckley, and M. Mitra. Pivoted Document Length Normalization In Proceedings of ACM SIGIR’96, pages 21–29, 1996. 192

    Google Scholar 

  26. C. J. van Rijsbergen and W. B. Croft. Document clustering: An evaluation of some experiments with the granfield 1400 collection. Information Processing & Management, 11:171–182, 1975. 193

    Article  Google Scholar 

  27. P. Willett. Similarity coefficients and weighting functions for automatic document classification: an empirical comparison. International Classification, 10(3):138–142, 1983. 193

    Google Scholar 

  28. P. Willett. Recent trends in hierarchic document clustering: A critical review. Information Processing & Management, 24(5):577–597, 1988. 194, 195

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Takano, A. et al. (2000). Information Access Based on Associative Calculation. In: Hlaváč, V., Jeffery, K.G., Wiedermann, J. (eds) SOFSEM 2000: Theory and Practice of Informatics. SOFSEM 2000. Lecture Notes in Computer Science, vol 1963. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44411-4_12

Download citation

  • DOI: https://doi.org/10.1007/3-540-44411-4_12

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41348-6

  • Online ISBN: 978-3-540-44411-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics