Information Access Based on Associative Calculation

Takano, Akihiko; Niwa, Yoshiki; Nishioka, Shingo; Iwayama, Makoto; Hisamitsu, Toru; Imaichi, Osamu; Sakurai, Hirofumi

doi:10.1007/3-540-44411-4_12

Akihiko Takano⁷,
Yoshiki Niwa⁷,
Shingo Nishioka⁷,
Makoto Iwayama⁷,
Toru Hisamitsu⁷,
Osamu Imaichi⁷ &
…
Hirofumi Sakurai⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1963))

Included in the following conference series:

International Conference on Current Trends in Theory and Practice of Computer Science

454 Accesses
9 Citations

Abstract

The statistical measures for similarity have been widely used in textual information retrieval for many decades. They are the basis to improve the effectiveness ofIR systems, including retrieval, clustering, and summarization. We have developed an information retrieval system DualNAVI which provides users with rich interaction both in document space and in word space. We show that associative calculation for measuring similarity among documents or words is the computational basis oft his effective information access with DualNAVI. The new approaches in document clustering (Hierarchical Bayesian Clustering), and measuring term representativeness (Baseline method) are also discussed. Both have sound mathematical basis and depend essentially on associative calculation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Faster Exact Search Using Document Clustering

Combining semantic and term frequency similarities for text clustering

Article 02 January 2019

Web Similarity in Sets of Search Terms Using Database Queries

Article 12 May 2020

References

M. R. Anderberg. Cluster Analysis for Applications. Academic Press, 1973. 194, 195, 197
Google Scholar
D. Butler. Souped-up search engines. Nature, 405, pages 112–115, 2000. 188
Article Google Scholar
K. W. Church, and P. Hanks. Word association norms, mutual information, and lexicography. Computational Linguistics, 16(1), pages 22–29, 1990. 198
Google Scholar
R. M. Cormack. A review of classification. Journal of the Royal Statistical Society, 134:321–367, 1971. 194, 195, 197
MathSciNet Google Scholar
W. B. Croft. A model of cluster searching based on classification. Information Systems, 5:189–195, 1980. 193, 194
Article Google Scholar
W. B. Croft. Document representation in probabilistic models of information retrieval. Journal of the American Society for Information Science, 32(6):451–457, 1981. 194
Article Google Scholar
T. Dunning. Accurate method for the statistics of surprise and coincidence. Computational Linguistics, 19(1), pages 61–74, 1993. 198
Google Scholar
R. H. Fowler, and D. W. Dearholt. Information Retrieval Using Pathfinder Networks, chapter 12, pages 165–178, 1990. Ablex.
Google Scholar
N. Fuhr. Models for retrieval with probabilistic indexing. Information Processing & Retrieval, 25(1):55–72, 1989. 194
MathSciNet Google Scholar
A. Griffiths, L. A. Robinson, and P. Willett. Hierarchic agglomerative clustering methods for automatic document classification. Journal of Documentation, 40(3):175–205, 1984. 194, 195, 197
Article Google Scholar
M. A. Hearst, and J. O. Pedersen. Reexamining the cluster hypothesis: Scatter/gather on retrieval results. In Proceedings of ACM SIGIR’96, pages 76–84, 1996.
Google Scholar
T. Hisamitsu, Y. Niwa, and J. Tsujii. Measuring Representativeness of Terms. In Proceedings of IRAL’99, pages 83–90, 1999. 197, 198
Google Scholar
T. Hisamitsu, Y. Niwa, and J. Tsujii. A Method of Measuring Term Representativeness. In Proceedings of COLING 2000, pages 320–326, 2000. 193, 197, 198
Google Scholar
M. Iwayama and T. Tokunaga. Hierarchical Bayesian Clustering for Automatic Text Classification. In Proceedings of IJCAI’95, pages 1322–1327, 1995. 194, 195
Google Scholar
N. Jardine and C. J. Van Rijsbergen. The use of hierarchic clustering in information retrieval. Information Storage and Retrieval, 7:217–240, 1971. 193
Article Google Scholar
K. L. Kwok. Experiments with a component theory ofp robabilistic information retrieval based on single terms as document components. ACM Transactions on Information Systems, 8(4):363–386, 1990. 194
Article MathSciNet Google Scholar
D. D. Lewis. An evaluation ofp hrasal and clustered representation on a text categorization task. In Proceedings of ACM SIGIR’92, pages 37–50, 1992. 194
Google Scholar
M. Nagao, M. Mizutani, and H. Ikeda. An automated method of the extraction of important words from Japanese scientific documents. In Transaction of IPSJ, 17(2), pages 110–117, 1976. 198
Google Scholar
S. Nishioka, Y. Niwa, M. Iwayama, and A. Takano. DualNAVI: An information retrieval interface. In Proceedings of JSSST WISS’97, pages 43–48, 1997. (in Japanese). 188
Google Scholar
Y. Niwa, S. Nishioka, M. Iwayama, and A. Takano. Topic graph generation for query navigation: Use of frequency classes for topic extraction. In Proceedings of NLPRS’97, pages 95–100, 1997. 190
Google Scholar
Y. Niwa, M. Iwayama, T. Hisamitsu, S. Nishioka, A. Takano, H. Sakurai, and O. Imaichi. Interactive Document Search with DualNAVI. In Proceedings of NTCIR’99, pages 123–130, 1999. 188, 189
Google Scholar
H. Sakurai, and T. Hisamitsu. A data structure for fast lookup of grammatically connectable word pairs in japanese morphological analysis. In Proceedings of ICCPOL’99, pages 467–471, 1999.
Google Scholar
G. Salton, and C. S. Yang. On the Specification of Term Values in Automatic Indexing. Journal of Documentation, 29(4):351–372, 1973. 198
Article Google Scholar
B. R. Schatz, E. H. Johnson, and P. A. Cochrane. Interactive term suggestion for users of digital libraries: Using subject thesauri and co-occurrence lists for information retrieval. In Proceedings of ACM DL’96, pages 126–133, 1996.
Google Scholar
A. Singhal, C. Buckley, and M. Mitra. Pivoted Document Length Normalization In Proceedings of ACM SIGIR’96, pages 21–29, 1996. 192
Google Scholar
C. J. van Rijsbergen and W. B. Croft. Document clustering: An evaluation of some experiments with the granfield 1400 collection. Information Processing & Management, 11:171–182, 1975. 193
Article Google Scholar
P. Willett. Similarity coefficients and weighting functions for automatic document classification: an empirical comparison. International Classification, 10(3):138–142, 1983. 193
Google Scholar
P. Willett. Recent trends in hierarchic document clustering: A critical review. Information Processing & Management, 24(5):577–597, 1988. 194, 195
Article Google Scholar

Download references

Author information

Authors and Affiliations

Central Research Laboratory, Hitachi, Ltd., 350-0395, Hatoyama, Saitama, Japan
Akihiko Takano, Yoshiki Niwa, Shingo Nishioka, Makoto Iwayama, Toru Hisamitsu, Osamu Imaichi & Hirofumi Sakurai

Authors

Akihiko Takano
View author publications
You can also search for this author in PubMed Google Scholar
Yoshiki Niwa
View author publications
You can also search for this author in PubMed Google Scholar
Shingo Nishioka
View author publications
You can also search for this author in PubMed Google Scholar
Makoto Iwayama
View author publications
You can also search for this author in PubMed Google Scholar
Toru Hisamitsu
View author publications
You can also search for this author in PubMed Google Scholar
Osamu Imaichi
View author publications
You can also search for this author in PubMed Google Scholar
Hirofumi Sakurai
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Cybernetics, Czech Technical University, Karlovo nám. 13, 121 35, Prague, Czech Republic
Václav Hlaváč
Information Technology Department, CLRC RAL, Chilton, Didcot, Oxfordshire, UK
Keith G. Jeffery
Insitute of Computer Science, Academy of Sciences of the Czech Republic, Pod vodárenskou věží 2, 182 07, Prague, Czech Republic
Jiří Wiedermann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Takano, A. et al. (2000). Information Access Based on Associative Calculation. In: Hlaváč, V., Jeffery, K.G., Wiedermann, J. (eds) SOFSEM 2000: Theory and Practice of Informatics. SOFSEM 2000. Lecture Notes in Computer Science, vol 1963. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44411-4_12

Download citation

DOI: https://doi.org/10.1007/3-540-44411-4_12
Published: 22 January 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41348-6
Online ISBN: 978-3-540-44411-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Information Access Based on Associative Calculation

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Faster Exact Search Using Document Clustering

Combining semantic and term frequency similarities for text clustering

Web Similarity in Sets of Search Terms Using Database Queries

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Information Access Based on Associative Calculation

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Faster Exact Search Using Document Clustering

Combining semantic and term frequency similarities for text clustering

Web Similarity in Sets of Search Terms Using Database Queries

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation