Abstract
With vast amounts of text being available in electronic format, such as news and social media, automatic multi-document summarization can help extract the most important information. We present and evaluate a novel method for automatic extractive multi-document summarization. The method is purely combinatorial, based on bicliques in the bipartite word-sentence occurrence graph. It is particularly suited for collections of very short, independently written texts (often single sentences) with many repeated phrases, such as customer reviews of products. The method can run in subquadratic time in the number of documents, which is relevant for the application to large collections of documents.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alexe, G., Alexe, S., Crama, Y., Foldes, S., Hammer, P.L., Simeone, B.: Consensus algorithms for the generation of all maximal bicliques. Discr. Appl. Math. 145, 11–21 (2004)
Arasu, A., Ganti, V., Kaushik, R.: Efficient exact set-similarity joins. In: Dayal, U. et al. (eds.) VLDB 2006, pp. 918–929, ACM (2006)
Bogren, E., Toft, J.: Finding top-\(k\) similar document pairs - speeding up a multi-document summarization approach. Master’s thesis, Department of Computer Science and Engineering, Chalmers, Göteborg (2014)
Bonzanini, M., Martinez-Alvarez, M., Roelleke, T.: Extractive summarisation via sentence removal: condensing relevant sentences into a short summary. In: Jones, G.J.F. et al. (eds.) SIGIR 2013, pp. 893–896, ACM (2013)
Damaschke, P.: Finding and enumerating large intersections. Theor. Comp. Sci. 580, 75–82 (2015)
Dias, V.M.F., de Figueiredo, C.M.H., Szwarcfiter, J.L.: On the generation of bicliques of a graph. Discr. Appl. Math. 155, 1826–1832 (2007)
Elsayed, T., Lin, J., Oard, D.W.: Pairwise document similarity in large collections with MapReduce. In: ACL 2008: HLT, Short Papers (Companion Volume), pp. 265–268, Association for Computational Linguistics (2008)
Ganesan, K., Zhai, C., Han, J.: Opinosis: a graph based approach to abstractive summarization of highly redundant opinions. In: Huang, C.R., Jurafsky, D. (eds.) COLING 2010, pp. 340–348, Tsinghua University Press (2010)
Gely, A., Nourine, L., Sadi, B.: Enumeration aspects of maximal cliques and bicliques. Discr. Appl. Math. 157, 1447–1459 (2009)
Li, W.: Random texts exhibit Zipf’s-law-like word frequency distribution. IEEE Trans. Inf. Theor. 38, 1842–1845 (1992)
Li, J., Liu, G., Li, H., Wong, L.: Maximal biclique subgraphs and closed pattern pairs of the adjacency matrix: a one-to-one correspondence and mining algorithms. IEEE Trans. Knowl. Data Eng. 19, 1625–1637 (2007)
Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Moens, M.F., Szpakowicz (eds.) ACL Workshop “Text Summarization Branches Out”, pp. 74–81 (2004)
Lin, H., Bilmes, J.A.: A class of submodular functions for document summarization. In: Lin, D., Matsumoto, Y., Mihalcea, R. (eds.) ACL 2011, pp. 510–520, Association for Computational Linguistics (2011)
Luhn, H.P.: The automatic creation of literature abstracts. IBM J. 2, 159–165 (1958)
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
Nenkova, A., McKeown, K.: A survey of text summarization techniques. In: Aggarwal, C.C., Zhai, C. (eds.) Mining Text Data, pp. 43–76. Springer, Berlin (2012)
Porter, M.F.: An algorithm for suffix stripping. Program 14, 130–137 (1980)
Radev, D.R., Allison, T., Blair-Goldensohn, S., Blitzer, J., Celebi, A., Dimitrov, S., Drábek, E., Hakim, A., Lam, W., Liu, D., Otterbacher, J., Qi, H., Saggion, H., Teufel, S., Topper, M., Winkel, A., Zhang, Z.: MEAD - a platform for multidocument multilingual text summarization. In: LREC(2004)
Wang, D., Zhu, S., Li, T.: SumView: a web-based engine for summarizing product reviews and customer opinions. Expert Syst. Appl. 40, 27–33 (2013)
Xiao, C., Wang, W., Lin, X. Haichuan Shang, H.: Top-\(k\) set similarity joins. In: Ioannidis, Y.E., Lee, D.L., Ng, R.T. (eds.) ICDE 2009, pp. 916–927, IEEE (2009)
Acknowledgments
This work has been supported by Grant IIS11-0089 from the Swedish Foundation for Strategic Research (SSF), for the project “Data-driven secure business intelligence”. We thank our former master’s students Emma Bogren and Johan Toft for drawing our attention to similarity joins, and the members of our Algorithms group and collaborators at the companies Recorded Future and Findwise for many discussions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Muhammad, A.S., Damaschke, P., Mogren, O. (2016). Summarizing Online User Reviews Using Bicliques. In: Freivalds, R., Engels, G., Catania, B. (eds) SOFSEM 2016: Theory and Practice of Computer Science. SOFSEM 2016. Lecture Notes in Computer Science(), vol 9587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-49192-8_46
Download citation
DOI: https://doi.org/10.1007/978-3-662-49192-8_46
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-49191-1
Online ISBN: 978-3-662-49192-8
eBook Packages: Computer ScienceComputer Science (R0)