Abstract
With the increasing importance of text analytics in all disciplines, e.g., science, business, and social media analytics, it has become important to extract actionable insights from text in a timely manner. Insights from text analytics are conventionally presented as visualizations and dashboards to the analyst. While these insights are intended to be set up as a one-time task and observed in a passive manner, most use cases in the real world require constant tweaking of these dashboards in order to adapt to new data analysis settings. Current systems supporting such analysis have grown from simplistic chains of aggregations to complex pipelines with a range of implicit (or latent) and explicit parametric knobs. The re-execution of such pipelines can be computationally expensive, and the increased query-response time at each step may significantly delay the analysis task. Enabling the analyst to interactively tweak and explore the space allows the analyst to get a better hold on the data and insights. We propose a novel interactive framework that allows social media analysts to tweak the text mining dashboards not just during its development stage, but also during the analytics process itself. Our framework leverages opportunities unique to text pipelines to ensure fast response times, allowing for a smooth, rich and usable exploration of an entire analytics space.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aggarwal, C.C.: An Introduction to Social Network Data Analytics. Springer (2011)
Alexe, B., Hernandez, M.A., Hildrum, K.W., Krishnamurthy, R., Koutrika, G., Nagarajan, M., Roitman, H., Shmueli-Scheuer, M., Stanoi, I.R., Venkatramani, C., Wagle, R.: Surfacing Time-critical Insights from Social Media. In: SIGMOD (2012)
Asur, S., Huberman, B.A.: Predicting the Future with Social Media. In: WI-IAT (2010)
Deng, K., Moore, A.W.: Multiresolution Instance-based Learning. In: IJCAI (1995)
Domingos, P., Pazzani, M.: On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. In: Machine Learning (1997)
Fisher, D.H.: Knowledge Acquisition via Incremental Conceptual Clustering. In: Machine Learning (1987)
Gama, J.: A Cost-sensitive Iterative bayes. In: ICML (2000)
Gama, J., Castillo, G.: Adaptive Bayes. In: Advances in AI BERAMIA (2002)
Gravano, L., Ipeirotis, P.G., Jagadish, H.V., Koudas, N., Muthukrishnan, S., Pietarinen, L., Srivastava, D.: Using q-grams in a DBMS for Approximate String Processing. In: TCDE (2001)
Gupta, H., Mumick, I.S.: Selection of Views to Materialize in a Data Warehouse. In: TKDE (2005)
Halevy, A.Y.: Answering Queries Using Views: A Survey. In: VLDB (2001)
Infosphere Biginsights, I. (2011), http://www.ibm.com
Facebook Inc. 1.35 Billion Monthly Active Users as of. Company Information (September 30, 2014)
Indyk, P., Motwani, R.: Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In: STOC (1998)
International Telecommunication Union: United Nations Special Agency. The World in 2014. ICT Facts and Figures (2014)
Ivanova, M.G., Kersten, M.L., Nes, N.J.: An Architecture for Recycling Intermediates in a Column-store. In: TODS (2010)
Jadhav, A.S., Purohit, H., Kapanipathi, P., Anantharam, P., Ranabahu, A.H., Nguyen, V., Mendes, P.N., Smith, A.G., Cooney, M., Sheth, A.: Twitris 2.0: Semantically Empowered System for Understanding Perceptions from Social Data. In: ISWC (2010)
Koudas, N., Marathe, A., Srivastava, D.: Flexible String Matching Against Large Databases in Practice. In: VLDB (2004)
Lewis, D.D.: Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval. Springer, 1998
Liu, Z., Heer, J.: The effects of interactive latency on exploratory visual analysis. IEEE Trans. Visualization & Comp. Graphics, Proc. InfoVis (2014)
Mami, I., Bellahsene, Z.: A Survey of View Selection Methods. In: SIGMOD (2012)
Marcus, A., Bernstein, M.S., Badar, O., Karger, D.R., Madden, S., Miller, R.C.: Tweets as Data: Demonstration of TweeQL and Twitinfo. In: SIGMOD (2011)
McCallum, A., Nigam, K.: A Comparison of Event Models for naive bayes Text Classification. AAAI-LTC (1998)
Miller, R.B.: Response time in man-computer conversational transactions. In: Proceedings of the, Fall Joint Computer Conference, Part I, December 9-11, pp. 267–277. ACM (1968)
Moore, A., Lee, M.S.: Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets. JAIR (1998)
Murphy, K.P.: Naive Bayes Classifiers. Springer (2006)
Olston, C., Bortnikov, E., Elmeleegy, K., Junqueira, F., Reed, B.: Interactive Analysis of Web-scale Data. In: CIDR (2009)
Park, C.-S., Kim, M.H., Lee, Y.-J.: Finding an Efficient Rewriting of OLAP Queries Using Materialized Views in Data Warehouses. In: DSS (2002)
Reips, U., Garaizar, P.: Mining Twitter: A Source for Psychological Wisdom of the Crowds. Behavior Research Methods (2011)
Rish, I.: An Empirical Study of the Naive bayes Classifier. IJCAI (2001)
Ross, K.A., Srivastava, D., Sudarshan., S.: Materialized View Maintenance and Integrity Constraint checking: Trading Space for Time. In: SIGMOD (1996)
Roy, P., Seshadri, S., Sudarshan, S., Bhobe, S.: Efficient and Extensible Algorithms for Multi Query Optimization. In: SIGMOD (2000)
Sankaranarayanan, J., Samet, H., Teitler, B.E., Lieberman, M.D., Sperling, J.: Twitterstand: News in Tweets. SIGSPATIAL GIS (2009)
Shneiderman, B.: Response time and display rate in human performance with computers. ACM Computing Surveys (CSUR) 16(3), 265–285 (1984)
Twitter Inc. Twitter Usage: 500 million Tweets are sent per day. Company Information (2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Nandi, A. et al. (2015). Interactive Tweaking of Text Analytics Dashboards. In: Chu, W., Kikuchi, S., Bhalla, S. (eds) Databases in Networked Information Systems. DNIS 2015. Lecture Notes in Computer Science, vol 8999. Springer, Cham. https://doi.org/10.1007/978-3-319-16313-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-16313-0_9
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16312-3
Online ISBN: 978-3-319-16313-0
eBook Packages: Computer ScienceComputer Science (R0)