Submodular Streaming in All its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity

Kazemi, Ehsan; Mitrovic, Marko; Zadimoghaddam, Morteza; Lattanzi, Silvio; Karbasi, Amin

Computer Science > Machine Learning

arXiv:1905.00948 (cs)

[Submitted on 2 May 2019 (v1), last revised 13 May 2019 (this version, v2)]

Title:Submodular Streaming in All its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity

Authors:Ehsan Kazemi, Marko Mitrovic, Morteza Zadimoghaddam, Silvio Lattanzi, Amin Karbasi

View PDF

Abstract:Streaming algorithms are generally judged by the quality of their solution, memory footprint, and computational complexity. In this paper, we study the problem of maximizing a monotone submodular function in the streaming setting with a cardinality constraint $k$. We first propose Sieve-Streaming++, which requires just one pass over the data, keeps only $O(k)$ elements and achieves the tight $(1/2)$-approximation guarantee. The best previously known streaming algorithms either achieve a suboptimal $(1/4)$-approximation with $\Theta(k)$ memory or the optimal $(1/2)$-approximation with $O(k\log k)$ memory. Next, we show that by buffering a small fraction of the stream and applying a careful filtering procedure, one can heavily reduce the number of adaptive computational rounds, thus substantially lowering the computational complexity of Sieve-Streaming++. We then generalize our results to the more challenging multi-source streaming setting. We show how one can achieve the tight $(1/2)$-approximation guarantee with $O(k)$ shared memory while minimizing not only the required rounds of computations but also the total number of communicated bits. Finally, we demonstrate the efficiency of our algorithms on real-world data summarization tasks for multi-source streams of tweets and of YouTube videos.

Comments:	Proceedings of the 36th International Conference on Machine Learning, Long Beach, California, PMLR 97, 2019
Subjects:	Machine Learning (cs.LG); Data Structures and Algorithms (cs.DS); Machine Learning (stat.ML)
Cite as:	arXiv:1905.00948 [cs.LG]
	(or arXiv:1905.00948v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1905.00948

Submission history

From: Ehsan Kazemi [view email]
[v1] Thu, 2 May 2019 19:58:57 UTC (1,345 KB)
[v2] Mon, 13 May 2019 16:49:53 UTC (1,345 KB)

Computer Science > Machine Learning

Title:Submodular Streaming in All its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Submodular Streaming in All its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators