Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-642-11957-6_27guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A universal calculus for stream processing languages

Published: 20 March 2010 Publication History
  • Get Citation Alerts
  • Abstract

    Stream processing applications such as algorithmic trading, MPEG processing, and web content analysis are ubiquitous and essential to business and entertainment. Language designers have developed numerous domain-specific languages that are both tailored to the needs of their applications, and optimized for performance on their particular target platforms. Unfortunately, the goals of generality and performance are frequently at odds, and prior work on the formal semantics of stream processing languages does not capture the details necessary for reasoning about implementations. This paper presents Brooklet, a core calculus for stream processing that allows us to reason about how to map languages to platforms and how to optimize stream programs. We translate from three representative languages, CQL, StreamIt, and Sawzall, to Brooklet, and show that the translations are correct. We formalize three popular and vital optimizations, data-parallel computation, operator fusion, and operator re-ordering, and show under which conditions they are correct. Language designers can use Brooklet to specify exactly how new features or languages behave. Language implementors can use Brooklet to show exactly under which circumstances new optimizations are correct. In ongoing work, we are developing an intermediate language for streaming that is based on Brooklet. We are implementing our intermediate language on System S, IBM's high-performance streaming middleware.

    References

    [1]
    Arasu, A., Babu, S., Widom, J.: The CQL continuous query language: Semantic foundations and query execution. VLDB Journal, 121-142 (2006)
    [2]
    Arasu, A., Widom, J.: A denotational semantics for continuous queries over streams and relations. In: SIGMOD Record, pp. 6-11 (2004)
    [3]
    Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., Hanrahan, P.: Brook for GPUs: Stream computing on graphics hardware. In: TOG, pp. 777- 786 (2004)
    [4]
    Chen, J., DeWitt, D. J., Tian, F., Wang, Y.: NiagaraCQ: A scalable continuous query system for internet databases. In: SIGMOD, pp. 379-390 (2000)
    [5]
    Dean, J., Ghemawat, S.: MapReduce: Simplified data processing on large clusters. In: OSDI, pp. 137-150 (2004)
    [6]
    Drake, M., Hoffmann, H., Rabbah, R., Amarasinghe, S.: MPEG-2 decoding in a stream programming language. In: IPDPS, pp. 86-95 (2006)
    [7]
    Fegaras, L.: Optimizing queries with object updates. In: JIIS, pp. 219-242 (1999)
    [8]
    Ferrante, J., Ottenstein, K. J., Warren, J. D.: The program dependence graph and its use in optimization. In: TOPLAS, pp. 319-349 (1987)
    [9]
    Gedik, B., Andrade, H., Wu, K.-L., Yu, P. S., Doo, M.: Spade: The System S declarative stream processing engine. In: SIGMOD, pp. 1123-1134 (2008)
    [10]
    Ghelli, G., Onose, N., Rose, K., Siméon, J.: XML query optimization in the presence of side effects. In: SIGMOD, pp. 339-352 (2008)
    [11]
    Gurevich, Y., Leinders, D., den Bussche, J. V.: A theory of stream queries. In: DBLP, pp. 153-168 (2007)
    [12]
    Hoare, C. A. R.: Communicating sequential processes. In: CACM, pp. 666-677 (1978)
    [13]
    Igarashi, A., Pierce, B., Wadler, P.: Featherweight Java - a minimal core calculus for Java and GJ. In: TOPLAS, pp. 132-146 (1999)
    [14]
    Kahn, G.: The semantics of a simple language for parallel programming. In: IFIP, pp. 471-475 (1974)
    [15]
    Lämmel, R.: Google's MapReduce Programming Model - Revisited. Science of Computer Programming Journal, 208-237 (2007)
    [16]
    Lee, E. A., Messerschmitt, D. G.: Synchronous data flow. In: Proc. IEEE, pp. 1235- 1245 (1987)
    [17]
    Nielson, H. R., Nielson, F.: Semantics with applications: a formal introduction. John Wiley & Sons, Inc., Chichester (1992)
    [18]
    Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig Latin: A notso-foreign language for data processing. In: SIGMOD, pp. 1099-1110 (2008)
    [19]
    Pierce, B.C.: Types and programming languages. MIT Press, Cambridge (2002)
    [20]
    Pike, R., Dorward, S., Griesemer, R., Quinlan, S.: Interpreting the data: Parallel analysis with Sawzall. In: Scientific Programming, pp. 277-298 (2005)
    [21]
    Rinard, M.C., Diniz, P.C.: Commutativity analysis: a new analysis framework for parallelizing compilers. In: PLDI, pp. 54-67 (1996)
    [22]
    Soulé, R., Hirzel, M., Grimm, R., Gedik, B., Andrade, H., Kumar, V., Wu, K.-L.: A unified semantics for stream processing languages (extended). Technical Report 2010-924, New York University (2010)
    [23]
    Stephens, R.: A survey of stream processing. In: Acta Inf., pp. 491-541 (1997)
    [24]
    The StreamBase dialect of StreamSQL, http://streamsql.org/
    [25]
    Terry, D., Goldberg, D., Nichols, D., Oki, B.: Continuous queries over append-only databases. In: SIGMOD, pp. 321-330 (1992)
    [26]
    Thies, W., Karczmarek, M., Amarasinghe, S. P.: StreamIt: A language for streaming applications. In: Horspool, R. N. (ed.) CC 2002. LNCS, vol. 2304, pp. 179-196. Springer, Heidelberg (2002)
    [27]
    Thies, W., Karczmarek, M., Gordon, M., Maze, D., Wong, J., Hoffman, H., Brown, M., Amarasinghe, S.: StreamIt: A compiler for streaming applications. In: MIT Laboratory for Computer Science Technical Memo LCS-TM-622 (2001)
    [28]
    Yu, Y., Isard, M., Fetterly, D., Budiu, M., Erlingsson, Ú., Gunda, P. K., Currey, J.: DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In: OSDI, pp. 1-14 (2008)

    Cited By

    View all
    • (2024)An Overview of Continuous Querying in (Modern) Data SystemsCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3654679(605-612)Online publication date: 9-Jun-2024
    • (2022)The essence of online data processingProceedings of the ACM on Programming Languages10.1145/35633206:OOPSLA2(899-928)Online publication date: 31-Oct-2022
    • (2022)Stream processing with dependency-guided synchronizationProceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3503221.3508413(1-16)Online publication date: 2-Apr-2022
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    ESOP'10: Proceedings of the 19th European conference on Programming Languages and Systems
    March 2010
    629 pages
    ISBN:3642119565
    • Editor:
    • Andrew D. Gordon

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 20 March 2010

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)An Overview of Continuous Querying in (Modern) Data SystemsCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3654679(605-612)Online publication date: 9-Jun-2024
    • (2022)The essence of online data processingProceedings of the ACM on Programming Languages10.1145/35633206:OOPSLA2(899-928)Online publication date: 31-Oct-2022
    • (2022)Stream processing with dependency-guided synchronizationProceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3503221.3508413(1-16)Online publication date: 2-Apr-2022
    • (2018)Stream Processing Languages in the Big Data EraACM SIGMOD Record10.1145/3299887.329989247:2(29-40)Online publication date: 11-Dec-2018
    • (2018)Versatile event correlation with algebraic effectsProceedings of the ACM on Programming Languages10.1145/32367622:ICFP(1-31)Online publication date: 30-Jul-2018
    • (2017)SPLACM Transactions on Programming Languages and Systems10.1145/303920739:1(1-39)Online publication date: 6-Mar-2017
    • (2016)RiverSoftware—Practice & Experience10.1002/spe.233846:7(891-929)Online publication date: 1-Jul-2016
    • (2014)Rate types for stream programsACM SIGPLAN Notices10.1145/2714064.266022549:10(213-232)Online publication date: 15-Oct-2014
    • (2014)Rate types for stream programsProceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications10.1145/2660193.2660225(213-232)Online publication date: 15-Oct-2014
    • (2014)A catalog of stream processing optimizationsACM Computing Surveys10.1145/252841246:4(1-34)Online publication date: 1-Mar-2014
    • Show More Cited By

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media