Gorgon: Accelerating machine learning from relational data
2020 ACM/IEEE 47th Annual International Symposium on Computer …, 2020•ieeexplore.ieee.org
Accelerator deployment in data centers remains limited despite domain-specific
architectures' promise of higher performance. Rapidly-changing applications and high nre
cost make deploying fixed-function accelerators at scale untenable. More flexible than dsas,
fpgas are gaining traction but remain hampered by cumbersome programming models, long
synthesis times, and slow clocks. Coarse-grained reconfigurable architectures (cgra) are a
compelling alternative and offer efficiency while retaining programmability-by providing …
architectures' promise of higher performance. Rapidly-changing applications and high nre
cost make deploying fixed-function accelerators at scale untenable. More flexible than dsas,
fpgas are gaining traction but remain hampered by cumbersome programming models, long
synthesis times, and slow clocks. Coarse-grained reconfigurable architectures (cgra) are a
compelling alternative and offer efficiency while retaining programmability-by providing …
Accelerator deployment in data centers remains limited despite domain-specific architectures' promise of higher performance. Rapidly-changing applications and high nre cost make deploying fixed-function accelerators at scale untenable. More flexible than dsas, fpgas are gaining traction but remain hampered by cumbersome programming models, long synthesis times, and slow clocks. Coarse-grained reconfigurable architectures (cgra) are a compelling alternative and offer efficiency while retaining programmability-by providing general-purpose hardware and communication patterns, a single cgra targets multiple application domains.One emerging application is in-database machine learning: a high-performance, low-friction interface for analytics on large databases. We co-locate database and machine learning processing in a unified reconfigurable data analytics accelerator, Gorgon, which flexibly shares resources between db and ml without compromising performance or incurring excessive overheads in either domain. We distill and integrate database parallel patterns into an existing ML-focused cgra, increasing area by less than 4% while outperforming a multicore software baseline by 1500X. We also explore the performance impact of unifying db and ml in a single accelerator, showing up to 4x speedup over split accelerators.
ieeexplore.ieee.org