Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–3 of 3 results for author: Dhanotia, A

.
  1. arXiv:2303.08396  [pdf, other

    cs.DC cs.AR

    Workload Behavior Driven Memory Subsystem Design for Hyperscale

    Authors: Suyash Mahar, Hao Wang, Wei Shu, Abhishek Dhanotia

    Abstract: Hyperscalars run services across a large fleet of servers, serving billions of users worldwide. These services, however, behave differently than commonly available benchmark suites, resulting in server architectures that are not optimized for cloud workloads. With datacenters becoming a primary server processor market, optimizing server processors for cloud workloads by better understanding their… ▽ More

    Submitted 2 May, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

  2. TPP: Transparent Page Placement for CXL-Enabled Tiered-Memory

    Authors: Hasan Al Maruf, Hao Wang, Abhishek Dhanotia, Johannes Weiner, Niket Agarwal, Pallab Bhattacharya, Chris Petersen, Mosharaf Chowdhury, Shobhit Kanaujia, Prakash Chauhan

    Abstract: The increasing demand for memory in hyperscale applications has led to memory becoming a large portion of the overall datacenter spend. The emergence of coherent interfaces like CXL enables main memory expansion and offers an efficient solution to this problem. In such systems, the main memory can constitute different memory technologies with varied characteristics. In this paper, we characterize… ▽ More

    Submitted 28 May, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

  3. arXiv:2107.04140  [pdf, other

    cs.AR

    First-Generation Inference Accelerator Deployment at Facebook

    Authors: Michael Anderson, Benny Chen, Stephen Chen, Summer Deng, Jordan Fix, Michael Gschwind, Aravind Kalaiah, Changkyu Kim, Jaewon Lee, Jason Liang, Haixin Liu, Yinghai Lu, Jack Montgomery, Arun Moorthy, Satish Nadathur, Sam Naghshineh, Avinash Nayak, Jongsoo Park, Chris Petersen, Martin Schatz, Narayanan Sundaram, Bangsheng Tang, Peter Tang, Amy Yang, Jiecao Yu , et al. (90 additional authors not shown)

    Abstract: In this paper, we provide a deep dive into the deployment of inference accelerators at Facebook. Many of our ML workloads have unique characteristics, such as sparse memory accesses, large model sizes, as well as high compute, memory and network bandwidth requirements. We co-designed a high-performance, energy-efficient inference accelerator platform based on these requirements. We describe the in… ▽ More

    Submitted 4 August, 2021; v1 submitted 8 July, 2021; originally announced July 2021.