Author: Li, Cunlu : Search

short-paper

Xraytest: An X-ray Test system for finding faults of RDMA-NIC Design and Implementation

NAIC '24: Proceedings of the 2024 SIGCOMM Workshop on Networks for AI ComputingPages 7–8https://doi.org/10.1145/3672198.3673802

This paper presents a test system, which can find faults of RDMA-NIC design and implementation as well as X-ray in medical examination, which can supply a gap between self-developed RDMA NIC and business RDMA NIC such as Mellanox. Our work, Xraytest, ...

research-article

DRLAR: A deep reinforcement learning-based adaptive routing framework for network-on-chips

Computer Networks: The International Journal of Computer and Telecommunications Networking (CNTW), Volume 246, Issue Chttps://doi.org/10.1016/j.comnet.2024.110419

Abstract

Adaptive routing plays a pivotal role in the overall performance of Network-on-Chips (NoCs). However, with many-core architectures supporting complex and constantly changing traffic patterns for emerging applications, this aspect presents ...

article

A survey of machine learning for Network-on-Chips

Journal of Parallel and Distributed Computing (JPDC), Volume 186, Issue Chttps://doi.org/10.1016/j.jpdc.2023.104778

Abstract

The popularity of Machine Learning (ML) has extended to numerous disciplines, including the domain of Network-on-chips (NoCs), leading to a consequential impact. Recent works have explored ML models' applicability for NoCs design, optimization, ...

Highlights

Introduction of applying common Machine Learning (ML) techniques for Network-on-Chips (NoCs).
A comprehensive survey of ML for NoCs from performance prediction and NoCs design perspective.
ML-based for NoCs performance modeling, ...

research-article

A Deterministic Embedded End-System Tightly Coupled With TSN Schedule

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCADICS), Volume 42, Issue 11Pages 3707–3719https://doi.org/10.1109/TCAD.2023.3248500

Distributed real-time systems (DRTSs) composed of many embedded end-systems have been widely adopted in the industrial fields. Time-sensitive networking (TSN), as a promising communication infrastructure for DRTS, has shown great potential in industry and ...

Article

DeTAR: A Decision Tree-Based Adaptive Routing in Networks-on-Chip

Euro-Par 2023: Parallel ProcessingPages 352–366https://doi.org/10.1007/978-3-031-39698-4_24

Abstract

The deployment of heuristic algorithms is extensively utilized in the routing policy of Networks-on-Chip (NoCs). However, the escalating complexity and heterogeneity of multi-core architectures present a formidable task for human-designed ...

poster

Poster Abstract: A Network-on-Chip Router Architecture for Industrial Internet-of-Thing Gateways

IPSN '23: Proceedings of the 22nd International Conference on Information Processing in Sensor NetworksPages 302–303https://doi.org/10.1145/3583120.3589818

More processors are integrated into Industrial Internet-of-Thing gateways to perform increasing emerging applications. Network-on-chip (NoC) offers a scalable, high-throughput, and energy-efficient communicate infrastructure. However, existing NoC ...

research-article

Revisiting network congestion avoidance through adaptive packet-chaining reservation

Computer Networks: The International Journal of Computer and Telecommunications Networking (CNTW), Volume 212, Issue Chttps://doi.org/10.1016/j.comnet.2022.109008

Abstract

Endpoint congestion is a bottleneck in high-performance computing (HPC) networks, which severely impacts system performance, especially for latency-sensitive applications. When the long messages (or flows) has a far larger duration ...

Highlights

PCRP selects the packet chaining as the efficient granularity of the reservation.

research-article

Open Access

MUA-Router: Maximizing the Utility-of-Allocation for On-chip Pipelining Routers

ACM Transactions on Architecture and Code Optimization (TACO), Volume 19, Issue 3Article No.: 33, Pages 1–23https://doi.org/10.1145/3519027

As an important pipeline stage in the router of Network-on-Chips, switch allocation assigns output ports to input ports and allows flits to transit through the switch without conflicts. Previous work designed efficient switch allocation strategies by ...

Article

Evaluation of Topology-Aware All-Reduce Algorithm for Dragonfly Networks

Network and Parallel ComputingPages 243–255https://doi.org/10.1007/978-3-030-93571-9_19

Abstract

Dragonfly is a popular topology for current and future high-speed interconnection networks. The concept of gathering topology information to accelerate collective operations is a very hot research field. All-reduce operations are often used in the ...

research-article

Open Access

CIB-HIER: Centralized Input Buffer Design in Hierarchical High-radix Routers

ACM Transactions on Architecture and Code Optimization (TACO), Volume 18, Issue 4Article No.: 50, Pages 1–21https://doi.org/10.1145/3468062

Hierarchical organization is widely used in high-radix routers to enable efficient scaling to higher switch port count. A general-purpose hierarchical router must be symmetrically designed with the same input buffer depth, resulting in a large amount of ...

research-article

Network Congestion Avoidance through Packet-chaining Reservation

ICPP '19: Proceedings of the 48th International Conference on Parallel ProcessingArticle No.: 58, Pages 1–10https://doi.org/10.1145/3337821.3337874

Endpoint congestion is a bottleneck in high-performance computing (HPC) networks and severely impacts system performance, especially for latency-sensitive applications. For long messages (or flows) whose duration is far larger than the round-trip time (...

research-article

DeepHiR: improving high-radix router throughput with deep hybrid memory buffer microarchitecture

ICS '19: Proceedings of the ACM International Conference on SupercomputingPages 403–413https://doi.org/10.1145/3330345.3330381

Hierarchical high-radix router microarchitecture consisting of small SRAM-based intermediate buffers have been used in large-scale supercomputers interconnection networks. While hierarchical organization enables efficient scaling to higher switch port ...

article

HARE: History-Aware Adaptive Routing Algorithm for Endpoint Congestion in Networks-on-Chip

International Journal of Parallel Programming (IJPP), Volume 47, Issue 3Pages 433–450https://doi.org/10.1007/s10766-018-0614-6

Endpoint congestion is one of the most challenging issues when designing low latency and high bandwidth on-chip interconnection networks. Tree saturation and head-of-line blocking caused by the endpoint congestion seriously decrease system throughput ...

research-article

RoB-Router : A Reorder Buffer Enabled Low Latency Network-on-Chip Router

IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 29, Issue 9Pages 2090–2104https://doi.org/10.1109/TPDS.2018.2817552

Traditional input-queued routers in network-on-chips (NoCs) only have a small number of virtual channels (VCs) and packets in a VC are organized in a fixed order. Such design is susceptible to head-of-line (HoL) blocking as only the packet at the head of ...

research-article

Exploiting contention and congestion aware switch allocation in network-on-chips

ACM TURC '17: Proceedings of the ACM Turing 50th Celebration Conference - ChinaArticle No.: 41, Pages 1–10https://doi.org/10.1145/3063955.3063997

Network-on-chip system plays an important role to improve the performance of chip multiprocessor systems. As the complexity of the network increases, congestion problem has become the major performance bottleneck and seriously influence the performance ...

research-article

Galaxyfly: A Novel Family of Flexible-Radix Low-Diameter Topologies for Large-Scales Interconnection Networks

ICS '16: Proceedings of the 2016 International Conference on SupercomputingArticle No.: 24, Pages 1–12https://doi.org/10.1145/2925426.2926275

Interconnection network plays an essential role in the architecture of large-scale high performance computing (HPC) systems. In the paper, we construct a novel family of low-diameter topologies, Galaxyfly, using techniques of algebraic graphs over ...

Search Results

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Results

Xraytest: An X-ray Test system for finding faults of RDMA-NIC Design and Implementation

DRLAR: A deep reinforcement learning-based adaptive routing framework for network-on-chips

A survey of machine learning for Network-on-Chips

A Deterministic Embedded End-System Tightly Coupled With TSN Schedule

DeTAR: A Decision Tree-Based Adaptive Routing in Networks-on-Chip

Poster Abstract: A Network-on-Chip Router Architecture for Industrial Internet-of-Thing Gateways

Revisiting network congestion avoidance through adaptive packet-chaining reservation

MUA-Router: Maximizing the Utility-of-Allocation for On-chip Pipelining Routers

Evaluation of Topology-Aware All-Reduce Algorithm for Dragonfly Networks

CIB-HIER: Centralized Input Buffer Design in Hierarchical High-radix Routers

Network Congestion Avoidance through Packet-chaining Reservation

DeepHiR: improving high-radix router throughput with deep hybrid memory buffer microarchitecture

HARE: History-Aware Adaptive Routing Algorithm for Endpoint Congestion in Networks-on-Chip

RoB-Router : A Reorder Buffer Enabled Low Latency Network-on-Chip Router

Exploiting contention and congestion aware switch allocation in network-on-chips

Galaxyfly: A Novel Family of Flexible-Radix Low-Diameter Topologies for Large-Scales Interconnection Networks

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Save to Binder