Keyword: Parallelism : Search

research-article

Free

JUST ACCEPTED

SPARTA: High-Level Synthesis of Parallel Multi-Threaded Accelerators

ACM Transactions on Reconfigurable Technology and Systems (TRETS), Just Accepted https://doi.org/10.1145/3677035

This paper presents a methodology for the Synthesis of PARallel multi-Threaded Accelerators (SPARTA) from OpenMP annotated C/C++ specifications. SPARTA extends an open-source HLS tool, enabling the generation of accelerators that provide latency tolerance ...

research-article

A new family of fourth-order energy-preserving integrators

Yuto Miyatake

Numerical Algorithms (SPNA), Volume 96, Issue 3Jul 2024, Pages 1269–1293https://doi.org/10.1007/s11075-024-01824-w

Abstract

For Hamiltonian systems with non-canonical structure matrices, a new family of fourth-order energy-preserving integrators is presented. The integrators take a form of a combination of Runge–Kutta methods and continuous-stage Runge–Kutta methods ...

research-article

Open Access

WiseGraph: Optimizing GNN with Joint Workload Partition of Graph and Operations

EuroSys '24: Proceedings of the Nineteenth European Conference on Computer SystemsApril 2024, Pages 1–17https://doi.org/10.1145/3627703.3650063

Graph Neural Network (GNN) has emerged as an important workload for learning on graphs. With the size of graph data and the complexity of GNN model architectures increasing, developing an efficient GNN system grows more important. As GNN has heavy neural ...

research-article

ScaleCache: A Scalable Page Cache for Multiple Solid-State Drives

EuroSys '24: Proceedings of the Nineteenth European Conference on Computer SystemsApril 2024, Pages 641–656https://doi.org/10.1145/3627703.3629588

This paper presents a scalable page cache called ScaleCache for improving SSD scalability. Specifically, we first propose a concurrent data structure of page cache based on XArray (ccXArray) to enable access and update the page cache concurrently. Second,...

research-article

Open Access

An Energy-Efficient Parallelism Scheme for Deep Neural Network Training And Inferencing on Heterogeneous Cloud Resources

ICIIT '24: Proceedings of the 2024 9th International Conference on Intelligent Information TechnologyFebruary 2024, Pages 493–498https://doi.org/10.1145/3654522.3654596

The emergence of Large Language Models(LLM) and generative AI has led to an explosive increase in computational demands across cloud computing data centers. The growing number of parameters in deep learning models results in significant power consumption ...

Article

HPX with Spack and Singularity Containers: Evaluating Overheads for HPX/Kokkos Using an Astrophysics Application

Asynchronous Many-Task Systems and ApplicationsFeb 2024, Pages 173–184https://doi.org/10.1007/978-3-031-61763-8_17

Abstract

Cloud computing for high performance computing resources is an emerging topic. This service is of interest to researchers who care about reproducible computing, for software packages with complex installations, and for companies or researchers who ...

research-article

Open Access

Explicit Effects and Effect Constraints in ReML

Martin Elsman

Proceedings of the ACM on Programming Languages (PACMPL), Volume 8, Issue POPLArticle No.: 79, Pages 2370–2394https://doi.org/10.1145/3632921

An important aspect of building robust systems that execute on dedicated hardware and perhaps in constrained environments is to control and manage the effects performed by program code.

We present ReML, a higher-order statically-typed functional ...

research-article

Open Access

Discovering Parallelisms in Python Programs

ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software EngineeringNovember 2023, Pages 832–844https://doi.org/10.1145/3611643.3616259

Parallelization is a promising way to improve the performance of Python programs. Unfortunately, developers may miss parallelization possibilities, because they usually do not concentrate on parallelization. Many approaches have been proposed to ...

research-article

Open Access

CPU and GPU Parallelism of the A* Algorithm on solving N-Puzzle problems

PCI '23: Proceedings of the 27th Pan-Hellenic Conference on Progress in Computing and InformaticsNovember 2023, Pages 21–25https://doi.org/10.1145/3635059.3635063

This paper discusses the implementation of parallelism on the A* algorithm, using both the central processing unit and the graphics processing unit, in order to increase its efficiency in terms of the necessary time to solve Sliding Puzzle problems. The ...

Article

A Pipelined AES and SM4 Hardware Implementation for Multi-tasking Virtualized Environments

Algorithms and Architectures for Parallel ProcessingOct 2023, Pages 275–291https://doi.org/10.1007/978-981-97-0801-7_16

Abstract

Virtualization techniques are becoming increasingly prevalent and are driving trends in hardware development to offer parallelization support for multi-tasking. Existing works on hardware designs of the Advanced Encryption Standard (AES) and SM4 ...

research-article

Parallel Execution of Transactions Based on Dynamic and Self-Verifiable Conflict Analysis

LADC '23: Proceedings of the 12th Latin-American Symposium on Dependable and Secure ComputingOctober 2023, Pages 110–119https://doi.org/10.1145/3615366.3615425

In most blockchains, miners execute transactions sequentially, while validators reproduce the execution to validate its results. Although simple, this approach does not exploit modern multi-core resources efficiently, thus limiting performance and ...

research-article

Parallel approaches to extract multi-level high utility itemsets from hierarchical transaction databases

Knowledge-Based Systems (KNBS), Volume 276, Issue CSep 2023https://doi.org/10.1016/j.knosys.2023.110733

Abstract

In the field of data mining, high utility itemset mining (HUIM) is a relevant mining task, with the aim of analyzing customer transaction databases. HUIM consists of exploiting the set of items that are often purchased together and ...

Highlights

Parallelism is applied at many parts of the algorithm to improve mining performance.

Article

Construction of Locality-Aware Algorithms to Optimize Performance of Stencil Codes on Heterogeneous Hardware

SupercomputingSep 2023, Pages 147–161https://doi.org/10.1007/978-3-031-49435-2_11

Abstract

Recently, an increase in code performance has been obtained mainly through parallelism. For codes that implement stencil schemes, parallel processing requires data-intensive exchange. When parallel threads need to communicate, memory bandwidth ...

Article

Scalable Random Forest with Data-Parallel Computing

Euro-Par 2023: Parallel ProcessingAug 2023, Pages 397–410https://doi.org/10.1007/978-3-031-39698-4_27

Abstract

In the last years, there has been a significant increment in the quantity of data available and computational resources. This leads scientific and industry communities to pursue more accurate and efficient Machine Learning (ML) models. Random ...

research-article

Conjugate Gradients Acceleration of Coordinate Descent for Linear Systems

Dan Gordon

Journal of Scientific Computing (JSCI), Volume 96, Issue 3Sep 2023https://doi.org/10.1007/s10915-023-02307-1

Abstract

This paper introduces a conjugate gradients (CG) acceleration of the coordinate descent algorithm (CD) for linear systems. It is shown that the Kaczmarz algorithm (KACZ) can simulate CD exactly, so CD can be accelerated by CG similarly to the CG ...

research-article

FPGA Design of Transposed Convolutions for Deep Learning Using High-Level Synthesis

Journal of Signal Processing Systems (JSPS), Volume 95, Issue 10Oct 2023, Pages 1245–1263https://doi.org/10.1007/s11265-023-01883-7

Abstract

Deep Learning (DL) is pervasive across a wide variety of domains. Convolutional Neural Networks (CNNs) are often used for image processing DL applications. Modern CNN models are growing to meet the needs of more sophisticated tasks, e.g. using ...

research-article

Parallel multi-GPU implementation of fast decoupled power flow solver with hybrid architecture

Cluster Computing (KLU-CLUS), Volume 27, Issue 1Feb 2024, Pages 1125–1136https://doi.org/10.1007/s10586-023-04064-0

Abstract

Abstract-Achieving high solution efficiency on conventional sequential computation architecture is a challenging task due to penetration of multiple renewable energy sources (RESs). This challenge has become the bottleneck for the application in ...

research-article

Open Access

Parallelism in a Region Inference Context

Proceedings of the ACM on Programming Languages (PACMPL), Volume 7, Issue PLDIArticle No.: 142, Pages 884–906https://doi.org/10.1145/3591256

Region inference is a type-based program analysis that takes a non-annotated program as input and constructs a program that explicitly manages memory allocation and deallocation by dividing the heap into a stack of regions, each of which can grow and ...

research-article

Area-latency efficient floating point adder using interleaved alignment and normalization

Microprocessors & Microsystems (MSYS), Volume 99, Issue CJun 2023https://doi.org/10.1016/j.micpro.2023.104842

Highlights

Bidirectional barrel shifter replaces the two barrel shifters in conventional FP adder.

Abstract

The barrel shifter is an indispensable floating-point (FP) adder circuit. It performs the alignment on the mantissa of the smallest FP number and also normalizes the added mantissa in a conventional FP adder. Alignment and ...

Article

Constraint Propagation on GPU: A Case Study for the Cumulative Constraint

Integration of Constraint Programming, Artificial Intelligence, and Operations ResearchMay 2023, Pages 336–353https://doi.org/10.1007/978-3-031-33271-5_22

Abstract

The Cumulative constraint is one of the most important global constraints, as it naturally arises in a variety of problems related to scheduling with limited resources. Devising fast propagation algorithms that run at every node of the search tree ...

Applied Filters

People

Names

Institutions

Authors

Reviewers

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Reproducibility Badges

Publication Date

Save to Binder

Upcoming Conferences