Keyword: Multi-threading : Search

research-article

A fast high resolution distributed hydrological model for forecasting, climate scenarios and digital twin applications using wflow_sbm

Environmental Modelling & Software (ENMS), Volume 179, Issue Chttps://doi.org/10.1016/j.envsoft.2024.106099

Abstract

We investigated improvements to further speed up the multi-threaded scaling of the distributed hydrological model wflow_sbm. To gain insight in the speed improvements for operational applications, we connected the improved code to ECMWF’s Fields ...

Highlights

Improved multi-threaded scaling of the distributed hydrological model wflow_sbm.
Two to eleven times faster run times on high-performance computing clusters.
Distributed hydrological model application for large-scale hydrological ...

Article

Rapid Prototyping of Complex Micro-architectures Through High-Level Synthesis

Applied Reconfigurable Computing. Architectures, Tools, and ApplicationsPages 19–34https://doi.org/10.1007/978-3-031-42921-7_2

Abstract

Register-Transfer Level (RTL) design has been a traditional approach in hardware design for several decades. However, with the growing complexity of designs and the need for fast time-to-market, the design and verification process at the RTL level ...

research-article

A novel ILU preconditioning method with a block structure suitable for SIMD vectorization

Journal of Computational and Applied Mathematics (JCAM), Volume 419, Issue Chttps://doi.org/10.1016/j.cam.2022.114687

Abstract

Incomplete LU (ILU) preconditioning is typically used when an iterative solver is applied on an asymmetric system of linear equations. A fill-in selection policy significantly affects the ILU preconditioned iterative solver. In this ...

research-article

Studying error propagation on application data structure and hardware

The Journal of Supercomputing (JSCO), Volume 78, Issue 17Pages 18691–18724https://doi.org/10.1007/s11227-022-04625-x

Abstract

As technology scales, transistors become smaller and aggressive power optimization techniques combined with high operation frequencies and performance-enhancing microarchitectural techniques are employed to achieve increasingly higher performance ...

research-article

A novel hybrid multi-thread metaheuristic approach for fake news detection in social media

Gungor Yildirim

Applied Intelligence (KLU-APIN), Volume 53, Issue 9Pages 11182–11202https://doi.org/10.1007/s10489-022-03972-9

Abstract

In fake news detection, intelligent optimization seems to be a more effective and explainable solution methodology than the black-box methods that have been extensively used in the literature. This study takes the optimization-based method one ...

research-article

Fuzzing with automatically controlled interleavings to detect concurrency bugs

Journal of Systems and Software (JSSO), Volume 191, Issue Chttps://doi.org/10.1016/j.jss.2022.111379

Abstract

Concurrency vulnerabilities are an irresistible threat to security, and detecting them is challenging. Triggering the concurrency vulnerabilities requires a specific thread interleaving and a bug-inducing input. Existing methods have ...

Highlights

Grey-box fuzzing to detect concurrency vulnerabilities.
Adjusting the thread ...

research-article

Parallelizing filter-and-verification based exact set similarity joins on multicores

Information Systems (ISYS), Volume 108, Issue Chttps://doi.org/10.1016/j.is.2021.101912

Abstract

Set similarity join (SSJ) is a well studied problem with many algorithms proposed to speed up its performance. However, its scalability and performance are rarely discussed in modern multicore environments. Existing algorithms assume a ...

Highlights

Multi-threading has not yet been considered to speed up set similarity joins.
...

research-article

Real-time edge computing on multi-processes and multi-threading architectures for deep learning applications

Shih Hsiung Lee

Microprocessors & Microsystems (MSYS), Volume 92, Issue Chttps://doi.org/10.1016/j.micpro.2022.104554

Abstract

As the computing power of embedded system hardware devices continues to grow, more and more deep learning models have been gradually transplanted into edge devices. Accordingly, a variety of application scenarios have been developed ...

Article

QR Factorization Using Malleable BLAS on Multicore Processors

High Performance Computing. ISC High Performance 2022 International WorkshopsPages 176–189https://doi.org/10.1007/978-3-031-23220-6_12

Abstract

We demonstrate that significant performance benefits can be obtained via the exploitation of malleability in a framework designed to implement portable and high-performance BLAS-like kernels. For this purpose, we integrate thread-level ...

research-article

Parallel best-first search algorithms for planning problems on multi-core processors

The Journal of Supercomputing (JSCO), Volume 78, Issue 3Pages 3122–3151https://doi.org/10.1007/s11227-021-03986-z

Abstract

The multiplication of computing cores in modern processor units permits revisiting the design of classical algorithms to improve computational performance in complex application domains. Artificial Intelligence planning is one of those ...

research-article

Cronista: A multi-database automated provenance collection system for runtime-models

Information and Software Technology (INST), Volume 141, Issue Chttps://doi.org/10.1016/j.infsof.2021.106694

Abstract Context:

Decision making by software systems that face uncertainty needs tracing to support understandability, as accountability is crucial. While logging has been essential to support explanations and understandability of ...

Highlights

Cronista automatically logs the provenance of changes to a system’s runtime model.

research-article

Parallel and distributed association rule mining in life science: A novel parallel algorithm to mine genomics data

Information Sciences: an International Journal (ISCI), Volume 575, Issue CPages 747–761https://doi.org/10.1016/j.ins.2018.07.055

Highlights

A parallel algorithm for Association rule mining.
Association rule mining of ...

Abstract

Association rule mining (ARM) is largely employed in several scientific areas and application domains, and many different algorithms for learning association rules from databases have been introduced. Despite the presence of many ...

research-article

A mathematical framework for design discovery from multi-threaded applications using neural sequence solvers

Innovations in Systems and Software Engineering (SPISSE), Volume 17, Issue 3Pages 289–307https://doi.org/10.1007/s11334-021-00393-8

Abstract

Comprehending existing multi-threaded applications effectively is a challenge without proper assistance. Research has been proposed to mine programs to extract aspects of high-level design but not much to reverse-engineer the concurrent design ...

Article

Drill Pipe Counting Method Based on Local Dense Optical Flow Estimation

Image and GraphicsPages 443–454https://doi.org/10.1007/978-3-030-87355-4_37

Abstract

To solve the problem of drill pipe counting in coal mines, we propose a drill pipe counting method based on local dense optical flow (LDOF) estimation. Compared to the general tracking method, we provide a new perspective for resolving the problem ...

Article

Locality: The 3rd Wall and the Need for Innovation in Parallel Architectures

Architecture of Computing SystemsPages 3–18https://doi.org/10.1007/978-3-030-81682-7_1

Abstract

In the past we have seen two major “walls” (memory and power) whose vanquishing required significant advances in architecture. This paper discusses evidence of a third wall dealing with data locality, which is prevalent in data intensive ...

research-article

An intelligent memory caching architecture for data-intensive multimedia applications

Multimedia Tools and Applications (MTAA), Volume 80, Issue 11Pages 16743–16761https://doi.org/10.1007/s11042-020-08805-w

Abstract

With the rapid developments in cloud computing and mobile networks, multimedia content can be accessed conveniently. Recently, some novel intelligent caching-based approaches have been proposed to improve the memory architectures for multimedia ...

research-article

Multiple-tasks on multiple-devices (MTMD): exploiting concurrency in heterogeneous managed runtimes

VEE 2021: Proceedings of the 17th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution EnvironmentsPages 125–138https://doi.org/10.1145/3453933.3454019

Modern commodity devices are nowadays equipped with a plethora of heterogeneous devices serving different purposes. Being able to exploit such heterogeneous hardware accelerators to their full potential is of paramount importance in the pursuit of ...

research-article

Programming parallel dense matrix factorizations with look-ahead and OpenMP

Cluster Computing (KLU-CLUS), Volume 23, Issue 1Pages 359–375https://doi.org/10.1007/s10586-019-02927-z

Abstract

We investigate a parallelization strategy for dense matrix factorization (DMF) algorithms, using OpenMP, that departs from the legacy (or conventional) solution, which simply extracts concurrency from a multi-threaded version of basic linear ...

research-article

Combining sequentialization-based verification of multi-threaded C programs with symbolic Partial Order Reduction

International Journal on Software Tools for Technology Transfer (STTT) (STTT), Volume 21, Issue 5Pages 545–565https://doi.org/10.1007/s10009-019-00507-5

Abstract

Sequentialization has been shown to be an effective symbolic verification technique for safety properties in multi-threaded C programs using POSIX threads. The tool Lazy-CSeq, which applies a lazy sequentialization scheme, demonstrated its ...

research-article

AggrePlay: efficient record and replay of multi-threaded programs

ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software EngineeringPages 567–577https://doi.org/10.1145/3338906.3338959

Deterministic replay presents challenges and often results in high memory and runtime overheads. Previous studies deterministically reproduce program outputs often only after several replay iterations or may produce a non-deterministic sequence of ...

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Reproducibility Badges

Publication Date

A fast high resolution distributed hydrological model for forecasting, climate scenarios and digital twin applications using wflow_sbm

Rapid Prototyping of Complex Micro-architectures Through High-Level Synthesis

A novel ILU preconditioning method with a block structure suitable for SIMD vectorization

Studying error propagation on application data structure and hardware

A novel hybrid multi-thread metaheuristic approach for fake news detection in social media

Upcoming Conferences

Fuzzing with automatically controlled interleavings to detect concurrency bugs

Parallelizing filter-and-verification based exact set similarity joins on multicores

Real-time edge computing on multi-processes and multi-threading architectures for deep learning applications

QR Factorization Using Malleable BLAS on Multicore Processors

Parallel best-first search algorithms for planning problems on multi-core processors

Cronista: A multi-database automated provenance collection system for runtime-models

Parallel and distributed association rule mining in life science: A novel parallel algorithm to mine genomics data

A mathematical framework for design discovery from multi-threaded applications using neural sequence solvers

Drill Pipe Counting Method Based on Local Dense Optical Flow Estimation

Locality: The 3rd Wall and the Need for Innovation in Parallel Architectures

An intelligent memory caching architecture for data-intensive multimedia applications

Multiple-tasks on multiple-devices (MTMD): exploiting concurrency in heterogeneous managed runtimes

Programming parallel dense matrix factorizations with look-ahead and OpenMP

Combining sequentialization-based verification of multi-threaded C programs with symbolic Partial Order Reduction

AggrePlay: efficient record and replay of multi-threaded programs

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Reproducibility Badges

Publication Date

Save to Binder

Upcoming Conferences