Issue Information
No abstract is available for this article.
Accelerating unstructured finite volume computations on field-programmable gate arrays
In the paper, an field-programmable gate array FPGA-based framework is described to efficiently accelerate unstructured finite volume computations where the same mathematical expression has to be evaluated at every point of the mesh. The irregular ...
XpressSpace: a programming framework for coupling partitioned global address space simulation codes
Complex coupled multiphysics simulations are playing increasingly important roles in scientific and engineering applications such as fusion, combustion, and climate modeling. At the same time, extreme scales, increased levels of concurrency, and the ...
Visual exploration of data by using multidimensional scaling on multicore CPU, GPU, and MPI cluster
Visual and interactive data exploration requires fast and reliable tools for embedding of an original data space in 32-dimensional Euclidean space. Multidimensional scaling MDS is a good candidate. However, owing to at least OM2 memory and time ...
Parallel spherical harmonic transforms on heterogeneous architectures graphics processing units/multi-core CPUs
Spherical harmonic transforms SHT are at the heart of many scientific and practical applications ranging from climate modelling to cosmological observations. In many of these areas, new cutting-edge science goals have been recently proposed requiring ...
PGAS-FMM: Implementing a distributed fast multipole method using the X10 programming language
The fast multipole method FMM is a complex, multi-stage algorithm over a distributed tree data structure, with multiple levels of parallelism and inherent data locality. X10 is a modern partitioned global address space language with support for ...
PU text classification enhanced by term frequency-inverse document frequency-improved weighting
Term frequency-inverse document frequency TF-IDF, one of the most popular feature also called term or word weighting methods used to describe documents in the vector space model and the applications related to text mining and information retrieval, can ...
The remote sensing image enhancement based on nonsubsampled contourlet transform and unsharp masking
To restrain pseudo-Gibbs phenomenon, low contrast and blurred phenomenon in the process of image enhancement, a new method based on the nonsubsampled contourlet transform and the unsharp masking is proposed in this paper. The proposed method utilizes ...
CPU-GPU hybrid parallel strategy for cosmological simulations
Gadget is a simulation application for N-body and smoothed particle hydrodynamics problems in cosmology, and it is widely applied in solving series of cosmological problems. N-body focuses on the motion of the interaction of N particles, and smoothed ...
MapReduce delay scheduling with deadline constraint
MapReduce programming paradigm has been widely applied to solve large-scale data-intensive problems. Intensive studies of MapReduce scheduling have been carried out to improve MapReduce system performance. Delay scheduling is a common way to achieve ...
Register spilling via transformed interference equations for PAC DSP architecture
Digital signal processors DSPs with very long instruction word VLIW data-path architectures are increasingly being deployed on embedded devices for multimedia processing applications. To reduce the power consumption and design cost of VLIW DSP ...
Decision tree building on multi-core using FastFlow
The whole computer hardware industry embraced the multi-core. The extreme optimisation of sequential algorithms is then no longer sufficient to squeeze the real machine power, which can be only exploited via thread-level parallelism. Decision tree ...
Efficient parallel implementation of three-point viterbi decoding algorithm on CPU, GPU, and FPGA
In wireless communication, Viterbi decoding algorithm VDA is the one of most popular channel decoding algorithms, which is widely used in WLAN, WiMAX, or 3G communications. However, the throughput of Viterbi decoder is constrained by the convolutional ...
A scalable architecture for concurrent online auctions
Online auction systems are characterised by a number of functional and performance management requirements, caused by the potentially very large numbers of distributed concurrent bidders, as well as by the auction rules. Such systems are typically ...