Parallel Processing
Parallel Processing
Examples
1. The large-scale structure of the Universe
Contemporary research in astrophysics has deep and important connections to particle
physics. Observations of large structures in the universe lead physicists to the
discovery of the dark matter and the dark energy, and understanding these new forms
of matter will change our view of the universe on all scales, including the particle
scale and the human scale. Theoretical developments in astrophysics must be tested
against vast amounts of data collected by instruments, such as the Hubble Space
Telescope, as well as against the results of supercomputer simulation experiments,
like the Millenium Run [5]. These data sets are available in public databases, and are
being mined by scientists to gain intuition and to make new discoveries, but the
researchers are limited by the technological means available to access the data. In
order to analyze astrophysical data researchers write scripts that perform database
queries, transfer the resulting data sets to their local computers and store them as flat
files. Such limited access has already produced important discoveries. For example,
recently a new log-power density spectrum was discovered by such analysis of the
data in the Millenium Run database [6]. This is the most efficient quantitative
description of the distribution of the density of matter in the Universe, that was
obtained so far.
2. Computational modeling of the cochlea
The human cochlea is a remarkable highly nonlinear transducer that extracts vital
information from sound pressure and converts it into neuronal impulses that are sent
to the auditory cortex. The cochlea’s accuracy, amplitude range and frequency range
are orders of magnitude better than man made transducers. Understanding its
function has tremendous medical and engineering significance. The two most
fundamental questions of cochlear research are to provide a mathematical description
of the transform computed by the cochlea and to explain the biological mechanisms
that compute this transform. Presently there is no adequate answer to either of these
two questions. Signal processing in the cochlea is carried out by a collection of
coupled biological processes occuring on length scales measuring from one
centimeter down to a fraction of a nanometer. A comprehensive model describing
the coupling of the dynamics of the biological processes occurring on multiple scales
is needed in order to achieve system level understanding of cochlear signal
processing. A model of cochlear macro-mechanics was constructed in 1999–2002 by
Givelberg and Bunn [18], who used supercomputers to generate very large data sets,
containing results of simulation experiments. These results were stored as flat files
which were subsequently analyzed by the authors on workstations using specially
developed software. aA set of web pages devoted to this research [19] is widely and
frequently accessed, however the data was never exposed to the wide community for
analysis since no tools to ingest simulation output into a database existed when the
cochlea model was developed.
Characteristics:
Several common characteristics of data-intensive computing systems distinguish them
from other forms of computing:
(1) The principle of collection of the data and programs or algorithms is used to
perform the computation. To achieve high performance in data-intensive computing,
it is important to minimize the movement of data.[19] This characteristic allows
processing algorithms to execute on the nodes where the data resides reducing system
overhead and increasing performance.[20] Newer technologies such as InfiniBand
allow data to be stored in a separate repository and provide performance comparable
to collocated data.
(4) The inherent scalability of the underlying hardware and software architecture.
Data-intensive computing systems can typically be scaled in a linear fashion to
accommodate virtually any amount of data, or to meet time-critical performance
requirements by simply adding additional processing nodes. The number of nodes and
processing tasks assigned for a specific application can be variable or fixed depending
on the hardware, software, communications, and distributed file system architecture.
The importance of big data doesn’t revolve around how much data you have, but what
you do with it. You can take data from any source and analyze it to find answers that
enable cost reductions, time reductions, new product development and optimized
offerings and smart decision making. When you combine big data with high-
powered analytics, you can accomplish business-related tasks such as:
Generating coupons at the point of sale based on the customer’s buying habits.