Project Report (Parallel SM - NW) PDF
Project Report (Parallel SM - NW) PDF
Project Report (Parallel SM - NW) PDF
Alignment Algorithms
CSE4001 – Parallel and Distributed Computing
By
SHASHWAT NEGI
(17BCE0322)
YASH BHOJWANI
(17BCE0614)
RACHANA REDDY
(17BCE0278)
March 2019
Page 1
DECLARATION
Place : Vellore
Date : 5/11/19
Page 2
CONTENTS Page
No.
Acknowledgement 2
Abstract 7
Table of Contents 3
List of Figures 4
List of Tables 5
Abbreviations 6
1 INTRODUCTION 8
1.1 Objective 8
1.2 Motivation 8
2 LITERATURE SURVEY 9
3 TECHNICAL SPECIFICATION 12
5 PROPOSED SYSTEM 17
7 CONCLUSION 20
8 REFERENCES 21
APPENDIX .
Page 3
List of Figures
Page 4
List of Tables
Page 5
List of Abbreviations
S-W Smith-Watermann
N-W Needleman-Wunsch
OMP Open Multi-Processing
FPGA Field Programmable Gate Array
MPI Message Passing Interface
DNA Deoxyribo Nucleic Acid
Page 6
ABSTRACT
Parallel and Distributed computing is the future of technology. All products and their
fundamental concepts are being shifted to a parallel computing model. Everybody would
agree that serial computing is easy to implement and use, but simply not efficient enough for
industry level purposes. Due to this reason, day by day higher number of industries are
providing and using cloud solutions which work on the basis on parallel and distributed
computing. For instance, Amazon’s AWS or Google’s
Google Cloud platform are becoming the center for development, may it be in the field of
web development, or in the field of data analytics. To cope up with the fast-paced
improvement in technology, one must also become familiarized with this domain, and hence
this project. Gene sequencing problem is one of the major issues for researchers regarding
optimized system models that could help optimum processing and efficiency without
introduction overheads in terms of memory and time. Bioinformatics and computational
biology is a latest multidisciplinary field which explains many aspects of the fields of
computer science, while computational biology harnesses computational approach and
technologies to respond biological questions conveniently.
The libraries used for the same are: <stdio.h>,<stdlib.h>,<math.h>,<omp.h>,<time.h>.
We would be learning mainly how to detect sequences in proteins and nucleic acids and how
to detect them using parallel computing using multiple threads.
We use this inspiration of our project to create something on a smaller scale, but with a large
scope.
Page 7
1. INTRODUCTION
1.1 OBJECTIVE
1.2 MOTIVATION
As time passes, the world is becoming more and more oriented towards Parallel Computing.
Many of the tasks that were once carried out sequentially are now being carried out in parallel
so as to use resources more efficiently and get faster results. Genome is an emerging field,
constantly presenting many new challenges to researchers in both biological and
computational aspect of application. Research is being done in the Biology with the
application Smith–Waterman algorithm it is possible to process and understand nucleic acid/
protein sequences. Sequence comparison is a very essential and important operation. They
detect similar or identical parts between two sequences called the query sequence and the
reference sequence. The global and local alignments are the most prevalent kinds of sequence
alignment. In global alignment, we find the superior counterpart between parts of the
sequences. On the other hand, local alignment algorithms try to match parts of sequences and
not the entirety of them. Local alignment is faster than global alignment, due to the lack of
need to align the entire sequences. In our project, we would be implementing the Smith-
Waterman (Local sequence alignment) Algorithm for randomly generated nucleotide
sequences in a serial and parallel manner for comparison and analysis. As common sense
suggests, the parallel implementation should execute and provide the same result as the serial
implementation but in a lesser amount of time.
Page 8
2. LITERATURE SURVEY
In this paper, Ernst and Zaid have presented OpenCL-based FPGA Smith-Waterman implementation
that employs two key techniques to greatly improve the utilization of its underlying systolic array
architecture. By eliminating centralized control and through the use of Query Buffers, an arbitrary
number of alignments can be in flight at the same time, resulting in utilization close to theoretical
maximum performance. In parallel computer architectures, a systolic array is a homogeneous network
of tightly coupled data processing units called cells or nodes. Field-Programmable Gate Arrays (or
FPGAs), with their flexible and reprogrammable substrate, are a natural fit for a computationally
intensive algorithm such as the Smith-Waterman algorithm. The Query Buffer contains for each
Processing Element a separate queue with query symbols for the alignments it needs to process.
Whenever the Processing Element encounters the new read token, it checks against the query length
to verify if it is active during this alignment; if so, it reads the next query symbol from its queue. Only
the Input Parser and Output Parser communicate with the on-board DDR memory.
In this paper, a new approach has been introduced to reduce the complexity of the Smith Waterman
algorithm for FPGA implementation. The technique for the fastest comparison of the two DNA
sequencing using Verilog on the Xilinx ISE 7.1. In this paper, it was proved that Smith Waterman
algorithm based on divide and conquers technique gives better performance than existing
technique.The Smith Waterman algorithm is the complex and heuristic algorithm for the DNA
sequencing. Divide and Conquer technique helps to reduce the complexity of the main structure. In
previous works, the main structure is divided or break-up into few modules called sub-functions and
at cluster based. However, the result is less sensitive. In this technique, the functions involve in Smith
Waterman algorithm were break up into few sub- functions. These parts are prediction of the sequence
score, previous result score and optimization of the sequencing.
In this paper, Hsien-Yu Liao, Meng-Lai Yin, Yi Cheng has presented a parallel implementation
methodology of the Smith-Waterman algorithm. The power of parallelization lays on the massive
comparisons. When an unknown sequence is compared with different existing sequences, each
comparison is independent and can be performed independently. This observation highlights the
potential of massive parallelism existing in this particular application. Using this method, it was
proven that high efficiency can be achieved.
Page 9
Parallel Processing Cell Score Design of Linear Gap Penalty Smith-Waterman Algorithm
In this paper, the optimize computational processing element for linear smith-waterman algorithm cell
score based on the parallel computational approach is introduced. Two bits optimized comparator
block were used to compare the DNA sequence characters, while the computation block was used to
complete computation towards reducing the numbers of components involved. The complete
architecture is designed and developed in Altera Quartus II version 13.0 and targeted to Altera Cyclone
IV EP4CE115 Field Programmable Gate Array (FPGA). In realizing the important of the logic
functionality of the design, the modular design approach has been adapted to this design. The
simulation over the module and complete architecture were used for validating the design against the
expected result. Finally, the result obtained from the study indicates that the optimized computational
processing element design for linear smith-waterman algorithm cell score based on the parallel
computational approach is feasible to be implemented in the DNA sequence alignment as the
processing element.
In this paper, a hybrid parallel model was introduced that combines both shared and distributed
memory architectures to improve the performance of the Smith waterman algorithm (SW). The hybrid
model uses both MPI and OpenMp as programming techniques for different memory architectures.
Our improved implementation executes a parallel version of SW algorithm with a row wise
computation of the alignment matrix, which mainly optimizes the memory usage. Using different
parallel programming models and their corresponding implementations.
In this paper, Aruk, Ustek, and Kursun have given a comparative simulation of three Software based
methods in terms of their runtimes and errors in estimated position of the start of the deletion in the
query sequences. As a result, they have found that Binary Partial Align has the lowest error and very
high speed. Partial alignment aims at finding the best splitting of a query sequence into the former and
latter parts such that their matching scores with (different parts of) the reference sequence are mutually
maximized. In this paper, we give a more detailed comparison of the classical SW,
IncrementalPartialAlign, and BinaryPartialAlign. BinaryPartialAlign has better EDPE than the other
two and is faster than IncrementalPartialAlign, making it the best option among the three.
In this paper, a new architecture for SW algorithm has been presented for the implementation of the
matrix fill-up stage in the SW algorithm. The matrix fill-up stage is one of the most time-consuming
operations in the SW algorithm. The conventional implementation of this phase takes four cycles for
computation of each element in the matrix. The newly proposed design reduces this latency to three
cycles. However, the proposed design takes one overhead cycle at the start of the processing for
ensuring the 100% accuracy. With the newly proposed architecture, the SW algorithm achieves up to
25% speedup.
Page 10
An Efficient and High Performance Linear Recursive Variable Expansion Implementation of
In this paper, an efficient and high performance linear recursive variable expansion (RVE)
implementation of the Smith-Waterman (S-W) algorithm and compare it with a traditional linear
systolic array implementation. An efficient and high performance implementation of the S-W
algorithm based on the linear RVE approach and compared it with a traditional linear systolic array
implementation. The linear RVE implementation is efficient in terms of hardware utilization (both
slices and IOBs) and high performance in terms of time consumption (latency). The results
demonstrate that the linear RVE implementation is upto 2.33 times faster than a traditional linear
systolic array implementation at the cost of utilizing 2 times more resources. The results demonstrate
that the linear RVE implementation performs up to 2.33 times better than the traditional linear systolic
array implementation, at the cost of utilizing 2 times more resources.
Page 11
3. TECHNICAL SPECIFICATION
Package Requirements:
<stdio.h> The C programming language provides many standard library functions for file
input and output. These functions make up the bulk of the C standard library
<stdlib.h> stdlib.h is the header of the general purpose standard library of C programming
language which includes functions involving memory allocation, process control, conversions
and others. It is compatible with C++ and is known as cstdlib in C++. The name "stdlib" stands
for "standard library"
<math.h> The math.h header defines various mathematical functions and one macro. All the
functions available in this library take double as an argument and return double as the result.
<omp.h> It is a library that allows memory multiprocessing programming in C
<time.h> In C programming language time.h (used as ctime in C++) is a header file defined
in the C Standard Library that contains time and date function declarations to provide
standardized access to time/date manipulation and formatting.
Smith-Waterman Algorithm
Smith-Waterman algorithm calculates the local alignment of two sequences. It guarantees to find out
the best possible local alignment taking into account the specified scoring system. This includes a
substitution matrix and a gap-scoring method. Scores consider match, mismatch and substitution. To
measure the comparison between two sequences, a score be calculated as follows:
Given an alignment between sequences S0 and S1, the following values must be assigned, for each
column:
Procedure:
ma = (+5)
mi = (-3)
G = (-4)
Page 12
Example:
Needleman-Wunsch Algorithm
In order to prove that the proposed algorithm is better, we compared this algorithm with
Needleman Wunsch algorithm. The Needleman–Wunsch algorithm is an algorithm used
in bioinformatics to align protein or nucleotide sequences. It was one of the first applications
of dynamic programming to compare biological sequences. The algorithm was developed by
Saul B. Needleman and Christian D. Wunsch and published in 1970. [1] The algorithm
essentially divides a large problem into a series of smaller problems, and it uses the solutions
to the smaller problems to find an optimal solution to the larger problem. [2] It is also
sometimes referred to as the optimal matching algorithm and the global alignment technique.
This method aligns the pair of sequences from end to end. The entire length of the sequence
is taken into account. An optimal score is calculated from the matrix formed using the
maximum similarity of each character using match, mismatch and gap penalty values of the
sequences. The optimal alignment is achieved by trace back of the matrix.
Page 13
application of the algorithm is finding sequence alignments of DNA or protein sequences. It
is also a space-efficient way to calculate the longest common subsequence between two sets
of data such as with the common diff tool.
ExistingTool:
There are few existing tools which have a parallel implementation of the Smith-Waterman
algorithm, but the most prominent one is Crustal W [EMBOSS WATER]. Link:
https://www.ebi.ac.uk/Tools/psa/emboss_water/
EMBOSS Water uses the Smith-Waterman algorithm (modified for speed enhancements) to
calculate the local alignment of two sequences.
We can perform the alignment for protein, DNA or RNA sequences.
Page 14
4. DESIGN
Page 15
5. PROPOSED SYSTEM
The problem at hand was tackled with a modular approach. Eight functions were constructed,
each of which would be explained as follows:
nElement – This function is used to calculate the number of elements that have been
found by the Smith Waterman Algorithm. Three conditions are given: One of which
is to find out if the number of elements in the diagonal are increasing, decreasing or
stable.
calcFirstDiagElement – This function is used to calculate the position of the
maximum scored value in the matrix. This value needs to be found because the
algorithm suggests that the backtracking to find the path should be started from this
particular point.
similarityScore – This function is used to find out the optimal order of execution
based on three conditions, which are used to calculate the new values of left, upper
and the diagonal elements.
Every iteration, the values of maximum element is updated and inserting into the
similarity and predecessor matrices.
Generate – This function generates the two sequences A and B which would be
locally aligned with each other. A random seed is used to ensure the reproducible
nature of the output.
Page 16
6. Working
Both the programs have been parallelized using section and for directives only. The number
of threads have been set as same.
Figure 4b)Executing SW
Page 17
Executing NWomp.c (Needleman-Wunsch Algorithm )
GTACGCAAACGGGT--G
G-GCG-AGACTACTACG
Page 18
7. RESULTS AND DISCUSSION
Figure 6 Graph
Page 19
8. CONCLUSION
As we can see in the above graph, for small lengths of the sequence, Smith-Waterman
and Needleman- Wunsch algorithms tend to give the output in almost the same amount
of time. However, as the length or the number of characters increases, the execution
time for Needleman Wunsch implementation also increases exponentially, hence
forming a steep graph.
However, the Smith-Waterman implementation remains stable with not a high rise in
the execution time because of the parallel execution of the task with two threads,
making the process faster than other algorithms. Hence, the proposed system will be
useful when dealing with large amount of data and can be used for practical purposes..
Page 20
8. REFRENECES
4. Parallel Processing Cell Score Design of Linear Gap Penalty Smith-Waterman Algorithm
Page 21