vbind is a userfriedly tool for RNA sequencing. In particular, it can be used to compute and visualize the bindings between a pool of sRNA nuleotides and a genome sequence.
In what follows, we will describe the two parts of our software tool. The first part computes matchings between a gene and a nucleotide pool. Second, presents some tools to visualize the solution to the mapping problem.
- Key ingriedients: The gene and the pool.
- gene: The gene sequence should be a string of characters from the alphabet {
A
,T
,G
,C
}. We require the gene sequence to be specified in a text file without any line-breaks. - pool: The pool is a collection of several sRNA nucleotides, each represented as a string of characters from the alphabet {
A
,T
,G
,C
}. Ideally, we require a text file, wherein each new line identifies a nucleotide sequence. However, since a pool is typically derived from a gene bank, we accomodate some other formats for specifying the pool. In particular, we accept a text file containing the nueotides of the pool, interleaved by other information that is irrelevant for sequencing. Hence, the pool should be specified by a text file, wherein everyk
-th line, for somek > 0
.
- Other settings:
- lines to skip: number of lines to skip (denoted above by
k
) in the text file describing the pool, before reading a valid nuleoide sequence. - tolerance: the maximum number of mismatches allowed
- topology of the gene: an integer that takes the value 0 for linear matching and 1 for circular matching
A problem instance is specified as: <gene> <pool> <lines to skip> <tolerance> <topology> <cores>
.
For example, gene.txt pool.txt 4 1 1 1
is a complete input specification, indicating that the problem of computing bindings, in the circular topology while allowing for at most one mismatch, between the sequence in gene.txt
with those in the pool.txt
. Furthermore, every fourth sequence in pool.txt
is a valid sRNA nuleotide.
All such instances of the matching problem can be gathered in a text file, placed in vbind/data/input.
To solve the instances of the matching problem in a text file “example.txt”, we need to run the following command:
./vbind.sh example.txt
and to solve only a particular instance, we can specify its line number: x
, by
./vbind.sh example.txt x
.
Matching results computed from part one can be visualized using plots in our software tool. In addition to plotting, we also offer the capability of normalizing to reads per million. The post processing steps can be executed from ./analyze.sh
.
vbind is entriely developed and meant to be used with the Python 3 interpretter. The following packages are required for its seamless functioning.
Package | Description |
---|---|
numpy | Linear algebra |
scipy | Linear algebra |
tqdm | Progress bar |
multiprocessing | Parallel execution |
datetime | Date and Time |
This software is licensed under the BSD Clause 3 license.
Please contact the following developers for any additional help or requests for customization.
- Pavithran Iyer: pavithran.iyer@uwaterloo.ca .
- Charith Adkar: Charith.Adkar@USherbrooke.ca .
- Jean-Pierre Perrault: Jean-Pierre.Perreault@USherbrooke.ca .
- Teruo Sano: sano@hirosaki-u.ac.jp .