Conf Fair
Conf Fair
Conf Fair
Figure 3 nanoronfler
A middle layer of soybean lecithin, a fatty
material, lies between the core and the
outer shell, consisting of a polymer called
Polyethylene Glycol (PEG) that protects
the particles as they travel through the
bloodstream.
Figure 4 Fabrication of nanoronfler
The drug can only be released when
it detaches from the PLA polymer chain,
which occurs gradually by a reaction called
ester hydrolysis. The longer the polymer
chain, the longer this process takes and thus
the timing of the drug s release can be
changed by altering the chain length of
polymer.
4. INJECTION OF NANORONFLER
There are two techniques to inject the
nanoronfler
In vivo
Ex vivo
In Vivo
In the in vivo technique, the
nanoronfler were injected and left to circulate
for 1 hour. Four times as many nanoronfler
were in the injured left carotid artery than in
the uninjured right one. This showed that the
nanoronfler were effective in targeting the
damaged tissue in the carotid artery. To check
that systematic delivery would be possible, the
nanoronfler are injected again into the body,
this time via another vein. Similarly positively
findings were achieved, with twice as many
nanoronfler found in the injured carotid artery
than in the uninjured one.
Ex Vivo
The ex vivo technique used balloon-
injured aortas to test the effectiveness of the
nano burrs. The nanoronfler were incubated
with the abdominal aortas for 5 minutes under
constant pressure and then washed out to
ensure any nanoronfler that had not attached
themselves to the arterial wall did not show up
on the fluorescent imaging. Twice the number
of nanoronfler attached themselves to the
arterial wall than the scrambled peptide and
non-targeted did. Similarly, when repeating
the experiment with uninjured aortic tissue,
four times as many nanoronfler attached
themselves to the injured tissue than the
uninjured tissue.
Page 21 of 165
5. TACTICS OF NANORONFLER
The nanoronfler latch onto injured
arterial walls because they're decorated
with peptides pulled from bacterial
phages, viruses that infect bacteria. It was
found through research that one peptide
latches onto the collagen that makes up
the basement membrane of arteries which
makes this peptide useful in preferentially
binding to the basement membrane of the
artery wall that gets exposed whenever the
artery is injured, such as during an
Angioplasty, where the inflated angio
balloon squeezes against the arterial wall,
pulling off the top layer of cells.
The nanoburr's stickiness means
these tiny hybrid-polymeric particles are
much more likely to hit the treatment
target than nanoparticles lacking the
protein hooks. In the current study, done
in both arterial cell cultures in a dish and in
the carotid arteries of living rats the burred
nanoparticles were between two and four
times as likely to glom onto injured arterial
tissue as non-burred varieties.
6. WORKING OF NANORONFLER
Process to keep the nanoronfler
sticking and circulating in the blood for a
long time is necessary for cardiovascular
disease patient because as soon as it enters
the patient body then body's natural
defenses will quickly muster attacks
against it, treating as a foreign particle
for the body .
To prevent this, nanoronfler are to be
sheathed in soy lecithin, a fatty substance
and then later should be coated with
polyethylene glycol (PEG) because PEG is
an inert hydrophilic substance and is able to
evade much of the body's defenses.
7. THE REASON FOR NANOPARTICLES
AS A DRUG DELIVERY SYSTEM
1. Particle size and surface characteristics of
Nanoparticles can be easily manipulated to
achieve both passive and active drug
targeting after parenteral administration.
. They control and sustain release of the
drug during the transportation and at the site
of localization, altering organ distribution of
the drug and subsequent clearance of the
drug so as to achieve increase in drug
therapeutic efficacy and reduction in side
effects.
3.Controlled release and particle degradation
characteristics can be readily modulated by
the choice of matrix constituents.
Drug loading is relatively high and
drugs can be incorporated into the systems
without any chemical reaction; this is an
important factor for preserving the drug
activity.
4. Site-specific targeting can be achieved by
attaching targeting ligands to surface of
particles or use of magnetic guidance.
5. The system can be used for various routes
of administration including oral, nasal,
parenteral, intra-ocular etc.
8. DISCUSSIONS
In the future, the hope is to use nanoronfler
alongside stents or in lieu of them to
treat damage located in areas not well suited
to stents, such as near a fork in the artery
(bifurcation lesions), diffuse lesions, larger
arteries, and also already-stented arteries
which may have more than one lesion. There
are also conditions which do not allow for
drug-eluting stent placement, such as in
patients with renal failure, diabetes and
hypertension, or patients who cannot take
the dual drug regimen of clopidogrel
.
Page 22 of 165
(antiplatelet agent used to inhibit blood clots
in coronary artery disease) and aspirin that is
required with this treatment.
The nanoronfler consist of spheres
60 nanometres across, more than 100 times
smaller than a red blood cell, and at its core
the particle has a drug designed to combat
narrowing of blood vessels bound to a
chain-like polymer molecule. The time it
takes for the drug to be released is controlled
by varying the length of the polymer.
The longer the chain, the longer the
duration of the release, which occurs
through a reaction called ester hydrolysis
whereby the drug becomes detached from
the polymer. Controlling drug release is
already promising to be a very important
development; as of today, drug release has
been a maximum of 1days. The hope is to
be able to control it more easily, and with
the speed of development currently, the wait
promises to be short.
The advantage of this is, because the
particles can deliver drugs over a longer
period of time and can also be injected
intravenously, patients would not have to
endure repeated and surgically invasive
injections directly into the area that requires
treatment. This would not only increase
patient comfort and satisfaction but reduce
the time in hospital for recovery, producing
only benefits for those involved.
9. FUTURE ENHANCEMENTS
There is now testing in rats over a
two-week period to determine the most
effective dose for treating damaged vascular
tissue. There are hopes that the particles may
also prove useful in delivering drugs to
tumours and could have broad applications
across other diseases including cancer. This
highlights the truly promising hopes for the
nanoronfler, and the changes they could
provide for medicine.
Again this would not only directly
improve the treatment of the patient, but also
indirectly improve research and other areas
by having this greater control of treatment.
10. CONCLUSION
Overall we believe that
Nanotechnology is of vital importance to
future developments in medicine because
one third of the population dies every year
due to heart disease. In our paper we have
discussed how Nanotechnology is currently
used in the treatment of Heart diseases using
nanoronfler. This method is not only used
for treating heart disease but also for curing
cancer and inflammatory diseases. We have
also looked at ongoing developments in the
field of Nanotechnology used in medicine.
The development of nanoparticles coated
with a thin layer of gold is currently in
development. This will allow doctors to
inject a patient with these nanoparticles,
once the nanoparticles have reached the area
of the body, the nanoparticles can be heated
using infrared radiation. The heat will
irreparably damage the foreign body
therefore curing the patient.
11. REFERENCES
1. http://discoverysedge.mayo.edu
. http://www.ncbi.nlm.nih.gov
3. http://www.understandingnano.com
4. http://www.actionbioscience.org
5. http://iopscience.iop.org
6. http://scienceray.com
7. http://www.medlinkuk.org
8. http://en.wikipedia.org
Page 23 of 165
'^d
Abstract
Multiple-path source routing protocols allow a data source
node to distribute the total traffic among available paths. In this
article, we consider the problem of jamming-aware source routing
in which the source node performs traffic allocation based on
empirical jamming statistics at individual network nodes. We
formulate this traffic allocation as a lossy network flow
optimization problem using portfolio selection theory from
financial statistics. We show that in multi-source networks,
this centralized optimization problem can be solved using a
distributed algorithm based on decomposition in network utility
maximization (NUM). We demonstrate the networks ability to
estimate the impact of jamming and incorporate these estimates
into the traffic allocation problem. Finally, we simulate the
achievable throughput using our proposed traffic allocation
method in several scenarios.
Index TermsJamming, Multiple path routing, Portfolio se-
lection theory, Optimization, Network utility maximization
I. INTRODUCTION
Jamming point-to-point transmissions in a wireless mesh
network [1] or underwater acoustic network [2] can have
debilitating effects on data transport through the network. The
effects of jamming at the physical layer resonate through
the protocol stack, providing an effective denial-of-service
(DoS) attack [3] on end-to-end data communication. The
simplest methods to defend a network against jamming attacks
comprise physical layer solutions such as spread-spectrum
or beamforming, forcing the jammers to expend a greater
resource to reach the same goal. However, recent work has
demonstrated that intelligent jammers can incorporate cross-
layer protocol information into jamming attacks, reducing
resource expenditure by several orders of magnitude by tar-
geting certain link layer and MAC implementations [4][6]
as well as link layer error detection and correction protocols
[7]. Hence, more sophisticated anti-jamming methods and
This work was supported in part by the following grants:
ARO PECASE, W911NF-05-1-0491; ONR, N000-07-1-0600;
ONR, N00014-07-1-0600; ARL CTA, DAAD19-01-2-0011;
and ARO MURI, W911NF-07-1-0287. This docu- ment was
prepared through collaborative participation in the
Communications and Networks Consortium sponsored by the
US Army Research Laboratory under the Collaborative
Technology Alliance Program, DAAD19-01-2-0011. The US
Government is authorized to reproduce and distribute reprints
for Government purposes notwithstanding any copyright
notation thereon. The views and conclusions contained in this
document are those of the author and should not be interpreted
as representing the official policies, either expressed or
implied, of the Army Research Laboratory or the US
Government.
P. Tague, S. Nabar, J. A. Ritcey, and R.
Poovendran are with the Network Security Lab
(NSL), Electrical Engineering Department, University
of Washington, Seattle, Washington. Email:
{tague,snabar,jar7,rp3. P. Tague is currently with
Carnegie Mellon University, Silicon Valley Campus,
Moffett Field, California. defensive measures must be
incorporated into higher-layer protocols, for example
channel surfing [8] or routing around jammed regions of the
network [6].
The majority of anti-jamming techniques make use of
diversity. For example, anti-jamming protocols may employ
multiple frequency bands, different MAC channels, or multiple
routing paths. Such diversity techniques help to curb the effects
of the jamming attack by requiring the jammer to act on
multiple resources simultaneously. In this paper, we consider
the anti-jamming diversity based on the use of multiple routing
paths. Using multiple-path variants of source routing protocols
such as Dynamic Source Routing (DSR) [9] or Ad-Hoc On-
Demand Distance Vector (AODV) [10], for example the MP-
DSR protocol [11], each source node can request several
routing paths to the destination node for concurrent use. To
make effective use of this routing diversity, however, each
source node must be able to make an intelligent allocation
of traffic across the available paths while considering the
potential effect of jamming on the resulting data throughput.
In order to characterize the effect of jamming on throughput,
each source must collect information on the impact of the
jamming attack in various parts of the network. However,
the extent of jamming at each network node depends on a
number of unknown parameters, including the strategy used
by the individual jammers and the relative location of the
jammers with respect to each transmitter-receiver pair. Hence,
the impact of jamming is probabilistic from the perspective of
the network
1
, and the characterization of the jamming impact
is further complicated by the fact that the jammers strategies
may be dynamic and the jammers themselves may be mobile
2
.
In order to capture the non-deterministic and dynamic
effects of the jamming attack, we model the packet error rate
at each network node as a random process. At a given time, the
randomness in the packet error rate is due to the uncertainty
in the jamming parameters, while the time-variability in the
packet error rate is due to the jamming dynamics and mobility.
Since the effect of jamming at each node is probabilistic,
the end-to-end throughput achieved by each source-destination
pair will also be non-deterministic and, hence, must be studied
using a stochastic framework.
Page 24 of 165
2
(t) is computed only every Ts second
source nodes every Ts seconds.
(s, y). However, since the pac
has historically been more stea
option. Hence, the source s can
and partition the remaining 10
p
2
. This solution takes into
in the packet success rates du
following section, we build o
of parameters to be estimated
for the sources to aggregate t
the available paths on the bas
B. Estimating Local Packet Su
We let x
ij
(t) denote the
(i, j) E at time t, notin
analytically as a function o
of node i, the signal power
distances from node j, and
wireless medium. In reality, h
jammers are often unknown, a
analytical model is not applica
the jamming impact, we mod
as a random process and allo
empirical data in order to cha
that each node j maintains an
success rate xij (t) as well as
characterize the estimate unce
We propose the use of
allowing each node j to period
as a function of time. As illu
that each node j updates the
period of T seconds and rela
source node s after each upda
seconds. The shorter update
node j to characterize the va
relay period of T
s
seconds, a
We propose the use of the
(PDR) to compute the estim
corporates additional factors
shown by extensive experim
4
At a time instant t, the estimate
define a random variable describing the
This random variable can be appropr
[15], though the results of this articl
ij
ij
ij
process is illustrated for a single link. The
T seconds, and the estimation variance
econds. Both values are relayed to relevant
acket success rate over link (s, z)
teady, it may be a more reliable
can choose to fill p
3
to its capacity
100 pkts/s equally over p
1
and
to account the historic variability
due to jamming mobility. In the
on this example, providing a set
ated by network nodes and methods
ate this information and characterize
sis of expected throughput.
Success Rates
e packet success rate over link
ng that x
ij
(t) can be computed
of the transmitted signal power
wer of the jammers, their relative
d the path loss behavior of the
however, the locations of mobile
, and, hence, the use of such an
licable. Due to the uncertainty in
model the packet success rate x
ij
(t)
ow the network nodes to collect
racterize the process. We suppose
an estimate
ij
(t) of the packet
as a variance parameter
2
(t) to
certainty and process variability
4
.
a recursive update mechanism
odically update the estimate
ij
(t)
ustrated in Figure 3, we suppose
estimate
ij
(t) after each update
elays the estimate to each relevant
update relay period of T
s
# T
ate period of T seconds allows each
ariation in x
ij
(t) over the update
a key factor in
2
(t).
e observed packet delivery ratio
mate
ij
(t). While the PDR in-
such as congestion, it has been
mentation [8] that such factors
e ij (t) and estimation variance
2
(t)
he current view of the packet success rate.
ppropriately modeled as a beta random variable
ticle do not require such an assumption.
Page 26 of 165
2
ij
do not affect the PDR in a similar m
we propose to average the empirical PDR
to smooth out the relatively short-term
noise or fading. During the update perio
time interval [t - T, t], each node j can
r
ij
([t - T, t]) of packets received over
number v
ij
([t - T, t]) : r
ij
([t - T, t]) o
pass an error detection check
5
. The PDR
the update period [t - T, t], denoted P DR
equal to the ratio
v
ij
([t -
P DR
ij
([t - T, t]) =
ij
([t -
This PDR can be used to update the e
end of the update period. In order to
variation in the estimate
ij
(t) and to i
the jamming attack history, we suggest
weighted moving average (EWMA) [16] to
ij
(t) as a function of the previous estim
ij
(t) =
ij
(t - T ) + (1 - )P DR
where [0, 1] is a constant weight ind
preference between current and historic s
We use a similar EWMA process to
ij
(t) at the end of each update relay p
Since this variance is intended to captur
packet success rate over the last T
s
second
sample variance V
ij
([t - T
s
, t]) of the s
ratios computed using (1) during the inte
Vij ([t - Ts, t]) =V ar {P DRij ([t - kT
k = 0, . . . , &T
s
/T ' -
The estimation variance
2
(t) is thus defi
the previous variance
2
(t - T
s
) as
ij
2 2
ij
(t) =
ij
(t - T
s
) + (1 - )V
i
where [0, 1] is a constant weight sim
The EWMA method is widely used in s
processes, including estimation of the round
in TCP [17]. We note that the parameters
allow for design of the degree of histor
in the parameter estimate updates, and
themselves be functions (t) and (t) o
decreasing the parameter allows the m
more rapidly with the PDR due to ja
decreasing the parameter allows the va
more preference to variation in the most
period over historical variations. We furth
date period T and update relay period T
s
updates of the parameter estimates have
on the quality of the estimate. In partic
period T
s
is too large, the relayed estimates
will be outdated before the subsequent upd
5
In the case of jamming attacks which prevent t
detecting transmissions by node i, additional hea
periodically exchanged between nodes i and j to
total number of transmissions, yielding the same ov
ij
ij
manner. Furthermore,
PDR values over time
m variations due to
od represented by the
can record the number
er link (i, j) and the
of valid packets which
PDR over link (i, j) for
R
ij
([t - T, t]), is thus
- T, t])
Furthermore, if the update pe
the dynamics of the jamming
the large number of samples r
T and T
s
must thus be short
of the jamming attack. Howev
T
s
between successive updates
increases the communication
there exists a trade-off between
the choice of the update per
of the update relay period T
s
and jammer mobility models.
. (1)
- T, t])
of the update relay period T
s
Using the above formulation
estimate
ij
(t) at the
to prevent significant
to include memory of
using an exponential
to update the estimate
mate
ij
(t - T ) as
R
ij
([t - T, t]), (2)
ndicating the relative
samples.
update the variance
period of T
s
seconds.
ure the variation in the
onds, we consider the
set of packet delivery
terval [t - T
s
, t] as
T , t - kT + T ]) :
is requested or an existing rou
along the path will include th
part of the reply message. In
source node s uses these esti
packet success rates over each
C. Estimating End-to-End Pac
Given the packet success
for the links (i, j) in a routi
to estimate the effective end
determine the optimal traffic all
time required to transport pac
corresponding destination d
s
update relay period T
s
, we dro
the end-to-end packet success
ij
and
2
. The end-to-end
p
s!
can be expressed as the produ
- 1} . (3)
efined as a function of
y
s!
=
(i
ij
([t - T
s
, t]), (4)
milar to in (2).
sequential estimation
round-trip time (RTT)
s in (2) and in (4)
orical content included
these parameters can
of time. For example,
mean
ij
(t) to change
jammer mobility, and
ariance
2
(t) to give
t recent update relay
which is itself a random varia
each x
ij
. We let
s!
denote th
denote the covariance of y
s!
a
Due to the computational burd
inference of correlation between
we let the source node s assum
mutually independent, even t
We maintain this independe
work, yielding a feasible approx
of correlated random variab
inference of the relevant corr
Under this independence ass
given in (5) is equal to the produ
her note that the up-
between subsequent
s!
=
(i
significant influence
ticular, if the update
and the covariance
s!m
=
similarly given by
ates
ij
(t) and
ij
(t) $
update at time t + T
s
.
the receiving node j from
eader information can be
s!m
=
(i,j)ps!psm
ij
(i,j)
o achieve the convey the
overall effect.
6
If the x
ij
are modeled as beta rando
approximated by a beta random variabl
ij
ij
eriod T at each node is too large,
g attack may be averaged out over
r
ij
([t- T, t]). The update periods
enough to capture the dynamics
ver, decreasing the update period
ates to the source node necessarily
overhead of the network. Hence,
etween performance and overhead in
riod T
s
. We note that the design
s
depends on assumed path-loss
. The application-specific tuning
s
is not further herein.
on, each time a new routing path
routing path is updated, the nodes
he estimates
ij
(t) and
2
(t) as
n what follows, we show how the
timates to compute the end-to-end
each path.
cket Success Rates
rate estimates
ij
(t) and
2
(t)
ting path p
s!
, the source s needs
nd-to-end packet success rate to
fic allocation. Assuming the total
ackets from each source s to the
s
is negligible compared to the
drop the time index and address
s rates in terms of the estimates
packet success rate y
s!
for path
product
$
i,j)ps!
x
ij
, (5)
iable
6
due to the randomness in
he expected value of y
s!
and
s!m
and y
sm
for paths p
s!
, p
sm
P
s
.
urden associated with in-network
etween estimated random variables,
ume the packet success rates x
ij
as
though they are likely correlated.
ence assumption throughout this
pproximation to the complex reality
bles, and the case of in-network
orrelation is left as future work.
ssumption, the mean
s!
of y
s!
product of estimates
ij
as
$
i,j)ps!
ij
, (6)
= E[y
s!
y
sm
] - E[y
s!
]E[y
sm
] is
$
%
2 2
&
- .
)ps!psm
ij
+
ij
s! sm
(7)
andom variables, the product y
s!
is well-
iable [18].
Page 27 of 165
s! s!
y
(i)
s! s!
s!
s!
s!
y
(i)
s!
In (7), denotes the exclusive-OR set operator such that an value and variance. The mean
(i)
and variance (
(i)
)
2
of
element is in A B if it is in either A or B but not both.
s!
can be computed using (6) and (7), respectively, with p
s!
The covariance formula in (7) reflects the fact that the end-
replaced by the sub-path p
(i) (i)
to-end packet success rates y
s!
and y
sm
of paths p
s!
and p
sm
with shared links are correlated even when the rates x
ij
are
s!
. We thus replace y
s!
in (8) with
the statistic
(i)
+
(i)
, where > 0 is a constant which can
be tuned bas on t rance to delay resulting from capacity
independent. We note that the variance
2
of the end-to-end
ed ole
s!
rate y
s!
can be computed using (7) with / = m.
Let
s
denote the L
s
1 vector of estimated end-to-end
violations
7
. We let Ws denote the |E| Ls weighted link-path
incidence matrix for source s with rows indexed by links (i, j)
and columns indexed by paths p
s!
. The element w((i, j), p
s!
)
packet success rates
s!
computed using (6), and let
s
denote the L
s
L
s
covariance matrix with (/, m) entry
s!m
in row (i, j) and column p
s!
of W
s
is thus given by
computed using (7). The estimate pair (
s
,
s
) provides the
(
min
)
1,
(i)
+
(i)
*
, if (i, j) p
s!
sufficient statistical characterization of the end-to-end packet
success rates for source s to allocate traffic to the paths in
P
s
. Furthermore, the off-diagonal elements in
s
denote the
w ((i, j), p
s!
) =
s! s!
0, otherwise.
extent of mutual overlap between the paths in P
s
.
IV. OPT IMAL JAMMING-AWARE TRAFFIC
ALLOCATION
In this section, we present an optimization framework for
jamming-aware traffic allocation to multiple routing paths in
P
s
for each source node s S . We develop a set of constraints
imposed on traffic allocation solutions and then formulate a
utility function for optimal traffic allocation by mapping the
problem to that of portfolio selection in finance. Letting
s!
denote the traffic rate allocated to path p
s!
by the source node
s, the problem of interest is thus for each source s to determine
the optimal L
s
1 rate allocation vector
s
subject to network
flow capacity constraints using the available statistics
s
and
s
of the end-to-end packet success rates under jamming.
A. Traffic Allocation Constraints
In order to define a set of constraints for the multiple-path
traffic allocation problem, we must consider the source data
rate constraints, the link capacity constraints, and the reduction
of traffic flow due to jamming at intermediate nodes. The traf-
fic rate allocation vector
s
is trivially constrained to the non-
negative orthant, i.e.
s
> 0, as traffic rates are non-negative.
Assuming data generation at source s is limited to a maximum
data rate Rs, the rate allocation vector is also constrained as
1
T
s
: R
s
. These constraints define the convex space
s
of feasible allocation vectors
s
characterizing rate allocation
solutions for source s.
Due to jamming at nodes along the path, the traffic rate is
potentially reduced at each receiving node as packets are lost.
Hence, while the initial rate of
s!
is allocated to the path,
the residual traffic rate forwarded by node i along the path
Letting c denote the |E| 1 vector of link capacities c
ij
for (i, j ) E , the link capacity constraint in (8) including
expected packet loss due to jamming can be expressed by the
vector inequality
'
Wss : c, (10)
sS
which is a linear constraint in the variable
s
. We note that
this statistical constraint formulation generalizes the standard
network flow capacity constraint corresponding to the case of
x
ij
= 1 for all (i, j) E in which the incidence matrix W
s
is deterministic and binary.
B. Optimal Traffic Allocation Using Portfolio Selection Theory
In order to determine the optimal allocation of traffic to the
paths in P
s
, each source s chooses a utility function U
s
(
s
)
that evaluates the total data rate, or throughput, successfully
delivered to the destination node d
s
. In defining our utility
function U
s
(
s
), we present an analogy between traffic allo-
cation to routing paths and allocation of funds to correlated
assets in finance.
In Markowitzs portfolio selection theory [12], [13], an
investor is interested in allocating funds to a set of financial
assets that have uncertain future performance. The expected
performance of each investment at the time of the initial
allocation is expressed in terms of return and risk. The return
on the asset corresponds to the value of the asset and measures
the growth of the investment. The risk of the asset corresponds
to the variance in the value of the asset and measures the
degree of variation or uncertainty in the investments growth.
We describe the desired analogy by mapping this allocation
of funds to financial assets to the allocation of traffic to routing
p
s!
may be less than
s!
. Letting p
(i)
denote the sub-path paths. We relate the expected investment return on the financial
of p
s!
from source s to the intermediate node i, the residual
traffic rate forwarded by node i is given by y
(i)
s!
, where y
(i)
portfolio to the estimated end-to-end success rates
s
and the
investment risk of the portfolio to the estimated success rate
s! s!
is computed using (5) with p
s!
replaced by the sub-path p
(i)
.
The capacity constraint on the total traffic traversing a link
(i, j) thus imposes the stochastic constraint
' '
s!
: c
ij
(8)
sS !:(i,j)ps!
covariance matrix
s
. We note that the correlation between
related assets in the financial portfolio corresponds to the
correlation between non-disjoint routing paths. The analogy
between financial portfolio selection and the allocation of
traffic to routing paths is summarized below.
7
The case of = 0 corresponds to the average-case constraint and will
on the feasible allocation vectors
s
. To compensate for the
(i) (i)
randomness in the capacity constraint in (8), we replace the
residual packet success rate y
(i)
with a function of its expected
.
Page 28 of 165
lEEE/ACM TRANSACTlONS ON NETWORKlNG,
s
Portfolio Selection Traffic
Funds to be invested Source d
Financial assets Routing
Expected Asset return Expected Packet
Investment portfolio Traffic all
Portfolio return Mean throughpu
Portfolio risk Estimation varia
As in Markowitzs theory, we define a c
factor k
s
> 0 for source s S to ind
for source s to allocate resources to le
lower throughput variance. This risk-aver
the trade-off between expected throughpu
variance. We note that each source s can
risk-aversion factor, and a source may v
factor k
s
with time or for different types
traffic rate allocation vector
s
, the expected
for source s is equal to the vector inner
corresponding variance in the throughput
the uncertainty in the estimate s is equal
T
ss. Based on the above analogy ma
selection theory, we define the utility
source s as the weighted sum
U
s
(
s
) =
T
s
- k
s
s s
Setting the risk-aversion factor ks to ze
source s is willing to put up with any amoun
in the estimate
s
of the end-to-end succe
the expected throughput. The role of the
is thus to impose a penalty on the object
tional to the uncertainty in the estimatio
narrowing the gap between expected throughpu
throughput. The cases of k
s
= 0 and k
s
detail in Section V.
Combining the utility function in (11)
straints defined in Section IV-A yields the
aware traffic allocation optimization prob
find the globally optimal traffic allocatio
sources.
Optimal Jamming-Aware Traffic
= arg max
!
T
s - ks
s
{s}
sS
s.t.
!
Wss : c
sS
1
T
s
: R
s
for all s S
0 :
s
for all s S .
Since the use of centralized protocols
may be undesirable due to excessive commun
in large-scale wireless networks, we seek
lation for the optimal traffic allocation prob
C. Optimal Distributed Traffic Allocation
In the distributed formulation of the al
s determines its own traffic allocation
s
,
message passing between sources. By
that the optimal jamming-aware flow all
(12) is similar to the network utility m
NETWORKlNG, VOL. 19, NO. 1, FEB 2011
s
s
s
s,n
Allocation
data rate Rs
g paths Ps
et success rate
s!
llocation s
oughput
T
s
ariance
T
ss
constant risk-aversion
formulation of the basic maxi
We thus develop a distributed
Lagrangian dual decompositio
The dual decomposition tec
the capacity constraint in (10
ij
corresponding to each li
|E| 1 vector of link prices
the optimization problem in (12
ndicate the preference
L(, ) =
'
T T
to less risky paths with
rsion constant weighs
sS
s
s
- k
s
s
hroughput and estimation
can choose a different
vary the risk-aversion
of data. For a given
The distributed optimizati
using the Lagrangian dual m
of link prices
n
at iteration n,
optimization problem
ected total throughput
%
T
er product
T
s
. The
t for source s due to
s,n
= arg max
ss
s
-
to the quadratic term
aking use of portfolio
The link prices n+1 are then
iteration as
function U
s
(
s
) at
n+1
=
!
n
- a
!
s
. (11)
s
where a > 0 is a constant step
zero indicates that the
mount of uncertainty
ccess rates to maximize
e risk-aversion factor
jective function propor-
n process, potentially
hroughput and achieved
> 0 are compared in
) with the set of con-
e following jamming-
problem which aims to
on over the set S of
c Allocation
T
s s
is the element-wise projectio
In order to perform the local
exchange information about th
step. Since updating the link
expected link usage, sources
link usage vectors us,n = W
prices are consistently updated
ative optimization step can
vectors
s
converge
8
for all
+
s,n
-
s,n-1
+ : for all s
approach yields the following
jamming-aware flow allocation
Distributed Jamming-Awa
Initialize n = 1 with initial
1. Each source s independ
"
s
s,n
= arg max
T
s
ss
,
(12) 2. Sources exchange the
us,n = Ws
.
3. Each source locally upda
ls for source routing
n+1 =
$
n - a
$
c
ommunication overhead
eek a distributed formu-
problem in (12).
n using NUM
algorithm, each source
, ideally with minimal
y inspection, we see
allocation problem in
maximization (NUM)
4. If $
s,n
-
s,n-1
$ >
increment n and go to
Given the centralized opti
the above distributed formulati
allocation, a set of sources with
s
can proactively compens
on network traffic flow.
6
s
s,n
s,n
imum network flow problem [14].
ted traffic allocation algorithm using
on techniques [14] for NUM.
technique is derived by decoupling
(10) and introducing the link prices
link (i, j). Letting denote the
ij
, the Lagrangian L(, ) of
(12) is given by
! #
T
W .
s
s
+ c -
'
sS
s s
(13)
ization problem is solved iteratively
method as follows. For a given set
n n, each source s solves the local
T
&
- k
T
n
W
s s s
s
. (14)
updated using a gradient descent
!
c -
'
W
s
##
+
, (15)
sS
tep size and (v)
+
= max(0, v)
on into the non-negative orthant.
cal update in (15), sources must
he result of the local optimization
k prices depends only on the
must only exchange the |E| 1
Ws
to ensure that the link
ated across all sources. The iter-
be repeated until the allocation
all sources s S , i.e. when
s with a given > 0. The above
distributed algorithm for optimal
on.
are Traffic Allocation
itial link prices 1.
ndently computes
" #
-
T
Ws s - ks
T
ss.
s n s
link usage vectors
updates link prices as
+
$ %%
c -
!
us,n .
sS
> " for any s,
to step 1.
timization problem in (12) and
lation for jamming-aware traffic
with estimated parameters
s
and
sate for the presence of jamming
Page 29 of 165
lEEE/ACM TRANSACTlONS ON NETWORKlNG, VOL. 19, NO. 1, FEB 2011
7
D. Computational Complexity
We note that both the centralized optimization problem in
(12) and the local optimization step in the distributed algo-
rithm are quadratic programming optimization problems with
linear constraints [13]. The computational time required for
solving these problems using numerical methods for quadratic
programming is a polynomial function of the number of
optimization variables and the number of constraints.
In the centralized problem, there are
+
sS
|P
s
| optimiza-
tion variables corresponding to the number of paths available
to each of the sources. The number of constraints in the
centralized problem is equal to the total number of links
|
,
sS
E
s
|, corresponding to the number of link capacity
constraints. In the distributed algorithm, each source iteratively
solves a local optimization problem, leading to |S | decoupled
optimization problems. Each of these problems has |P
s
| opti-
mization variables and |E
s
| constraints. Hence, as the number
of sources in the network increases, the distributed algorithm
may be advantageous in terms of total computation time. In
what follows, we provide a detailed performance evaluation
of the methods proposed in this article.
VI. CONCLUSION
In this article, we studied the problem of traffic allocation in
multiple-path routing algorithms in the presence of jammers
whose effect can only be characterized statistically. We have
presented methods for each network node to probabilistically
characterize the local impact of a dynamic jamming attack
and for data sources to incorporate this information into
the routing algorithm. We formulated multiple-path traffic
allocation in multi-source networks as a lossy network flow
optimization problem using an objective function based on
portfolio selection theory from finance. We showed that this
centralized optimization problem can be solved using a dis-
tributed algorithm based on decomposition in network utility
maximization (NUM). We presented simulation results to
illustrate the impact of jamming dynamics and mobility on
network throughput and to demonstrate the efficacy of our
traffic allocation algorithm. We have thus shown that multiple-
REFERENCES
[1] I. F. Akyildiz, X. Wang, and W. Wang, Wireless mesh networks: A
survey, Computer Networks, vol. 47, no. 4, pp. 445487, Mar. 2005.
[2] E. M. Sozer, M. Stojanovic, and J. G. Proakis, Underwater acoustic
networks, IEEE Journal of Oceanic Engineering, vol. 25, no. 1, pp.
7283, Jan. 2000.
[3] R. Anderson, Security Engineering: A Guide to Building Dependable
Distributed Systems. John Wiley & Sons, Inc., 2001.
[4] J. Bellardo and S. Savage, 802.11 denial-of-service attacks: Real
vulnerabilities and practical solutions, in Proc. USENIX Security Sym-
posium, Washington, DC, Aug. 2003, pp. 1528.
[5] D. J. Thuente and M. Acharya, Intelligent jamming in wireless net-
works with applications to 802.11b and other networks, in Proc. 25th
IEEE Communications Society Military Communications Conference
(MILCOM06), Washington, DC, Oct. 2006, pp. 17.
[6] A. D. Wood and J. A. Stankovic, Denial of service in sensor networks,
IEEE Computer, vol. 35, no. 10, pp. 5462, Oct. 2002.
[7] G. Lin and G. Noubir, On link layer denial of service in data wireless
LANs, Wireless Communications and Mobile Computing, vol. 5, no. 3,
pp. 273284, May 2005.
[8] W. Xu, K. Ma, W. Trappe, and Y. Zhang, Jamming sensor networks:
Attack and defense strategies, IEEE Network, vol. 20, no. 3, pp. 4147,
May/Jun. 2006.
[9] D. B. Johnson, D. A. Maltz, and J. Broch, DSR: The Dynamic Source
Routing Protocol for Multihop Wireless Ad Hoc Networks. Addison-
Wesley, 2001, ch. 5, pp. 139172.
[10] E. M. Royer and C. E. Perkins, Ad hoc on-demand distance vector
routing, in Proc. 2nd IEEE Workshop on mobile Computing Systems
and Applications (WMCSA99), New Orleans, LA, USA, Feb. 1999, pp.
90100.
[11] R. Leung, J. Liu, E. Poon, A.-L. C. Chan, and B. Li, MP-DSR: A QoS-
aware multi-path dynamic source routing protocol for wireless ad-hoc
networks, in Proc. 26th Annual IEEE Conference on Local Computer
Networks (LCN01), Tampa, FL, USA, Nov. 2001, pp. 132141.
[12] H. Markowitz, Portfolio selection, The Journal of Finance, vol. 7,
no. 1, pp. 7792, Mar. 1952.
[13] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, 2004.
Page 30 of 165
PATH FINDING MOBILE ROBOT INCORPORATING TWO WHEEL
DIFFERENTIAL DRIVE AND BOOLEAN LOGIC FOR
OBSTACLE AVOIDANCE
S.Venkatesan,
AP/CSE
Gojan School of Business &
Technology.
selvamvenkatesan@gmail.com
A.C.Arun
IV CSE
Gojan School of Business &
Technology.
acarun90@gmail.com
R.Rajesh Manikandan
IV CSE
Gojan School of Business &
Technology.
rajesh.rajesh269@gmail.com
Abstract
Mobile robots have the capability to move
around in their environment and are not
fixed to one physical location. The robot
could sense its surroundings with the aid of
various electronic sensors while mechanical
actuators were used to move it around. In
this research work we are focusing on
building a Obstacle Avoiding Robot using
Boolean Logic by deducing a truth table.
The design consisted of two main sections:
Electronic analysis of the various robot
sensors and Boolean logic used to interface
the sensors with the robots actuators. The
prototype is build using infra red sensor with
comparator circuit and DC motor. In this
paper its shown that obstacle detection
using IR-Phototransistor sensors and motors
act as actuators in turning to the next
position. This system can be further
enhanced by providing an external
monitoring control to the robot.
Keywords: Robot, Obstacle Avoidance,
Boolean logic.
Introduction
Obstacle avoidance is one of the most
critical factors in the design of autonomous
vehicles such as mobile robots. One of the
major challenges in designing intelligent
vehicles capable of autonomous travel on
highways is reliable obstacle avoidance
system. Obstacle avoidance system may be
divided into two parts, obstacle detection
(mechanism, hardware, sensors) and
avoidance control.
The traditional artificial intelligence
approach to building a control system for an
autonomous robot is to break the task into a
number of subsystems. These subsystems
typically include perception, world
modeling, planning, task execution and
motor control. The subsystems can be
thought of as a series of vertical slices with
sensor inputs on the left and actuator outputs
on the right. The disadvantage of this
approach, however, is that all of these
subsystems must work correctly for the
robot to function at all. To overcome this we
provide an external monitoring control.
Differential Drive
Differential drive is a method of controlling
a robot with only two motorized wheels.
What makes this algorithm important for a
robot builder is that it is also the simplest
control method for a robot. The term
'differential' means that robot turning speed
is determined by the speed difference
Page 31 of 165
between both wheels, each on either side of
your robot. For example: keep the left wheel
still, and rotate the right wheel forward, and
the robot will turn left. If you are clever with
it, or use PID control, you can get interesting
curved paths just by varying the speeds of
both wheels over time. Dont want to turn?
As long as both wheels go at the same
speed, the robot does not turn - only going
forward or reverse.
The differential drive algorithm is useful for
light chasing robots. This locomotion is the
most basic of all types, and is highly
recommended for beginners. Mechanical
construction, as well as the control
algorithm, cannot get any simpler than this
Operating Logic
Pseudocode:
Input sensor reading
Make decision based on sensor
reading
Do one of below actions:
To drive straight both wheels move
forward at same speed
To drive reverse both wheels move
back at same speed
To turn left the left wheel moves in
reverse and the right wheel moves forward
To turn right the right wheel moves
in reverse and the left wheel moves forward
Circuit Design
Sensor circuit
The Infrared emitter detector circuit is used
for a robot with basic object or obstacle
detection. Infrared emitter detector pair
sensors are fairly easy to implement,
although involved some level
of testing and calibration to get right. They
can be used for obstacle detection, motion
detection, transmitters, encoders, and color
detection (such as for line following).
Infrared Emitter Detector Basic Circuit
R1 is to prevent the emitter
(clear) LED from melting itself. Look at the
emitter spec sheet to find maximum power.
Make sure you choose an R1 value so
that Vcc^2/R1 < Power_spec. Or just use R1
= 120 ohms if you are lazy and trust me.
R2 should be larger then the maximum
resistance of the detector. Measure the
resistance of the detector (black) when it is
pointing into a dark area and
Control Logic Design
The output of the sensor circuit is taken and
the truth table is built as follows, Where M
+
and M
-
are the terminals of the motor. Based
Page 32 of 165
on the above truth the logic expression for
M
+
and M
-
are deduced using any of the
truth table minimization technique.
Conclusion
We have completed and tested the final
circuit and analyzed the results. The Robot
is found to of good mark unless and until it
is not in the range of sunlight. This can be
enhanced by using ultrasonic sensors. And
also an external monitoring assistance can
be given to avoid deadlock in unfriendly
environments.
References
[1] Development of a Navigation System for
Mobile Robots Using Different Patterns of
Behavior Based on Fuzzy Logic Villaseor-
Carrillo, U.G.Sotomayor-Olmedo, A.;
Gorrostieta-Hurtado, E.; Pedraza-Ortega,
J.C.; Aceves-Fernandez, M.A.; Delgado-
Rosas, M.; Electronics, Robotics and
Page 33 of 165
Automotive Mechanics Conference
(CERMA), 2010
Digital Object Identifier
10.1109CERMA.2010.57 Publication Year
2010 , Page(s) 451 456
[2]Differential Drive Wheeled Mobile Robot
(WMR) Control Using Fuzzy Logic
Techniques Rashid, Razif; Elamvazuthi,
Irraivan; Begam, Mumtaj; Arrofiq, M.;
Mathematical/Analytical Modelling and
Computer Simulation (AMS), 2010 Fourth
Asia International Conference on Digital
Object Identifier: 10.1109/AMS.2010.23
Publication Year: 2010 , Page(s): 51 55
[3]Autonomous navigation of a
nonholonomic mobile robot in a complex
environment Kokosy, A.; Defaux, F.-O.;
Perruquetti, W.; Safety, Security and Rescue
Robotics, 2008. SSRR 2008. IEEE
International Workshop on Digital Object
Identifier: 10.1109/SSRR.2008.4745885
Publication Year: 2008 , Page(s): 102 108
[4]Artificial Immune Algorithm based robot
obstacle-avoiding path planning
Zeng Dehuai; Xu Gang; Xie Cunxi; Yu
degui;
Automation and Logistics, 2008. ICAL
2008. IEEE International Conference on
Digital Object Identifier:
10.1109/ICAL.2008.4636259
Publication Year: 2008 , Page(s): 798 - 803
[5]Studying on path planning and dynamic
obstacle avoiding of soccer robot
Tang Ping; Zhang Qi; Yang Yi Min;
Intelligent Control and Automation, 2000.
Proceedings of the 3rd World Congress on
Volume: 2 Digital Object Identifier:
10.1109/WCICA.2000.863442 Publication
Year: 2000 , Page(s): 1244 - 1247 vol.2
Cited by: 1
[6]Collision-free curvature-bounded smooth
path planning using composite Bezier curve
based on Voronoi diagram
Yi-Ju Ho; Jing-Sin Liu; Computational
Intelligence in Robotics and Automation
(CIRA), 2009 IEEE International
Symposium on Digital Object Identifier:
10.1109/CIRA.2009.5423161
Publication Year: 2009 , Page(s): 463 - 468
Cited by: 2
[7]A motion planning method for an AUV
Arinaga, S.; Nakajima, S.; Okabe, H.;
OnoA.; Kanayama, Y.; Autonomous
Underwater Vehicle Technology, 1996.
AUV '96., Proceedings of the 1996
Symposium on Digital Object Identifier:
10.1109/AUV.1996.532450
Publication Year: 1996 , Page(s): 477 - 484
[8]Mobile Robot Path Planning Base on the
Hybrid Genetic Algorithm in Unknown
Environment Yong Zhang; Lin Zhang;
Xiaohua Zhang; Intelligent Systems Design
and Applications, 2008. ISDA '08. Eighth
International Conference on Volume: 2
Digital Object Identifier:
10.1109/ISDA.2008.18
Publication Year: 2008 , Page(s): 661 - 665
Cited by: 1
[9]Robot motion planning for time-varying
obstacle avoidance using view-time concept
Nak Yong Ko; Bum Hee Lee; Myoung Sam
Ko; Yun Seok Nam; Industrial Electronics,
1992., Proceedings of the IEEE International
Symposium on Digital Object Identifier:
10.1109/ISIE.1992.279551
Publication Year: 1992 , Page(s): 366 - 370
vol.1
Page 34 of 165
T
File sharing in Unstructured Peer-to-Peer
Network Using Sampling Technique
Ms. P. Preethi Rebecca, Asst.Professor / CSE , St. Peters University, Chennai.
M.ARUNA M.E (CSE) , St. Peters University, Chennai
AbstractThis paper presents a detailed examination of how
the dynamic and heterogeneous nature of real-world peer-to-peer
systems can introduce bias into the selection of representative
samples of peer properties (e.g., degree, link bandwidth, number
of files shared). We propose the Metropolized Random Walk with
Backtracking (MRWB) as a viable and promising technique for
collecting nearly unbiased samples and conduct an extensive
simulation study to demonstrate that our technique works well
for a wide variety of commonly-encountered peer-to-peer net-
work conditions. We have implemented the MRWB algorithm
for selecting peer addresses uniformly at random into a tool
any of the present peers with equal probability. The addresses
of the resulting peers may then be used as input to another mea-
surement tool to collect data on particular peer properties (e.g.,
degree, link bandwidth, number of files shared). The focus of
our work is on unstructured P2P systems, where peers select
neighbors through a predominantly random process. Most pop-
ular P2P systems in use today belong to this unstructured cate-
gory. For structured P2P systems such as Chord [1] and CAN
[2], knowledge of the structure significantly facilitates unbiased
called I on
Sampl er
. Using the Gnutella network, we empirically
sampling as we discuss in Section VII.
show that - yields more accurate samples than tools
that rely on commonly-used sampling techniques and results in
dramatic improvements in efficiency and scalability compared to
performing a full crawl.
Index TermsPeer-to-peer, sampling.
I. INTRODUCTION
HE popularity and wide-spread use of peer-to-peer sys-
tems has motivated numerous empirical studies aimed at
providing a better understanding of the properties of deployed
peer-to-peer systems. However, due to the large scale and highly
dynamic nature of many of these systems, directly measuring
the quantities of interest on every peer is prohibitively expen-
sive. Sampling is a natural approach for learning about these
systems using light-weight data collection, but commonly-used
sampling techniques for measuring peer-to-peer systems tend
to introduce considerable bias for two reasons. First, the dy-
namic nature of peers can bias results towards short-lived peers,
much as naively sampling flows in a router can lead to bias to-
wards short-lived flows. Second, the heterogeneous nature of the
overlay topology can lead to bias towards high-degree peers.
In this paper, we are concerned with the basic objective of
devising an unbiased sampling method, i.e., one which selects
Manuscript received March 25, 2007; revised January 23, 2008; approved by
IEEE/ACM TRANSACTIONS ON NETWORKING Editor L. Massoulie. First pub-
lished October 03, 2008; current version published April 15, 2009. This material
is based upon work supported in part by the National Science Foundation (NSF)
under Grant Nets-NBD-0627202 and an unrestricted gift from Cisco Systems.
Any opinions, findings, and conclusions or recommendations expressed in this
material are those of the authors and do not necessarily reflect the views of the
NSF or Cisco. An earlier version of this paper appeared in the Proceedings of
the ACMSIGCOMM Internet Measurement Conference 2006.
D. Stutzbach is with Stutzbach Enterprises, LLC, Dallas, TX 75206 USA
(e-mail: daniel@stutzbachenterprises.com; http://stutzbachenterprises.com).
R. Rejaie is with the Department of Computer Science, Universityof Oregon,
Eugene, OR 97403-1202 USA (e-mail: reza@cs.uoregon.edu).
N. Duffield, S. Sen, and W. Willinger are with AT&T LabsResearch,
Florham Park, NJ 07932 USA (e-mail: duffield@research.att.com; sen@re-
search.att.com; walter@research.att.com).
Digital Object Identifier 10.1109/TNET.2008.2001730
Achieving the basic objective of selecting any of the peers
present with equal probability is non-trivial when the structure
of the peer-to-peer system changes during the measurements.
First-generation measurement studies of P2P systems typically
relied on ad-hoc sampling techniques (e.g., [3], [4]) and pro-
vided valuable information concerning basic system behavior.
However, lacking any critical assessment of the quality of these
sampling techniques, the measurements resulting from these
studies may be biased and consequently our understanding of
P2P systems may be incorrect or misleading. The main contri-
butions of this paper are (i) a detailed examination of the ways
that the topological and temporal qualities of peer-to-peer sys-
tems (e.g., Churn [5]) can introduce bias, (ii) an in-depth ex-
ploration of the applicability of a sampling technique called the
Metropolized Random Walk with Backtracking (MRWB), repre-
senting a variation of the MetropolisHastings method [6][8],
and (iii) an implementation of the MRWB algorithm into a tool
called - . While sampling techniques based on the
original MetropolisHastings method have been considered ear-
lier (e.g., see Awan et al. [9] and Bar-Yossef and Gurevich [10]),
we show that in the context of unstructured P2P systems, our
modification of the basic MetropolisHastings method results
in nearly unbiased samples under a wide variety of commonly
encountered peer-to-peer network conditions.
The proposed MRWB algorithm assumes that the P2P
system provides some mechanism to query a peer for a list of
its neighborsa capability provided by most widely deployed
P2P systems. Our evaluations of the - tool shows
that the MRWB algorithm yields more accurate samples than
previously considered sampling techniques. We quantify the
observed differences, explore underlying causes, address the
tools efficiency and scalability, and discuss the implications on
accurate inference of P2P properties and high-fidelity modeling
of P2P systems. While our focus is on P2P networks, many of
our results apply to any large, dynamic, undirected graph where
nodes may be queried for a list of their neighbors.
After discussing related work and alternative sampling tech-
niques in Section II, we build on our earlier formulation in [11]
Page 35 of 165
and focus on sampling techniques that select a set of peers uni-
formly at random from all the peers present in the overlay and
then gather data about the desired properties from those peers.
While it is relatively straightforward to choose peers uniformly
at random in a static and known environment, it poses consid-
erable problems in a highly dynamic setting like P2P systems,
which can easily lead to significant measurement bias for two
reasons.
The first cause of sampling bias derives from the temporal
dynamics of these systems, whereby new peers can arrive and
existing peers can depart at any time. Locating a set of peers and
measuring their properties takes time, and during that time the
peer constituency is likely to change. In Section III, we show
how this often leads to bias towards short-lived peers and ex-
plain howto overcome this difficulty.
The second significant cause of bias relates to the connectivity
structure of P2P systems. As a sampling program explores a
given topological structure, each traversed link is more likely to
lead to a high-degree peer than a low-degree peer, significantly
biasing peer selection. We describe and evaluate different tech-
niques for traversing static overlays to select peers in Section IV
and find that the Metropolized Random Walk (MRW) collects
unbiased samples.
In Section V, we adapt MRWfor dynamic overlays by adding
backtracking and demonstrate its viability and effectiveness
when the causes for both temporal and topological bias are
present. We show via simulations that the MRWB technique
works well and produces nearly unbiased samples under a
variety of circumstances commonly encountered in actual P2P
systems.
Finally, in Section VI we describe the implementation of the
- tool based on the proposed MRWB algorithm and
empirically evaluate its accuracy and efficiency through com-
parison with complete snapshots of Gnutella taken with Cruiser
[12], as well as with results obtained from previously used,
more ad-hoc, sampling techniques. Section VII discusses some
important questions such as how many samples to collect and
outlines a practical solution to obtaining unbiased samples for
structured P2P systems. Section VIII concludes the paper by
summarizing our findings and plans for future work.
II. RELATED WORK
A. Graph Sampling
The phrase graph sampling means different things in dif-
ferent contexts. For example, sampling from a class of graphs
has been well studied in the graph theory literature [13], [14],
where the main objective is to prove that for a class of graphs
sharing some property (e.g., same node degree distribution), a
given random algorithm is capable of generating all graphs in
the class. Cooper et al. [15] used this approach to show that
their algorithm for overlay construction generates graphs with
good properties. Our objective is quite different; instead of sam-
pling a graph from a class of graphs our concern is sampling
peers (i.e., vertices) from a largely unknown and dynamically
changing graph. Others have used sampling to extract informa-
tion about graphs (e.g., selecting representative subgraphs from
a large, intractable graph) while maintaining properties of the
original structure [16][18]. Sampling is also frequently used
as a component of efficient, randomized algorithms [19]. How-
ever, these studies assume complete knowledge of the graphs in
question. Our problemis quite different in that we do not know
the graphs in advance.
A closely related problemto ours is sampling Internet routers
by running traceroute from a few hosts to many destinations for
the purpose of discovering the Internets router-level topology.
Using simulation [20] and analysis [21], research has shown
that traceroute measurements can result in measurement bias
in the sense that the obtained samples support the inference of
power law-type degree distributions irrespective of the true na-
ture of the underlying degree distribution. A common feature
of our work and the study of the traceroute technique [20], [21]
is that both efforts require an evaluation of sampling techniques
without complete knowledge of the true nature of the underlying
connectivity structure. However, exploring the router topology
and P2P topologies differ in their basic operations for graph-ex-
ploration. In the case of traceroute, the basic operation is What
is the path to this destination? In P2P networks, the basic oper-
ation is What are the neighbors of this peer? In addition, the
Internets router-level topology changes at a much slower rate
than the overlay topology of P2P networks.
Another closely related problem is selecting Web pages uni-
formly at randomfrom the set of all Web pages [22], [23]. Web
pages naturally form a graph, with hyper-links forming edges
between pages. Unlike unstructured peer-to-peer networks, the
Web graph is directed and only outgoing links are easily dis-
covered. Much of the work on sampling Web pages therefore
focuses on estimating the number of incoming links, to facili-
tate degree correction. Unlike peers in peer-to-peer systems, not
much is known about the temporal stability of Web pages, and
temporal causes of sampling bias have received little attention
in past measurement studies of the Web.
B. RandomWalk-Based Sampling of Graphs
A popular technique for exploring connectivity structures
consists of performing random walks on graphs. Several
properties of random walks on graphs have been extensively
studied analytically [24], such as the access time, cover time,
and mixing time. While these properties have many useful
applications, they are, in general, only well-defined for static
graphs. To our knowledge the application of random walks
as a method of selecting nodes uniformly at random from a
dynamically changing graph has not been studied. A number
of papers [25][28] have made use of random walks as a basis
for searching unstructured P2P networks. However, searching
simply requires locating a certain piece of data anywhere along
the walk, and is not particularly concerned if some nodes are
preferred over others. Some studies [27], [28] additionally use
random walks as a component of their overlay-construction
algorithm.
Two papers that are closely related to our randomwalk-based
sampling approach are by Awan et al. [9] and Bar-Yossef and
Gurevich [10]. While the former also address the problem of
gathering uniform samples from peer-to-peer networks, the
latter are concerned with uniform sampling from a search
Page 36 of 165
FILE SHARING IN UNSTRUCTURED PEER-TO-PEER NETWORKS
engines index. Both works examine several random walk tech-
niques, including the MetropolisHastings method, but assume
an underlying graph structure that is not dynamically changing.
In addition to evaluating their techniques empirically for static
power-law graphs, the approach proposed by Awan et al. [9]
also requires special underlying support from the peer-to-peer
application. In contrast, we implement the MetropolisHast-
ings method in such a way that it relies only on the ability
to discover a peers neighbors, a simple primitive operation
commonly found in existing peer-to-peer networks. Moreover,
we introduce backtracking to cope with departed peers and con-
duct a much more extensive evaluation of the proposed MRWB
method. Specifically, we generalize our formulation reported in
[11] by evaluating MRWB over dynamically changing graphs
with a variety of topological properties. We also perform
empirical validations over an actual P2P network.
C. Sampling in Hidden Populations
The problemof obtaining accurate estimates of the number of
peers in an unstructured P2P network that have a certain prop-
erty can also be viewed as a problem in studying the sizes of
hidden populations. Following Salganik [29], a population is
called hidden if there is no central directory of all population
members, such that samples may only be gathered through re-
ferrals from existing samples. This situation often arises when
public acknowledgment of membership has repercussions (e.g.,
injection drug users [30]), but also arises if the target population
is difficult to distinguish from the population as a whole (e.g.,
jazz musicians [29]). Peers in P2P networks are hidden because
there is no central repository we can query for a list of all peers.
Peers must be discovered by querying other peers for a list of
neighbors.
Proposed methods in the social and statistical sciences for
studying hidden populations include snowball sampling [31],
key informant sampling [32], and targeted sampling [33]. While
these methods gather an adequate number of samples, they are
notoriously biased. More recently, Heckathorn [30] (see also
[29], [34]) proposed respondent-driven sampling, a snowball-
type method for sampling and estimation in hidden populations.
Respondent-driven sampling first uses the sample to make infer-
ences about the underlying network structure. In a second step,
these network-related estimates are used to derive the propor-
tions of the various subpopulations of interest. Salganik et al.
[29], [34] show that under quite general assumptions, respon-
dent-driven sampling yields estimates for the sizes of subpop-
ulations that are asymptotically unbiased, no matter how the
seeds were chosen.
Unfortunately, respondent-driven sampling has only been
studied in the context where the social network is static and
does not change with time. To the best of our knowledge, the
accuracy of respondent-driven sampling in situations where the
underlying network structure is changing dynamically (e.g.,
unstructured P2P systems) has not been considered in the
existing sampling literature.
D. Dynamic Graphs
While graph theory has been largely concerned with studying
and discovering properties of static connectivity structures,
many real-world networks evolve over time, for example via
node/edge addition and/or deletion. In fact, many large-scale
networks that arise in the context of the Internet (e.g., WWW,
P2P systems) are extremely dynamic and create havoc for
graph algorithms that have been designed with static or only
very slowly changing network structures in mind. Furthermore,
the development of mathematical models for evolving graphs
is still at an early stage and is largely concerned with genera-
tive models that are capable of reproducing certain observed
properties of evolving graphs. For example, recent work by
Leskovec et al. [35] focuses on empirically observed properties
such as densification (i.e., networks become denser over time)
and shrinking diameter (i.e., as networks grow, their diameter
decreases) and on new graph generators that account for these
properties. However, the graphs they examine are not P2P
networks and their properties are by and large inconsistent with
the design and usage of measured P2P networks (e.g., see [5]).
Hence, the dynamic graph models proposed in [35] are not
appropriate for our purpose, and neither are the evolving graph
models specifically designed to describe the Web graph (e.g.,
see [36] and references therein).
III. SAMPLING WITH DYNAMICS
We develop a formal and general model of a P2P system as
follows. If we take an instantaneous snapshot of the system at
time , we can view the overlay as a graph with the
peers as vertices and connections between the peers as edges.
Extending this notion, we incorporate the dynamic aspect by
viewing the system as an infinite set of time-indexed graphs,
. The most common approach for sampling
from this set of graphs is to define a measurement window,
, and select peers uniformly at random from the
set of peers who are present at any time during the window:
. Thus, it does not distinguish between
occurrences of the same peer at different times.
This approach is appropriate if peer session lengths are expo-
nentially distributed (i.e., memoryless). However, existing mea-
surement studies [3], [5], [37], [38] show session lengths are
heavily skewed, with many peers being present for just a short
time (a few minutes) while other peers remain in the system for
a very long time (i.e., longer than ). As a consequence, as
increases, the set includes an increasingly large frac-
tion of short-lived peers.
A simple example may be illustrative. Suppose we wish to
observe the number of files shared by peers. In this example
system, half the peers are up all the time and have many files,
while the other peers remain for around 1 minute and are imme-
diately replaced by new short-lived peers who have few files.
The technique used by most studies would observe the system
for a long time and incorrectly conclude that most of the
peers in the system have very few files. Moreover, their results
will depend on howlong they observe the system. The longer the
measurement window, the larger the fraction of observed peers
with few files.
One fundamental problem of this approach is that it focuses
on sampling peers instead of peer properties. It selects each
sampled vertex at most once. However, the property at the vertex
may change with time. Our goal should not be to select a vertex
Page 37 of 165
, but rather to sample the property at at a par-
ticular instant . Thus, we distinguish between occurrences of
the same peer at different times: samples and gath-
ered at distinct times are viewed as distinct, even when
they come from the same peer. The key difference is that it
must be possible to sample fromthe same peer more than once,
at different points in time. Using the formulation ,
, the sampling technique will not be biased by
the dynamics of peer behavior, because the sample set is decou-
pled frompeer session lengths. To our knowledge, no prior P2P
measurement studies relying on sampling make this distinction.
Returning to our simple example, our approach will correctly
select long-lived peers half the time and short-lived peers half
the time. When the samples are examined, they will show that
half of the peers in the systemat any given moment have many
files while half of the peers have fewfiles, which is exactly cor-
rect.
If the measurement window is sufficiently small, such
that the distribution of the property under consideration does
not change significantly during the measurement window, then
we may relax the constraint of choosing uniformly at random
from .
We still have the significant problemof selecting a peer uni-
formly at random from those present at a particular time. We
begin to address this problemin Section IV.
IV. SAMPLING FROM STATIC GRAPHS
We now turn our attention to topological causes of bias.
Towards this end, we momentarily set aside the temporal issues
by assuming a static, unchanging graph. The selection process
begins with knowledge of one peer (vertex) and progressively
queries peers for a list of neighbors. The goal is to select peers
uniformly at random. In any graph-exploration problem, we
have a set of visited peers (vertices) and a front of unexplored
neighboring peers. There are two ways in which algorithms
differ: (i) how to chose the next peer to explore, and (ii) which
subset of the explored peers to select as samples. Prior studies
use simple breadth-first or depth-first approaches to explore the
graph and select all explored peers. These approaches suffer
fromseveral problems:
The discovered peers are correlated by their neighbor rela-
tionship.
Peers with higher degree are more likely to be selected.
Because they never visit the same peer twice, they will
introduce bias when used in a dynamic setting as described
in Section III.
1) Random Walks: A better candidate solution is the random
walk, which has been extensively studied in the graph theory lit-
erature (for an excellent survey see [24]). We briefly summarize
the key terminology and results relevant to sampling. The tran-
sition matrix describes the probability of transitioning
to peer if the walk is currently at peer :
is a neighbor of
otherwise.
If the vector describes the probability of currently being at
each peer, then the vector describes the
probability
after taking one additional step. Likewise, describes the
probability after taking steps. As long as the graph is con-
nected and not bipartite, the probability of being at any partic-
ular node, , converges to a stationary distribution:
In other words, if we select a peer as a sample every steps, for
sufficiently large , we have the following good properties:
The information stored in the starting vector, , is lost,
through the repeated selection of random neighbors. There-
fore, there is no correlation between selected peers. Alter-
nately, we may start many walks in parallel. In either cases,
after steps, the selection is independent of the origin.
While the stationary distribution, , is biased towards
peers with high degree, the bias is precisely known, al-
lowing us to correct it.
Randomwalks may visit the same peer twice, which lends
itself better to a dynamic setting as described in Section III.
In practice, need not be exceptionally large. For graphs
where the edges have a strong random component (e.g., small-
world graphs such as peer-to-peer networks), it is sufficient that
the number of steps exceed the log of the population size, i.e.,
.
2) Adjusting for Degree Bias: To correct for the bias towards
high degree peers, we make use of the MetropolisHastings
method for Markov Chains. Randomwalks on a graph are a spe-
cial case of Markov Chains. In a regular random walk, the tran-
sition matrix leads to the stationary distribution ,
as described above. We would like to choose a new transition
matrix, , to produce a different stationary distribution,
. Specifically, we desire to be the uniform distribu-
tion so that all peers are equally likely to be at the end of the
walk. MetropolisHastings [6][8] provides us with the desired
:
if
if .
Equivalently, to take a step from peer , select a neighbor
of as normal (i.e., with probability ). Then, with
probability , accept the move. Otherwise,
return to (i.e., with probability ).
To collect uniformsamples, we have , so the move-
acceptance probability becomes:
Therefore, our algorithm for selecting the next step from some
peer is as follows:
Select a neighbor of uniformly at random.
Query for a list of its neighbors, to determine its degree.
Generate a randomvalue, , uniformly between 0 and 1.
If , is the next step.
Otherwise, remain at as the next step.
We call this the Metropolized Random Walk (MRW). Quali-
tatively, the effect is to suppress the rate of transition to peers of
Page 38 of 165
FILE SHARING IN UNSTRUCTURED PEER-TO-PEER NETWORKS
TABLE I
KOLMOGOROVSMIRNOV TEST STATISTIC FOR TECHNIQUES OVER STATIC GRAPHS. VALUES ABOVE
1.07 1o LIE IN THE REJECTION REGION AT THE 5% LEVEL
Fig. 1. Bias of different sampling techniques; after collecting samples. The figures showhowmany peers ( -axis) were selected times.
higher degree, resulting in selecting each peer with equal prob-
ability.
3) Evaluation: Although [6] provides a proof of correctness
for the MetropolisHastings method, to ensure the correctness
of our implementation we conduct evaluations through simu-
lation over static graphs. This additionally provides the oppor-
tunity to compare MRW with conventional techniques such as
Breadth-First Search (BFS) or naive random walks (RW) with
no adjustments for degree bias.
To evaluate a technique, we use it to collect a large number of
sample vertices froma graph, then perform a goodness-of-fit test
against the uniform distribution. For Breadth-First Search, we
simulate typical usage by running it to gather a batch of 1,000
peers. When one batch of samples is collected, the process is
reset and begins anew at a different starting point. To ensure
robustness with respect to different kinds of connectivity struc-
tures, we examine each technique over several types of graphs
as follows:
ErdsRnyi: The simplest variety of randomgraphs
WattsStrogatz: Small world graphs with high clus-
tering and low path lengths
BarabsiAlbert: Graphs with extreme degree distribu-
tions, also known as power-lawor scale-free graphs
Gnutella: Snapshots of the Gnutella ultrapeer topology,
captured in our earlier work [39]
To make the results more comparable, the number of vertices
and edges in each graph
are approximately the same.
1
Table I presents the results of the
goodness-of-fit tests after collecting samples, showing
that MetropolisHastings appears to generate uniform samples
over each type of graph, while the other techniques fail to do so
by a wide margin.
Fig. 1 explores the results visually, by plotting the number
of times each peer is selected. If we select samples,
the
1
ErdsRnyi graphs are generated based on some probability ) that any edge
may exist. We set ) = so that there will be close to E edges,
though the exact value may vary slightly. The WattsStrogatz model require
that E be evenly divisible by , so in that model we use E = 1 9o4 1o .
typical node should be selected times, with other nodes being
selected close to times approximately following a normal dis-
tribution with variance .
2
We used samples. We also
include an Oracle technique, which selects peers uniformly
at random using global information. The MetropolisHastings
results are virtually identical to the Oracle, while the other
techniques select many peers much more and much less than
times. In the Gnutella, WattsStrogatz, and BarabsiAlbert
graphs, Breadth-First Search exhibits a few vertices that are
selected a large number of times . The (not-ad-
justed) Random Walk (RW) method has similarly selected a
few vertices an exceptionally large number of times in the
Gnutella and BarabsiAlbert models. The Oracle and MRW,
by contrast, did not select any vertex more than around 1,300
times.
In summary, the MetropolisHastings method selects peers
uniformly at random from a static graph. The Section V exam-
ines the additional complexities when selecting from a dynamic
graph, introduces appropriate modifications, and evaluates the
algorithms performance.
V. SAMPLING FROM DYNAMIC GRAPHS
Section III set aside topological issues and examined the dy-
namic aspects of sampling. Section IV set aside temporal issues
and examined the topological aspects of sampling. This section
examines the unique problems that arise when both temporal
and topological difficulties are present.
Our hypothesis is that a MetropolisHastings random walk
will yield approximately unbiased samples even in a dynamic
environment. Simulation results testing this hypothesis are later
in this section and empirical tests are in Section V-A. The funda-
mental assumption of MetropolisHastings is that the frequency
of visiting a peer is proportional to the peers degree. This as-
sumption will be approximately correct if peer relationships
change only slightly during the walk. On one extreme, if the
2
Based on the normal approximation of a binomial distribution with ) =
and = .
Page 39 of 165
entire walk completes before any graph changes occur, then the
problem reduces to the static case. If a single edge is removed
mid-walk, the probability of selecting the two affected peers
is not significantly affected, unless those peers have very few
edges. If many edges are added and removed during a random
walk, but the degree of each peer does not change significantly,
we would also expect that the probability of selecting each peer
will not change significantly. In peer-to-peer systems, each peer
actively tries to maintain a number of connections within a cer-
tain range, so we have reason to believe that the degree of each
peer will be relatively stable in practice. On the other hand, it is
quite possible that in a highly dynamic environment, or for cer-
tain degree distributions, the assumptions of MetropolisHast-
ings are grossly violated and it fails to gather approximately un-
biased samples.
The fundamental question we attempt to answer in this sec-
tion is: Under what conditions does the MetropolisHastings
random walk fail to gather approximately unbiased samples?
If there is any bias in the samples, the bias will be tied to
some property that interacts with the walk. Put another way,
if there were no properties that interacted with the walk, then
the walking process behaves as it would on a static graph, for
which we have a proof from graph theory. Therefore, we are
only worried about properties which cause the walk to behave
differently. We identify the following three fundamental prop-
erties that interact with the walk:
Degree: the number of neighbors of each peer. The
MetropolisHastings method is a modification of a reg-
ular random walk in order to correct for degree-bias as
described in Section IV. It assumes a fixed relationship
between degree and the probability of visiting a peer.
If the MetropolisHastings assumptions are invalid, the
degree-correction may not operate correctly, introducing a
bias correlated with degree.
Session lengths: how long peers remain in the system.
Section III showed howsampling may result in a bias based
on session length. If the walk is more likely to select either
short-lived or long-lived peers, there will be a bias corre-
lated with session length.
Query latency: how long it takes the sampler to query
a peer for a list of its neighbors. In a static environment
the only notion of time is the number of steps taken by
the walk. In a dynamic environment, each step requires
querying a peer, and some peers will respond more quickly
than others. This could lead to a bias correlated with the
query latency. In our simulations, we model the query la-
tency as twice the round-trip time between the sampling
node and the peer being queried.
3
For other peer properties, sampling bias can only arise if the
desired property is correlated with a fundamental properties
and that fundamental property exhibits bias. For example, when
sampling the number of files shared by each peer, there may be
sampling bias if the number of files is correlated with session
length and sampling is biased with respect to session length.
One could also imagine the number of files being correlated
3
RTT for the SYN, RTT for the SYN-ACK, RTT for the ACKand the
request, and RTT for the reply.
with query latency (which is very loosely related to the peer
bandwidth). However, sampling the number of shared files
cannot be biased independently, as it does not interact with
the walk. To show that sampling is unbiased for any property,
it is sufficient to show that it is unbiased for the fundamental
properties that interact with the sampling technique.
A. Coping With Departing Peers
Departing peers introduce an additional practical considera-
tion. The walk may try to query a peer that is no longer present-a
case where the behavior of the ordinary random walk algorithm
is undefined. We employ a simple adaptation to mimic an ordi-
nary random walk on a static graph as closely as possible, by
maintaining a stack of visited peers. When the walk chooses a
newpeer to query, we push the peers address on the stack. If the
query times out, we pop the address off the stack, and choose a
new neighbor of the peer that is now on top of the stack. If all of
a peers neighbors time out, we re-query that peer to get a fresh
list of its neighbors. If the re-query also times out, we pop that
peer from the stack as well, and so on. If the stack underflows,
we consider the walk a failure. We do not count timed-out peers
as a hop for the purposes of measuring the length of the walk.
We call this adaptation of the MRW sampling technique the
Metropolized Random Walk with Backtracking (MRWB) method
for sampling from dynamic graphs. Note that when applied in a
static environment, this method reduces to MRW.
B. Evaluation Methodology
In the static case, we can rely on graph theory to prove the
accuracy of the MRW technique. Unfortunately, graph theory
is not well-suited to the problem of dynamically changing
graphs. Therefore, we rely on simulation rather than analysis.
We have developed a session-level dynamic overlay simulator
that models peer arrivals, departures, latencies, and neighbor
connections. We nowdescribe our simulation environment.
The latencies between peers are modeled using values from
the King data set [40]. Peers learn about one another using one of
several peer discovery mechanisms described below. Peers have
a target minimumnumber of connections (i.e., degree) that they
attempt to maintain at all times. Whenever they have fewer con-
nections, they open additional connections. We assume connec-
tions are TCP and require a 3-way handshake before the connec-
tion is fully established, and that peers will time out an attempted
connection to a departed peer after 10 seconds. A new peer gen-
erates its session length from one of several different session
length distributions described below and departs when the ses-
sion length expires. New peers arrive according to a Poisson
process, where we select the mean peer arrival rate based on the
session length distribution to achieve a target population size of
100,000 peers.
To query a peer for a list of neighbors, the sampling node
must set up a TCP connection, submit its query, and receive a
response. The query times out if no response is received after 10
seconds.
4
We run the simulator for a warm-up period to reach
steady-state conditions before performing any randomwalks.
4
The value of 10 seconds was selected based on our experiments in devel-
oping a crawler for the Gnutella network in [12].
Page 40 of 165
FILE SHARING IN UNSTRUCTURED PEER-TO-PEER NETWORKS
Fig. 2. Comparison of sampled and expected distributions. They are visually indistinguishable.
Fig. 3. Distribution of time needed to complete a randomwalk (simulated).
Our goal is to discover if random walks started under identical
conditions will select a peer uniformly at random. To evaluate
this, we start 100,000 concurrent randomwalks froma single lo-
cation. Although started at the same time, the walks will not all
complete at the same time.
5
We chose to use 100,000 walks as
we believe this is a much larger number of samples than most
researchers will use in practice. If there is no discernible bias
with 100,000 samples, we can conclude that the tool is unbiased
for the purposes of gathering fewer samples (i.e., we cannot get
more accuracy by using less precision). Fig. 3 shows the dis-
tribution of how long walks take to complete in one simulation
using 50 hops per walk, illustrating that most walks take 1020
seconds to complete. In the simulator the walks do not interact
or interfere with one another in any way. Each walk ends and
collects an independent sample.
As an expected distribution, we capture a perfect snapshot
(i.e., using an oracle) at the median walk-completion time, i.e.,
when 50%of the walks have completed.
C. Evaluation of a Base Case
Because the potential number of simulation parameters is un-
bounded, we need a systematic method to intelligently explore
the most interesting portion of this parameter space. Towards
this end, we begin with a base case of parameters as a starting
point and examine the behavior of MRWB under those condi-
tions. In Sections V-D and E, we vary the parameters and ex-
plore how the amount of bias varies as a function of each of the
parameters. As a base case, we use the following configuration:
Fig. 2 presents the sampled and expected distributions for the
three fundamental properties: degree, session length, and query
latency. The fact that the sampled and expected distributions are
visually indistinguishable demonstrates that the samples are not
significantly biased in the base case.
5
Each walk ends after the same number of hops, but not every hop takes the
same amount of time due to differences in latencies and due to the occasional
timeout.
TABLE II
BASE CASE CONFIGURATION
To efficiently examine other cases, we introduce a sum-
mary statistic to quickly capture the difference between the
sampled and expected distributions, and to provide more rigor
than a purely visual inspection. For this purpose, we use the
Kolmogorov-Smirnov (KS) statistic, , formally defined as
follows. Where is the sampled cumulative distribution
function and is the expected cumulative distribution
function fromthe perfect snapshot, the KS statistic is
In other words, if we plot the sampled and expected CDFs,
is the maximum vertical distance between them and has a pos-
sible range of . For Fig. 2(a)(c), the values of were
0.0019, 0.0023, and 0.0037, respectively. For comparison, at the
significance level, is 0.0061, for the two-sample KS
statistic with 100,000 data points each. However, in practice we
do not expect most researchers to gather hundreds of thousands
of samples. After all, the initial motivation for sampling is to
gather reasonably accurate data at relatively low cost. As a rough
rule of thumb, a value of is quite bad, corresponding to at
least a 10 percentage point difference on a CDF. A value of
is excellent for most purposes when studying a peer property,
corresponding to no more than a 1 percentage point difference
on a CDF.
D. Exploring Different Dynamics
In this section, we examine how the amount of bias changes as
we vary the type and rate of dynamics in the system. We examine
different settings of the simulation parameters that affect dy-
namics, while continuing to use the topological characteristics
fromour base case (Table II). We would expect that as the rate
of peer dynamics increases, the sampling error also increases.
The key question is: How fast can the churn rate be before it
causes significant error, and is that likely to occur in practice?
In this subsection, we present the results of simulations with
a wide variety of rates using three different models for session
length, as follows:
Exponential: The exponential distribution is a one-param-
eter distribution (rate ) that features sessions relatively
close together in length. It has been used in many prior
Page 41 of 165
Fig. 4. Sampling error of the three fundamental properties as a function of session-length distribution. Exceptionallyheavy churn eaiimd
error into the sampling process.
< 1m ii) introduces
simulation and analysis studies of peer-to-peer systems
[41][43].
Pareto: The Pareto (or power-law) distribution is a two-pa-
rameter distribution (shape , location ) that features
many short sessions coupled with a few very long sessions.
Some prior measurement studies of peer-to-peer systems
have suggested that session lengths follow a Pareto distri-
bution [44][46]. One difficulty with this model is that
is a lower-bound on the session length, and fits of to
empirical data are often unreasonably high (i.e., placing a
lower bound significantly higher than the median session
length reported by other measurement studies). In their in-
sightful analytical study of churn in peer-to-peer systems,
Leonard, Rai, and Loguinov [47] instead suggest using a
shifted Pareto distribution (shape , scale ) with .
We use this shifted Pareto distribution, holding fixed and
varying the scale parameter . We examine two different
values: (infinite variance) and (finite
variance).
Weibull: Our own empirical observations [5] suggest the
Weibull distribution (shape , scale ) provides a good
model of peer session lengths, representing a compromise
between the exponential and Pareto distributions. We fix
(based on our empirical data) and vary the scale
parameter .
Fig. 4 presents the amount of sampling error as a func-
tion of median session length, for the three fundamental proper-
ties, with a logarithmic -axis scale. The figure shows that error
is low over a wide range of session lengths but be-
gins to become significant when the median session length drops
below 2 minutes, and exceeds when the median drops
below 30 seconds. The type of distribution varies the threshold
slightly, but overall does not appear to have a significant im-
pact. To investigate whether the critical threshold is a function
of the length of the walk, we ran some simulations using walks
of 10,000 hops (which take around one simulated hour to com-
plete). Despite the long duration of these walks, they remained
unbiased with for each of the three
fundamental properties. This suggests that the accuracy of
MRWB is not ad- versely affected by a long walk.
While the median session length reported by measurement
studies varies considerably (see [42] for a summary), none re-
port a median below 1 minute and two studies report a median
session length of one hour [3], [4]. In summary, these results
demonstrate that MRWBcan gracefully tolerate peer dynamics.
In particular, it performs well over the rate of churn reported in
real systems.
E. Exploring Different Topologies
In this section, we examine different settings of the simulation
parameters that directly affect topological structure, while using
the dynamic characteristics from our base case (Table II). The
MetropolisHastings method makes use of the ratio between the
degrees of neighboring peers. If this ratio fluctuates dramatically
while the walk is conducted, it may introduce significant bias.
If peers often have only a few connections, any change in their
degree will result in a large percentage-wise change. One key
question is therefore: Does a low target degree lead to sampling
bias, and, if so, when is significant bias introduced?
The degree of peers is controlled by three factors. First, each
peer has a peer discovery mechanism that enables it to learn
the addresses of potential neighbors. The peer discovery mech-
anism will influence the structure of the topology and, if per-
forming poorly, will limit the ability of peers to establish con-
nections. Second, peers have a target degree which they actively
try to maintain. If they have fewer neighbors than the target, they
open additional connections until they have reached the target.
If necessary, they make use of the peer discovery mechanism
to locate additional potential neighbors. Finally, peers have a
maximumdegree, which limits the number of neighbors they are
willing to accept. If they are at the maximum and another peer
contacts them, they refuse the connection. Each of these three
factors influences the graph structure, and therefore may affect
the walk.
We model four different types of peer discovery mechanisms,
based on those found in real systems:
Random Oracle: This is the simplest and most idealistic
approach. Peers learn about one another by contacting a
rendezvous point that has perfect global knowledge of the
system and returns a random set of peers for them to con-
nect to.
FIFO: In this scheme, inspired by the GWebCaches of
Gnutella [48], peers contact a rendezvous point which re-
turns a list of the last peers that contacted the rendezvous,
where is the maximumpeer degree.
Soft State: Inspired by the approach of BitTorrents
trackers, peers contact a rendezvous point that has
imperfect global knowledge of the system. In addition to
contacting the rendezvous point to learn about more peers,
every peer periodically (every half hour) contacts the
Page 42 of 165
FILE SHARING IN UNSTRUCTURED PEER-TO-PEER NETWORKS
Fig. 5. Sampling error of the three fundamental properties as a function of the number of connections each peer actively attempts to maintain. Lowtarget degree
2) introduces significant sampling error.
rendezvous point to refresh its state. If a peer fails to make
contact for 45 minutes, the rendezvous point removes it
fromthe list of known peers.
History: Many P2P applications connect to the network
using addresses they learned during a previous session
[49]. A large fraction of these addresses will timeout, but
typically enough of the peers will still be active to avoid
the need to contact a centralized rendezvous point. As
tracking the re-appearance of peers greatly complicates
our simulator (as well as greatly increasing the memory
requirements), we use a coarse model of the History
mechanism. We assume that 90% of connections auto-
matically timeout. The 10% that are given valid addresses
are skewed towards peers that have been present for a
long time (more than one hour) and represent regular
users who might have been present during the peers last
session. While this might be overly pessimistic, it reveals
the behavior of MRWB under harsh conditions.
Fig. 5 presents the amount of sampling error for the
three fundamental properties as a function of the target degree,
for each of the peer discovery methods, holding the maximum
peer degree fixed at 30 neighbors. It shows that sampling
is not significantly biased in any of the three fundamental
properties as long as peers attempt to maintain at least three
connections. Widely deployed peer-to-peer systems typically
maintain dozens of neighbors. Moreover, maintaining fewer
than three neighbors per peer almost certainly leads to network
fragmentation, and is therefore not a reasonable operating point
for peer-to-peer systems.
The results for the different peer-discovery mechanisms were
similar to one another, except for a small amount of bias ob-
served when using the History mechanism as the target degree
approaches the maximumdegree (30). To investigate this issue,
Fig. 6 presents the sampled and expected degree distribution
when using the History mechanism with a target degree of 30.
The difference between the sampled and expected distributions
is due to the 2.4% of peers with a degree of zero. These iso-
lated peers arise in this scenario because the History mechanism
has a high failure rate (returning addresses primarily of departed
peers), and when a valid address is found, it frequently points
to a peer that is already at its connection limit. The zero-degree
peers are visible in the snapshot (which uses an oracle to obtain
global information), but not to the sampler (since peers with a
degree of zero have no neighbors and can never be reached).
We do not regard omitting disconnected peers as a serious lim-
itation.
Fig. 6. Comparison of degree distributions using the History mechanism with a
target degree of 30. Sampling cannot capture the unconnected peers dg=eee
o), causing the sampling error observed in Fig. 5.
Having explored the effects of lowering the degree, we now
explore the effects of increasing it. In Fig. 7, we examine sam-
pling error as a function of the maximumdegree, with the target
degree always set to 15 less than the maximum. There is little
error for any setting of the maximumdegree.
In summary, the proposed MRWB technique for sampling
from dynamic graphs appears unbiased for a range of different
topologies (with reasonable degree distributions; e.g.,
), operates correctly for a number of different mechanisms for
peer discovery, and is largely insensitive to a wide range of peer
dynamics, with the churn rates reported for real systems safely
within this range.
VI. EMPIRICAL RESULTS
In addition to the simulator version, we have implemented
the MRWB algorithm for sampling from real peer-to-peer net-
works into a tool called - . The Sections VI-AE
briefly describe the implementation and usage of -
and present empirical experiments to validate its accuracy.
A. Ion-Sampler
The - tool uses a modular design that accepts
plug-ins for new peer-to-peer systems.
6
A plug-in can be written
for any peer-to-peer systemthat allows querying a peer for a list
of its neighbors. The - tool hands IP-address:port
pairs to the plug-in, which later returns a list of neighbors or sig-
nals that a timeout occurred. The - tool is respon-
sible for managing the walks. It outputs the samples to stan-
dard output, where they may be easily read by another tool that
collects the actual measurements. For example, -
6
In fact, it uses the same plug-in architecture as our earlier, heavy-weight tool,
Cruiser, which exhaustively crawls peer-to-peer systems to capture topology
snapshots.
Page 43 of 165
Fig. 7. Sampling error of the three fundamental properties as a function of the maximumnumber of connections each peer will accept. Each peer activelyattempts
to maintain - 1 connections.
could be used with existing measurement tools for measuring
bandwidth to estimate the distribution of access link bandwidth
in a peer-to-peer system. Listing 1 shows an example of using
- to sample peers fromGnutella.
B. Empirical Validation
Empirical validation is challenging due to the absence of
high-quality reference data to compare against. In our earlier
work [12], [39], we developed a peer-to-peer crawler called
Cruiser that captures the complete overlay topology through
exhaustive exploration. We can use these topology snapshots as
a point of reference for the degree distribution. Unfortunately,
we do not have reliably accurate empirical reference data for
session lengths or query latency.
By capturing every peer, Cruiser is immune to sampling diffi-
culties. However, because the network changes as Cruiser oper-
ates, its snapshots are slightly distorted [12]. In particular, peers
arriving near the start of the crawl are likely to have found addi-
tional neighbors by the time Cruiser contacts them. Therefore,
we intuitively expect a slight upward bias in Cruisers observed
degree distribution. For this reason, we would not expect a per-
fect match between Cruiser and sampling, but if the sampling is
unbiased we still expect them to be very close. We can view the
CCDF version of the degree distribution captured by Cruiser as
a close upper-bound on the true degree distribution.
Fig. 8 presents a comparison of the degree distribution of
reachable ultrapeers in Gnutella, as seen by Cruiser and by the
sampling tool (capturing approximately 1,000 samples with
hops). It also includes the results of a short crawl,
7
a sampling
technique commonly used in earlier studies (e.g., [3]). We inter-
leaved running these measurement tools to minimize the change
in the systembetween measurements of different tools, in order
to make their results comparable.
Examining Fig. 8, we see that the full crawl and sampling
distributions are quite similar. The sampling tool finds slightly
more peers with lower degree, compared to the full crawl, in ac-
cordance with our expectations described above. We examined
several such pairs of crawling and sampling data and found the
same pattern in each pair. By comparison, the short crawl ex-
hibits a substantial bias towards high degree peers relative to
both the full crawl and sampling. We computed the KS statistic
7
A short crawl is a general term for a progressive exploration of a portion
of the graph, such as by using a breadth-first or depth-first search. In this case,
we randomly select the next peer to explore.
Fig. 8. Comparison of degree distributions observed from sampling versus ex-
haustively crawling all peers.
between each pair of datasets, presented in Table III. Since
the full crawl is a close upper-bound of the true degree dis-
tribution, and since samplings distribution is lower, the error
in the sampling distribution relative to the true distribution is
. On the other hand, because the short crawl data ex-
ceeds the full crawl distribution, its error relative to the true dis-
tribution is . In other words, the true for the sam- pling
data is at most 0.043, while the true for the short crawl data is
at least 0.120. It is possible that sampling with MRWB
produces more accurate results than a full crawl (which suffers
from distortion), but this is difficult to prove conclusively.
C. Efficiency
Having demonstrated the validity of the MRWB technique,
we now turn our attention to its efficiency. Performing the walk
requires queries, where is the desired number of samples
and is the length of the walk in hops. If is too low, significant
bias may be introduced. If is too high, it should not introduce
bias, but is less efficient. From graph theory, we expect to require
for an ordinary randomwalk.
To empirically explore the selection of for Gnutella, we
conducted many sets of sampling experiments using different
values of , with full crawls interspersed between the sampling
experiments. For each sampling experiment, we compute the KS
statistic, , between the sampled degree distribution and that
captured by the most recent crawl. Fig. 9 presents the mean and
standard deviation of as a function of across different exper-
iments. The figure shows that lowvalues of can lead to
enormous bias . The amount of bias decreases rapidly
with , and low bias is observed for hops. However, in a
single experiment with hops, we observed ,
while all other experiments at that length showed .
Page 44 of 165
UNSTRUCTURED PEER-TO-PEER NETWORKS
TABLE III
KS STATISTIC D) BETWEEN PAIRS OF EMPIRICAL DATASETS
Fig. 10. Difference between sampled results and a crawl as a function of walk
length, after the change suggested in Section VI-C. Each experiment was re-
peated several times. Error bars show the sample standard deviation.
Fig. 9. Difference between sampled results and a crawl as a function of walk
length. Each experiment was repeated several times. Error bars show the sample
standard deviation.
Investigating the anomalous dataset, we found that a single peer
had been selected 309 out of 999 times.
Further examining the trace of this walk, we found that the
walk happened to start at a peer with only a single neighbor.
In such a case, the walk gets stuck at that peer due to the way
MetropolisHastings transitions to a new peer with probability
only . When this stuck event occurs late in the walk,
it is just part of the normal re-weighting to correct for a regular
random walks bias towards high degree peers. However, when
it occurs during the first step of the walk, a large fraction of
the walks will end at the unusual low-degree peer, resulting in
an anomalous set of selections where the same peer is chosen
many times.
One way to address this problem is to increase the walk length
by requiring
Fig. 11. Runtime of ion-mea
ri
1,000 samples.
as a function of walk length when collecting
However, this reduces the efficiency of the walk. More impor-
tantly, we typically do not accurately know the maximum de-
gree, i.e., while increasing decreases the probability of an
anomalous event, it does not preclude it. Therefore, we suggest
the following heuristic to prevent such problems from occur-
ring. During the first few steps of the walk, always transition
to the next peer as in a regular random walk; after the first few
steps, use the MetropolisHastings method for deciding whether
to transition to the next peer or remain at the current one. This
modification eliminates the correlations induced by sharing a
single starting location, while keeping the walk length relatively
short. We repeated the experiment after making this change; the
results are shown in Fig. 10. The observed error in the revised
implementation is low for , with low variance. In other
words, the samples are consistently accurate for .
In light of these considerations, we conservatively regard a
choice of as a safe walk length for Gnutella. Choosing
, we can collect 1,000 samples by querying 25,000
peers, over an order of magnitude in savings compared with per-
forming a full crawl which must contact more than 400,000.
Fig. 12. Example usage of the ion- am ier tool. We specify that we want to
use the Gnutella plug-in, each walk should take 25 hops, and we would like 10
samples. The tool then prints out 10 IP-address:port pairs. We have changed the
first octet of each result to 10 for privacy reasons.
D. Execution Time
We examined execution time as a function of the number of
hops, , and present the results in Fig. 11. With hops,
the execution time is around 10 minutes. In our initial imple-
mentation of - , a small fraction of walks would get
stuck in a corner of the network, repeatedly trying to contact
a set of departed peers. While the walks eventually recover, this
corner-case significantly and needlessly delayed the overall exe-
cution time. We added a small cache to remember the addresses
of unresponsive peers to address this issue.
For comparison, Cruiser takes around 13 minutes to capture
the entire topology. This begs the question: if -
does an order of magnitude less work, why is the
running time only slightly better? While - contacts
significantly fewer peers, walks are sequential in nature
which limits the amount of parallelism that - can
exploit. Cruiser, on the other hand, can query peers almost
entirely in parallel,
Page 45 of 165
but it must still do work, where is the population size. In
other words, if a peer-to-peer network doubles in size, Cruiser
will take twice as long to capture it. Alternately, we can keep
Cruisers execution time approximately constant by double the
amount of hardware and bandwidth we provision for Cruisers
use. The - tool requires only work,
meaning there is little change in its behavior as the network
grows.
While longer execution time has a negative impact on the ac-
curacy of Cruisers results, - s results are not sig-
nificantly impacted by the time required to perform the walk
(as demonstrated in Section V-D where we simulate walks of
10,000 hops).
E. Summary
In summary, these empirical results support the conclusion
that a Metropolized Random Walk with Backtracking is an ap-
propriate method of collecting measurements from peer-to-peer
systems, and demonstrate that it is significantly more accurate
than other common sampling techniques. They also illustrate the
dramatic improvement in efficiency and scalability of MRWB
compared to performing a full crawl. As network size increases,
the cost of a full crawl grows linearly and takes longer to com-
plete, introducing greater distortion into the captured snapshots.
For MRWB, the cost increases logarithmically, and no addi-
tional bias is introduced.
VII. DISCUSSION
A. How Many Samples are Required?
An important consideration when collecting samples is to
knowhowmany samples are needed for statistically significant
results. This is principally a property of the distribution being
sampled. Consider the problemof estimating the underlying fre-
quency of an event, e.g., that the peer degree takes a particular
value. Given unbiased samples, an unbiased estimate of is
where is the number of samples for which the
event occurs. has root mean square (RMS) relative error
Fromthis expression, we derive the following observations:
Estimation error does not depend on the population size; in
particular the estimation properties of unbiased sampling
scale independently of the size of the systemunder study.
The above expression can be inverted to derive the number
of samples required to estimate an outcome of fre-
quency up to an error . A simple bound is
.
Unsurprisingly, smaller frequency outcomes have a larger
relative error. For example, gathering 1,000 unbiased sam-
ples gives us very little useful information about events
which only occur one time in 10,000; the associated
value is approximately 3: the likely error dominates the
value to be estimated. This motivates using biased sam-
pling in circumstances that we discuss in Section VII-B.
The presence of sampling bias complicates the picture. If an
event with underlying frequency is actually sampled with fre-
quency , then the RMS relative error acquires an additional
term which does not reduce as the number of sam-
ples grows. In other words, when sampling from a biased
distribution, increasing the number of samples only increases
the accuracy with which we estimate the biased distribution.
B. Unbiased Versus Biased Sampling
At the beginning of this paper, we set the goal of collecting
unbiased samples. However, there are circumstances where
unbiased samples are inefficient. For example, while unbiased
samples provide accurate information about the body of a
distribution, they provide very little information about the tails:
the pitfall of estimating rare events we discussed in the previous
subsection.
In circumstances such as studying infrequent events, it may
be desirable to gather samples with a known sampling bias, i.e.,
with non-uniform sampling probabilities. By deliberately intro-
ducing a sampling bias towards the area of interest, more rele-
vant samples can be gathered. During analysis of the data, each
sample is weighted inversely to the probability that it is sampled.
This yields unbiased estimates of the quantities of interest, even
though the selection of the samples is biased. This approach is
known as importance sampling [50].
A known bias can be introduced by choosing an appropriate
definition of in the MetropolisHastings equations pre-
sented in Section IV and altering the walk accordingly. Because
the desired type of known bias depends on the focus of the re-
search, we cannot exhaustively demonstrate through simulation
that MetropolisHastings will operate correctly in a dynamic
environment for any . Our results show that it works well
in the common case where unbiased samples are desired (i.e.,
for all and ).
C. Sampling From Structured Systems
Throughout this paper, we have assumed an unstructured
peer-to-peer network. Structured systems (also known as Dis-
tributed Hash Tables or DHTs) should work just as well with
random walks, provided links are still bidirectional. However,
the structure of these systems often allows a more efficient
technique.
In a typical DHT scheme, each peer has a randomly generated
identifier. Peers form an overlay that actively maintains certain
properties such that messages are efficiently routed to the peer
closest to a target identifier. The exact properties and the defi-
nition of closest vary, but the theme remains the same. In these
systems, to select a peer at random, we may simply generate an
identifier uniformly at random and find the peer closest to the
identifier. Because peer identifiers are generated uniformly at
random, we know they are uncorrelated with any other prop-
erty. This technique is simple and effective, as long as there is
little variation in the amount of identifier space that each peer is
responsible for. We made use of this sampling technique in our
study of the widely-deployed Kad DHT [51].
VIII. CONCLUSIONS AND FUTURE WORK
This paper explores the problem of sampling representative
peer properties in large and dynamic unstructured P2P systems.
We show that the topological and temporal properties of P2P
Page 46 of 165
UNSTRUCTURED PEER-TO-PEER NETWORKS
systems can lead to significant bias in collected samples.
To collect unbiased samples, we present the Metropolized
Random Walk with Backtracking (MRWB), a modification of
the MetropolisHastings technique, which we developed into
the - tool. Using both simulation and empirical
evaluation, we show that MRWB can collect approximately
unbiased samples of peer properties over a wide range of
realistic peer dynamics and topological structures.
We are pursuing this work in the following directions. First,
we are exploring improving sampling efficiency for uncommon
events (such as in the tail of distributions) by introducing
known bias, as discussed in Section VII-B. Second, we are
studying the behavior of MRWB under flash-crowd scenarios,
where not only the properties of individual peers are changing,
but the distribution of those properties is also rapidly evolving.
Finally, we are developing additional plug-ins for -
and using it in conjunction with other measurement tools to
accurately characterize several properties of widely-deployed
P2P systems.
ACKNOWLEDGMENT
The authors would like to thank A. Rasti and J. Capehart for
their invaluable efforts in developing the dynamic overlay sim-
ulator, and V. Lo for her valuable feedback on this paper.
REFERENCES
[1] I. Stoica, R. Morris, D. Liben-Nowell, D. R. Karger, M. F. Kaashoek, F.
Dabek, and H. Balakrishnan, Chord: A scalable peer-to-peer lookup
protocol for Internet applications, IEEE/ACMTrans. Networking, vol.
11, no. 1, pp. 1732, Feb. 2002.
[2] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker,
A scalable content-addressable network, presented at the ACM
SIGCOMM2001, San Diego, CA.
[3] S. Saroiu, P. K. Gummadi, and S. D. Gribble, Measuring and ana-
lyzing the characteristics of Napster and Gnutella hosts, Multimedia
Syst. J., vol. 9, no. 2, pp. 170184, Aug. 2003.
[4] R. Bhagwan, S. Savage, and G. Voelker, Understanding availability,
presented at the 2003 Int. Workshop on Peer-to-Peer Systems,
Berkeley, CA.
[5] D. Stutzbach and R. Rejaie, Understanding churn in peer-to-peer net-
works, presented at the 2006 Internet Measurement Conf., Rio de
Janeiro, Brazil.
[6] S. Chib and E. Greenberg, Understanding the Metropolis-Hastings al-
gorithm, The Americian Statistician, vol. 49, no. 4, pp. 327335, Nov.
1995.
[7] W. Hastings, Monte carlo sampling methods using Markov chains and
their applications, Biometrika, vol. 57, pp. 97109, 1970.
[8] N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller, and E. Teller,
Equations of state calculations byfast computingmachines, J. Chem.
Phys., vol. 21, pp. 10871092, 1953.
[9] A. Awan, R. A. Ferreira, S. Jagannathan, and A. Grama, Distributed
uniform sampling in unstructured peer-to-peer networks, presented
at the 2006 Hawaii Int. Conf. System Sciences, Kauai, HI, Jan.
2006.
[10] Z. Bar-Yossef and M. Gurevich, Random sampling from a search en-
gines index, presented at the 2006 WWW Conf., Edinburgh, Scot-
land.
[11] D. Stutzbach, R. Rejaie, N. Duffield, S. Sen, and W. Willinger, Sam-
pling techniques for large, dynamic graphs, presented at the 2006
Global Internet Symp., Barcelona, Spain, Apr. 2006.
[12] D. Stutzbach and R. Rejaie, Capturing accurate snapshots of the
Gnutella network, in Proc. 2005 Global Internet Symp., Miami, FL,
Mar. 2005, pp. 127132.
[13] B. Bollobs, A probabilistic proof of an asymptotic formula for the
number of labelled regular graphs, Eur. J. Combinator., vol. 1, pp.
311316, 1980.
[14] M. Jerrumand A. Sinclair, Fast uniformgeneration of regular graphs,
Theoret. Comput. Sci., vol. 73, pp. 91100, 1990.
[15] C. Cooper, M. Dyer, and C. Greenhill, Sampling regular graphs and a
peer-to-peer network, in Proc. Symp. Discrete Algorithms, 2005, pp.
980988.
[16] V. Krishnamurthy, J. Sun, M. Faloutsos, and S. Tauro, Sampling In-
ternet topologies: How small can we go?, in Proc. 2003 Int. Conf.
Internet Computing, Las Vegas, NV, Jun. 2003, pp. 577580.
[17] V. Krishnamurthy, M. Faloutsos, M. Chrobak, L. Lao, and J.-H. C. G.
Percus, Reducing large Internet topologies for faster simulations,
presented at the 2005 IFIP Networking Conf., Waterloo, Ontario,
Canada, May 2005.
[18] M. P. H. Stumpf, C. Wiuf, and R. M. May, Subnets of scale-free
networks are not scale-free: Sampling properties of networks, Proc.
National Academy of Sciences, vol. 102, no. 12, pp. 42214224, Mar.
2005.
[19] A. A. Tsay, W. S. Lovejoy, and D. R. Karger, Random sampling in
cut, flow, and network design problems, Math. Oper. Res., vol. 24, no.
2, pp. 383413, Feb. 1999.
[20] A. Lakhina, J. W. Byers, M. Crovella, and P. Xie, Sampling biases in
IP topology measurements, presented at the IEEE INFOCOM2003,
San Francisco, CA.
[21] D. Achlioptas, A. Clauset, D. Kempe, and C. Moore, On the bias
of traceroute sampling; or, power-law degree distributions in regular
graphs, presented at the 2005 Symp. Theory of Computing, Baltimore,
MD, May 2005.
Page 47 of 165
Page 48 of 165
RANDOM STRING GENERATION USING BIOMETRIC
AUTHENTICATION SCHEME FOR PROTECTING AND SECURING
DATA IN DISTRIBUTED SYSTEMS
Asst Prof B.SHANMUGHA SUNDARAM
1
, S.MADHUMATHY
2
, R.MONISHA
3
Gojan School of Business & Technology
ssmcame@gmail.com
1
,madhumathy17@gmail.com
2
,monisha179@gmail.com
3
Abstract- Remote authentication is the most commonly used method to determine the identity of a remote
client. Three factor authentications provide smart-card based authentication with biometric after a
successful completion of a valid password. It could also fail if these authentication factors are compromised
(e.g., an attacker has successfully obtained the password and the data in the smart card). A generic and
secure framework is proposed to upgrade the three-factor authentication. A random string is generated after
the validation using human characteristics to login successfully Using smart card based password
authentication protocol and cryptographic algorithm
Index TermsAuthentication, distributed systems, security, privacy, password, biometrics.
1 INTRODUCTION
IN a distributed system, various resources are
distributed in the form of network services provided
and managed by servers. Remote authentication is the
most commonly used method to determine the
identity of a remote client. In general, there are three
authentication factors:
1. Something the client knows: password.
2. Something the client is: biometric characteristics
(e.g., fingerprint, voiceprint, and iris scan)
3. Something that the client gets: Random string.
Most early authentication mechanisms are solely
based on password. While such protocols are
relatively easy to implement, passwords (and human
generated passwords in particular) have many
vulnerabilities. As an example, human generated and
memorable passwords are usually short strings of
characters and (sometimes) poorly selected. By
exploiting these vulnerabilities, simple dictionary
attacks can crack passwords in a short time. Due to
these concerns, hardware authentication tokens are
introduced to strengthen the security in user
authentication, and smart-card-based password
authentication has become one of the most common
authentication mechanisms. Smart-card-based
password authentication provides two factor
authentications, namely a successful login requires
the client to have a valid smart card and a correct
password. While it provides stronger security
guarantees than password authentication, it could also
fail if both authentication factors are compromised
(e.g., an attacker has successfully obtained the
Password and the data in the smart card). In this case,
a third authentication factor can alleviate the problem
and further improve the systems assurance. Another
authentication mechanism is biometric
authentication, where users are identified by their
measurable human characteristics, such as
fingerprint, voiceprint, and iris scan. Biometric
characteristics are believed to be a reliable
authentication factor since they provide a potential
source of high-entropy information and cannot be
easily lost or forgotten. Despite these merits,
biometric authentication has some imperfect features.
Unlike password, biometric characteristics cannot be
easily changed or revoked. Some biometric
characteristics (e.g., fingerprint) can be easily
obtained without the awareness of the owner. This
motivates the three-factor authentication, which
incorporates the advantages of the authentication
based on password, biometrics and generation of
random string.
1.1 MOTIVATION
The motivation of this paper is to investigate a
systematic approach for the design of secure three-
factor authentication with the protection of user
privacy. Three-factor authentication is introduced to
incorporate the advantages of the authentication
based on password, biometrics and generation of
random string. A well designed three-factor
authentication protocol can greatly improve the
information assurance in distributed systems.
1.1.1 SECURITY ISSUES
Page 49 of 165
Most existing three factor authentication protocols
are flawed and cannot meet security requirements in
their applications. Even worse, some improvements
of those flawed protocols are not secure either. The
research history of three-factor authentication can be
summarized in the following sequence.
NEW PROTOCOLS! BROKEN! IMPROVED
PROTOCOLS! BROKEN AGAIN!
1.1.2 PRIVACY ISSUES
Along with the improved security features, three-
factor authentication also raises another subtle issue,
namely how to protect the biometric data. Not only is
this the privacy information of the owner, it is also
closely related to the security in the authentication.
As biometrics cannot be easily changed, the breached
biometric information (either on the server side or the
client side) will make the biometric authentication
totally meaningless. However, this issue has received
less attention than it deserves from protocol
designers. We believe it is worthwhile, both in theory
and in practice, to investigate a generic framework
for three-factor authentication, which can preserve
the security and the privacy in distributed systems.
1.2 CONTRIBUTIONS
The main contribution of this paper is a generic
framework for three-factor authentication in
distributed systems. The proposed framework has
several merits as follows: First, we demonstrate how
to incorporate biometrics in the existing
authentication based on password. Our framework is
generic rather than instantiated in the sense that it
does not have any additional requirements on the
underlying smart-card-based password
authentication. Not only will this simplify the design
and analysis of three-factor authentication protocols,
but also it will contribute a secure and generic
upgrade from two-factor authentication to three-
factor authentication possessing the practice-friendly
properties of the underlying two-factor authentication
system. Second, authentication protocols in our
framework can provide true three-factor
authentication, namely a successful authentication
requires password, biometric and generation of
random string. Characteristics In addition, our
framework can be easily adapted to allow the server
to decide the authentication factors in user
authentication (instead of all three authentication
factors). Last, in the proposed framework clients
biometric characteristics are kept secret from servers.
This not only protects user privacy but also prevents
a single-point failure (e.g., a breached server) from
undermining the authentication level of other
services. Furthermore, the verification of all
authentication factors is performed by the server. In
particular, our framework does not rely on any
trusted devices to verify the authentication factors,
which also meets the imperfect feature of distributed
systems where devices cannot be fully trusted.
1.3 RELATED WORK
Several authentication protocols have been proposed
to integrate biometric authentication with password
authentication and/or smart-card authentication. Lee
et al. designed an authentication system which does
not need a password table to authenticate registered
users. Instead, smart card and fingerprint are required
in the authentication; Lee et al.s scheme is insecure
under conspiring attack. Lin and Lai showed that Lee
et al.s scheme is vulnerable to masquerade attack.
Namely, a legitimate user (i.e., a user who has
registered on the system) is able to make a successful
login on behalf of other users. An improved
authentication protocol was given by Lin and Lai to
fix that flaw. The new protocol, however, has several
other security vulnerabilities. First, Lin-Lais scheme
only provides client authentication rather than mutual
authentication, which makes it susceptible to the
server spoofing attack. Second, the password
changing phase in Lin- Lais scheme is not secure as
the smart card cannot check the correctness of old
passwords .
Third, Lin-Lais scheme is insecure under
impersonation attacks due to the analysis given by
Yoon and Yoo , who also proposed a new scheme.
However, the new scheme is broken and improved by
Lee and Kwon. In , Kim et al. proposed two ID-based
password authentication schemes where users are
authenticated by smart cards, passwords, and
fingerprints. However, Scott showed that a passive
eavesdropper (without access to any smart card,
password or fingerprint) can successfully login to the
server on behalf of any claiming identity after
passively eavesdropping only one legitimate login.
Bhargav-Spantzel et al. proposed a privacy
preserving multifactor authentication protocol with
biometrics. The authentication server in their protocol
does not have the biometric information of registered
clients. However, the biometric authentication is
implemented using zero knowledge proofs, which
requires the server to maintain a database to store all
users commitments and uses costly modular
exponentiations in the finite group.
Page 50 of 165
2 PRELIMINARIES
This section reviews the definitions of smart-card-
based password authentication, three-factor
authentication, and fuzzy extractor.
2.1 Smart-Card-Based Password Authentication
Definition 1. A smart-card-based password
authentication protocol (hereinafter referred to as
SCPAP) consists of four phases.
2-Factor-Initialization: The server (denoted by S)
generates two system parameters PK and SK. PK is
published in the system, and SK is kept secret by S.
2-Factor-Reg: The client (denoted by C), with an
initial password PW, registers on the system by
running this interactive protocol with S. The output
of this protocol is a smart card SC. An execution of
this protocol is denoted by
2-Factor-Login-Auth: This is another interactive
protocol between the client and the server, which
enables the client to login successfully using PW and
SC. An execution of this protocol is denoted by
The output of this protocol is 1 (if the
authentication is successful) or 0 (otherwise).
2-Factor-Password-Changing: This protocol enables
a client to change his/her password after a successful
authentication (i.e., 2-Factor-Login-Auth outputs
1). The data in the smart card will be updated
accordingly.
The attacker on SCPAP can be classified from two
aspects: the behavior of the attacker and the
information compromised by the attacker.
As an interactive protocol, SCPAP may face passive
attackers and active attackers.
Passive attacker. A passive attacker can obtain
messages transmitted between the client and the
server. However, it cannot interact with the client or
the server.
Active attacker. An active attacker has the full
control of the communication channel. In addition to
message eaves-dropping, the attacker can arbitrarily
inject, modify, and delete messages in the
communication between the client and the server.
On the other hand, SCPAP is a two-factor
authentication protocol, namely a successful login
requires a valid smart card and a correct password.
According to the compromised secret, an attacker can
be further classified into the following two types.
Attacker with smart card. This type of attacker has
the smart card, and can read and modify the data in
the smart card. Notice that there are techniques to
restrict access to both reading and modifying data in
the smart card. Nevertheless, from the security point
of view, authentication protocols will be more robust
if they are secure against attackers with the ability to
do that.
Attacker with password. The attacker is assumed to
have the password of the client but is not given the
smart card.
Definition 2 (Secure SCPAP). The basic security
requirement of SCPAP is that it should be secure
against a passive attacker with smart card and a
passive attacker with password. It is certainly more
desirable that SCPAP is secure against an active
attacker with smart card and an active attacker with
password.
2.2 THREE-FACTOR AUTHENTICATION
Three-factor authentication is very similar to smart-
card-based password authentication, with the only
difference that it requires biometric characteristics as
an additional authentication factor.
Definition 3 (Three-Factor Authentication). A three-
factor authentication protocol involves a client C and
a server S, and consists of five phases.
3-Factor-Initialization: S generates two system
para-meters PK and SK. PK is published in the
system, and SK is kept secret by S. An execution of
this algorithm is denoted by 3-Factor-Initialization
PK; SK.
3-Factor-Reg: A client C, with an initial password
PW and biometric characteristics BioData, registers
on the system by running this interactive protocol
with the server S. The output of this protocol is a
smart card SC, which is given to C. An execution of
this protocol is denoted by
3-Factor-Login-Auth: This is another interactive
protocol between the client C and the server S, which
enables the client
to login successfully using PW, SC, and BioData.
The output of this protocol is 1 (if the
authentication is successful) or 0 (otherwise).
Page 51 of 165
3-Factor-Password-Changing: This protocol
enables a client to change his/her password after a
successful authentication. The data in the smart card
will be updated accordingly.
3-Factor-Biometrics-Changing: An analogue of
pass-word-changing is biometrics-changing, namely
the client can change his/her biometrics used in the
authentication, e.g., using a different finger or using
iris instead of finger.
While biometrics-changing is not supported by
previous three-factor authentication protocols, we
believe it provides the client with more flexibility in
the authentication.
Security requirements. A three-factor authentication
protocol can also face passive attackers and active
attackers as defined in SCPAP (Section. 2.1). A
passive (an active) attacker can be further classified
into the following three types.
Type I attacker has the smart card and the biometric
characteristics of the client. It is not given the
password of that client.
Type II attacker has the password and the biometric
characteristics. It is not allowed to obtain the data in
the smart card.
Type III attacker has the smart card and the password
of the client. It is not given the biometric
characteristics of that client. Notice that such an
attacker is free to mount any attacks on the
(unknown) biometrics, including biometrics faking
and attacks on the metadata (related to the
biometrics) stored in the smart card.
Definition 4 (Secure Three-Factor Authentication).
For a three-factor authentication protocol, the basic
security require-ment is that it should be secure
against passive type I, type II, and type III attackers.
It is certainly more desirable that a three-factor
authentication protocol is secure against active type I,
type II, and type III attackers.
2.3 FUZZY EXTRACTOR
A fuzzy extractor extracts a nearly random string R
from its biometric input w in an error-tolerant way. If
the input changes but remains close, the extracted R
remains the same. To assist in recovering R from a
biometric input w
0
, a fuzzy extractor outputs an
auxiliary string P. However, R remains uniformly
random even given P. The fuzzy extractor is formally
defined as below.
Definition 5 (Fuzzy Extractor). A fuzzy extractor is
given by two procedures Gen; Rep.
Gen is a probabilistic generation procedure, which on
(biometric) input w 2 M outputs an extracted string
R 2 f0; 1g
l
Figure. 1.3 is a mapping of the
multidimensional temporal data into an intuitive
analytical construct that we call a temporal cluster
graph. These graphs contain important information
proportion of common transaction
types within each time period, relationships and
similarities between common transaction types
across time periods, and trends in common
transaction types over time.
Page 80 of 165
Figure 1.2 Partitioning the data into
clusters
Figure 1.3 Temporal cluster graph.
Figure. 1. Reducing multiattribute t
complexity by partitioning data into
time periods and producing a temporal
cluster graph.
In summary, the main contribution of this
paper is the development of a novel and useful
approach for visualization and analysis of multi
Figure 1.2 Partitioning the data into
Figure 1.3 Temporal cluster graph.
Figure. 1. Reducing multiattribute temporal
time periods and producing a temporal
contribution of this
paper is the development of a novel and useful
approach for visualization and analysis of multi
attribute transactional data based on a new tempora
cluster graph construct, as well as the
implementation of this approach as the C
based Temporal Representation of Event Data (C
TREND) system.
The rest of the paper is organized as
follows: It provides an overview of related work in
the temporal data mining and visualization research
streams. It introduces the temporal cluster
construct and describes the technique for mapping
multi attribute temporal data to these graphs. It
discusses the algorithmic implementation of the
proposed technique as the C
includes performance analyses which presents an
evaluation of the technique using real
wireless networking technology certifications. a
discussion of possible applications, trend metrics,
and limitations associated with the proposed
technique and a brief discussion of future work. Th
conclusions are provided in terms of graph.
2. KEY INGREDIENTS OF THE PROPOSED
SYSTEM
Here I propose a new data analysis and
visualization technique for representing trends in
multiattribute temporal data using a clustering bas
approach. I introduce Cluster
Representation of EveNt Data (C
that implements the temporal cluster graph
construct, which maps multiattribute temporal data
to a two-dimensional directed graph that identifies
trends in dominant data types over time. discuss it
algorithmic implementation and performance.In our
project I use DENDROGRAM Data structure for
storing and Extracting cluster solutions generated
hierarchical clustering algorithms. Calculations ar
made using Tree structure
attribute transactional data based on a new temporal
cluster graph construct, as well as the
implementation of this approach as the Cluster-
based Temporal Representation of Event Data (C-
The rest of the paper is organized as
follows: It provides an overview of related work in
the temporal data mining and visualization research
streams. It introduces the temporal cluster graph
construct and describes the technique for mapping
multi attribute temporal data to these graphs. It
discusses the algorithmic implementation of the
proposed technique as the C-TREND system and
includes performance analyses which presents an
n of the technique using real-world data on
wireless networking technology certifications. a
discussion of possible applications, trend metrics,
and limitations associated with the proposed
technique and a brief discussion of future work. The
re provided in terms of graph.
2. KEY INGREDIENTS OF THE PROPOSED
Here I propose a new data analysis and
visualization technique for representing trends in
multiattribute temporal data using a clustering based
approach. I introduce Cluster-based Temporal
Representation of EveNt Data (C-TREND), a system
that implements the temporal cluster graph
construct, which maps multiattribute temporal data
dimensional directed graph that identifies
trends in dominant data types over time. discuss its
algorithmic implementation and performance.In our
project I use DENDROGRAM Data structure for
storing and Extracting cluster solutions generated by
hierarchical clustering algorithms. Calculations are
made using Tree structure
Page 81 of 165
DATAFLOW DIAGRAM:
Figure : Dataflow diagram of c-trend
C-TREND is the system implementation of
the temporal cluster-graph-based trend identification
and visualization technique; it provides an end user
with the ability to generate graphs from data and
adjust the graph parameters.C-TREND consists of
two main phases:
Offline preprocessing of the data
l 1 C18Lnu
DATA PREPROCESSING
Data Clustering
C-TREND utilizes optimized dendrogram
data structures for storing and extracting cluster
solutions generated by hierarchical clustering
algorithms (see Figure 2.3 below for a dendrogram
example).While C-TREND can be extended to
support partition-based clustering methods (e.g., k-
means)
l L u n
2.1 Partition Of Dataset
Separating of data present in the dataset
according to the companies name and storing the
values in separate tables.
Given input : Result set (collection of
unrelated data)
Expected output : Clustered data
(separating and grouping relevant data)
M S
l M
2.2 Dendrogram Sorting
Apply dendrogram extract algorithm to sort
the value present in the partition.
DENDRO_EXTRACT starts at the root of the
dendrogram and traverses the dendrogram by
splitting the highest numbered node (where the
nodes are numbered according to how close they
are to the root, as numbered in Figure 2.3 ) in the
current set of clusters until k clusters are included in
the set.
Dendrogram Data Structure
A dendrogram data structure allows for
quick extraction of any specific clustering solution for
8 C
Page 83 of 165
each data partition when the user changes partition
zoom level ki. To obtain a specific clustering solution
from the data structure for data partition Di, C-
TREND uses the DENDRO_EXTRACT algorithm
(Algorithm 1), which takes the desired number of
clusters in the solution ki as an input and returns the
set CurrCl containing the clusters corresponding to
the ki-sized solution. Cluster attributes such as
center and size are then accessible from the
corresponding dendrogram data structure by
referencing the clusters in CurrCl.
Algorithm1: DENDRO_EXTRACT
INPUT : ki desired no. of clusters
I data partition indicator
begin
if ki >= N then
CurrCl = { Dendrogramrooti }
while |CurrCl| < ki
MaxCl = Dendrogramrooti -
|CurrCl| + 1
CurrCl = (CurrCl \ MaxCl ) U
{MaxCl.Left} U {MaxCl.Right}
return CurrCl
else request new ki
end
MaxCl represents the highest element in the current
cluster set CurrCl. It is easy to see that because of
the specific dendrogram structure, it is always the
case that
MaxCl = Dendrogram Rooti - |CurrCl| + 1.
Furthermore, the dendrogram data array maintains
the successive levels of the hierarchical solution in
order; therefore, replacing MaxCl by its children
MaxCl.Left (leftchild) and MaxCl.Right (right child) is
sufficient for identifying the next solution level in the
dendrogram.
DENDRO_EXTRACT is linear in time complexity
O(ki), which provides for the real-time extraction of
cluster solutions.
Given input : clustered column value
Expected output : Sorted sequence fort the
given column
Using above algorithm we sort all the value
present in each column using above algorithm. here
we calculate time for each table using current time
of the system .thus calculated time is displayed in
separate table.
2.3 Calculate And Display The Time Of Sorting:
The last step in preprocessing is the
generation of the node list, which contains all
possible nodes and their sizes, and the edge list,
which contains all possible edges and their weights,
for the entire data set.
Creating these lists in the preprocessing
phase allows for more effective (real-time)
visualization updates of the C-TREND graphs.Each
data partition possesses an array-based
dendrogram data structure containing all its possible
clustering solutions. The node list is simply an
aggregate list of all dendrogram data structures
indexed for optimal node lookup.It should be noted
that the results reported in Table 1 were calculated
holding the number of attributes in the data constant
at 10. Since this process requires the calculation of
a distance metric for each edge, the time it takes to
generate the edge list should increase linearly with
the number of attributes in the data.
Table 1 : Edge List Creation
Times
---------------------------------------------------------------------
------------
Data Partitions Max Clusters # Of Possible
Edges Edge List Creation Time
(t) (N)
(sec)
10 10 3249
0.006
100 10 35739
0.062
Page 84 of 165
10 100 356409
0.630
1000 10 360639
0.650
100 100 3920499
6.891
10 500 8982009
15.760
To demonstrate that this is indeed the case,
Figure 4.5 contains a plot of the increase in edge list
generation time as the number of attributes is being
increased from 10 to 100, holding N and t constant.
80
51
Mi
cr
o
Co
nt
rol
ler
Power
Supply
Re
la
y
Dr
Re
la
y
r
o
b
o
t
Tempe
rature
Sensor
Signal
Conditio
ning
Circuit
UL
Sensor
PIR Sensor
G
A
S
S
e
n
so
Wireless
Camera
8051
Micro
Controller
Power
Supply
LCD
keypad
Alarm
Wireless
Video
Receiver
Page 103 of 165
TEMPERATURE SENSOR
The LM35 series are precision
integrated-circuit temperature sensors,
whose output voltage is linearly
proportional to the Celsius (Centigrade)
temperature. The LM35 thus has an
advantage over linear temperature
sensors calibrated in Kelvin, as the user
is not required to subtract a large constant
voltage from its output to obtain
convenient Centigrade scaling.
GAS SENSORS (MQ5)
Gas sensors interact with a gas to initiate
the measurement of its concentration.
The gas sensor then provides output to a
gas instrument to display the
measurements. Common gases measured
by gas sensors include ammonia,
aerosols, arsine, bromine, carbon
Diborane, dust, fluorine, germane,
halocarbons or refrigerants,
hydrocarbonoxide, carbon monoxide,
chlorine, chlorine dioxide,
SENSOR
The LM35 series are precision
circuit temperature sensors,
whose output voltage is linearly
proportional to the Celsius (Centigrade)
temperature. The LM35 thus has an
advantage over linear temperature
sensors calibrated in Kelvin, as the user
is not required to subtract a large constant
voltage from its output to obtain
convenient Centigrade scaling.
Gas sensors interact with a gas to initiate
the measurement of its concentration.
ovides output to a
gas instrument to display the
Common gases measured
by gas sensors include ammonia,
, arsine, bromine, carbon -di -
Diborane, dust, fluorine, germane,
halocarbons or refrigerants,
oxide, carbon monoxide,
hydrogen, hydrogen chloride, hydrogen
cyanide, hydrogen fluoride, hydrogen
selenide, hydrogen sulfide, mercury
vapor, nitrogen dioxide, nitrogen oxides,
nitric oxide, organic solvents, oxygen,
ozone, phosphine, silane, sulfur dioxide,
and water vapor.
WIRELESS TRANSMISSION
ZIGBEE
What is ZIGBEE???
ZigBee is an Ad-hoc networking
technology for LR-WPAN.
IEEE 802.15.4 standard that defines the
PHY and Mac Layers for ZigBee.
Intended for 2.45 GHz, 868 MHz and
915 MHz Band. Low in cost, complexity
& power consumption as compared to
competing technologies. Intended to
network inexpensive devices Data rates
touch 250Kbps for 2.45 GHz, 40 Kbps
915 MHz and 20Kbps for 868 MHz
band. ZigBee is an established set of
specifications for wireless personal area
networking (WPAN), i.e., digital radio
connections between computers and
related devices. This kind of network
eliminates use of physical data buses like
USB and Ethernet cables. The devices
could include telephones, hand
digital assistants, sensors and controls
located within a few meters of each
other. ZigBee is one of the global
standards of communication protocol
formulated by the relevant task force
under the IEEE 802.15 working group.
The fourth in the series, WPAN Low
Rate/ZigBee is the newest and provides
specifications for devices that have low
data rates, consume very low power and
are thus characterized by long battery
life. Other standards like Bluetooth
IrDA address high data rate applications
hydrogen, hydrogen chloride, hydrogen
cyanide, hydrogen fluoride, hydrogen
selenide, hydrogen sulfide, mercury
vapor, nitrogen dioxide, nitrogen oxides,
nitric oxide, organic solvents, oxygen,
ozone, phosphine, silane, sulfur dioxide,
WIRELESS TRANSMISSION
hoc networking
WPAN. Based on
IEEE 802.15.4 standard that defines the
Layers for ZigBee.
Intended for 2.45 GHz, 868 MHz and
915 MHz Band. Low in cost, complexity
& power consumption as compared to
competing technologies. Intended to
network inexpensive devices Data rates
touch 250Kbps for 2.45 GHz, 40 Kbps
for 868 MHz
ZigBee is an established set of
specifications for wireless personal area
networking (WPAN), i.e., digital radio
connections between computers and
related devices. This kind of network
eliminates use of physical data buses like
thernet cables. The devices
could include telephones, hand-held
digital assistants, sensors and controls
located within a few meters of each
other. ZigBee is one of the global
standards of communication protocol
formulated by the relevant task force
er the IEEE 802.15 working group.
The fourth in the series, WPAN Low
Rate/ZigBee is the newest and provides
specifications for devices that have low
data rates, consume very low power and
are thus characterized by long battery
Bluetooth and
IrDA address high data rate applications
Page 104 of 165
such as voice, video and LAN
communications.
Figure 4. XBee Pro10
DC MOTOR
A DC motor is an electric motor that
runs on direct current (DC) electricity. A
DC motor consists of a stator, an
armature, a rotor and a commutator with
brushes. Opposite polarity between the
two magnetic fields inside the motor
cause it to turn. DC motors are the
simplest type of motor and are used in
household appliances, such as electric
razors.
Figure 3. DC MOTOR
ATMEL MICROCONTROLLER
The AT89C51 is a low-power, high-
performance CMOS 8-bit Microcomputer
with 4K bytes of Flash programmable
and erasable read only memory
(PEROM). The device is manufactured
using Atmels high-density nonvolatile
memory technology and is compatible
with the industry-standard MCS-51
instruction set and pin out.
Figure 5. Microcontroller
The on-chip Flash allows the program
memory to be reprogrammed in-system
or by a conventional nonvolatile memory
programmer. By combining a versatile 8-
bit CPU with Flash on a monolithic chip,
Page 105 of 165
the Atmel AT89C51 is a powerful
microcomputer which provides a highly
flexible and cost effective solution
flexible and cost-effective solution to
many embedded control applications
Advantages
This Paper Functionality Of Zigbee
Enabled Can Be Enhanced In Future
To Address Various Critical Data
Collection As Follows:
1. This robot can be enhanced by the
GSM and GSP technology for the long
distance communication.
2. This robot can also be used in rescue
operation where there is a difficulty to
involve human
interactions/involvements.
3. This robot can be used in mines to
identify the inert gases and alcoholic
gases to avoid environmental hazards.
4. This robot can be used to store the
images during the image transmission
and this can be enhanced By
implementing rotating HD cameras.
CONCLUSION
This paper is designed with integrated
intelligence for network setup and
message routing by using ZIG BEE
TECHNOLOGY the information,
instructions and commands are
transferred .here zigbee transceiver is
used for data transmission and reception.
Freedom of robot movement angle is
achieved in our software as well as
hardware. Future enhancement our
system, without major modifications is
specified.
REFERENCES
1.Industrial robotics by Groover,
Wises, Nagel, Odrey. Mcgraw Hill
publications.
2.Robotics by K.S.F.U, R.C.Gonzalez,
C.S.G.Lee. McGRAW Hill publications.
3.Robotics Engineering by Richard
D.Klafter, Thomas A.Chmielewski,
Michael Negin, PHI Pvt Ltd.
WORLD WIDE WEB PAGES:
1.http://www.robotics.com
2.http://www.robocup.com
3.http://www.roboprojects.com
4.http://www.bbc.co.uk/science/robots/
5.http://www.howstuffworks.com
Page 106 of 165
Mining the Shirt sizes for Indian men by Clustered Classification
M.Martin Jeyasingh
1
, Kumaravel Appavoo
2
1
Associate Professor,
2
Dean & Professor,
1
National Institute of Fashion Technology ,Chennai.
2
Bharath Institute of Higher Education and Research , Chennai -73,Tamilnadu,India.
1
mmjsingh@rediffmail.com,
2
drkumaravel@gmail.com
Abstract-- In garment production engineering, sizing
system plays an important role for manufacturing of
clothing. This research work intend to introduce a
strong approach that it could be used for developing
sizing systems by data mining techniques using
Indian anthropometric data. By using a new
approach of two-stage data mining procedure shirt
size type of Indian men determined. This approach
included two phases. First of all, cluster analysis, then
cases sorted to the cluster results to extract the most
significant classification algorithms of the shirt size
based on the results of cluster and classification
analysis. A sizing system developed for the Indian
men age between 25 and 66 years, based on the chest
size determined in the data mining procedure.
In the sizing system, definition of the size label is a
critical issue that it determines quickly locating the
right garment size for further consideration for
customers as an interface. In this paper, we have
obtained classifications of mens shirt attributes
based on clustering techniques.
Keywords-- Data Mining ,Clustering, Classifiers, IBK
KNN,Logitboost, Clothing industry,Anthropometric
data.
I. INTRODUCTION
Garment sizing systems were originally
based on those developed by tailors in the late 18th
century. Professional dressmakers and craftsmen were
developed various sizing methods in the past years.
They used unique techniques for measuring and fitting
their customers. In the 1920s, the demand for the mass
production of garments created the need for a standard
sizing system. Many researchers started working on
developing sizing system by the different methods and
data collecting approaches. It has proved that garment
manufacturing is the highest value-added industry in
textile industry manufacturing cycle [1]. Mass
production by machines in this industry has replaced
manual manufacturing, so the planning and quality
control of production and inventory are very
important for manufacturer. Moreover, this type of
manufacturing has demand to certain standards and
specifications. Furthermore each country has its own
standard sizing systems for manufacturers to follow
and fit in with the figure types of the local population.
A sizing system classifies a specific
population into homogeneous subgroups based on
some key body dimensions [2]. Persons of the same
subgroup have the garment size. Standard sizing
systems can correctly predict manufacturing quantity
and proportion of production, resulting more accurate
production planning and control of materials [3, 4].
The standard unique techniques for measuring and
fitting their sizing systems have been used as a
communication tool among manufacturers, retailers
and consumers.
It can provide manufacturers with size
specification, design development, pattern grading
and market analysis. Manufacturers, basing their
judgments on the information, can produce different
type of garments with the various allowances for
specific market segmentation. Thus, establishing
standard sizing systems are necessary and important.
Many researchers worked on developing the sizing
system are necessary and important by many
approaches. They found very extensive data were
made by using anthropometric data [2].People have
changed in body shape over time. Workman [5]
demonstrated that the problem of ageing contributes
to the observed changes in body shape and size, more
than any other single factor, such as improved diet
and longer life expectancy [6]. Sizing concerns will
grow as the number of ageing consumers is expected
to double by the year 2030. This presents a marketing
challenge for the clothing industry since poor sizing is
the number one reason for returns and markdowns,
resulting in substantial losses. Therefore, sizing
systems have to be updated from time to time in order
to ensure the correct fit of ready-to-wear apparel.
Many countries have been undertaking sizing surveys
in recent years. Since sizing practices vary from
country to country, in 1968 Sweden originated the
first official approach to the International
Organization for Standardization (ISO) on the subject
of sizing of clothing, it being in the interest of the
general public that an international system be created.
After lengthy discussions and many proposals,
members of technical committee TC133 submitted
documents relating to secondary body dimensions,
their definitions and methods of measuring. This
eventually resulted in the publication of ISO 8559
Garment Construction and Anthropometric Surveys -
Body Dimension, which is currently used as an
international standard for all types of size survey [7].
Figure type plays a decisive role in a sizing
system and contributes to the topic of fit. So to find a
sizing system, different body types are first divided
from population, based on dimensions, such as height
Page 107 of 165
Cluster
Analysis
Classification
Analysis
or ratios between body measurements. A set of size
categories is developed, each containing a range of
sizes from small to large. The size range is generally
evenly distributed from the smallest to the largest size
in the most countries. For men's wear, the body length
and drop value are the two main measurements
characterizing the definition of figure type. Bureau of
Indian Standards (BIS) identified three body heights;
short (166 cm), regular (174 cm) and tall (182 cm)
[20] recommended the use of the difference in figure
types as the classification of ready-to-wears and
developed a set of procedures to formulate standard
sizes for all figure types. In early times, the
classification of figure types was based on body
weight and stature. Later on, anthropometric
dimensions were applied for classification. This type
of sizing system has the advantages of easy grading
and size labeling. But, the disadvantage is that the
structural constraints in the linear system may result
in a loose fit. Thus, some optimization methods have
been proposed to generate a better fit sizing system,
such as an integer programming approach [10] and a
nonlinear programming approach [11]. For the
development of sizing systems using optimization
methods, the structure of the sizing systems tends to
affect the predefined constraints and objectives.
Tryfos [10] indicated that the probability of purchase
depended on the distance between the sizing system
of a garment and the real size of an individual. In
order to optimize the number of sizes so as to
minimize the distance, an integer programming
approach was applied to choose the optimal sizes.
Later on, McCulloch, et al. [11] constructed a sizing
system by using a nonlinear optimization approach to
maximize the quality of fit. Recently, Gupta, et al.
[12] used a linear programming approach to classify
the size groups. Using the optimization method has
the advantages of generating a sizing system with an
optimal fit, but the irregular distribution of the optimal
sizes may increase the complexity in grading and the
cost of production. On the other hand, in recent years,
data mining has been widely used in area of science
and engineering. The application domain is quite
broad and plausible in bioinformatics, genetics,
medicine, education, electrical power engineering,
marketing, production, human resource management,
risk prediction, biomedical technology and health
insurance. In the field of sizing system in clothing
science, data mining techniques such as neural
networks [13], cluster analysis [14], the decision tree
approach [15] and two stage cluster analysis [16] have
been used. Clustering is the classification of objects
into different groups, or more precisely, the
partitioning of a data set into subsets (clusters), so that
the data in each subset (ideally) share some common
trait. Cluster analysis was used as an exploratory data
analysis tool for classification. In the clothing a
cluster which is typically grouped by the similarity of
its members shirt sizes can grouped by the K-means
cluster analysis method to classify the upper garment
sizing system. The pitfall of these methods is that it
requires one to pre-assign the number of clusters to
initialize the algorithm and it is usually subjectively
determined by experts. To overcome these
disadvantages, a two stage-based data mining
procedure include cluster analysis and classification
algorithms, is proposed here to eliminate the
requirement of subjective judgment and to improve
the effectiveness of size classification[8].
II.DATA MINING TECHNIQUES
A. Data Preparation
After the definition of industry problem, first
stage of data mining, data preparation selected to
increase the efficiency and ensure the accuracy of its
analysis through the processing and transformation of
the data. Before starting to mine the data, they must
be examined and proceed with all missing data and
outliers. By examining the data before the application
of a multivariate technique, the researcher gains
several critical insights into the characteristics of the
data. In this research work, we used an
anthropometric database which was collected from
BIS and from Clothing industrialists. Anthropometric
data of 620 Indian men with the age ranged from 25 to
66 years from the database were obtained. The data
mining process as shown Fig.1.
Fig. 1 Data mining process
B.Cluster Analysis:
First step of data mining approach was
undertaken, XMeans clustering in the cluster analysis.
X-Means is K-Means extended by an improve-
structure part, In this part of the algorithm the centers
are attempted to be split in its region. The decision
between the children of each center and itself is done
comparing the BIC-values of the two structures. With
the difference between the age and the other
Raw data
Data cleaning
Data Transformation
Validation
Accuracy
Prediction
Page 108 of 165
attributes, we determined the cluster numbers. In the
cluster analysis, K-means method implemented to
determine the final cluster categorization.
C.Classification Techniques
C.1. K-nearest neighbour
K-nearest neighbour algorithm (K-nn) is a
supervised learning algorithm that has been used in
many applications in the field of data mining,
statistical pattern recognition, image processing and
many others. K-nn is a method for classifying objects
based on closest training examples in the feature
space.The k-neighborhood parameter is determined in
the initialization stage of K-nn. The k samples which
are closest to new sample are found among the
training data. The class of the new sample is
determined according to the closest k-samples by
using majority voting [9]. Distance measurements like
Euclidean, Hamming and Manhattan are used to
calculate the distances of the samples to each other.
C.2.RandomTree
In this classifier the class for constructing a tree that
considers K randomly chosen attributes at each node.
It performs no pruning. Also has an option to allow
estimation of class probabilities based on a hold-out
set (backfitting). Sets the number of randomly chosen
attributes by KValue. To allow the unclassified
instances, maximum depth of the tree, the minimum
total weight of the instances in a leaf and the random
number seed used for selecting attributes could
parameterised, numFolds -- Determines the amount of
data used for backfitting and one fold is used for
backfitting, the rest for growing the tree. (Default: 0,
no backfitting) .
C.3.Logitboost
In this classifier the class for performing additive
logistic regression. This class performs classification
using a regression scheme as the base learner, and can
handle multi-class problems. Can do efficient internal
cross-validation to determine appropriate number of
iterations. This classifier may output additional
infomation to the console, threshold on improvement
in likelihood, the number of iterations to be
performed, number of runs for internal cross-
validation, weight threshold for weight pruning
(reduce to 90 for speeding up learning process) are
parameterised, numFolds -- number of folds for
internal cross-validation (default 0 means no cross-
validation is performed) to be specified.
III. METHODOLOGY AND DATA DESCRIPTION
A. Description of Dataset
Data processing : The data types like
nominal(text), numeric or the missing data has been
filled with meaningful assumptions in the database.
Database specification with description and table
structure as shown in Table 1.
TABLE 1. SPECIFICATION OF DATABASE
B. Data source
For this experiments we choose a dataset
from BIS based dataset which has total of 620
records. This dataset records are at present available
measurements which is authorized by anthropometric
experts. Then the records has been preprocessed, after
that cluster and classification techniques of data
mining is experimented for prediction of accuracy for
the best performance.
C. The Application of Data Mining
Data mining could be used to uncover
patterns. The increasing power of computer
technology has increased data collection and storage.
Automatic data processing has been aided by
computer science, such as neural networks, clustering,
genetic algorithms, decision trees, Digital printing and
support vector machines. Data mining is the process
of applying these methods to the prediction of
uncovering hidden patterns [18]. It has been used for
many years by businesses, scientists to sift through
volumes of data. The application of data mining in
fashion product development for production detect
and forecasting analysis by using classification and
clustering methods as shown in Fig. 2.
Field
No.
Field Name Description Data
Type
1 Age To refer the age Numeric
2 Back length To refer the back
length
Numeric
3 Front
length
To refer the front
length
Numeric
4 Shoulder
length
To refer the shoulder
length
Numeric
5 Chest girth To refer the chest
girth
Numeric
6 Waist
length
To refer the waist
length
Numeric
7 Hip To refer the hip Numeric
8 Sleeve
length
To refer the sleeve
length
Numeric
9 Arm depth To refer the arm
depth
Numeric
10 Cuff length To refer the cuff
length
Numeric
11 Cuff width To refer the cuff
width
Numeric
12 Label To refer the Size
labels
Nominal
(Text)
Page 109 of 165
Fig.2 . Application of data mining in Fashion Industry
IV. EXPERIMENTAL RESULTS
A. Distribution of Classes
This dataset has different characteristics
such as: the number of attributes, the
classes, the number of records and the percentage
class occurrences. Like the test dataset,
types of shirt sizes are broadly categorized in
groups as XS, S, M, L, XL,XXL. The Distribution of
Classes in the actual training data for classifiers
evaluation and the occurrences as given in Table II
The percentage of size Categories using Pie chart as
shown in Fig.4 and after clustered size
shown in Fig.5.
TABLE II
DISTRIBUTION OF CLASSES IN THE ACTUAL TRAINING SET
Class Category
No. of
Records
Percentage of
Class
Occurrences (%)
XS 86
S 83
M 115
L 121
XL 93
XXL 122
Total 620
Fig.4. Percentage of size Categories
W^
y^
^
D
>
y>
yy>
Industry
different characteristics
such as: the number of attributes, the number of
percentage of
Like the test dataset, 620 different
are broadly categorized in six
The Distribution of
Classes in the actual training data for classifiers
evaluation and the occurrences as given in Table II.
Categories using Pie chart as
and after clustered size categories
DISTRIBUTION OF CLASSES IN THE ACTUAL TRAINING SET
Percentage of
Occurrences (%)
14
13
18
20
15
20
100
B. Experimental Outcomes
To estimate the performance of the
method, we compared the results generated by
with the results generated by original sets of attr
for the chosen dataset. In the experiments,
mining software called weka 3.6.4 which has been
implemented in Java with latest windows 7 operating
system in Intel Core2Quad@2.83 GHz processor and
2 GB memory, These dataset has been applied and
then evaluated for accuracy by using 10
Validation strategy [19]. The predicted result values
of various classifiers with prediction accuracy as
given Table III.The dataset with original features and
clustered form of the dataset are classified with
algorithms K-nn[17] with 5 neighbours,
Logitboost without pruning. Both of the obtained
classification results are compared. In each phase of a
cross validation, one of the yet unprocessed sets w
tested, while the union of all remaining sets was u
as training set for classification by the
algorithms. Classifiers with prediction accuracy and
difference is given in Table III. Performance of the
classifiers as shown in Fig.6. Difference between
original and clustered classification accuracy rate
been shown in Fig.7.
TABLE III . CLASSIFIERS WITH PREDICTION ACCURACY
Fig.5. Percentage of size Categories after clustered
y^
D
y>
yy>
W^
99.1935
99.5161
96.9355
97.0968
99.0323
99.8387
98.871
98.0645
Z
K
8870780198
kC8C1 1kLL 1 k
1 k
k
1
We all know that forests are the
treasures of our earth. But now, mankind himself
the treasures of our
ing trees, not only that the rainfall
will be reduced, also the temperature will raise
enormously, which results in global warming.
This causes harm to the whole mankind. Thus the
scientists are giving call to protect forest and save
going on regarding the
issue. In our paper we propose an astonishing
solution to save our earth from global warming.
Global Warming is defined as
the increase of the average temperature on
Earth is getting hotter, disasters
like hurricanes, droughts and floods are getting
Over the last 100 years, the
average temperature of the air near the Earths
surface has risen a little less than
0.18C, or 1.3 0.32 Fahrenheit). Does not
seem all that much? It is responsible for the
conspicuous increase in storms,
forest fires we have seen in the last ten years,
though, say scientists.
Their data show that an increase of
one degree Celsius makes the Earth warmer now
than it has been for at least a thousand years. Out
of the 20 warmest years on record, 19 have
occurred since 1980. The three hottest years ever
observed have all occurred in the last eight years,
even.
An Astonishing Solution for Global Warming
8870780198
Over the last 100 years, the
average temperature of the air near the Earths
surface has risen a little less than 1 Celsius (0.74
0.18C, or 1.3 0.32 Fahrenheit). Does not
seem all that much? It is responsible for the
conspicuous increase in storms, and raging
forest fires we have seen in the last ten years,
Their data show that an increase of
one degree Celsius makes the Earth warmer now
than it has been for at least a thousand years. Out
of the 20 warmest years on record, 19 have
occurred since 1980. The three hottest years ever
ved have all occurred in the last eight years,
MAIN
CAUS
ES
FOR
GLOB
AL
WAR
MING
:
Carbon
dioxid
Page 112 of 165
e, water vapour, nitrous oxide, methane and ozone
are some of the natural gases causing global
warming.
HEALTH AND ENVIRONMENTAL
EFFECTS:
Greenhouse gas emissions could cause a
1.8 to 6.3 Fahrenheit rise in temperature
during the next century, if atmospheric
levels are not reduced.
Produce extreme weather events, such as
droughts and floods.
Threaten coastal resources and wetlands
by raising sea level.
Increase the risk of certain diseases by
producing new breeding sites for pests and
pathogens.
Agricultural regions and woodlands are
also susceptible to changes in climate that
could result in increased insect populations
and plant disease.
The degradation of natural ecosystems
could lead to reduced biological
diversity.
WHAT GLOBAL WARMING EFFECTS
ARE EXPECTED FOR THE FUTURE?
To predict the future global warming effects,
several greenhouse gas emission scenarios were
developed and fed into computer models.
They project for the next century that, without
specific policy changes
Global mean temperature should
increase by between 1.4 and
5.8C (2.5 to 10F).
The Northern Hemisphere cover
should decrease further, but the
Antarctic ice sheet should
increase.
The sea level should rise by
between 9 and 88 cm (3.5" to
35").
Other changes should occur,
including an increase in some
extreme weather events.
After 2100, human induced global warming
effects are projected to persist for many centuries.
The sea level should continue rising for thousands
of years after the climate has been stabilized. We
have weather up to 40 degree Celsius now.
CARBON DIOXIDE
Ninety-three percent of
all emissions
Generating power by
burning carbon based
fossil fuels like natural
gas, oil, and coal,
decomposition,
accounting for about
one quarter of all
global emissions.
METHANE
Twenty times more
effective in trapping
heat in our atmosphere
25 times as potent as
carbon dioxide
Agricultural activities,
landfills.
NITROUS OXIDE Agricultural soil
management, animal
manure management,
sewage treatment,
mobile and stationary
combustion of fossil
fuel, adipic acid
production, and nitric
acid production.
OZONE Automobile exhaust
and industrial
processes.
HYDROFLURO
COMPOUNDS
(HFCs).
Industrial processes
such as foam
production,
refrigeration, dry
cleaning, chemical
manufacturing, and
semiconductor
manufacturing.
PERFLURONIATED
COMPOUNDS (PFCs).
Smelting of aluminium
Page 113 of 165
IMPACTS OF RISE IN MAJOR GREEN
HOUSE GAS CO2:
In air the carbon dioxide concentration
should be approximately 330 ppm (parts per
million).But due to environmental researchers the
carbon dioxide content will increase as follows,
2025 405 to 469 ppm
2050 445 to 640 ppm
2100 540 to 970 ppm
We have weather up to 40 degree
Celsius now. It is expected that the weather will
increase in Tamil Nadu as follows.
In 2025 0.4 to 1.1 degree Celsius
In 2050 0.8 to 2.6 degree Celsius
In 2100 1.4 to 5.8 degree Celsius
SOLUTION WE PROPOSE:
We all know that forests are the
treasures of our earth. But, man started to destroy
forests and the scientists are giving call to save
forest. We all know that forests help to protect the
earth from global warming. By cutting trees, not
only that the rainfall will be reduced, also the
temperature will raise enormously, which causes
harm to the whole mankind. The research is going
on all the time to save the mankind from global
warming. Now, it has been found that robot trees
will help to tackle the problem of global warming.
In the air, the carbon dioxide content should be
330 ppm (part per million). Day by day it is
increasing which results in global warming.
WHAT IS ROBOT TREE???
The scientists are trying to make robot
to perform various activities to reduce the physical
and mental work of human being. The
combination of nature and robots is called
Robotany. The scientists Jill Coffin, John
Taylor and Daniel Bauen are researching on
robot tree. The robot tree does not look like our
ordinary tree. The structures of the stem, roots and
leaves are present in the robot tree. Does robot
tree help to solve the problem of global warming?
I have read in a magazine recently
that the experiment done by the researchers at
Madurai Kamaraj University on robot tree is
successful. Hats off to them. It is really happy
news. We have studied in history that the kings of
olden days had planted trees on both sides of the
road. In the same way we hope that all the roads
will have robot trees on both sides in future to
prevent global warming and save the earth. It is
said that one robot tree is equal to 1000 natural
trees. Each robot tree looks more like a giant fly
swatter so as to remain as guards of mankind
Klaus Lackner, a professor of
Geophysics at Columbia University, is working
on an interesting concept: "synthetic trees".
The idea is to reproduce the process of
photosynthesis to capture and store massive
amounts of CO2 gas. Nearly 90,000 tons of
carbon dioxide a year, roughly the amount emitted
annually by 15,000 cars, could be captured by the
structure. Paired with a windmill, the carbon-
capture tree would generate about 3 megawatts of
power, Lackner calculates, making the operation
self-sufficient in energy.
The scientists are trying to make robot
to perform various activities to reduce the physical
and mental work of human being. The
combination of nature and robots is called
Robotany. The scientists Jill Coffin, John Taylor
and Daniel Bauen are currently researching on
robot tree.
HOW DOES A ROBOT TREE
FUNCTION???
Page 114 of 165
Just imagine a normal tree. A
normal will have a root, stem and leaves. In
the same way, the robot tree also has root,
stem, branch and leaf like normal tree. Some
plastic poles are fixed in the stem part and in
between solar plates are fixed which act as
leaves. In the big poles small holes are made
and small poles are fixed. This will absorb
carbon dioxide in the air. In the inside of big
poles there will be calcium hydroxide liquid
and the absorbed carbon dioxide will be
dissolved in it.
The solar plates produce current
and pass current inside the stem, which will
separate carbon and oxygen. Oxygen,
hydrogen and vapour will come out. The
carbon will act with water and become
carbonic acid.
A computer-generated image
of Lackners synthetic trees.
Synthetic trees dont exactly
look like your average tree with green
leaves and roots. Although the design is
not finalized, Lackner predicts that the
device would look more like a post with
venetian blinds strung across it; a box-
shaped extractor raised about 1,000 feet
tall, adorned with scaffolding lined with
liquid sodium hydroxide (commonly known
as lye). When exposed, sodium hydroxide
(lye) is an absorbent of CO2. So, as air
flows through the venetian blind leaves of
the tree, the sodium hydroxide will bind
the CO2, sifting out cleaner, about 70-
90% less CO2 concentrated air on the
other side. Lackner estimates that an area
of sodium hydroxide about the size of a
large TV screen (a 20 inch diagonal) and
a meter in depth could absorb 20 tons of
CO2 a year. Paired with a windmill, a
carbon-capture tree could generate about 3
megawatts of power.
l5 l1 l45l8L
d
K
=
=
n
j
ER
j
ER
i
CAR
i
v
n
v v
1
1
Bandpass filter :
After amplification the signals are passed
through a bandpass filter. The band pass
filter virtually lays a band over the incoming
signal. Every frequency outside this band is
removed from the signal. Using this filter for
instance the mu-rhythm can be extracted
from the input by discarding every
frequency less than 10Hz and over 12Hz,
leaving the desired band.The bandpass filter
can be implemented using a FIR (Finite
Impulse Response) filter algorithm. FIR
does not distort the signal.
Artifacts :
The EEG signals are always imperfect and
always contaminated with artifacts.
Artifacts are undesirable disturbances in the
signal. These artifacts range from
bioelectrical potentials produced by
movement of body parts like, eyes, tongue,
arms or hart or fluctuation in skin resistance
(sweating). And can also have a source
outside the body like interference of
electrical equipment nearby or varying
impedance of the electrodes.
Figure: Example of eye-blink artifacts
All these issues influence the EEG data
and should ideally be completely removed.
Page 121 of 165
A possible way to remove them is detection
and recognition. This is not a trivial task.
Recognition of for instance limb
movement can be facilitated by using EMG.
Whenever EMG activity is recorded, the
EEG readings should be discarded (artefact
rejection). The other artefact sources like
eye-movement (measured by
Electrooculargraphy (EOG)) and heart
activity (measured by Electrocardiography
(ECG)) can also be removed. However most
of the artifacts are continuously present and
typically more present than EEG. If it is
known when and where they occur, they can
be compensated for.
Artifact removal :
To increase the effectiveness of BCI systems
it is necessary to find methods of increasing
the signal-to-noise ratio (SNR) of the
observed EEG signals.
The noise, or artefact, sources
include: line noise from the power grid, eye
blinks, eye movements, heartbeat, breathing,
and other muscle activity.
Improving technology can decrease
externally generated artifacts, such as line
noise, but biological artefact signals must be
removed after the recoding process.
The maximum signal fraction (MSF)
transformation is an alternative to the two
most common techniques: principal
component analysis (PCA) and independent
component analysis (ICA). A signal
separation method based on canonical
correlation analysis (CCA) is also used. The
simplest approach is to discard a fixed
length segment, perhaps one second, from
the time an artefact is detected. Discarding
segments of EEG data with artifacts can
greatly decrease the amount of data
available for analysis. No quantitative
evaluation was done on the removal but it
was visually observed that the artifacts were
extracted into a small number of
components that would allow their removal.
In online filtering systems, artefact
recognition is important for achieving their
automatic removal. One approach to
recognition of noise components is based on
measuring structure in the signal.
Whenever artifacts are detected the
affected portion of the signal can be
rejected. This can be a valid pre-processing
step and does not have to be a problem.
However the problem with deleting a
specific piece of data is that it can result in
strange anomalies where the two pieces are
connected. Secondly, EEG data in general is
relatively scarce. For that reason a better
approach is to remove the artefact from the
EEG data. This goes one step further than
artefact rejection.For practical purposes in
an online system, it is undesirable to throw
away every signal that is affected with an
artefact. Retrieving the signal with 100%
correctness is impossible; it is simply
unknown what the data would have looked
like without for instance the eye blink. For
offline systems this is less critical, since it
does not matter if some commands are lost.
In the online case however, the user
demands that every command that is issued
to the system is recognized and executed.
The user doesnt want to keep trying
endlessly for a good trial.
Rejection therefore is not desirable.
The goal is to keep a continuous signal.
Ideally the artifact must be removed and the
desired signal preserved. This can be done
Page 122 of 165
using filters or higher-order statistical
elimination and separation, like for instance
independent component analysis.
Independent component analysis:
ICA was originally developed for blind
source separation whose goal is to recover
mutually independent but unknown source
signals from their linear mixtures without
knowing the mixing coefficients. ICA can
be seen as an extension to Principal
Component Analysis and Factor Analysis.
Algorithm:
Where g = u exp(u2 2) , x observed data
and w is a weight matrix that does ICA.
Note that convergence means that the old
and new values of w point in the same
direction, i.e. their dot-product are (almost)
equal to 1.
Artifacts removal using ICA and GA:
The step of proposed method as follows: At
first using ICA algorithm extract
Independent components (ICs) of each trial
then GA select the best and related ICs
among the whole ICs. The proposed
approach to the use of GAs for Artefact
removal involves encoding a set of d, ICs as
a binary string of the elements, in which a 0
in the string indicates that the corresponding
IC is to be omitted, and a 1 that it is to be
included. This coding scheme represents the
presence or absence of a particular IC from
the IC space. The length of chromosome
equal to IC space dimensions. Then the
selected ICs used as input data for
classifiers. This paper used the fitness
function shown below to combine the two
terms:
Fitness = classification error + * (Number
of Active Gens) Where error corresponds to
the classification error that used elected ICs
and active Gens corresponds to the number
of ICs selected. In this function is
considered between (0, 1) and the higher
results in less selected features. In this paper
= 0.01 is chosen.
Translation Algorithm:
The translation algorithm is the main
element of any BCI system. The function of
the translation algorithm is to convert the
EEG input to device control output. It can be
split in to feature extraction and feature
classification.
Feature Extraction
Feature extraction is the process of selecting
appropriate features from the input data.
These features should characterize the data
as good as possible. The goal of feature
extraction is to select those features which
have similar values for objects of one
category and differing values for other
categories. Reducing overlapping features
will aid in the quest against poor
generalization and computational
complexity. The goal is to select of few
features from the total feature vector, which
still describe the data.Feature extraction can
be performed using:
Page 123 of 165
Time-frequency analysis, using Fast
Fourier Transform (FFT)
Autoregressive modelling (AR)
Common spatial patterns (CSP)
Linear Discrimination (LD)
Genetic algorithm (GA)
Feature classification:
The next step in the translation algorithm is
the classification of the acquired features.
When presented with new data, the BCI
should be able to tell which brain state it
represents. Therefore it must recognize
patterns in the data provided by the features.
Classification will result in device control:
acquiring the commands from the user. It
can be achieved in various ways:
Linear Vector Quantization (LVQ)
Neural network (NN)
Support Vector Machines (SVM)
Application
In the application domain the issue
of self- versus evoked control is important.
Good thought must be given to how this
control is realized. Special care should be
given to real-time control of devices that
interact with the physical environment.
Synchronous control of your wheelchair
does not seem very handy. Although the
intermediate approach using an on/off
switch offers possibilities.The current
maximum information transfer rate of about
25 bits per minute, strongly limits the
application range and its applicability for
mass society. Moreover the dimensionality
of the commands is today mostly one
dimensional. This depends on the number of
different brain states that can be recognized
per time unit. For good mouse control for
instance, two dimensional controls is a
necessity.
Figure : Example of BCI spelling program.
CONCLUSION
Like other communication and control
systems, BCIs have inputs, outputs, and
translation algorithms that convert the
former to the latter. BCI operation depends
on the interaction of two adaptive
controllers, the users brain, which produces
the input (i.e., the electrophysiological
activity measured by the BCI system) and
the system itself, which translates that
activity into output (i.e., specific commands
that act on the external world). Successful
BCI operation requires that the user acquire
and maintain a new skill, a skill that consists
not of muscle control but rather of control of
EEG or single-unit activity.
REFERENCES
1. T Hong LIU , Lin-pei ZHAIV , Ying
GAO, Wen-ming LI, Jiu-fei ZHOU,
Image Compression Based on
BiorthogonalnWavelet Transform,
IEEE Proceedings of ISCIT2005.
Page 124 of 165
2. De Vore , et al.,nImage
Compression through Wavelet
Transform Coding, IEEE
Transaction on Information Theory.
A Comparative Study of Image
Compression Techniques Based on
Svd, Dwt-Svt , Dwt-Dct ICSCI2008
proceedings pg 494-496.
3. http://www.electrodesales.com.
4. http://www.analog.com/en/processors-
dsp/content/reference_design
5. . http://www.analog.com/en/processors-
dsp/content/reference_designs
6. http://www.analog.com/library/analogdi
alogue/archives/39-
06/mixed_signal.zip:MixedSignal
7. http://www.analog.com/en/digital-to-
analog-converters/da-
converters/ist/233/pst.html
Page 125 of 165
Information Sharing Across Databases Using Anonymous Connection
B.Janani, Lecturer
Department of Computer
Science & Engineering
Gojan School of Business &
Technology
jananib@gojaneducation.com
R.Jeyavani
IV CSE
Gojan School of Business &
Technology
jeyavani.r@gmail.com
S.T.Ramya Raja Rajeshwari
IV CSE
Gojan School of Business &
Technology
ramyst.rajeshwari@gmail.com
AbstractSuppose Alice owns a k-
anonymous database and needs to determine
whether her database, when inserted with a
tuple owned by Bob, is still k-anonymous.
Also, suppose that access to the database is
strictly controlled, because for example data
are used for certain experiments that need to
be maintained confidential. Clearly, fllowing
Alice to directly read the contents of the
tuple breaks the privacy of Bob (e.g., a
patients medical record); on the other hand,
the confidentiality of the database managed
by Alice is violated once Bob has access to
the contents of the database. Thus, the
problem is to check whether the database
inserted with the tuple is still k-anonymous,
without letting Alice and Bob know the
contents of the tuple and the database,
respectively. In this paper, we propose two
protocols solving this problem on
suppression-based and generalization-based
k-anonymous and confidential databases.
The protocols rely on well-known
cryptographic assumptions, and we provide
theoretical analyses to proof their soundness
and experimental results to illustrate their
efficiency.
Index TermsPrivacy, anonymity, data
management, secure computation.
Introduction
It is today well understood that databases
represent an important asset for many
applications and thus their security is
crucial. Data confidentiality is particularly
relevant because of the value, often not only
monetary, that data have. For example,
medical data collected by following the
history of patients over several years may
represent an invaluable asset that needs to be
adequately protected. Such a requirement
has motivated a large variety of approaches
aiming at better protecting data
confidentiality and data ownership. Relevant
approaches include query processing
techniques for encrypted data and data
watermarking techniques. Data
confidentiality is not, however, the only
requirement that needs to be addressed.
Today there is an increased concern for
privacy. The availability of huge numbers of
databases recording a large variety of
information about individuals makes it
possible to discover information about
specific individuals by simply correlating all
the available databases. Although
confidentiality and privacy are often used as
synonyms, they are different concepts: data
confidentiality is about the difficulty (or
impossibility) by an unauthorized user to
learn anything about data stored in the
database. Usually, confidentiality is
achieved by enforcing an access policy, or
possibly by using some cryptographic tools.
Privacy relates to what data can be safely
disclosed without leaking sensitive
information regarding the legitimate owner
[5]. Thus, if one asks whether confidentiality
is still required once data have been
anonymized, the reply is yes if the
anonymous data have a business value for
the party owning them or the unauthorized
disclosure of such anonymous data may
damage the party owning the data or other
parties. (Note that under the context of this
paper, the term anonymized or
Page 126 of 165
anonymization means identifying
information is removed from the original
data to protect personal or private
information. There are many ways to
perform data anonymization. We only focus
on the k-anonymization approach [28],
[32].) To better understand the difference
between confidentiality and anonymity,
consider the case of a medical facility
connected with a research institution.
Suppose that all patients treated at the
facility are asked before leaving the facility
to donate their personal health care records
and medical histories (under the condition
that each patients privacy is protected) to
the research institution, which collects the
records in a research database. To guarantee
the maximum privacy to each patient, the
medical facility only sends to the research
database an anonymized version of the
patient record. Once this anonymized record
is stored in the research database, the
nonanonymized version of the record is
removed from the system of the medical
facility. Thus, the research database used by
the researchers is anonymous. Suppose that
certain data concerning patients are related
to the use of a drug over a period of four
years and certain side effects have been
observed and recorded by the researchers
in the research database. It is clear that these
data (even if anonymized) need to be kept
confidential and accessible only to the few
researchers of the institution working on this
project, until further evidence is found about
the drug. If these anonymous data were to be
disclosed, privacy of the patients would not
be at risk; however the company
manufacturing the drug may be adversely
affected. Recently, techniques addressing
the problem of privacy via data
anonymization have been developed, thus
making it more difficult to link sensitive
information to specific individuals. One
well-known technique is k-anonymization
[28], [32]. Such technique protects privacy
by modifying the data so that the probability
of linking a given data value, for example a
given disease, to a specific individual is very
small. So far, the problems of data
confidentiality and anonymization have been
considered separately. However, a relevant
problem arises when data stored in a
confidential, anonymity-preserving database
need to be updated. The operation of
updating such a database, e.g., by inserting
a tuple containing information about a given
individual, introduces two problems
concerning both the anonymity and
confidentiality of the data stored in the
database and the privacy of the individual to
whom the data to be inserted are related: 1)
Is the update database still
privacypreserving? and 2) Does the database
owner need to know the data to be inserted?
Clearly, the two problems are related in the
sense that they can be combined into the
following problem: can the database owner
decide if the updated database still preserves
privacy of individuals without directly
knowing the new data to be inserted? The
answer we give in this work is affirmative.
It is important to note that assuring that a
database maintains the privacy of
individuals to whom data are referred is
often of interest not only to these
individuals, but also to the organization
owning the database. Because of current
regulations, like HIPAA [19], organizations
collecting
Page 127 of 165
data about individuals are under the
obligation of assuring individual privacy. It
is thus, in their interest to check the data that
are entered in their databases do not violate
privacy, and to perform such a verification
without seeing any sensitive data of an
individual.
Problem Statement
Fig. 1 captures the main participating parties
in our application domain. We assume that
the information concerning a single patient
(or data provider) is stored in a single tuple,
and DB is kept confidentially at the server.
The users in Fig. 1 can be treated as medical
researchers who have the access to DB.
Since DB is anonymous, the data providers
privacy is protected from these researchers.
(Note that to follow the traditional
convention, in Section 4 and later sections,
we use Bob and Alice to represent the data
provider and the server, respectively.) As
mentioned before, since DB contains
privacy-sensitive data, one main concern is
to protect the privacy of patients. Such task
is guaranteed through the use of
anonymization. Intuitively, if the database
DB is anonymous, it is not possible to infer
the patients identities from the information
contained in DB. This is achieved by
blending information about patients. See
Section 3 for a precise definition. Suppose
now that a new patient has to be treated.
Obviously, this means that the database has
to be updated in order to store the tuple t
containing the medical data of this patient.
The modification of the anonymous
database DB can be naively performed as
follows: the party who is managing the
database or the server simply checks
whether the updated database DB [ ftg is
still anonymous. Under this approach, the
entire tuple t has to be revealed to the party
managing the database server, thus violating
the privacy of the patient. Another
possibility would be to make available the
entire database to the patient so that the
patient can verify by himself/herself if the
insertion of his/her data violates his/her own
privacy. This approach however, requires
making available the entire database to the
patient thus violating data confidentiality. In
order to devise a suitable solution, several
problems need to be addressed: Problem 1:
without revealing the contents of t and DB,
how to preserve data integrity by
establishing the anonymity of DB [ ftg?
Problem 2: once such anonymity is
established, how to perform this update?
Problem 3: what can be done if database
anonymity is not preserved? Finally,
problem 4: what is the initial content of the
database, when no data about users has been
inserted yet? In this paper, we propose two
protocols solving Problem 1, which is the
central problem addressed by our paper.
However, because the other problems are
crucial from a more practical point of view,
we discuss them as well in Section 7. Note
that to assure a higher level of anonymity to
the party inserting the data, we require that
the communication between this party and
the database occurs through an anonymous
connection, as provided by protocols like
Crowds [27] or Onion routing [26]. This is
necessary since traffic analysis can
potentially reveal sensitive information
based on users IP addresses. In addition,
sensitive information about the party
inserting the data may be leaked from the
access control policies adopted by the
anonymous database system, in that an
important requirement is that only
authorized parties, for example patients,
should be able to enter data in the database.
Therefore, the question is how to enforce
authorization without requiring the parties
inserting the data to disclose their identities.
An approach that can be used is based on
techniques for user anonymous
authentication and credential verification
[20]. The above discussion illustrates that
Page 128 of 165
the problem of anonymous updates to
confidential databases is complex and
requires the combination of several
techniques, some of which are proposed for
the first time in this paper. Fig. 1
summarizes the various phases of a
comprehensive approach to the problem of
anonymous updates to confidential
databases, while Table 1 summarizes the
required techniques and identifies the role of
our techniques in such approach.
Proposed Solutions
All protocols we propose to solve Problem 1
rely on the fact that the anonymity of DB is
not affected by inserting it if the information
contained in t, properly anonymized, is
already contained in DB. Then, Problem 1 is
equivalent to privately checking whether
there is a match between (a properly
anonymized version of) t and (at least) one
tuple contained in DB. The first protocol is
aimed at suppression-based anonymous
databases, and it allows the owner of DB to
properly anonymize the tuple t, without
gaining any useful knowledge on its
contents and without having to send to its
owner newly generated data. To achieve
such goal, the parties secure their messages
by encrypting them. In order to perform the
privacy-preserving verification of the
database anonymity upon the insertion, the
parties use a commutative and homomorphic
encryption scheme. The second protocol
(see Section 5) is aimed at generalization-
based anonymous databases, and it relies on
a secure set intersection protocol, such as the
one found in [3], to support privacy-
preserving updates on a generalization-based
k-anonymous DB. The paper is organized as
follows: Section 2 reviews related work on
anonymity and privacy in data management.
Section 3 introduces notions about
anonymity and privacy that we need in order
to define our protocols and prove their
correctness and security. The protocols are
defined, respectively, in Section 4 and
Section 5 with proofs of correctness and
security. Section 6 analyzes the complexity
of the proposed protocol and presents
experimental complexity results, we
obtained by running such protocols on real-
life data. Section 7 concludes the paper and
outlines future work.
Private Update For Suppression Based
Anonymous Connection
In this section, we assume that the database
is anonymized using a suppression-based
method. Note that our protocols are not
required to further improve the privacy of
users other than that provided by the fact
that the updated database is still k-
anonymous. Suppose that Alice owns a k-
anonymous table T over the QI attribute.
Alice has to decide whether T [ twhere t is
a tuple owned by Bobis still k-
anonymous, without directly knowing the
values in t (assuming t and T have the same
schema). This problem amounts to decide
whether t matches any tuple in T on the
nonsuppressed QI attributes. If this is the
case, then t, properly anonymized, can be
inserted into T. Otherwise, the insertion of t
into T is rejected. A trivial solution requires
as a first step Alice to send Bob the
suppressed attributes names, for every tuple
in the witness set f_1; . . . ; _wg of T. In this
way, Bob knows what values are to be
suppressed from his tuple. After Bob
computes the anonymized or suppressed
versions _ti of tuple t, 1i we, he and Alice
Page 129 of 165
can start a protocol (e.g., the Intersection
Size Protocol in [3]) for privately testing the
equality of _ti and _i. As a drawback, Bob
gains knowledge about the suppressed
attributes of Alice. A solution that addresses
such drawback is based on the following
protocol. Assume, Alice and Bob agree on a
commutative and product-homomorphic
encryption scheme E and QI fA1; . .
.;Aug. Further, they agree on a coding c_
(4) as well. Since other non-QI attributes do
not play any role in our computation,
without loss of generality, let _i hv0 1; . . .
; v0 si be the tuple containing only the s
nonsuppressed QI attributes of witness wi,
and t hv1; . . . ; vui. Protocol 4.1 allows
Alice to compute an anonymized version of
t without letting her know the contents of t
and, at the same time, without letting Bob
know what the suppressed attributes of the
tuples in T. are The protocol works as
follows: At Step 1, Alice sends Bob an
encrypted version of _i, containing only the
s nonsuppressed QI attributes. At Step 2,
Bob encrypts the information received from
Alice and sends it to her, along with
encrypted version of each value in his tuple
t. At Steps 3-4, Alice examines if the
nonsuppressed QI attributes of _i is equal to
those of t.
Conclusion/Future work
In this paper, we have presented two secure
protocols for privately checking whether a
k-anonymous database retains its anonymity
once a new tuple is being inserted to it.
Since the proposed protocols ensure the
updated database remains k-anonymous, the
results returned from a users (or a medical
researchers) query are also k-anonymous.
Thus, the patient or the data providers
privacy cannot be violated from any query.
As long as the database is updated properly
using the proposed protocols, the user
queries under our application domain are
always privacy-preserving.
References
[1] G. Aggarwal, T. Feder, K. Kenthapadi,
R. Motwani, R. Panigrahy, D. Thomas, and
A. Zhu, Anonymizing Tables, Proc. Intl
Conf. Database Theory (ICDT), 2005.
[2] E. Bertino and R. Sandhu, Database
SecurityConcepts, Approaches and
Challenges, IEEE Trans. Dependable and
Secure Computing, vol. 2, no. 1, pp. 2-19,
Jan.-Mar. 2005.
[3] J.W. Byun, T. Li, E. Bertino, N. Li, and
Y. Sohn, Privacy- Preserving Incremental
Data Dissemination, J. Computer Security,
vol. 17, no. 1, pp. 43-68, 2009.
[4] S. Chawla, C. Dwork, F. McSherry, A.
Smith, and H. Wee, Towards Privacy in
Public Databases, Proc. Theory of
Cryptography Conf. (TCC), 2005.
[5] B.C.M. Fung, K. Wang, A.W.C. Fu, and
J. Pei, Anonymity for Continuous Data
Page 130 of 165
Publishing, Proc. Extending Database
Technology Conf. (EDBT), 2008.
[6] Y. Han, J. Pei, B. Jiang, Y. Tao, and Y.
Jia, Continuous Privacy Preserving
Publishing of Data Streams, Proc.
Extending Database Technology Conf.
(EDBT), 2008.
[7] J. Li, B.C. Ooi, and W. Wang,
Anonymizing Streaming Data for
Privacy Protection, Proc. IEEE Intl Conf.
Database Eng. (ICDE), 2008.
[8] A. Trombetta and E. Bertino, Private
Updates to Anonymous Databases, Proc.
Intl Conf. Data Eng. (ICDE), 2006.
[9] K. Wang and B. Fung, Anonymizing
Sequential Releases, Proc.ACM
Knowledge Discovery and Data Mining
Conf. (KDD), 2006.
Page 131 of 165
Preventing Node and Link Failure in
IP Network Recovery Using MRC
1
N.GOWRISHANKAR
3
S.ARAVINDH
2
S.JAI KRISHNAN Senior Lecturer, Co-Guide
UG-Student, Gojan School of Business And
Gojan School of Business and Technology, Chennai-52
Technology,Chennai-52
gowthamshankar26@gmail.com aravindhgojan@gmail.com
Abstract
As the Internet takes an increasingly
central role in our communications
infrastructure, the slow convergence of routing
protocols after a network failure becomes a
growing problem. To assure fast recovery from
link and node failures in IP networks, we
present a new recovery scheme called Multiple
Routing Configurations (MRC). Our proposed
scheme guarantees recovery in all single
failure scenarios, using a single mechanism to
handle both link and node failures, and
without knowing the root cause of the failure.
MRC is strictly connectionless, and assumes
only destination based hop-by-hop forwarding.
MRC is based on keeping additional routing
information in the routers, and allows packet
forwarding to continue on an alternative
output link immediately after the detection of a
failure. It can be implemented with only minor
changes to existing solutions. In this paper we
present MRC, and analyze its performance
with respect to scalability, backup path lengths,
and load distribution after a failure. We also
show how an estimate of the traffic demands in
the network can be used to improve the
distribution of the recovered traffic, and thus
reduce the chances of congestion when MRC is
used.
Index Terms -Availability, computer network
reliability, communication system fault
tolerance, communication system routing, and
protection.
I. INTRODUCTION
This network-wide IP re-convergence is a time
consuming process, and a link or node failure is
typically followed by a period of routing
instability. Much effort has been devoted to
optimizing the different steps of the convergence
of IP routing, i.e., detection, dissemination of
information and shortest path calculation. The
IGP convergence process is slow because it is
reactive and global. It reacts to a failure after it
has happened, and it involves all the routers in
the domain. In this paper we present a new
scheme for handling link and node failures in IP
networks. Multiple Routing Configurations
(MRC) is a proactive and local protection
mechanism that allows recovery in the range of
milliseconds. MRC allows packet forwarding to
continue over preconfigured alternative next-
hops immediately after the detection of the
failure.
Using MRC as a first line of defense
against network failures, the normal IP
convergence process can be put on hold. MRC
guarantees recovery from any single link or
node failure, which constitutes a large majority
of the failures experienced in a network. MRC
makes no assumptions with respect to the root
cause of failure, e.g., whether the packet
forwarding is disrupted due to a failed link or a
failed router.
The main idea of MRC is to use the
network graph and the associated link weights to
produce a small set of backup network
configurations. The link weights in these backup
configurations are manipulated so that for each
link and node failure, and regardless of whether
it is a link or node failure, the node that detects
the failure can safely forward the incoming
packets towards the destination on an alternate
link. MRC assumes that the network uses
shortest path routing and destination based hop-
by-hop forwarding.
This gives great flexibility with respect
to how the recovered traffic is routed. The
Page 132 of 165
backup configuration used after a failure is
selected based on the failure instance, and thus
we can choose link weights in the backup
configurations that are well suited for only a
subset of failure instances. The rest of this paper
is organized as follows.
II. MRC OVERVIEW
MRC is based on building a small set of
backup routing configurations that are used to
route recovered traffic on alternate paths after a
failure. The backup configurations differ from
the normal routing configuration in that link
weights are set so as to avoid routing traffic in
certain parts of the network. We observe that if
all links attached to a node are given sufficiently
high link weights, traffic will never be routed
through that node. The failure of that node will
then only affect traffic that is sourced at or
destined for the node itself. Similarly, to exclude
a link (or a group of links) from taking part in
the routing, we give it infinite weight. The link
can then fail without any consequences for the
traffic. Our MRC approach is threefold. First,
we create a set of backup configurations, so that
every network component is excluded from
packet forwarding in one configuration. Second,
for each configuration, a standard routing
algorithm like OSPF is used to calculate
configuration specific shortest paths and create
forwarding tables in each router, based on the
configurations.
The use of a standard routing algorithm
guarantees loop-free forwarding within one
configuration. Finally, we design a forwarding
process that takes advantage of the backup
configurations to provide fast recovery from a
component failure. In our approach, we
construct the backup configurations so that for
all links and nodes in the network, there is a
configuration where that link or node is not used
to forward traffic. Thus, for any single link or
node failure, there will exist a configuration that
will route the traffic to its destination on a path
that avoids the failed element. Also, the backup
configurations must be constructed so that all
nodes are reachable in all configurations and
node in a network. Using a standard shortest
path calculation, each router creates a set of
configuration-specific forwarding tables.
When a router detects that a neighbor
can no longer be reached through one of its
interfaces, it does not immediately inform the
rest of the network about the connectivity
failure. Instead, packets that would normally be
forwarded over the failed interface are marked
as belonging to a backup configuration, and
forwarded on an alternative interface towards its
destination. The packets must be marked with a
configuration identifier, so the routers along the
path know which configuration to use.
It is important to stress that MRC does
not affect the failure free original routing, i.e.,
when there is no failure, all packets are
forwarded according to the original
configuration, where all link weights are normal.
Upon detection of a failure, only traffic reaching
the failure will switch configuration. All other
traffic is forwarded according to the original
configuration as normal. If a failure lasts for
more than a specified time interval, a normal re-
convergence will be triggered. MRC does not
interfere with this convergence process, or make
it longer than normal. However, MRC gives
continuous packet forwarding during the
convergence, and hence makes it easier to use
mechanisms that prevent micro-loops during
convergence, at the cost of longer convergence
times.
III. GENERATING BACKUP
CONFIGURATIONS
A. Configurations Structure
Definition: A configuration is an ordered pair of
The graph and a function that assigns an integer
weight to each link. We distinguish between the
normal configuration and the backup
configurations. In the normal configuration, , all
links have normal weights .
We assume that is given with finite integer
weights. MRC is agnostic to the setting of these
weights. In the backup configurations,
selected links and nodes must not carry any
transit traffic. Still, traffic must be able to depart
from and reach all operative nodes. These traffic
regulations are imposed by assigning high
weights to some links in the backup
configurations:
Page 133 of 165
The purpose of the restricted links is to isolate a
node from routing in a specific backup
configuration, such as node 5 to the left in many
topologies, more than a single node can be
isolated simultaneously. In the example to the
right in Restricted and isolated links are always
given the same weight in both directions.
However, MRC treats links as unidirectional,
and makes no assumptions with respect to
symmetric link weights for the links that are not
restricted or isolated.
Definition: A configuration backbone , consists
of all non-isolated nodes in and all links that are
neither isolated nor restricted:
Definition: A backbone is connected if and only
if
B. Algorithm
The number and internal structure of backup
configurations in a complete set for a given
topology may vary depending on the
construction model. If more configurations are
created,
Fewer links and nodes need to be isolated per
configuration, giving a richer (more connected)
backbone in each configuration. On the other
hand, if fewer configurations are constructed,
the state requirement for the backup routing
information storage is reduced. However,
calculating the minimum number of
configurations for a given topology graph is
computationally demanding.
1) Description: Algorithm 1 loops through all
nodes in the topology, and tries to isolate them
one at a time. A link is isolated in the same
iteration as one of its attached nodes. The
algorithm terminates when either all nodes or
links in the network are isolated in exactly one
configuration, or a node that cannot be isolated
is encountered.
a) Main loop: Initially, backup configurations
are created as copies of the normal
configuration. A queue of nodes and a queue of
links are initiated. The node queue contains all
nodes in an arbitrary sequence. The link queue is
Initially empty, but all links in the network will
have to pass through it. Method returns the first
item in the queue, removing it from the queue.
b) Isolating links: Along with, as many as
possible of its attached links are isolated. The
algorithm runs through the links attached to. It
can be shown that it is an invariant in our
algorithm that in line 1, all links in are attached
to node . In the case that the neighbor node was
not isolated in any configuration, we isolate the
link along with if there exists another link not
isolated with. If the link cannot be isolated
together with node, we leave it for node to
isolate it later.
2) Output: We show that successful execution
of Algorithm 1 results in a complete set of valid
backup configurations.
Page 134 of 165
Proposition 3.3: If Algorithm 1 terminates
successfully, the produced backup
configurations.
Proof: Links are only given weights or in the
process of isolating one of its attached nodes.
Proposition 3.4: If Algorithm 1 terminates
successfully, the backup configurations set are
complete, and all configurations are valid.
Proof: Initially, all links in all configurations
have original link weights. Each time a new
node and its connected links are isolated in a
configuration we verify that the backbone in that
configuration remains connected. When the
links are isolated, it is checked that the node has
at least one neighbor not isolated in. When
isolating a node, we also isolate as many as
possible of the connected links. Algorithm 1). If
it does terminate with success, all nodes and
links are isolated in one configuration, thus the
configuration set is complete.
3) Termination: The algorithm runs through all
nodes trying to make them isolated in one of the
backup configurations and will always terminate
with or without success. If a node cannot be
isolated in any of the configurations, the
algorithm terminates without success. However,
the algorithm is designed so that any bi
connected topology will result in a successful
termination, if the number of configurations
allowed is sufficiently high.
Proposition 3.5: Given a bi-connected graph,
there will exist , so that Algorithm 1 will
terminate successfully.
Proof: Assume. The algorithm will create
backup configurations, isolating one node in
each backup configuration. In bi-connected
topologies this can always be done. Along with
a node, all attached links except one, say, can be
isolated. By forcing node to be the next node
processed, and the link to be first among node
and link will be isolated in the next
configuration. This can be repeated until we
have configurations so that every node and link
is isolated. This holds also for the last node
processed, since its last link will always lead to a
node that is already isolated in another
configuration. Since all links and nodes can be
isolated, the algorithm will terminate
successfully.
4) Complexity: The complexity of the proposed
algorithm is determined by the loops and the
complexity of the method. This method
performs a procedure similar to determining
whether a node is an articulation point in a
graph, bound to worst case . Additionally, for
each node, we run through all adjacent links,
whose number has an upper bound in the
maximum node degree.
IV. LOCAL FORWARDING
PROCESS
When a packet reaches a point of failure, the
node adjacent to the failure, called the detecting
node, is responsible for finding a backup
configuration where the failed component is
isolated. The detecting node marks the packet as
belonging to this configuration, and forwards the
packet. From the packet marking, all transit
routers identify the packet with the selected
backup configuration, and forward it to the
egress node avoiding the failed component.
Fig: 1 1Pocket Forwarding state diagram
we can distinguish between two cases. If,
forwarding can be done in configuration, where
both and will be avoided. In the other case, the
challenge is to provide recovery for the failure
of link when node is operative. Our strategy is to
forward the packet using a path to that does not
contain. Furthermore, packets that have changed
configuration before (their configuration ID is
different than the one used in ), and still meet a
Page 135 of 165
failed component on their forwarding path, must
be discarded.
A. Implementation Issues
The forwarding process can be implemented in
the routing equipment as presented above,
requiring the detecting node to know the backup
configuration for each of its neighbors. Node
would then perform at most two additional next-
hop look-ups in the case of a failure. However,
all nodes in the network have full knowledge of
the structure of all backup configurations.
Hence, node can determine in advance the
correct backup configuration to use if the normal
next hop for a destination has failed. This way
the forwarding decision at the point of failure
can be simplified at the cost of storing the
identifier of the correct backup configuration to
use for each destination and failing neighbor.
V. PERFORMANCE EVALUATION
MRC requires the routers to store additional
routing configurations. The amount of state
required in the routers is related to the number
of such backup configurations.
A. Evaluation Setup
For each topology, we measure the
minimum number of backup configurations
needed by our algorithm to isolate every node
and link in the network. Recall from Section III-
B that our algorithm for creating backup
configurations only takes the network topology
as input, and is not influenced by the link
weights.
Loop-Free Alternates (LFA) . LFA is a cheaper
fast reroute technique that exploits the fact that
for many destinations, there exists an alternate
next-hop that will not lead to a forwarding loop.
If such alternate paths exist for all traffic that is
routed through a node, we can rely on LFA
instead of protecting the node using MRC.
They define a piecewise linear cost function that
is dependent on the load on each of the links in
the network. is convex and resembles an
exponentially growing function.
Fig:2 The Cost239 network
They define a piecewise linear cost function that
is dependent on the load on each of the links in
the network. is convex and resembles an
exponentially growing function.
B. Number of Backup Configurations
The table also shows how many nodes that are
covered by LFAs, and the number of
configurations needed when MRC is used in
combination with LFAs. Since some nodes and
links are completely covered by LFAs, MRC
needs to isolate fewer components, and hence
the number of configurations decreases
Fig.3. The number of backup configurations
required for a wide range of BRITE generated
topologies. As an example the bar name wax-2-
16 denotes that the Waxman model is used with
a links-to-node ratio of 2, and with 16
for some topologies. This modest number of
backup configurations shows that our method is
Page 136 of 165
implementing able without requiring a
prohibitively high amount of state information.
B. Backup Path Lengths
The numbers are based on 100 different
synthetic Waxman topologies with 32 nodes and
64 links. All the topologies have unit weight
links, in order to focus more
Fig 4 Backup path lengths in the case of a node
failure.
Fig:5 Average backup path lengths in the case of
a node failure as a function of the number of
backup configurations.
on the topological characteristics than on a
specific link weight configuration. Algorithm 1
yields richer backup configurations as their
number increases.
C. Load on Individual Links
The maximum load on all links, which are
indexed from the least loaded to the most loaded
in the failure-free case.
Fig:6 Load on all unidirectional links in the
ailure free case, after IGP re-convergence, and
when RC is used to recover traffic. Shows each
individual links worst case scenario.
The results indicate that the restricted routing in
the backup topologies result in a worst case load
distribution that is comparable to what is
achieved after a complete IGP rerouting process.
However, we see that for some link failures,
MRC gives a some what higher maximum link
utilization in this network.
VI. RECOVERY LOAD DISTRIBUTION
MRC recovery is local, and the
recovered traffic is routed in a backup
configuration from the point of failure to the
egress node. If MRC is used for fast recovery,
the load distribution in the network during the
failure depends on three factors:
(a) The link weight assignment used in the
normal configuration,
(b) The structure of the backup configurations,
i.e., which links and nodes are isolated in each ,
(c) The link weight assignments used in the
backbones of the backup configurations.
The link weights in the normal configuration (a)
are important since MRC uses backup
configurations only for the traffic affected by the
failure, and all non-affected traffic is distributed
According to them. The backup configuration
structure (b) dictates which links can be used
used in the recovery paths for each failure. The
backup configuration link weight assignments
(c) determine which among the available backup
paths are actually used.
REFERENCES
[1] D. D. Clark, The design philosophy of
theDARPAinternet protocols, ACM SIGCOMM
Comput. Commun. Rev., vol. 18, no. 4,
[2] A. Basu and J. G. Riecke, Stability issues in
OSPF routing, in Proc. ACM SIGCOMM, San
Diego, CA, Aug. 2001, pp. 225236.
[3] C. Labovitz, A. Ahuja, A. Bose, and F.
Jahanian, Delayed internet routing
convergence, IEEE/ACM Trans. Networking,
vol. 9, no. 3, pp. 293306, Jun. 2001.
[4] C. Boutremans, G. Iannaccone, and C. Diot,
Impact of link failures on VoIP performance,
in Proc. Int. Workshop on Network and
perating System Support for Digital Audio and
Video, 2002, pp. 6371.
Page 137 of 165
Abstract
Product Authentication is one of the fundamental
procedures to ensure the standard and quality of
any product in the market. Counterfeit products
are often offered to consumers as being authentic.
Counterfeit consumer goods such as electronics,
music, apparel, and Counterfeit medications have
been sold as being legitimate. Efforts to control
the supply chain and educate consumers to
evaluate the packaging and labeling help ensure
that authentic products are sold and used.
However educating the consumer to evaluate the
product is a challenging task. Our work ensures
that the task is made as simple with the help of a
camera enabled mobile phone supported with QR
(Quick Response) Code Reader. We propose a
model whereby the application in the mobile
phone decodes the captured coded image and
sends it through the Cloud Data Management
Interface for authentication. The system then
forwards the message to product manufacturers
data center or any central database and the
response received from the cloud enables the
consumer to decide on the products authenticity.
The authentication system is made as a pay per
use model and thereby making it as a Security as a
Service (Saas) architecture. The system is being
implemented with a mobile toolkit in the cloud
environment provided by the simulator Cloud
Sim 1.0.
Index TermsCloud Computing, QR codes,
Authentication, 2D Codes, Security as a
Services.
I. INTRODUCTION
Authentication is one of the most important
process for any consumer to identify whether the
product we buy was from an authentic
manufacurer or any fictious company and also to
ensure that the product is well within the limit of
its expiry. In the recent times there are a lot of
duplicate and expired products present in the
market, the duplication of products has penetrated
into many products starting from basic provisions
to more important pharmaceuticals. The
consumers cannot judge whether the product is
original or duplicate on their own by checking the
manufactured date and the expired date. The lack
of awareness about a products authenticity was
well exposed in a recent issue where the
consumers faced an issue with the duplication of
medicines. It has been found that many expired
medicines has been recycled and sold in the
market as new ones [1]. This problem occurred
mainly because of improper authentication system
to find whether the product is an original one.
Thus to prevent this from happening again in
this pharmaceuticals field or with any other
counsumer product, a proper effective
authentication system must be implemented which
prevents the shop keepers or the stockiest to
modify any of the records regarding the originality
of the product. The present authentication systems
dealing with the product identification and
authentication are Barcode and Hologram.
Barcodes are the most common form of identiy
establishment technique where a series of black
vertical lines of various widths associated with
numbers is printed on every products. Being an
age old technique this is quite easily duplicated
The second and most efficient technique is the
Holograms. Holograms are photographic images
that are three-dimensional and appear to have
A Cloud Computing Solution for Product Authentication using QR Code
G.Bhagath, B.Harishkumar,
Department of Computer Science and Engineering,
Gojan School of Business and Technology,
bhagath.gopinath@gmail.com, harish_storm26@yahoo.com
Page 138 of 165
depth. Holograms work by creating an image
composed of two superimposed 2-dimensional
pictures of the same object seen from different
reference points. The hologram is printed onto a
set of ultra-thin curved silver plates, which are
made to diffract light, and this thin silver plates
are pasted on to the product for its authenticity.
But the technique of hologram stickers are a bit
expensive because of its cost of manufacturing
and hence authenticating a low price consumer
goods would not be a feasible solution. The draw
backs on the above techniqes are that on the one
end the bar code can be easiliy duplicated and on
the other extreme the hologram stickers are quite
sophesticated for a normal consumers to identify
the intricate details and come to a conclusion
about the originality of the product.
The drawbacks of the barcode and 3-D hologram
technique has led to the evloution of a new
technique called the QR code (Quick Response). It
is a plain old matrix code manufactured with the
intent of decoding it at very high speed. QR Code
was created as a step up from a bar code. QR
Code contains data in both vertical and horizontal
directions, whereas a bar code has only one
direction of data, usually the vertical one. QR
Code can also correspondingly hold more
information and are easily digested by scanning
equipment, and because it has potentially twice
the amount of data as bar code, it can increase the
effectiveness of such scanning. Further QR Code
can handle alphanumeric character, symbol,
binary, and other kinds of code. QR Code also has
an error-correction capability, whereby the data
can be brought back to full life even if the symbol
has been trashed. All of these features make QR
Code far superior to bar code.
All the products we buy will have a (QR)
code printed on its cover and it is unique for each
product which is going to be used in our
authentication system. This application reads the
codes printed on the external cover of the product
and it is encoded to get the data stored in the code.
Then the code is encrypted to add more security to
the code and it is sent to the central web server
which is in the cloud through SMS (Short
Messaging Service). The data can also be sent
through WAP (Wireless Access protocol) and
MMS (Multimedia Messaging Service); but the
cost factor is less in SMS when compared with
WAP and MMS.
The central server collects the data and checks
the data in the manufacturers server for the
products code. The code is searched with a
searching algorithm and if it is found, the data in
the manufacturers database is marked as bought
and a reply is sent to the central web server that
the product is original. If a match is not found
then the manufacturers server will return message
stating that the product is duplicate. The web
server can convey the message to the user.
II. RELATED WORK
The QR codes are now presently used in the
web pages to access the webpage directly from the
mobile phone without entering the URL in the
mobile phone but by capturing the QR code by the
camera device attached with the mobile phones.
The QR codes can also be used in the business
cards; the QR code encoded with the data about
the Person is created and printed in the business
cards of the person. If any of his friends wants to
add the details in the mobile phone contact list,
the QR code is just captured in the mobile with
the camera and the reader software in the mobile
phones decodes the data in the image and stores
the various details of the person in the mobiles
phone book.
In the present situation two authentication
systems are mainly used, there are disadvantages
with both the barcode and hologram. The problem
with the barcode is that any one can read the
numbers of a barcode and modify it to make their
duplicate product to look original and the
holograms do not contain any hidden data; it is
just an image with 3Dimensional effect. The
duplicator can create a new hologram which looks
similar to the original hologram which makes his
duplicate product original and also that Holograms
cannot be used for large quantity as the cost of
printing holograms is costlier.
Now the consumers when buying the product
has only two options for deciding the originality of
the product; either they must believe on the
present authentication system and believe that its
original or they have to decide based on the shop
keepers assurance. But both of this option will not
work all the time as the present authentication
system ( Barcode and holograms) can also be
duplicated and look like an original one and the
shop keeper might too tell that the product is
Page 139 of 165
original as he has to sell his product. So there is
no proper authentication to identify the originality
of the product.
III. PRODUCT AUTHENTICATION USING QR
CODE
Thus authentication of consumer products can be
done with the QR codes it is printed on the cover
of the product it is captured as an image through
the camera attached with the mobile phone. The
image is then opened with the QR code reading
application to extract the data from the code and is
sent to the central web server as an SMS. The web
server is connected to the cloud with through
internet; the web server on receiving the SMS
sends the data to the corresponding
manufacturers server in the cloud. The
manufacturers server using a searching algorithm
looks for the data in the corresponding database. If
the data is found a reply is sent to the central
server stating that the product is original and if the
corresponding record is not found then the
manufacturers server sends a message to the
central server stating that the product is a
duplicate one. The web server on receiving the
message from the manufacturers server sends a
message to the user stating the status of the
product and the user on receiving the message
from the central server can then decide on buying
the product.
The QR code which is used in our model is
better than the present Barcode and holograms as
the QR codes are not in human understandable
form, as no one can make changes to make it look
original it can only read by the QR code readers.
In this model the verification process is done by
the user itself and there is no shop keepers hand in
the complete authentication process. The user
sends the captured image and the final result is
also received only by the user with the help of the
QR code reader which helps in reading the data
printed in the form of QR code. For our model we
are using the QR code reader application which is
written in J2ME (Java 2 Micro Edition). J2ME
helps the QR code reader to work in all java
enabled mobile phones irrespective of its screen
size thus making our model to work with all
mobile phones with a small constraint that the
mobile phone should have a capturing device
attached to it.
In our model the computing technology used to
connect the mobile devices with central web
server is Cloud Computing which allows the users
from various locations to access the web server to
check the products originality. Cloud computing
helps in easy access to all the remote sites
connected in the internet. The central server sends
the reply from the manufacturers server to the
user who requests with a QR code to find the
originality of a product. The central server can
send the solution to the user in two ways; it can
either send a SMS with details about the
originality of the product or the web server can
send a voice message to the user about the
originality, the option of sending the reply is based
on the users selection while registering to the web
server in the beginning.
IV. MOBILE DEPLOYMENT MODEL
The mobile phone is the important device which
is used in our proposal as the user needs a device
to send the data and receive a reply from the web
server. It is found that by the end of 2009, 4
billion people are using mobile phones and by
2013, that number is projected to grow to 6
billion, which is much more than the personal
computer users which show that nearly everyone
has a mobile phone. So the same can be used for
our process rather than buying a new device for
authentication process. The mobile phones service
providers have also reduced the cost charged for
SMS which reduces the cost for the data transfer
when using a mobile phone. The most important
advantage in this model by using the mobile
phone is; the user can send the data and get the
reply without anybodys help or intervention thus
the privacy is maintained and The speed of
transfer is also high when compared with MMS..
The data from the mobile device to the central
web server is through the SMS as it is more
economic than the other data transfer modes like
MMS and WAP.
Page 140 of 165
V. CLOUD COMPUTING FOR AUTHENTICATION
Cloud computing has now come into the mobile
world as Mobile Cloud Computing, the cloud
computing provides general applications online
which can be accessed through a web browser
while all the software and data resides in the
server and the client can access those applications
and data without the complete knowledge about
the infrastructure.. The cloud computing has five
essential characteristics:
On demand self service,
Broad network access,
Resource pooling,
Rapid elasticity and
Measured service.
The figure 1.2 shows the cloud structure of the
Cloud Computing.
Fig 1.2
It can be explained in a simple way as it is a
Client-Server architecture where the clients
request a service and not a server. In general the
cloud computing users do not own their data, all
the data is placed in the cloud and the user can
access the data through a computer or a mobile
device. In our model cloud computing is chosen
because the manufacturers server will be located
in various locations and will have a huge amount
of data related to the products. In normal
computing technology we need to load the data in
from the server and check it for the required
record in the client machine. With the help of
cloud computing we can directly access the data
present in the manufacturers server and get the
data; this reduces the accessing time of the data
and increases the speed of the process. In our
model the manufacturers server and our central
server is located in the cloud and the user can
access the central server from any location in the
country and get the authentication information.
The central web server in the cloud searches for
the corresponding manufacturers server and sends
the data to it. As all the servers are in the cloud the
searching process is simple.
VI. IMPLEMENTATION AND RESULTS
The various steps involved in the process of
authenticating the products are as follows. First
the QR code is captured with the camera
attached to the mobile device and the captured
image is then encoded with the encode()
function. Then the encoded data is then sent to
the central server in the cloud through SMS
with the help of the sendEncoded() function.
The central server on receiving the data from the
mobile searches the respective server and
checks for the record. The reply is then sent to
the central server and then the server sends the
reply to the mobile device with the help of the
sendReply() function. The fig 1.3 shows the
process in a sequence.
The pseudo code for the proposed model is
shown below.
// Module for Encoding the data
encode()
{
if ( image == QRcode)
{
encode the code and get the data;
Page 141 of 165
}
else
return Encode Failed;
}
// Module to send the Encoded data
sendEncoded()
{
Store the encoded data in a buffer
send through SMS
if ( sending == success)
return sending success
else
return
sending failed
}
// Module to send data from server to mobile
sendReply()
{
if ( record == found )
return Original
else
return Duplicate
}
REFERENCES
[1] www.thehindu.com/2010/04/02/stories/2
010040252160400.htm-
[2] The Green Grid Consortium
http://www.gridbus.org/cloudsim/
[3] R.Buyya, C.S.Yeo, S.Venugopal, J.Broberg,
and I.Brandic. Cloud Computing and
Emerging IT platforms : Vision, Hype and
Reality for Delivering Computing as the 5
th
Utility. Future Generation Computer
Systems, 25(6):599-616, Elsevier, June 2009
.
Page 142 of 165
1
H HI I G GH HW WA AY Y T TR RA AN NS SP PO OR RT TA AT TI I O ON N U US SI I N NG G R RO OB BO OT TI I C CS S
N.Banupriya
III ECE
banubhuvi@gmail.com
C.S.Revathi
III ECE
revathi.aru66@gmail.com
ABSTRACT:
Highway travel is the lifeblood
of modern industrial nations. The
larger roads are sorely overburdened:
around the major cities, heavy usage
slows most peak-hour travel on
freeways to less than 60 kilometers per
hour. In all excessive traffic causes
more than five billion hours delay
every year; it wastes countless gallons
of fuel and needless multiplies exhaust
emissions. The main goal of this
project is to make the experience of
driving less burdensome and accident
less, especially on long trips. This can
be achieved by making the highway
itself part of the driving experience and
integrating roadside technologies that
would allow the overburdened highway
system to be used more efficiently.
The automobiles will have automatic
throttle, braking and steering control.
Here is a system to host these cars
consist of roadside sensors that obtain
information about current traffic
conditions and rely them to receivers in
the automobiles on the road. The
automobiles can be grouped together at
highway speeds, 65-70 MPH, no more
than a few feet apart, which make
better use of the available roadways. In
this manner, the traffic systems and the
automobiles work together to bring
passengers safely and quickly to their
destinations.
INTRODUCTION:
People now take for granted
automotive systems like emission
control and fuel injection. In fact, many
people do not realize how many systems
inside their automobiles are already
monitored and controlled by computers.
Fuel delivery, ignition, emission, air-
conditioning, and automatic transmission
systems are examples of the systems
used daily by a car that are computer
controlled or assisted
Now in the information age, people
have come to rely on the other driver
assistance technologies, such as mobile
phones and in-vehicle navigation
systems. The goal of these technologies
is to make the experience of driving less
burdensome, especially on long trip.
Even when cars were still young,
futurists began thinking about vehicles
that could drive themselves, without
human help. best known of. During the
last six decades interest in automated
vehicles rose and fell several times. Now
at the start of the new century, it's worth
taking a fresh look at this concept and
asking how automation might change
transportation and the quality of our
lives.
Consider some of the implications
of cars that could drive themselves.
We might eliminate the more than
ninety percent of traffic crashes that are
caused by human errors such as
Page 143 of 165
2
misjudgments and inattention,
We might reduce antisocial driving
behavior such as road rage,
rubbernecking delays, and unsafe
speeds, thereby significantly reducing
the stress of driving.
The entire population, including the
young, the old, and the infirm, might
enjoy a higher level of mobility without
requiring advanced driving skills.
The luxury of being chauffeured to
your destination might be enjoyed by the
general populace, not just the wealthiest
individuals, so we might all do whatever
we like, at work or leisure, while
traveling in safety.
Fuel consumption and polluting
emissions might be reduced by
smoothing traffic flow and running
vehicles close enough to each other to
benefit from aerodynamic drafting.
Traffic-management decisions might
be based on firm knowledge of vehicle
responses to instructions, rather than on
guesses about the choices that drivers
might make.
The capacity of a freeway lane might
be doubled or tripled, making it possible
to accommodate growing demands for
travel without major new construction,
or, equivalently, today's level of
congestion might be reduced, enabling
travelers to save a lot of time.
IS IT FEASIBLE?
Automating the process of driving
is a complex endeavor. Advancements in
information technology of the past
decade have contributed greatly, and
research specifically devoted to the
design of automated highway systems
has made many specific contributions.
This progress makes it possible for us to
formulate operational concepts and
prove out the technologies that can
implement them.
AN AUTOMATED DRIVE:
We can now readily visualize your
trip on an automated highway system:
Imagine leaving work at the end of the
day and needing to drive only as far as
the nearest on-ramp to the local
automated highway. At the on-ramp, you
press a button on your dashboard to
select the off-ramp closest to your home
and then relax as your car's electronic
systems, in cooperation with roadside
electronics and similar systems on other
cars, guide your car smoothly, safely,
and effortlessly toward your destination.
Enroute you save time by maintaining
full speed even at rush-hour traffic
volumes. At the end of the off-ramp you
resume normal control and drive the
remaining distance to your home, better
rested and less stressed than if you had
driven the entire way. The same
capability can also be used over longer
distances, e.g. for family vacations that
leave everybody, including the "driver,"
relaxed and well-rested even after a
lengthy trip in adverse weather.
Although many different technical
developments are necessary to turn this
image into reality, none requires exotic
technologies, and all can be based on
systems and components that are already
being actively developed in the
international motor vehicle industry.
These could be viewed as replacements
for the diverse functions that drivers
perform every day: observing the road,
observing the preceding vehicles,
steering, accelerating, braking, and
deciding when and where to change
course.
OBSERVING THE ROAD :
Cheap permanent magnets are
buried at four-foot intervals along the
lane centerline and detected by
Page 144 of 165
3
magnetometers mounted under the
vehicle's bumpers. The magnetic-field
measurements are decoded to determine
the lateral position and height of each
bumper at accuracies of less than a
centimeter. In addition, the magnets'
orientations (either North Pole or South
Pole up) represent a binary code (either
0 or 1), and indicate precise milepost
locations along the road, as well as road
geometry features such as curvature and
grade. The software in the vehicle's
control computer uses this information
to determine the absolute position of the
vehicle, as well as to anticipate
upcoming changes in the roadway.
Other researchers have used
computer vision systems to observe the
road. These are vulnerable to weather
problems and provide less accurate
measurements, but they do not require
special roadway installations, other than
well-maintained lane markings.
Both automated highway lanes
and intelligent vehicles will require
special sensors, controllers, and
communications devices to coordinate
traffic flow.
OBSERVING PRECEDING
VEHICLES :
The distances and closing rates to
preceding vehicles can be measured by
millimeter-wave radar or a laser
rangefinder. Both technologies have
already been implemented in
commercially available adaptive cruise
control systems in Japan and Europe.
The laser systems are currently less
expensive, but the radar systems are
more effective at detecting dirty vehicles
and operating in adverse weather
conditions. As production volumes
increase and unit costs decrease, the
radars are likely to find increasing
favour .
STEERING, ACCELERATING AND
BRAKING :
The equivalents of these driver
muscle functions are electromechanical
actuators installed in the automated
vehicle. They receive electronic
commands from the onboard control
computer and then apply the appropriate
steering angle, throttle angle, and brake
pressure by means of small electric
motors. Early versions of these actuators
are already being introduced into
production vehicles, where they receive
their commands directly from the
driver's inputs to the steering wheel and
pedals. These decisions are being made
for reasons largely unrelated to
automation. Rather they are associated
with reduced energy consumption,
simplification of vehicle design,
enhanced ease of vehicle assembly,
improved ability to adjust performance
to match driver preferences, and cost
savings compared to traditional direct
mechanical control devices.
DECIDING WHEN AND WHERE
TO CHANGE COURSE:
Computers in the vehicles and
those at the roadside have different
functions. Roadside computers are better
suited for traffic management, setting the
target speed for each segment and lane
of roadway, and allocating vehicles to
Page 145 of 165
4
different lanes of a multilane automated
facility. The aim is to maintain balanced
flow among the lanes and to avoid
obstacles or incidents that might block a
lane. The vehicle's onboard computers
are better suited to handling decisions
about exactly when and where to change
lanes to avoid interference with other
vehicles.
NEW FUNCTIONS :
Some additional functions have no
direct counterpart in today's driving.
Most important, wireless communication
technology makes it possible for each
automated vehicle's computer to talk
continuously to its counterparts in
adjoining vehicles. This capability
enables vehicles to follow each other
with high accuracy and safety, even at
very close spacing, and to negotiate
cooperative maneuvers such as lane
changes to increase system efficiency
and safety. Any failure on a vehicle can
be instantly known to its neighbors, so
that they can respond appropriately to
avoid possible collisions.
In addition, there should be
electronic "check-in" and "check-out"
stations at the entry and exit points of the
automated lane, somewhat analogous to
the toll booths on closed toll roads,
where you get a ticket at the entrance
and then pay a toll at the exit, based on
how far you traveled on the road. At
check-in stations, wireless
communication between vehicles and
roadside would verify that the vehicle is
in proper operating condition prior to its
entry to the automated lane. Similarly,
the check-out system would seek
assurance of the driver's readiness to
resume control at the exit. The traffic
management system for an automated
highway would also have broader scope
than today's traffic management systems,
because it would select an optimal route
for every vehicle in the system,
continuously balancing travel demand
with system capacity, and directing
vehicles to follow those routes precisely.
Most of these functions have
already been implemented and tested in
experimental vehicles. All except for
check-in, check-out, and traffic
management were implemented in the
platoon-scenario demonstration vehicles
of Demo '97. A single 166 MHz Pentium
computer (obsolete by standards of
today's normal desktop PCs) handled all
the necessary in-vehicle computations
for vehicle sensing, control, and
communications.
REMAINING TECHNICAL
CHALLENGES:
The key technical challenges that
remain to be mastered involve software
safety, fault detection, and malfunction
management. The state of the art of
software design is not yet sufficiently
advanced to support the development of
software that can be guaranteed to
perform correctly in safety-critical
applications as complex as road-vehicle
automation. Excellent performance of
automated vehicle control systems (high
accuracy with superb ride comfort) has
been proved under normal operating
conditions, in the absence of failures.
Elementary fault detection and
malfunction management systems have
already been implemented to address the
most frequently encountered fault
conditions, for use by well-trained test
drivers. However, commercially viable
implementations will need to address all
realistic failure scenarios and provide
safe responses even when the driver is a
completely untrained member of the
general public. Significant efforts are
still needed to develop system hardware
Page 146 of 165
5
8
0
5
2
Line
detector
Obstacle
detector
P
o
r
t
A
D
C
P
o
r
t
P
o
r
t
P
o
r
t
A
D
C
Steering
servo
Left
Right
R
C
L
PWM
E
n
a
b
l
e
DC motor
PWM
Direction
and software designs that can satisfy
these requirements.
NONTECHNICAL CHALLENGES :
The non technical challenges
involve issues of liability, costs, and
perceptions. Automated control of
vehicles shifts liability for most crashes
from the individual driver (and his or her
insurance company) to the designer,
developer and vendor of the vehicle and
roadway control systems. Provided the
system is indeed safer than today's
driver-vehicle-highway system, overall
liability exposure should be reduced. But
its costs will be shifted from automobile
insurance premiums to the purchase or
lease price of the automated vehicle and
toll for use of the automated highway
facility.
All new technologies tend to be
costly when they first become available
in small quantities, then their costs
decline as production volumes increase
and the technologies mature. We should
expect vehicle automation technologies
to follow the same pattern. They may
initially be economically viable only for
heavy vehicles (transit buses,
commercial trucks) and high-end
passenger cars. However, it should not
take long for the costs to become
affordable to a wide range of vehicle
owners and operators, especially with
many of the enabling technologies
already being commercialized for
volume production today.
The largest impediment to
introduction of electronic chauffeuring
may turn out to be the general perception
that it's more difficult and expensive to
implement than it really is. If political
and industrial decision makers perceive
automated driving to be too futuristic,
they will not pay it the attention it
deserves and will not invest their
resources toward accelerating its
deployment. The perception could thus
become a self-fulfilling prophecy.
It is important to recognize that
automated vehicles are already carrying
millions of passengers every day. Most
major airports have automated people
movers that transfer passengers among
terminal buildings. Urban transit lines in
Paris, London, Vancouver, Lyon, and
Lillie, among others, are operating with
completely automated, driverless
vehicles; some have been doing so for
more than a decade. Modern commercial
aircraft operate on autopilot for much of
the time, and they also land under
automatic control at suitably equipped
airports on a regular basis.
Given all of this experience in
implementing safety-critical automated
transportation systems, it is not such a
large leap to develop road vehicles that
can operate under automatic control on
their own segregated and protected
lanes. That should be a realistic goal for
the next decade. The transportation
system will thus gain substantial benefits
from the revolution in information
technology.
HARDWARE PLATFORM FOR
WORKING MODEL:
GENERAL BLOCK DIAGRAM
Page 147 of 165
6
INFRARED PROXIMITY
DETECTOR:
The IR Proximity detector uses
same technology found in a TV remote
control device. The detector sends out
modulated infra-red light, and looks for
reflected light coming back. When
enough light is received back to trigger
the detector circuit, the circuit produces
a high on the output line. Light is in the
form of a continuous string of bursts of
modulated square waves. Bursts
alternate between left and right LEDs. A
microprocessor generates the bursts, and
correlates the receiver output to burst.
The IRPD we have used makes use of a
Panasonic PNA4602M IR sensor
coupled with two IR LEDs to detect
obstacles. The Panasonic module
contains integrated amplifiers, filters,
and a limiter. The detector responds to a
modulated carrier to help eliminate
background noise associated with
sunlight and certain lighting fixtures.
The LEDs are modulated by an
adjustable free running oscillator. The
sensitivity of the sensor is controlled by
altering the drive current to LEDs. The
microcontroller alternatively enables the
LEDs and checks for a reflection. A
provided from the host microcontroller,
one for enabling the left IR LED, the
second for enabling the right IR LED. A
third analog output from the IRPD kit is
connected to an analog-to-digital
converter.
LINE DETECTOR:
The line detector is an infrared
reflective sensor that can be attached to
the front of the car to follow a white line
on a black background, or vice versa.
There are three reflective
sensors, which are made from one piece
of infrared LED and photo detector that
are directed at the surface below the
vehicle. Each of the sensors looks
reflected IR light. When one of the
sensors is positioned over dark or black
surface its output is low. When it is
moved to light or white surface its
output will be high. The microcontroller
accepts these signals and moves the
robot according to the diagram below.
The line detector works effectively when
thickness ranged between to .the
track can be white tape on a black
background or black tape on a white
background. The sensors can be at a
maximum height of .5 inches above the
ground.
The three IR-Detector pairs are
depicted on the right of the circuit
diagram. The base of each of the
transistors is passed through an inverter.
The lines from the inverter are passed to
microcontroller and to the LEDs
indicating the position of the line
detector on the road.
As the emitted light from the IR
LED is reflected from the road back to
the transistor the current starts flowing
through the emitter making the base low.
Te base is connected to the inverter
which causes the line to go at its output.
Since the output lines are also connected
to the LEDs, the corresponding LED
glows when the particular output line is
high.
STEERING SERVO :
A servo comprises of control, a
set of gears, a potentiometer and a
motor. The potentiometer is connected to
the motor via gear set .a control signal
gives the motor a position to rotate to
and the motor starts to turn. The
potentiometer rotates with motor and as
it does so its resistance changes. The
control circuit monitors its resistance, as
soon as its reaches its appropriate values
Page 148 of 165
7
the motor stop and the servo is in correct
position. A servo is a classic example of
a closed loop feedback system. The
potentiometer is coupled to the output
gear. Its resistance is proportional to the
position of the servos output shaft(0 to
180 degrees).
CONCLUSION:
National Highway Traffic and
Safety Administration is an ongoing
research on collision avoidance and
driver/vehicle interfaces. AHS was a
strong public/private partnership with
the goal to build a prototype system.
There are many things that can be done
in the vehicle, but if we do some of them
on the roadway it will be more efficient
and possibly cheaper. Preliminary
estimates show that rear-end, lane-
change, and roadway-departure crash-
avoidance systems have the potential to
reduce motor-vehicle crashes by one-
sixth or about 1.2 million crashes a year.
Such systems may take the form of
warning drivers, recommending control
actions, and introducing temporary or
partial automated control in hazardous
situations. AHS described in this paper
is functional, there is much room for
improvement. More research is needed
to determine if any dependencies exist
that influence velocity of the vehicle
maintaining proper following distance
while following a path. Assuming such
system is ever perfected, one would
imagine it would tend to render the great
tradition of the free-ranging car into
something approaching mass-transit.
After all, when your individual car
becomes part of a massive circulatory
system, with destination pre-selected and
all spontaneity removed, that makes your
travel any different than trip? Only that
you choose the starting time personally.
Page 149 of 165
A Comprehensive Stream Based Web Services Security Processing System
R.Chitra, Lecturer
Department of Computer
Science & Engineering
Gojan School Of Business
Technology
chitrar05@gmail.com
B.Abinaya
IV CSE
Gojan School Of Business
Technology
abinsb01@gmail.com
M.Geethanjali
IV CSE
Gojan School Of Business
Technology
anjalimthkmr@gmail.com
AbstractWith SOAP-based web services
leaving the stadium of being an explorative
set of new technologies and entering the
stageof mature and fundamental building
blocks for service-driven business
processesand in some cases even for
mission-critical systemsthe demand for
nonfunctional requirements including
efficiency as well as security and
dependability commonly increases rapidly.
Although web services are capable of
coupling heterogeneous information systems
in a flexible and cost-efficient way, the
processing efficiency and robustness against
certain attacks do not fulfill industry-
strength requirements. In this paper, a
comprehensive stream-based WS-Security
processing system is introduced, which
enables a more efficient processing in
service computing and increases the
robustness against different types of Denial-
of-Service (DoS) attacks. The introduced
engine is capable of processing all standard-
conforming applications of WS-Security in a
streaming manner. It can handle, e.g., any
order, number, and nesting degree of
signature and encryption operations, closing
the gap toward more efficient and
dependable web services.
Index TermsWeb services, SOAP, WS-
Security, streaming processing, DoS
robustness, efficient processing.
INTRODUCTION
ENTERPRISES are faced with greatly
changing requirementsinfluencing the way
businesses are created and operated. They
have become more pervasive with a mobile
workforce, outsourced data centers, different
engagements with customers, and
distributed sites. Information and
communication technology (ICT) is
therefore becoming a more and more critical
factor for business. ICT moves from a
business supporter to a business enabler and
has to be partly considered as a business
process on its own. In order to achieve the
required agility of the enterprise and its ICT,
the concept of Service-Oriented
Architectures [1] is increasingly used. The
most common technology for implementing
SOA-based systems is the SOAP-based web
services [2]. Some applications like
Software-as-a-Service (SaaS) [3], [4] or
Cloud Computing [5] are inconceivable
without web services. There are a number of
reasons for their high popularity. SOAP-
based web services enable flexible software
system integration, especially in
heterogeneous environments, and is a
driving technology for interorganization
business processes. Additionally, the large
amount of increasingly mature
specifications, the strong industry support,
and the large number of web service
frameworks for nearly all programming
languages have boosted its acceptance and
usage.
Since SOAP is an XML-based protocol, it
inherits a lot of the advantages of the text-
based XML such as message extensibility,
human readability, and utilization of
standard XML processing components. On
Page 150 of 165
the other hand of course, SOAP also inherits
all of XMLs issues. The main problems
used by critics since the start of web
services are verbosity of transmitted
messages and high resource requirements
for processing [6]. These issues are further
increased when using SOAP security [7]
through the need of handling larger
messages and performing cryptographic
operations. These issues possess
performance challenges which need to be
addressed and solved to obtain the efficiency
and scalability required by large (cross-
domain) information systems. These
problems are especially severe, e.g., in
mobile environments with limited
computing resources and low data rate
network connections [8], or for high-volume
web service transactions comprising a large
number of service invocations per second.
Further, high resource consumption is not
only an economic or convenience factor, it
also increases the vulnerability for resource
exhaustion Denial-of-Service (DoS) attacks.
To overcome the performance issues,
streaming XML processing provides
promising benefits in terms of memory
consumption and processing time. The
streaming approach is not new, but has not
found a widespread adoption yet. Reasons
therefore are manifold. The main issue
surely is the missing random access to
elements inside the XML document which
makes programming difficult. Therefore, a
current trend is using stream-based methods
for simple message preprocessing steps
(e.g., schema validation) and tree-based
processing inside the application. WS-
Security processing is double edged in this
sense. On one hand, high resource
consumption and the ability to detect
malicious messages makes security
processing an ideal candidate for streaming
methods. On the other hand, it requires
rather complex operations on the SOAP
message.
Thus, to date there exists no comprehensive
stream-based WS-Security engine. This
paper presents how a secured SOAP
message as defined in WS-Security can be
completely processed in streaming manner.
It can handle, e.g., any order, number, and
nesting degree of signature and encryption
operations. Thus, the system presented
provides the missing link to a fully streamed
SOAP processing which allows to leverage
the performance gains of streaming
processing as well as to implement services
with an increased robustness against Denial-
of-Service attacks.
STREAMING WS-SECURITY
PROCESSING
In this section, the algorithms for processing
WS-Security enriched SOAP messages in a
streaming manner are presented and
discussed. To understand the algorithms and
the problems solved by them, first of all, an
introduction to the WS-Security elements is
given.
WS-Security
In contrast to most classic communication
protocols, web services do not rely on
transport-oriented security means (like
TLS/SSL [25]) but on message-oriented
security. The most important specification
addressing this topic is WSSecurity [26],
defining how to provide integrity,
confidentiality, and authentication for SOAP
messages. Basically, WS-Security defines a
SOAP header (wsse:Security) that carries
the WS-Security extensions. Additionally, it
defines how existing XML security
standards like XML Signature [27] and
XML Encryption [28] are applied to SOAP
messages. For processing a WS-Security
enriched SOAP message at the server side,
the following steps must be performed (not
necessarily in this order):
. processing the WS-Security header,
. verifying signed blocks, and
Page 151 of 165
. decrypting encrypted blocks.
This implies that not only processing the
individual parts must be considered but also
the references between the WSSecurity
components. This is especially important in
the context of stream-based processing,
since arbitrary navigation between message
parts is not possible in this processingmodel.
Fig. 2 shows an example of a WS-Security
secured SOAP message containing these
references. Security tokens contain identity
information and cryptographic aterial
(typically an X.509 certificate) and are used
inside signatures and encrypted keys and are
backward referenced from those. Encrypted
key elements contain a (symmetric)
cryptographic key, which is asymmetrically
encrypted using the public key of the
recipient. This symmetric key is used for
encrypting message parts (at the client side)
and also for decrypting the encrypted blocks
(at the server side).
Encrypted keys must occur inside the
message before the corresponding encrypted
blocks. Finally, XML signatures have the
following structure:
<ds:Signature>
<ds:SignedInfo>
<ds:CanonicalizationMethod/>
<ds:SignatureMethod/>
<ds:Reference @URI>
<ds:Transforms>...</Transforms>
<ds:DigestMethod/>
<ds:DigestValue>...</DigestValue>
</ds:Reference>
</ds:SignedInfo>
<ds:SignatureValue>...</SignatureValue>
<ds:KeyInfo>...</KeyInfo>?
</ds:Signature>
The signature holdsin addition to
specifying the cryptographic algorithmsa
ds:Reference element for every signed
block, the cryptographic signature value of
the ds:SignedInfo element, and a reference
to the key necessary for validating the
signature. A ds:Reference element itself
contains a reference to the signed block,
optionally some transformations and the
cryptographic hash value of the signed
block. References to signed blocks can be
either backward or forward references. This
has to be taken into account for the
processing algorithm.
There are several possibilities for realizing
the reference. However, only references
according to the Xpointer specification [29]
are recommended (see WS-I Basic Security
Profile [23]). Thus, in the following, we
assume that the referenced element contains
an attribute of the form
Id=myIdentification and is referenced
using the URI #myIdentification inside
the ds:Reference element.
Architecture
Fig. 3 shows the architecture of the system
for stream-based processing of WS-Security
enriched SOAP messages called CheckWay
[30]. It operates on SAX events created by a
SAX parser and contains four types of Event
Page 152 of 165
Handlers. Instances of these Event Handler
types are instantiated on-demand
and linked together in an Event Handler
chain operating on the stream of XML
events [31]. The first handler is responsible
for processing the WS-Security header. As
the header has a fixed defined position
inside the SOAP message, the handler can
be statically inserted inside the handler
processing chain. For signed and encrypted
blocks, however, this is different. These may
occur at nearly arbitrary positions inside the
SOAP message, and can even be nested
inside each other. Thus, the Dispatcher
handler is responsible for detecting signed
and encrypted blocks and inserting a
respective handler into the processing chain
(at which position will be discussed below).
While detecting encrypted blocks is trivial
they start with the element
xenc:EncryptedData), detecting signed
blocks is more difficult as those elements
are not explicitly marked. For forward
references, the signature elements are
(regarding the document order) before the
signed block. Therefore, forward referenced
signed blocks can be detected by comparing
the ID attribute of that element with the list
of references from the signature elements
processed before. For backward references,
there is no possibility for a definite decision
if an element is signed or not. The following
solution for this problem has been
developed. Every element before the end of
SOAP header (only there backward
references are possible) that contains an ID
attribute is regarded as potentially signed
and therefore the signed block processing
is started. At theend of such a block, the ID
and the result of the signed block processing
(i.e., the digest of this block) are stored.
When processing a signature, the included
references are compared to the IDs stored
from the potentially signed blocks and the
stored digest is verified by comparison to
the one inside the signature element.
Encrypted Key Processing Automaton
Fig. 5 shows the automaton for processing
an xenc:EncryptedKey element contained in
the WS-Security header.
The processing starts with reading the
encryption algorithm alg (1).Inside the
ds:KeyInfo element (2), a hint to the key
pair keypriv and keypub is given (see above)
The key keypriv is used for initializing the
decryption algorithm inside the function
initDecryptionalg (4). The function
decryptchar decrypts then the content of
the xenc:CipherData element using this
algorithm in conjunction with keypriv. The
result is the (symmetric) key key, that is
used later to decrypt encrypted content. The
references stated inside the
xenc:ReferenceList claim the usage of the
current key for those encrypted blocks.
Thus, the storeKey. . . function adds the
pair ref; key to EncKey (7) to enable the
decryption of the appropriate encrypted
block (see below). Additionally, the pair
ref;Enc is added to the end of the list of
security references Ref.
Signature Processing Automaton
Fig. 6 shows the automaton for processing a
ds:Signature element from the WS-Security
header. For verifying the signature value, the
ds:SignedInfo block must be canonicalized
and hashed. Thus, at the beginning of that
element, the canonicalization and hashing is
started by the function startHashing (1).
Page 153 of 165
The canonicalization algorithms for the
ds:SignedInfo block are read (2). The WS-I
Basic Security Profile includes only
Exclusive C14N [33] as canonicalization
algorithm. The signature algorithm (e.g.,
RSA with SHA-1) is read (3). The
reference ref is read from the URI attribute
of the element ds:Reference (4) The
transformation algorithms are read and the
set of transformations is stored into t (6, 7).
The hashing algorithm for the signed block
is read. The digest value is read (10) and the
function checkDigestchar; t; ref is
executed. . If there exists a D with ref;D 2
CompletedDigest, ref is a backward
reference and thus the referenced
CONCLUSION
The paper introduces a comprehensive
framework for event-based WS-Security
processing. Although the streaming
processing of XML is known and
understood for almost as long as the
existence of the XML standard itself, the
exploitation of this processing model in the
presence of XML security is not. Due to the
lack of algorithms for eventbased processing
of WS-Security, most SOAP frameworks
include the option for streaming-based
processing only for unsecured SOAP
messages. As soon as these messages are
secured by WS-Security mechanisms, the
security processing is performed relying on
the DOM tree of the secured message, hence
loosing all advantages of the event-based
processing model. With this contribution,
the implementation of SOAP message
processing can be realized in a streaming
manner including the processing of security
means, resulting in significant
improvements compared to traditional and
currently mainly deployed tree-based
approaches. The main advantages of the
streaming model include an increased
efficiency in terms of resource consumption
and an enhanced robustness against different
kinds of DoS attacks. This paper introduces
the concepts and algorithms for a
comprehensive stream-based WS-Security
component. By implementing configurable
chains of stream processing components, a
streaming WS-Security validator has been
developed, which verifies and decrypts
messages with signed and/or encrypted parts
against a security policy.
The solution can handle any order, number,
and nesting degree of signature and
encryption operations filling the gap in the
stream-based processing chain toward more
efficient and dependable web services.
REFERENCES
[1] T. Erl, Service-Oriented Architecture:
Concepts, Technology, and Design. Prentice
Hall, 2005.
[2] G. Alonso, F. Casati, H. Konu, and V.
Machiraju, Web Services. Springer, 2004.
[3] M.P. Papazoglou, Service-Oriented
Computing: Concepts, Characteristics and
Directions, Proc. Intl Conf. Web
Information Systems Eng., p. 3, 2003.
[4] M. Turner, D. Budgen, and P. Brereton,
Turning Software into a Service,
Computer, vol. 36, no. 10, pp. 38-44, 2003.
[5] R. Buyya, C.S. Yeo, and S. Venugopal,
Market-Oriented Cloud Computing:
Vision, Hype, and Reality for Delivering IT
Services as Computing Utilities, Proc. 10th
IEEE Intl Conf. High Performance
Computing and Comm., pp. 5-13, 2008.
[6] M. Govindaraju, A. Slominski, K. Chiu,
P. Liu, R. van Engelen, and M.J. Lewis,
Toward Characterizing the Performance of
SOAP Toolkits, Proc. Fifth IEEE/ACM
Intl Workshop Grid Computing (GRID
04), pp. 365-372, 2004.
[7] H. Liu, S. Pallickara, and G. Fox,
Performance of Web Services Security,
Proc. 13th Ann. Mardi Gras Conf., Feb.
2005.
Page 154 of 165
Study on Web service Implementation in eclipse using apache CXF on JBoss
Platform; Towards Service Oriented Architecture Principles
M.Sanjay (Author)
Dep. Of Computer Science Engineering,
Bharath University,
Chennai 73 , India
S.Sivasubramanian M.Tech (Ph.D)
Asst. Professor, Dept. of CSE
Bharath University
Chennai 73 , India
Sivamdu2010@gmail.com
Abstract Webservice with SOA architecture is the mostly
used word in the IT industry. SOA is a powerful distributed
computing system provides the logic of divide the business
processes into services with the intention of promoting the
reusability. It converts the business process into loosely
coupled services to integrate the Enterprise applications to
reduce the IT Burden and increase the ROI. SOA platform is
more popular and widely used in the distributed systems with
the challenge of setting up the environment and integration,
this study has been carried out to setting up the webservice
environment and creation of the services using SOA
architecture. SOA Principles and the Open Software Tools
are used in this study.
Keywords- Service Oriented Architecture (SOA);
Webservices , Jboss , CXF
Introduction (Heading 1)
Service Oriented Architecture (SOA) & Webservices are the
emerging technology used in the IT industry. This study talks
about the Service Oriented Architecture , Web service with
SOA and the creation of webservice using CXF. Service
Oriented Architecture is an architectural model that enhance
the agility , cost effectives and reduce the IT burden. SOA
supports the service oriented computing. Service is an unit of
solution. Service serves as an Individual component and
achieves the stategic goals of an organization by reusing the
service. In SOA services are Created ,Executed and Evaluated.
The services are classified into Business services, Application
services and Infrastructure services. Services are aggregated
and achieved the business process. These services are exposed
as a webservice to access it from anywhere. The meta
information about the services are documented as WSDL
definition , which is nothing but the XML schema. In
Webservies the service functions are refered as a service
operations. This webservice can be designed using the Open
Software tool called Apache CXF. CXF is Celtix and XFire
communities. CXF has much friendlier experience and easy to
integrate with the spring framework.
.
SERVICE ENTITIEIS
Services are aggregated and given as a single interface called
coarser-grained. This uses the late binding because the
customer does not know the location of the service until the
runtime. The consumer knows the service details after
referring the registry during the runtime. Service consumer is
an application which sends the request the service with the
specified webservice contract input. Service provider is the
service which accepts the service request and executes.
Service provider publish the service contract in the registry to
access the consumer. After the execution, the service provider
sends the response to the service consumer according to the
contract output. Service registry is a place where all the
services are registered, so that the consumer can go and check
the available services in an organization. It provides
reusability. The tools are provided to enable the services to be
modeled , created and stored. Programmers are given access to
find out the services and also given the alert if any changes
happens in the registry. Service contract specifies the request
format and the response format with the preconditions and the
post conditions. The amount of time the service takes to
execute the method also specified as Quality of Service. The
service lease is used for the number of years the consumer can
use the service. Once the lease is over the consumer must
request a new lease in the service registry.Orchestration is
linking the services to achieve the business processes. This
allows the processes to be declared in business flow. It has the
ability to define and model the processes and analyze them to
understand the impacts of any changes. This supports for
monitoring and management capabilities at the process
level.The life cycle of SOA is Gather the requirements ,
Construct , test , Integrate the people process Information and
Manage the application services. The Elements of SOA is
Service , Provider , Requester and Directory. In this model
the service provider publishes the service based on the WSDL
contract in the Service Registry. The service consumers
discovers the service using the end point URL with the request
SOAP message. Messaging enables the service to
communicate and interact with multiple platform , it is
connection independent and it has the intelligent routing
capability.
Page 155 of 165
<?xml version="1.0" encoding="UTF-8"?>
<wsdl:definitions name="CXFWSDL"
targetNamespace="http://www.example.org/
CXFWSDL/"
xmlns:wsdl="http://schemas.xmlsoap.org/w
sdl/"
xmlns:tns="http://www.example.org/CXFWSD
L/"
xmlns:xsd="http://www.w3.org/2001/XMLSch
ema"
xmlns:soap="http://schemas.xmlsoap.org/w
sdl/soap/">
<wsdl:types>
<xsd:schema
targetNamespace="http://www.example.org/
CXFWSDL/">
<xsd:element name="NewOperation">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="in"
type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element
name="NewOperationResponse">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="out"
type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
S u
S
L C
8 S
u
A
S
1 C
S
u
WebServices
Webservices is a software program which is identified by the
Uniform Resource Locator. The interfaces and the bindings
are defined using XML. Its definition can be discovered by
other systems. Those systems interact with the webservices in
the specified definition using SOAP. The characteristic of
webservices are , it is platform , location, implemention and
format independence. This has three things. Discovery is the
one which search where the service is. Desciption is the one
which says how the service should be used and Messaing is the
one which says the communication. Webservice is using UDDI
for the discovery , WSDL is used for the Description and
SOAP is used for the messaging to communicate. Simple
Object Access Protocol is a specification describing how to
provide parameter to a service and how to receive the response
result from the service. This specify the information in a
XML Schema. Webservice Description Langualge is a
standard way to specify the service. The operations of the
services are provided in the WSDL and also what are the
arguments needed for the operation to execute. Address with
the port number is specified to locate the service. UDDI is
Universal Description Discovery and Integration is a platform
framework for describing the services , discovering the
business and integrating the business services. It stores the
information about the services. It is actually a directory of
webservices described by the WSDL. To implement the
webservice concept , the environment needs to be set up.
Once the environment is ready , the services can be written and
published in the server. The services can be developed by
using Top down or Bottom up approach. In our study the Top
down approach is followed. Once the service is published the
service can be accessed from the different systems. In market
different models are used to create the webservices. One of
them is CXF. In our study we are using CXF.
WEB SERVICE ENVIRONMENT CONFIGURATION
For creating the webservice the environment needs to be set up.
Following are the softwares are used for setting up the
environment.
Application Server : jboss 5.1.0 GA
IDE : Eclipse jee helios SR2 win 32
Webservice Model : apache-cxf-2.3.4
Server with CXF Integration : jbossws-cxf-3.4.0.GA
Build tool : apache-ant-1.8.2
Testing tool: SOAP UI 3.1.6
S
C
S
S
u
WSuL
L
WSuL
A SCA
S
Page 156 of 165
STEP 1:
USER ENVIRONMENT VARIABLE SETTINGS.
Go to
Windows -> Properties -> Advanced -> Environment Variables
Set the following environment variables
CXF-HOME
ANT-HOME
JAVA-HOME
STEP 2:
CONFIGURING JBOSSWS-CXF-3.4.0.GA
Change the ant.properteis.examples into ant.properties in
jbossws-cxf-3.4.0.GA folder and edit it in line number 6.
jboss510.home = Path where the jboss is installed
Run the ant command by using the following command
ant deploy jboss 510
Check if it is successfully build
STEP 3:
CONFIGURING ECLIPSE IDE
Setting up the CXF path
Go to Windows -> preferences -> webservice -> cxf2.x
preference
Give the root directory of CXF
STEP 4: CONFIGURING THE JBOSS SERVER
Go to
Windows -> preferences -> webservice -> server and runtime
Select jboss V5.0 server runtime
Select Apache CXF2.X for webservice run time
WEB SERVICE CREATION
The precondition is JBOSS server should be up and running.
Go to File -> new -> Dyanamic web project
Select Apache CXF2.x
Copy the WSDL under the webcontent folder.
Right click on the wsdl -> new -> other -> webservice
Click next -> next -> Finish.
Rename beans.xml created in project->webcontent->web-inf as
beans-delta.xml.
Change the value of parameter contextConfigLocation to
WEB-INF/beans-delta.xml in web.xml at project->webcontent-
>web-inf.
.ear file will be created and deployed in the JBOSS server.
Page 157 of 165
To check the .ear is deployed properly or not , use the
following steps.
Go to InterfaceImpl.java class and take the wsdl path and
run it in the internet explorer.
It should show the wsdl in the internet explorer.
webservice Access
Go to SOAP UI and create a new project by selecting
the above WSDL
Right click and create a new request.
Replace the ? with the proper value.
Run the request with the below address.
Check the server log about the status.
7. Conclusion
This study talks about setting up the webservice
environment , develop the code and deploy the war file in
the application server. There are lots of needs to be done in
the webservice configurations like ESB configurations and
JMS configurations etc.
Future work
Configuring the ESB environment and implement the
concept of ESB in JBOSS to achieve the concept of Routing
and transformation along with the security. Also the
messaging concept JMS . These would be the future work
for this case study.
8. References
WSDL -
http://oreilly.com/catalog/webservess/chapter/ch06.html
Page 158 of 165
Implementation of cryonics in Nano Technology
K.nandhini
III ECE
University college of engineering Arni
Knandhini68@gmail.com
A.Thenmathi
III ECE
University college of engineering Arni
tmathi1992@gmail.com
Abstract:
Today technology plays a vital
role in every aspect of life.
Increasing standards in technology
in many fields , has taken man
today to high esteem. But the
present available technologies are
unable to interact with the atoms,
such a minute particles. Hence
Nanotechnology has been developing.
Nanotechnology is nothing but a
technology which uses atoms with a
view to creating a desired product.
It has wider applications in all the
fields. The important application is
Cryonics.. Cryonics is nothing but an
attempt of raising the dead - making
them alive. First we preserve the
body then by using molecular
machines based nanotechnology we
could revive the patients by
repairing damaged cells.
In this technical paper we
would like to discuss cryonics, how
the process of cryonics goes on and
why nanotechnology is being used
and description of molecular
machines which has the capability
of repairing damaged cells.
Therefore Cryonics is an area in which
most of the work is to be done in
future.
Introduction:
Today technology plays a
vital role in every aspect of life.
Increasing standards in technology
in many fields particularly in
medicine, has taken man today to
high esteem. Nanotechnology is a
new technology that is knocking at
the doors. This technology uses
atoms with a view to creating a
desired product. The term
nanotechnology has been a
combination of two terms,nanoand
technology. The term nano is
derived from a Greek word nanos
which means dwarf. Thus
nanotechnology is dwarf technology.
A nanometer is one billionth of a
metre.
Our President A.P.J.Abdul
Kalam being a scientist made a
note about this technology that
nanotechnology would give us an
opportunity, if we take appropriate
and timely action to become one of
the important technological nations
in the world.
The main application of
nanotechnology is cryonics. Cryonics
is nothing but an attempt of raising
the dead. Cryonics is not a
widespread medical practice and
viewed with skepticism by most
scientists and doctors today.
History:
The first mention of
nanotechnology occurred in a talk
given by Richard Feynman in 1959,
entitled Theres plenty of Room at
the Bottom. Historically cryonics
began in 1962 with the publication
of The prospect of immortality
referred by Robert Ettinger, a
founder and the first president of
the cryonics institute. During 1980s
the extent of the damage from
Page 159 of 165
freezing process became much
clearer and better known, when the
emphasis of the movement began to
shift to the capabilities of
nanotechnology. Alcor Life
Extension Foundation currently
preserves about 70 human bodies
and heads in Scottsdale, Arizona and
the cryonics institute has about the
same number of cryonic patients in
its Clinton Township, Michigan
facility. There are no cryonics
service provided outside of the
U.S.A. also there are support groups
in Europe, Canada, Australia & U.K.
Four Generations :
Mihail Roco of the U.S.
National Nanotechnology Initiative has
described four generations of
nanotechnology development. The
current era, as Roco depicts it, is that
of passive nanostructures, materials
designed to perform one task. The
second phase, which we are just
entering, introduces active
nanostructures for multitasking; for
example, actuators, drug delivery
devices, and sensors. The third
generation is expected to begin
emerging around 2010 and will feature
nanosystems with thousands of
interacting components. A few years
after that, the first integrated
nanosystems, functioning much like a
mammalian cell with hierarchical
systems within systems, are expected
to be developed.
Cryonics:
The word "cryonics" is
the practice of freezing a dead body in
hopes of someday reviving it. A
Cryonics is the practice of cooling
people immediately after death to the
point where molecular physical decay
completely stops, in the expectation
that scientific and medical procedures
currently being developed will be able
to revive them and restore them to
good health later. A patient held in
such a state is said to be in 'cryonic
suspension. Cryonics is the practice of
cry preserving humans and pets (who
have recently become legally dead)
until the cry preservation damage can
be reversed and the cause of the fatal
disease can be cured (including the
disease known as aging). However,
there is a high representation of
scientists among cryonicists. Support
for cryonics is based on controversial
projections of future technologies and
of their ability to enable molecular-
level repair of tissues and organs
Cryonics patient prepares for the
future:
Page 160 of 165
How an Alcor patient's body is frozen
and stored until medical technology
can repair the body and revive the
patient, or grow a new body for the
patient.
Patient declared legally dead
On way to Alcor in Arizona,
blood circulation is maintained and
patient is injectedwithmedicineto
minimize problems with frozen tissue.
Cooling of body begun. (If body needs
to be flown, blood is replaced with
organ preservatives.)
At Alcor the body is cooled to 5
degrees
Chest opened, blood is replaced with a
solution (glycerol, water, other
chemicals) that enters the tissues,
pushing out water to reduce ice
formation. In 2 to 4 hours, 60% or
more of body water is replaced by
glycerol.
Freezing the body
The patient is placed in cold
silicone oil, chilling the body to 79C.
Then it's moved to an aluminum pod
and slowly cooled over 5 days in liquid
nitrogen to -196C (minus 320
Fahrenheit), then stored.
Actual process starts:
After preserving the body for some
days, they will start the surgery. As a
part of it, they will apply some
chemicals like glycerol and some
advanced chemicals to activate the
cells of the body. By doing so, 0.2% of
the cells in the body will be activated.
After that they will preserve the body
for future applications. The crayonists
strongly believe that future medicines
in 21
st
century will be useful to rapidly
increase those cells that will help to
retrieve the dead person back.
Storage vessel
Stainless-steel vats formed into a
large thermos-bottle-like container.
Vat for up to four bodies weighs about
a ton; stands 9 feet tall.
Transtime "recommends" that people
provide a minimum of $150,000 for
whole-body suspension. Part of this
sum pays for the initial costs of the
suspension. The balance is placed in a
trust fund, with the income used to pay
the continued cost of maintaining you
in suspension. Transtime can do
neurosuspensions but does not promote
the option. Transtime also charges a
Page 161 of 165
yearly fee of $96 for membership, with
the price halved to $48 for other family
members.
The Cryonics Institute in Clinton
Township, Michigan, charges $28,000
for a full-body suspension, along with
a one-time payment of $1,250. The
Cryonics Institute does not do
neurosuspension.
About 90 people in the United Stated
are already in suspension, with
hundreds more signed on for the
service. Probably the most famous
cryopreserved patient is Ted
WilliamsA cryopreserved person is
sometimes whimsically called a
corpsicle (a portmanteau of "corpse"
and "popsicle"). This term was first
used by science fiction author Larry
Niven, who credits its formulation to
Obstacles to success.
Revival process:
Critics have often quipped
that it is easier to revive a corpse than a
cryonically frozen body. Many
cryonicists might actually agree with
this, provided that the "corpse" were
fresh, but they would argue that such a
"corpse" may actually be biologically
alive, under optimal conditions. A
declaration of legal death does not
mean that life has suddenly ended
death is a gradual process, not a
sudden event. Rather, legal death is a
declaration by medical personnel that
there is nothing more they can do to
save the patient. But if the body is
clearly biologically dead, having been
sitting at room temperature for a period
of time, or having been traditionally
embalmed, then cryonicists would hold
that such a body is far less revivable
than a cryonically preserved patient,
because any process of resuscitation
will depend on the quality of the
structural and molecular preservation
of the brain.
Financial Issues:
Cryopreservation
arrangements can be expensive,
currently ranging from $28,000 at the
Cryonics Institute to $150,000 at Alcor
and the American Cryonics Society.
The biggest drawback to current
vitrification practice is a costs issue.
Because the most cost-effective means
of storing a cryopreserved person is in
liquid nitrogen, fracturing of the brain
occurs, a result of thermal stresses that
develop when cooling from 130C to
196C (the temperature of liquid
nitrogen). actually quite affordable for
the vast majority of those in the
industrialized world who really make
arrangements while still young.
Court Rules against Keeping:
The Conseil d'Etat ruled
cryonics - stopping physical decay
after death in the hope of future revival
- is illegal.
The court said relatives have two
choices over what to do with dead
bodies - burial or cremation. It said
relatives can scatter ashes after
cremation, but they have to bury
bodies in a cemetery or in a tomb on
private property after gaining special
permissionant it, especially if they
make arrangements while still young.
Why only nanotechnology is used in
cryonics ?
Biological molecules and
systems have a number of attributes
that make them highly suitable for
nanotechnology applications. Remote
control of DNA has proved that
electronics can interact with biology.
Gap between electronics and
biology is now closing.
The key to cryonics' eventual
success is nanotechnology,
manipulating materials on an atomic or
molecular scale, according to most
techies who are interested in cryonic
suspension. "Current medical science
does not have the tools to fix damage
that occurs at the cellular and
molecular level, and damage to these
systems is the cause of vast majority of
fatal illnesses. Nanotechnology is
Page 162 of 165
the ultimate miniaturization can
achieve. A nanometer is equivalent
to the width of six bonded carbon
items. A DNA molecule is 2.5nm
wide. Cryonics basically deals
with cells, these cells are in the
order of nanometers. At present
there is no other technology which
deals with such minute cells. Only
nanotechnology can have the ability
to deal with cells. Normally fatal
accidents could be walked away
from, thanks to range of safety
devices possible only with
nanotechnology.
Viruses, prions, parasites and
bacteria continue to mutate and
produce new diseases. Our natural
immune system may, or may not,
handle. In theory, a nano cell
sentinel could make our body
immune to any present or future
infectious disease.
Fracturing is a special
concern for new vitrification
protocol brought online by Alcor
for neuro patients. If advanced
nanotechnology is available for
patient recovery, then fracturing
probably causes little information
loss. Fracturing commits cryopatient
to the need for molecular repair at
cryogenic temperature a highly
specialized and advanced form of
nanotechnology. Whereas unfractured
patients may be able to benefit
sooner from simple forms of
nanotechnology developed for
more main stream medical
applications. Damaged caused by
freezing & fracturing is thought to
be potentially repairable in future
using nanotechnology which will
enable manipulation of matter at the
molecular level.
How nanotechnology is used in
cryonics?
MOLECULAR MACHINES
could revive patients by repairing
damaged cells but for making those
cell repair machines, we first need
to build a molecular assembler.
It is quite possible to
adequately model the behaviour of
molecular machines that satisfy two
constraints.
They are built from parts
that are so stable that small
errors in the empirical force
fields dont affect the shape
or stability of the parts.
The synthesis of parts is
done by using positionally
controlled reactions, where
the actual chemical reactions
involve a relatively small
number of atoms.
Drexlers assembler can be
built with these constraints.
Assembler made using current
methods:
The fundamental purpose of
an assembler is to position
atoms. Robotic arms are other
positioning devices are basically
mechanical in nature, and will
allow us to position molecular
parts during the assembly
process. Molecular mechanics
provides us with an excellent
tool for modeling the behaviour
of such devices. The second
requirement is the ability to
make and break bonds at
specific sites. While molecular
mechanics provides an excellent
tool for telling us where the tip
of the assembler arm is located,
current force fields are not
adequate to model the specific
chemical reactions that must
then take place at the tip/work
piece interface involved in building
an atomically precise part. For this
higher order ab initio calculations
are sufficient
The methods of computational
chemistry available today allow us
Page 163 of 165
to model a wide range of molecular
machines with an accuracy
sufficiently in many cases to
determine how well they will work.
Computational nano
technology includes not only the
tools and techniques required to
model the proposed molecular
machines it must also includes the
tools required to specify such
machine. Molecular machine
proposal that would require million
or even billions of atoms have been
made. The total atom count of an
assembler might be roughly a
billion atoms. while commercially
available molecular modeling
packages provide facilities to
specify arbitary structures it is
usually necessary to point and
click for each atom involved. This
is obviously unattractive for a
device as complex as an assembler
with its roughly one billion atoms.
The software required
to design and model complex
molecular machine is either already
available or can be readily develop
over the next few years. The
molecular compiler and other
molecular CAD tools needed for
this work can be implemented
using generally understood
techniques and methods from
computer science. Using this
approach it will be possible to
substantially reduce the
development time for complex
molecular machines, including
Drexlers assemblers.
FUTURE
ENHANCEMENTS:
1.with the knowledge of cryonics
cryonists are preserving the brains
of humans.we know that each
person alive today was once a
single cell,and a complete human
being can be grown in the natural
state.Thus they believe that genetic
programming of a single cell on the
surface of that brain begins a
process of growth and development
that perhaps appends to the brain a
complete young adult body.
Conclusion:
With the implementation of
Cryonics we can get back the
life.But Cryonics is a area
inwhich most of the work is to be
done in future and till now mainly
the concept of this area has been
proposed.So the Scientists are not
making long promises for the
future of this CryonicS
References:
1. Platzer, W. "The Iceman - 'Man
from the Hauslabjoch'." Universitt
Innsbruck. 12 November 2002
http://info.uibk.ac.at/c/c5/c552/For
schung/Iceman/iceman-en.html
2. "Cryonics." Merriam-Webster's
Collegiate Dictionary. 10th ed.
2001.
3. Iserson, K.V. Death To Dust:
What Happens To Dead Bodies?
2nd ed. Tucson: Galen Press, 2001.
4. Iserson, K.V. "RE: Cryonics
article." E-mail to the author. 11
November 2002.
5. "Frequently Asked Questions."
Alcor Life Extension Foundation.
12 November 2002
http://www.alcor.org/FAQs/index.
htm
6. Olsen, C.B. "A Possible Cure for
Death." Medical Hypotheses 26
(1988): 77-88
Page 164 of 165
Page 165 of 165