Р
Ч
Р
MASTERS THESIS
(2012-2013)
QUANTIFYING THE LINK BETWEEN MESOSCOPIC DOUBLY FRACTAL CONNECTED
NEOCORTICAL NETWORK AND BRAIN EFFICIENCY
Author:
Gabriel O. ADENEYE
University of Belgrade
Faculty of Physics
Mentor:
PЕВf DЕ. VladiАiЕ MILJKOVIĆ
University of Belgrade
Faculty of Physics
October, 2013
The Immortal, Invincible, the Only Wise God
and
To learning and unlearning
ii | P a g e
ACKNOWLEDGEMENT
Immeasurable praise to my Lord and King, Jesus Christ. Your faithfulness has been my
sustenance throughout my time in Belgrade. I am blessed to have you!
“pe ial tha ks to
y super isor, Prof Dr. Vladi ir Miljko i for his patie t guidance and
support.
Worthy of mention is the help I was afforded by Prof. Mila K eže i . Thank you for being open
and engaging. You helped me get clarity about my research interest.
To great friends! Thank you for the moral and emotional support.
Neša a d Go a, I a
grateful for your gentle and understanding disposition. Thanks for being a
pillar of strength.
‘Pelu i,
are there!
iii | P a g e
y darli g fia é a d frie d, tha ks for ei g there! It
ea s the
orld k o i g you
TABLE OF CONTENTS
Dedication ................................................................................................................................................... ii
Acknowledgement ..................................................................................................................................... iii
Table of Contents ....................................................................................................................................... iv
CHAPTER 1: INTRODUCTION .................................................................................................................... 1
CHAPTER 2: NEUROANATOMICAL BACKGROUND ................................................................................. 3
2.1 Neurons ............................................................................................................................................. 3
2.1.1 Electrophysiology of the Neuron ............................................................................................. 5
2.1.1.1 Membrane Potential and Nernst Equation ...................................................................... 6
2.1.1.2 Hyperpolarization and Depolarization ............................................................................ 6
2.1.1.3 Action Potential ................................................................................................................. 7
2.1.1.4 Ionic Currents and Conductances .................................................................................... 7
2.2 The Synapse ...................................................................................................................................... 9
2.2.1 Synaptic Transmission ............................................................................................................ 10
2.2.2 Synaptic Weight ....................................................................................................................... 10
2.3 The Neocortex ................................................................................................................................. 11
2.3.1 Cortical Columns ..................................................................................................................... 13
CHAPTER 3: NETWORK THEORY: AN OVERVIEW ................................................................................ 17
3.1 Brief Historical Background .......................................................................................................... 19
3.2 Networks in the Real World .......................................................................................................... 20
3.2.1 Social Networks ....................................................................................................................... 20
3.2.2 Information Networks ............................................................................................................. 20
3.2.3 Technological Networks ......................................................................................................... 20
iv | P a g e
3.2.4 Biological Networks ................................................................................................................ 21
3.3 Network Properties ........................................................................................................................ 22
3.3.1 Definitions and Notations ........................................................................................................ 22
3.3.2 Node Degrees, Degree Distribution and Correlations .......................................................... 23
3.3.3 Shortest Path Lengths, Diameter and Betweenness ............................................................. 24
3.3.4 Clustering ................................................................................................................................. 25
3.3.4 Graph Spectra .......................................................................................................................... 26
3.3.5 Graph Laplacian ....................................................................................................................... 26
3.4 Network Models ............................................................................................................................. 26
3.4.1 Random Graphs ....................................................................................................................... 26
3.4.2 Small World Networks ............................................................................................................ 28
3.4.3 Scale-Free Networks ............................................................................................................... 30
3.5 Weighted Networks ........................................................................................................................ 32
3.5.1 Node Strength, Strength Distribution and Correlation ........................................................ 31
3.5.2 Weighted Clustering ................................................................................................................ 35
CHAPTER 4: SELECTIVE OVERVIEW OF NEURAL DYNAMICS ............................................................. 37
4.1 Neural Models ................................................................................................................................. 38
4.1.1 Physiological Models ............................................................................................................... 38
4.1.1.1 Hodgkin-Huxley Model ................................................................................................... 38
4.1.1.2 FitzHugh-Nagumo Model ................................................................................................ 41
4.1.1.3 Leaky Integrate-And-Fire Model .................................................................................... 43
4.1.2 Abstract Models ....................................................................................................................... 45
4.1.2.1 Izhikevich Model ............................................................................................................. 45
v|Page
4.1.2.2 Rulkov Model ................................................................................................................... 47
4.2 Neural Networks ............................................................................................................................ 48
4.2.1 Network Topology ................................................................................................................... 49
4.2.1.1 Feed-Forward Neural Networks .................................................................................... 49
4.2.1.2 Recurrent Neural Network ............................................................................................. 52
4.2.2 Reservoir Computing .............................................................................................................. 53
4.2.2.1 Echo State Networks ....................................................................................................... 54
4.1.2.2 Liquid State Networks .................................................................................................... 54
4.1.2.3 Highlight of the Major Differences between ESNs and LSMs ....................................... 56
4.2.3 Learning ................................................................................................................................... 56
4.2.3.1 Supervised/Associative Learning .................................................................................. 57
4.2.3.2 Unsupervised Learning/Self-Organization ................................................................... 58
4.2.4 Other Characteristics of Neural Networks ....................................................................... 59
CHAPTER 5: MODEL AND IMPLEMENTATION ...................................................................................... 61
5.1 Microscopic Scale: Intracolumnar ................................................................................................. 61
5.1.1 The CSIM Software: Features and Availability ...................................................................... 62
5.1.2 Neuronal Model ....................................................................................................................... 63
5.2 Mesoscopic Scale: Intercolumnar .................................................................................................. 64
5.2.1 Neuronal Model ....................................................................................................................... 65
5.2.2 Speed of Information Transfer (SIT) and Synchronization ................................................. 66
CHAPTER 6: RESULTS AND DISCUSSIONS ............................................................................................. 71
6.1 Results ............................................................................................................................................. 71
6.2 Discussions ...................................................................................................................................... 76
vi | P a g e
CHAPTER 7: CONCLUSION ....................................................................................................................... 81
REFERENCES ............................................................................................................................................ 83
vii | P a g e
CHAPTER 1
INTRODUCTION
Thought leaders over the ages have all been in awe of the brain. Their fascination with the 3 pound
matter provoked literary rivers of ink, philosophical attempts and scientific investigations. Much has
been said about the capabilities of the brain, but down the years these sayings have been debunked
with discoveries that further astound the community of brain enthusiasts across disciplines. The
plasticity, adaptability, noise immunity, cognitive properties of the brain (to mention a few), is well
documented, yet in the light of these, conclusion from investigative processes, is made delicately. Such
is the enigma called the brain.
Today much has changed; technological advancements, novel imaging techniques and the diversity of
approach to scientific investigations have made the brain less enigmatic, although far from being fully
understood. The new found understanding of the brain has opened up new possibilities and also new
questions. The possibilities and the gains are divers with respect to disciplines. Medical implications,
technological implications, defense, psychology etc. and news questions also abound. Brain research is
approached from different angles, at different levels, by different disciplines yet, these effort are
complimentary.
Our research approaches the brain as a dynamical complex system, with a unique topological structure.
We investigate the computational prowess of the neocortex, taking the cortical columns as units in a
doubly fractal network. An attempt at exploring dynamical consequences of the topological setup is of
primary interest here and not a delicate replication of topology. We employ computation as our basis of
evaluating the performance of our setup using the principle of synchronization as a gauge. The modeling
1|Page
po er of the ‘ulko s dis rete
ap euro
odel is used to represe t olu
ar spiki g a d a oupled
map paradigm captures the network.
Chapter 2 deals with the anatomical overview of important concepts in our approach. The neuron, its
structure, and the dynamics of action potentials are laid out. We also looked at the synapses which is
the connection point between neurons. We also present the case of the columns being some sort of
functional units of the neocortex in computational sense.
Chapter 3 takes a look at network theory. The small-world, scale-free and random networks are
presented we also examine the weighted network which correctly typifies the brain.
Chapter 4 is a selective overview of neural network dynamics. The different approaches to neural
network study are summarized. The various types of neuron models, their advantages and benefits are
touched. We also looked at the platforms upon which investigations are made. The chapter concludes
with an outline of neural network properties.
Chapter 5 outlines our research methodology and also identifies our chosen neural model and why. The
basis of computational assessment also is shown.
Chapter 6 details the results of our simulations and discussion
Chapter 7 concludes the work, presenting questions which should encourage further investigations.
2|Page
CHAPTER 2
NEUROANATOMICAL BACKGROUND
The brain has long been admired for its astonishing capabilities. This approximately 3 pounds matter is
the most complex biological organ there is. It may seem perplexing to know that our understanding of
this unique entity is far from complete, yet it is what makes us human. It gives us the aptitude for art,
usi , literature, s ie e,
oral judg e t a d ratio al thought. It is a ou ta le for e ery i di idual s
personality, memories, movement and perception of the environment. Making sense of the mindboggling complexity of the brain is a daunting task. Carl Sagan in "The Cosmos", tried to capture this
he he said; The rai is a ery ig pla e i a ery s all spa e." It is hu a
ature to e urious a d
attempt to understand the environment and the events around us. Thus, it is just natural for the brain to
attract an enormous attention. The science of the study of the brain is called neuroscience. The brain
contains highly specialized cells called neurons. It contains about 100 billion of them. There are support
cells in the brain also called the glia or neuroglia and there are more than 10 times more of these than
the neurons. These glia cells provide structural support.
From the neuronal level we can go down to cell biophysics and to the molecular biology of gene
regulation. From the neuronal level we can go up to neuronal circuits, to cortical structures, to the
whole brain, and finally to the behavior of the organism. For the purpose of this thesis however, we look
a bit into neurons then scale up to the basic neuronal circuitry in the cortical structure.
2.1 NEURONS
Neurons are the basic processing unit and the brain consists of billions of these highly specialized cells
connected together. The points of connections between neurons are called synapsis, a term coined in
3|Page
1897 by the British scientist Charles Sherrington. Neurons are remarkable among the cells of the body in
their ability to propagate signals rapidly over large distances. They do this by generating characteristic
electrical pulses called action potentials, or more simply spikes, which can travel down nerve fibers [1].
Neurons consists of three functional components; cell body, dendrites and axon.
Figure 2.1: Complete diagram of a typical myelinated vertebrate neuron cell.
1
The cell body or soma of a typical cortical neurons ranges in diameter from about 10 to 50 μm. It
contains the nucleus and it is isolated by the cell membrane. The dendrites are the input pathways. The
multiplicity of the dendrites and their elaborate branching ensures that a typical neuron can receive
thousands of signals from other neurons. The axon of a neuron acts as the output device. A neuron
usually has one axon, which grows off from the cell body. The start of the protrusion is called the axon
hillock, and the end may split into several branches. The main purpose of the axon is the propagation of
electrical signals away from the cell body.
Axons from single neurons can traverse large fractions of the brain or, in some cases, of the entire body,
although most of the branches or synaptic terminals of the axon connects the neurons in the immediate
neighborhood. In the mouse brain, it has been estimated that cortical neurons typically send out a total
of about 40 mm of axon and have approximately 4 mm of total dendritic cable in their branched
1
This image has been released into the public domain by its author, LadyofHats.
http://en.wikipedia.org/wiki/File:Complete_neuron_cell_diagram_en.svg
4|Page
dendritic trees. The axon makes an average of 180 synaptic connections with other neurons per mm of
length while the dendritic tree receives, on average, 2 synaptic inputs per μm [1]. Most axons are
covered with a protective sheath of myelin, a substance made of fats and protein, which insulates the
axon. Myelinated axons conduct neuronal signals faster than do unmyelinated axons.
The junction between neurons is called a synapse. The neuron which sends out a signal is thus called a
pre-synaptic neuron and the receiving one is called the post-synaptic neuron. The electrical activities of
the neuron are one of the important features that set them apart from other types of cells. Neurons
do t fu tio
alo e i
isolatio ; they are orga ized i to ir uits that pro ess spe ifi ki ds of
information. They depend on their connectivity with other neurons to carry out the simplest of
functions. Typical neurons in the human brain are connected to on the order of 10,000 other neurons,
with some types of neurons having more than 200,000 connections and the computational ability of the
brain is made possible in no small way by these extensive interconnectivities [2]. There are two types of
euro s ased o
hether they e ourage spiki g respo se or ot; they are e itatory a d i hi itory
neurons. As already stated, a neuron makes synaptic connections with thousands of other neurons,
which implies that it gets inputs (electrical impulses), via its dendrites, from multiple sources and then
respo ds
akes a de isio upo i tegratio of i puts
a ordi gly either y firi g a spike or not.
Neuronal responses are all or nothing, which means that the spikes produced during firing are of
constant amplitude, though not frequency. A neuronal response to inputs from its synaptic connections
could be in form of an intermittent train of spikes spaced by a rest or relaxation period. This train of
spikes is called bursting. The distinguishing factor in spiking response to integrated inputs is encoded in
the spiking patterns i.e. the frequency and sequence. The response of the neuron is then sent down its
axon, to thousands of other neurons. The circuitry in the cerebral cortex is that which is of interest in
this study and would be elaborated on much later. We would now examine in summary the
electrophysiology of the neuron which enables it to perform its function.
2.1.1 ELECTROPHYSIOLOGY OF A NEURON
Thousa ds of spikes arri e at the so a of a typi al spiki g euro
ut it does t spike i respo se to
them all. How then does a neuron decide when to respond by generating an action potential or not? The
a s er lies i the stru ture of the so a. Figurati ely, the so a is the
e tral pro essi g u it that
performs an important nonlinear processing step. If the total input exceeds a certain threshold, then an
output signal is generated. The output sig al is take o er y the output de i e , the a o ,
delivers the signal to other neurons [3]. When a neuron generates a signal, it is said to have fired.
5|Page
hi h
2.1.1.1 Membrane Potential and Nernst Equation
Electrical activities are sustained and propagated out via ionic currents through the cell membrane of
the soma. The important factor is the electrochemical gradient between the interior of a neuron and
the surrounding extracellular medium. The cell membrane contains ionic channels that are permissive to
a two-way movement of ions. The predominant ions in any ionic current are Sodium (
(
), Calcium (
) and Chlorine (
), Potassium
). Ionic channels regulate the flow of these ions by opening and
closing in response to voltage fluctuations and internal and external signals.
The concentration of these ions in the interior of the cell differs to that of the surrounding medium and
these electrochemical gradients or the driving force of neural activity. The surrounding medium of the
neuron has a high concentration of
and
and also a relatively high concentration of
. The
interior however has a high concentration of K. Ions diffuse down concentrations gradients in an
attempt to counter the imbalance.
in the cell body diffuses out of the soma and thus producing the
outward current, leaves the cell with a net negative charge. The negative and positive charges
accumulate on the opposite sides of the cell membrane thereby creating what is called trans-membrane
potential or membrane voltage. This potential slows down further diffusions of
as it is attracted to
the negative interior and repelled by the positive exterior. A point is reached when the positive exterior
balances the negative interior i.e. the concentration gradient and the electric potential exert equal and
opposite forces that counterbalance each other and the net cross-membrane current is zero. The value
of such equilibrium potential varies for different ionic species and is given by the Nernst equation:
where
and
are concentrations of the ions inside and outside the cell, respectively;
the universal gas constant
;
is Faraday s o sta t
⁄
2.1.1.2 Hyperpolarization and Depolarization
is
is temperature in degrees Kelvin
⁄
, is the valence of the ion.
Under resting conditions like this, the potential inside the cell membrane is about
relative to
that of the surrounding bath (which is conventionally taken to be 0 mV), and thus the cell is said to be
polarized [1]. Ionic pumps located on the cell membrane maintain this membrane potential difference.
When positive ions flows out of the cell or negative ions flows into the cell though open channels, the
membrane potential becomes more negative thus casing the cell to be more polarized and this is called
6|Page
hyperpolarization. Conversely, current flow into the cell changes the membrane potential and driving it
to be less negative or even to a positive value, the cell is said to be depolarized. When depolarization
persists above a certain critical potential called the threshold, then a positive feedback processes is
initiated and a spike is generated. Thus action potentials are caused by the depolarization of the cell
membrane beyond threshold. This depolarization can be brought about by a variety of ionic currents. It
should be noted that depolarizing the membrane would not lead to an action potential until it crosses
the threshold. Therefore a tio pote tials are said to e all or o e . This ge erated er e i pulse is
then transmitted down the axon. The insulation provided by myelin maintains the ionic charge over long
distances. Nerve impulses are propagated at specific points along the myelin sheath; these points are
called the nodes of Ranvier.
2.1.1.3 Action Potential
As previously stated, the Nernst potential for each of the ion is the potential that opposes them from
passing across the cell membrane. An attempt would be made here to show how the relationship
between the Nernst potential of these ions and the membrane potential occur and how their interplay
results in an action potential.
2.1.1.4 Ionic Currents and Conductances
The membrane of the soma is selectively permeable to ionic currents. It has specialized proteins which
acts as pumps specifically for different ions. If we take a characteristic neuron with potassium-sodium
pump, potassium channel and sodium channel, the pump works continually to maintain the
concentration gradient. Using Nernst equation, at a temperature of
62
. If both the channels are closed, then the membrane potential
we open the potassium channels and those of sodium stays closed,
,
is
would be
and
is
. Now assuming
would flow out of the cell down
the concentration gradient thus driving the interior of the cell to a negatively charged state and at
equilibrium,
If we represent the
.
current that drove V from 0 to 80 as
channels is proportional to an electric conductance
, then we know that
. The driving force of K is thus the difference between
More generally,
7|Page
, and taking that the number of open
and
i.e.
would flow as long as
thus;
It is noteworthy that as soon as
,
stops flowing although the potassium channels are still
open. Now calculating from 3 above, we see that the driving force on
is quite high as the
membrane potential is quite negative compared to the sodium equilibrium potential but despite this,
there is no inflow of
because the sodium channels are not opened. But if we open the sodium
channels, the membrane becomes more permissibly to
due to the strong driving force and
would flow into the cell thus depolarizing it. Taking that the membrane is more permissibly to sodium
than potassium,
would depolarize the membrane until V approaches
, 62mV.
Figure 2.2: Ion flow regulation by special cell membrane protein channels
Therefore, by the switching of the permeability of the membrane from
to
2
, the membrane
potential was reversed rapidly. Thus if the depolarization persists above the threshold, then an action
potential is generated which corresponds with saying that a change of phase (bifurcation) occurred.
Now if the sodium channels were to suddenly close and those of potassium remains opened, the
membrane suddenly becomes more permeable to
towards a polarized state, till
2
. Thus it flows again out of the cell, driving it
again.
This image has been released into the public domain by its author, LadyofHats.
http://en.wikipedia.org/wiki/File:Complete_neuron_cell_diagram_en.svg
8|Page
Taking into account all other ionic currents,
that the total current , flo i g a ross the
and
e
a d follo i g Kir hhoff s La
ra e e e ual to the su
of the
e
hi h stipulates
ra e s apa iti e
current CV and all the ionic currents, we have;
where ̇
⁄
̇
is the derivative of the membrane voltage with time. This arises because it takes time
to charge the membrane.
Rewriting each of the currents in the form of (3) and then making the capacitive current of the voltage
the subject of the relation, we have a dynamical equation of the form: [4]
2.2 THE SYNAPSE
Neurons interact via synapses. These are tiny gaps where the axon of the presynaptic neuron makes
o ta t
ith the de drite of the postsy apti
euro . The site where the axon of the presynaptic
neuron meets the dendrites of the postsynaptic neuron is termed the synapse. A synapse could either
be chemical or electrical.
Figure 2.3: Diagram of the synapse showing the on-site activities during signal transfer
9|Page
At a he i al sy apse, the a o al ter i als are t dire tly i
o ta t
ith the de drites of the target
postsynaptic neuron. They come very close to them leaving a very tiny gap known as the synaptic cleft.
The synaptic cleft, however, is not simply a space to be traversed; rather, it is the site of extracellular
proteins that influence the diffusion, binding, and degradation of molecules secreted by the presynaptic
terminal [5].
2.2.1 Synaptic Transmission
When an action potential arrives at the axonal terminals, it triggers small presynaptic vesicles in the cell.
These vesicles contain chemicals called neurotransmitters which are then released into the synaptic
cleft. As soon as these neurotransmitters reach the postsynaptic neuron, they are detected by
specialized receptors in the membrane causing ion-channels to open. Depending on the nature of the
change in potential in the postsynaptic neuron brought about by the inflow of extracellular fluids, the
synapses could either have an excitatory (depolarizing) effect on the postsynaptic neuron or inhibitory
(typically hyperpolarization) effect. When the effect is depolarizing, the potential induced in the
postsynaptic neuron is called excitatory postsynaptic potential (EPSP), otherwise, it is called inhibitory
postsy apti pote tial IP“P . The ature of the ha ge
either as all or o e . Thus,
ould the
e read out at the a o hillo k
e ha e t o types of sy apses depe di g o their i flue e o the
postsynaptic neuron, whether it encourages firing or it discourages it. They are called excitatory and
inhibitory synapses.
2.2.2 Synaptic weight
Connections between neurons can be reinforced of discouraged, depending on the frequency of signal
transmissions via these connections. The strength of these connections is called weight. Synaptic
weights bear the signatures of previous interactions in them, indicating that their strength is a
re i is e e or
e ory of the history of a ti ities. Co
e tio s that are rarely used slo ly de ays,
making it less likely that signal transmissions would be sent via them, while paths that have high
frequency of activities would be strengthened thus making increasing the probability of signals being
transmitted through them.
The a ility of sy apti
o
e tio sites to
odify its stre gth a ordi g to the fre ue y of use is
called synaptic plasticity. Since the 1970s, a large body of experimental results on synaptic plasticity has
ee a u ulated. Ma y of these e peri e ts are i spired y He
s postulate
hi h des ri es ho
the connection between two neurons should be modified [3]. He specified that if neuronal activity
patterns correspond to behavior, then the stabilization of specific patterns implies the learning of
specific types of behaviors [6].
10 | P a g e
The plastic property of synaptic connections is quite crucial; scientists has been able to confirm
experimentally that they determine to a great extend the manner in which the brain processes the pool
of information it takes in through the senses and as such associated to learning and memory [7,8]. In
neural network simulations, this property is mimicked in a variable called the connection weight and it is
used in learning processes.
2.3 THE NEOCORTEX
The neocortex is the portion of the brain responsible for language, perception, imagination,
mathematics, art, music, planning, and all other aspects necessary for an intelligent system [100]. It has
been and still is the focus of intense research for decades. It holds virtually all our skills, memories,
knowledge and experiences. Approximately 75-85% of the neurons in the neocortex are pyramidal
cells (pyramid-shaped), characterized by a broad base at the bottom, and an apex that points upwards
to the cortical surface. The neurotransmitter of pyramidal neurons is glutamate, which is excitatory.
Most of the axons in the neocortex connect pyramidal neurons with other pyramidal neurons. A large
pyramidal neuron may have 20,000 synapses (the average neocortical neuron has 6,000). Non-pyramidal
neurons in the neocortex are referred to collectively as interneurons. Most of these interneurons
(smooth stellate, basket cells, chandelier cells and double bouquet cells) use the inhibitory
neurotransmitter gamma-amino butyric acid (GABA) The different types of smooth interneurons are
characterized by their axonal ramifications, particular synaptic connectivity, and the expression of a
variety of cotransmitters, neuroactive peptides, or calcium-binding proteins [9]. The other common
interneuron is the spiny stellate cell, which is excitatory. The average cortical neuron is idle 99.8% of the
time. Anatomically, a cortical column consists of six layers. It extends from the surface of the cortex
(layer I) down to the base of the cortex (layer VI) [10].
11 | P a g e
Figure 2.4: Layers in the Neocortex
Lamina I - Molecular layer: This superficial layer consists of terminal branches of dendrites or axons. It
consists of almost no neuron cell bodies.
Lamina II - External granular layer: This layer contains small pyramidal cells and interneurons.
Lamina III - External pyramidal layer: This layer contains typical pyramidal cells (PC) whose projections
are primarily associational or commissural. Communication between other cortical areas originates here
and in lamina II.
Lamina IV - Internal granular layer: This layer is dominated by closely arranged spiny stellate cells (SSCs)
and is a primary target of thalamic afferents. It tends to be the thickest in primary sensory cortex and it
is virtually missing in the motor cortex. Its thickness is such that it is further divided into 4A, 4B AND 4C.
Cell density is also very high in the primary visual cortex.
Lamina V - Internal pyramidal layer: This layer contains larger pyramidal cells that are often the longprojecting neurons intermingled with interneurons.
12 | P a g e
Lamina VI - Multiform layer: it consists of pyramidal neurons and neurons with spindle-shaped bodies.
Most cortical outputs leading to the thalamus originates here.
Usually, neurons transverse the multiple cortical layers but are said to belong to the layer wherein their
cell body is situated. This however fails to capture the locations of their axons or de drites. It is t
foreig for t o euro s to
ake a sy apti
o
e tio i a layer here their ell ody is t lo ated i .
PC and SSC are excitatory neurons. PCs are situated in layers II to VI and SSCs are situated with layer IV
of the primary sensory cortex areas. They both have spiny dendrites and are such called spiny cells. SSCs
lack the vertically oriented apical dendrite of the PCs so they do not get input from the superficial layers
and only establish local intra-cortical connections
The basic unit of the mature neocortex is the horizontal minicolumn, a narrow chain of neurons
extending vertically connections across the cellular layers II–VI, perpendicular to the pial surface [11]. In
the
s Ver o Mou t astle pro ided physiologi al e ide e for the columnar organization of the
somatic sensory cortex of the cat and monkey [12]. Columnar organization allows for intermittently
recursive mapping, so that two or more variables can be mapped to the single x–y dimension of the
cortical surface [13, 14, 15]. An important distinction is that the columnar organization is functional by
definition, and reflects the local connectivity of the neocortex. Connections up and down within the
thickness of the cortex are much denser than connections that spread from side to side. The neocortex
is hierarchically structured [16]. This implies that there is an abstract sense of above and below in the
hierarchy. There are afferent paths from sensory inputs to the lower cortical regions (lower levels of the
hierarchy) and from the lower cortical regions, information flows to the higher ones [17].
The neocortex contains millions of these cortical columns and it owes its structural uniformity to the
seemingly identical nature of these functional components [11]. Because of this property, the regions of
the cortex that handle auditory inputs appear very similar to the regions that handle visual and other
inputs. This uniformity suggests that even though different regions specialize in different tasks, they
employ the same underlying algorithm [17].
2.4.1 Cortical Columns
Half a century ago, Mountcastle et al. (1955) made an observation while recording from cat
somatosensory cortex. They noted that all cells in a given vertical electrode penetration responded
either to superficial (skin, hair) or deep (joint, fascia) stimulation It appeared that for a common
re epti e field lo atio
sensory modalities.
13 | P a g e
e.g. the at s foreleg , ells ere segregated i to do ai s represe ti g differe t
These cortical columns are the basic functional unit of the neocortex [18]. They vary between 300 and
μ
i tra s erse dia eter, a d do ot differ sig ifi a tly i size et ee
rai s that ary i size
over three orders of magnitude [19]. They are made up of minicolumns. These minicolumns bound
together with many short-ranged horizontal connections to form cortical columns [11]. The main
function of individual minicolumns is to enhance contrast. Connectivity between modules binds
different experiences into a cognitive whole.
Cortical columns make horizontal connections with adjacent columns and also with columns far across
the cortex. They monitor activities of nearby columns within the same network with these horizontal
connections and can modify the synaptic connections of their neurons to identify features from the data
that are not being detected from the other columns [20]. The horizontal connections have been
hypothesized to be responsible for dimensionality reduction, which plays a role during the learning
process [17].
Cortical columns are vertical arrangement (not necessarily of circular shape) of neurons that have similar
response properties. They have distinctive patterns of circuitry. The vast majority of wiring is local, that
is, between neurons within the same columns and a minority of these wiring establish connections
between columns [21]. The former type constitutes intracolumnar wiring and the latter intercolumnar
wiring.
As pointed out by Silberberg et al [22], cortical microcircuits show stereotypy; referring to a repeating
pattern of structural and/or functional features in terms of cell types, cell arrangements and patterns of
synaptic connectivity.
Attempts are being made to model systems that mimic the computational efficiency of the neocortex
due to its many impressive properties such as attention, and plasticity. It remains an open question to
what extend neuronal physics and cortical architecture could account for the exquisite computational
abilities of the human brain, and how to derive from it templates of efficient computation [23].
The brain is intrinsically a dynamic system, in which the traffic between regions, during behavior or even
at rest, creates and reshapes continuously complex functional networks of correlated dynamics. Like
every other complex networks, it is a pool of nodes and links with nontrivial topological properties [24,
25]. It is the nature of complex systems that details at a lower level does not necessarily explain the
whole at a higher level. As such, it is a rational pursuit to strip down as it were this biological processing
unit to its basic functional components and then study these components. In this study, like some other
studies done previously e.g., ref [23, 17], computer models of the cortical columns, a typical
14 | P a g e
input/output information processing unit, are synonymous to nodes and the wiring (both intra- and
inter-columnar), governed by some connection probability law, are identical to the links of the network.
15 | P a g e
This page was left blank intentionally
16 | P a g e
CHAPTER 3
NETWORK THEORY
Network theory is an area of applied mathematics and part of graph theory. Network theory concerns
itself with the study of graphs as a representation of either symmetric relations or, more generally, of
asymmetric relations between discrete objects. A net ork thus is a set of ite s, hi h e ll all odes,
which have connections between them called edges (see Figure 3.1).
Networks are all around us, and we are ourselves, as individuals, the units of a network of social
relationships of different kinds and, as biological systems, the delicate result of a network of biochemical
reactions [26]. It could be in terms of the abstract texture of the collaboration between individuals or
interactions between friends or family members. It could be concrete in terms of transportation
network, pipeline network, power gird network, etc.
17 | P a g e
Figure 3.1: A simple network showing sample connectivity
3.1 BRIEF HISTORICAL BACKGROUND
The study of networks, in the form of mathematical graph theory, is one of the fundamental pillars of
dis rete
athe ati s. Euler s ele rated
solutio of the Ko igs erg ridge pro le
is ofte
ited
as the first true proof i the theory of et orks, a d duri g the t e tieth e tury graph theory has
developed into a substantial body of knowledge. The study of networks or graph theory has since
blossomed into a very interesting and dense one, from the social network studies of the early 1920s
which focused on the interactions typified in social entities such as communication between members in
a group to trade among nations, to the most recent catalyst in the work of Strogatz and Watts published
in 1998, one can clearly see the evolution of approaches in network theory. Good to note also is the
various backgrounds of the different persons that have contributed to the development of the field. The
works of Strogatz and Watts which appeared in Nature and that of Barabasi and Albert on scale-free
networks which appeared in Science a year later played a key role in the birth of a novel movement of
interest and research in the study of complex networks [26].
Quoting from the work of Albert and Barabasi (see ref [27 , the dra ati ad a es that has ee
witnessed in the past few years have been prompted by several parallel developments. First, the
computerization of data acquisition in all fields led to the emergence of large databases on the topology
of various real networks. Second, the increased computing power allowed us to investigate networks
containing millions of nodes, exploring questions that could not be addressed before. Third, the slow but
noticeable breakdown of boundaries between disciplines offered researchers access to diverse
databases, allowing them to uncover the generic properties of complex networks. Finally, there is an
18 | P a g e
increasingly voiced need to move beyond reductionist approaches and try to understand the behavior of
the syste
as a hole.
3.2 NETWORKS IN THE REAL WORLD
As stated earlier, we are surrounded by networks of various types. Recent works on the mathematics of
networks has been driven largely by observations of the properties of actual networks and attempts to
model them. Broadly speaking, these networks can be categorized into some four types and these
would be briefly discussed.
3.2.1 Social networks
A social network is a set of people or groups of people with some pattern of contacts or interactions
between them [28]. The pattern of friendship, sexual habits among a select group, inter-marriages
between families to mention a few are all examples of social network. An important contribution to
social network analysis came from Jacob Moreno who introduced sociograms in the 1930s. The famous
si degrees of separatio
o ept
hi h is a
a ifestatio of s all
orlds
as u o ered y the
social psychologist Stanley Milgram [29].
3.2.2 Information Network
Information networks are also called knowledge networks. One of the most popular is the citations
between papers. Most learned articles cite previous work by others on related topics. These citations
form a network in which the vertices are articles and a directed edge from article A to article B indicates
that A cites B [30]. The World Wide Web is another example of this kind of network. It represents the
largest network for which topological information is currently available [27]. The nodes of the network
are the do u e ts
e pages a d the edges are the hyperli ks U‘L s that poi t fro
o e do u e t
to another. Other examples include peer-to-peer network and citations between US patents.
3.2.3 Technological Networks
These are man-made networks which are typically designed for the distribution of some sort of
commodity between locations of interest. A classic example is the London rail network which links key
parts of the city in an efficient manner to facilitate the movement of people from one point to the other.
Other examples include but not limited to the power grid, the network of airports with linked flights,
road networks, and gas pipe networks.
19 | P a g e
Figure 3.2: The World-Wide Web
Figure 3.3: Citation distribution from 783,000 papers [31]
3.2.4 Biological Networks
A number of biological systems can be successfully represented by networks. Perhaps the classic
example of a biological network is the network of metabolic pathways, which is a representation of
metabolic substrates and products with directed edges joining them if a known metabolic reaction exists
that acts on a given substrate and produces a given product. Another example is the protein-protein
interaction. Metabolic reactions are catalyzed and regulated by enzymes. Most enzymes are proteins,
which are biological polymers (chain of simple units, called monomers) constructed from 20 different
kinds of amino acids. The function of a protein is determined by a three-dimensional folding structure
attributed by the sequence of amino acids. Proteins have modular units called domains, which have
compactly folded globular structures. Domains are often connected with open lengths of polypeptide
chains. Furthermore, individual proteins often serve as subunits for the formation of larger molecules,
20 | P a g e
called protein assemblies or protein complexes. Proteins are synthesized by ribosomes. The sequence of
amino acids for a specific protein is determined exactly by the information encoded in the matching
gene [26].
Neural networks are one of the most researched today. This is a study of the web-like network of the
brain. This network is studied at different levels and scales in order understand the structural and
functional make-up of the brain. Some of the research focuses are; understanding the neural basis of
diseases associated with the nervous system, neural coding and neural information processing. Neural
networks, which are the central theme of this thesis, would be elaborated on later. Other examples
include gene regulatory networks, cell cycle, disease network (Study offers network-based explanation
for complex disorders: a phenotype correlating with malfunction of a particular functional module) and
food-chain networks.
3.3 NETWORK THEORY PROPERTIES
3.3.1 Definitions and Notations
A graph
consists of two sets
and , such that
are its links (or edges, or lines). The number of
graph , while the elements of
and
are denoted by
or simply
or
all links of
as
and , respectively. A graph is thus indicated as
. A node is usually referred to by its order i in the set N. If there exists a
of
graph
is a set of unordered (ordered)
are the nodes (or vertices, or points) of the
pairs of elements of N. The elements of
elements in
and
, SUCH is a subgraph such that
that join two nodes in
, then
and
. If
is said to be the subgraph induced by
contains
and is denoted
. A subgraph is said to be maximal with respect to a given property if it cannot be
extended without losing that property. Of particular relevance for some of the definitions given in the
following subsections is the subgraph of the neighbors of a given node , denoted as
the subgraph induced by
.
is defined as
, the set of nodes adjacent to , i.e.
The following are definitions of some terminologies.
Directed/undirected: An edge is directed if it runs in only one direction (such as a one-way road
between two points), and undirected if it runs in both directions. . In an undirected graph, each
of the links is defined by a couple of nodes i and j, and is denoted as (i, j) or
. The link is said to
be incident in nodes i and j, or to join the two nodes; the two nodes i and j are referred to as the
end-nodes of link (i, j). Two nodes joined by a link are referred to as adjacent or neighboring. In
21 | P a g e
a directed graph, the order of the two nodes is important:
and
stands for a link from i to j,
.
Figure 3.4: Directed and Undirected Graphs showing number of links for a select node.
Degree: The number of edges connected to a node. Note that the degree is not necessarily
equal to the number of nodes adjacent to a node, since there may be more than one edge
between any two nodes. A directed graph has both an in-degree and an out-degree for each
node, which are the numbers of in-coming and out-going edges respectively.
Component: The component to which a node belongs is that set of nodes that can be reached
from it by paths running along edges of the graph. In a directed graph a node has both an incomponent and an out-component, which are the sets of nodes from which the vertex can be
reached and which can be reached from it.
Geodesic path: A geodesic path is the shortest path through the network from one node to
another. Note that there may be and often is more than one geodesic path between two nodes.
Diameter: The diameter of a network is the length (in number of edges) of the longest geodesic
path between any two vertices.
3.3.2 Node degree, degree distribution and correlations
The degree or connectivity
of a node
in terms of the adjacency matrix A as:
∑
22 | P a g e
is the number of edges incident with the node, and is defined
If the graph is directed, the degree of the node has two components: the number of outgoing links
∑
∑
(known as the out-degree of the node), and the number of ingoing links
(known as the in-degree of the node). The total degree is therefore defined as
a
list of the node degrees of a graph is called the degree sequence.
The most basic topological characteristic of a graph
distribution
degree
can be obtained in terms of the degree
, which is defined as the probability that a node chosen uniformly at random has
or, equivalently as the fraction of nodes in the graph having degree . Alternatively, the degree
distribution can be denoted as
, to indicate that the variable
For directed networks, one need to consider two distributions, (
assumes non-negative integer values.
) and
. Information on how
the degree is distributed among the nodes of an undirected network can be obtained either by a plot of
or by the calculation of the moments of the distribution. The n-moment of
The first moment
is defined as:
∑
is the mean degree of . The second moment measures the fluctuation of the
connectivity distribution.
3.3.3 Shortest path lengths, diameter and betweenness
Shortest paths play an important role in the transport and communication within a network. It is useful
to represent all the shortest path lengths of a graph G as a matrix D in which the entry
of the geodesic from node to node . The maximum value of
and will be indicated in the following as
is the length
is called the diameter of the graph,
. A measure of the typical separation between two
nodes in the graph is given by the average shortest path length, also known as characteristic path
length, defined as the mean of geodesic lengths over all couples of nodes:
∑
The communication of two non-adjacent nodes, say
and , depends on the nodes belonging to the
paths connecting j and k. Consequently, a measure of the relevance of a given node can be obtained by
counting the number of geodesics going through it, and defining the so-called node betweenness.
Together with the degree and the closeness of a node (defined as the inverse of the average distance
23 | P a g e
from all other nodes); the betweenness is one of the standard measures of node centrality. More
precisely, the betweenness
Where
of a node
, sometimes referred to as the load, is defined as:
∑
is the number of shortest paths connecting and
paths connecting and
, while
is the number of shortest
passing through .
The concept of betweenness can be extended also to the edges. The edge betweenness is defined as the
number of shortest paths between pairs of nodes that run through that edge [32].
3.3.4 Clustering
In many networks it is found that if node A is connected to node B and node B to node C, then there is a
heightened probability that node A will be also be connected to node C. In terms of network topology,
clustering (also called transitivity) means the presence of a heightened number of triangles in the
network – sets of three nodes each of which is connected to each of the others. Transitivity means the
heightened of triangles in the network; a set of three nodes each of which are connected to each other.
The factor 3 in the numerator compensates for the fact that each complete triangle of three nodes
contributes three connected triples; one centered on each of the three nodes, and ensures that
, with
for
.
Another way is to define a clustering coefficient as introduced by Watts and Strogatz (see ref [1.28]). A
local clustering coefficient
of a node
is first introduced to express how likely
for two
neighbors j and m of node I and can be defined by:
∑
The clustering coefficient of the graph is then given by the average of
By definition,
24 | P a g e
and
∑
over all the nodes of
:
3.3.5 Graph Spectra
Any graph
with
nodes can be represented by its adjacency matrix
whose value is
if nodes and are connected and
is the set of eigenvalues of its adjacency matrix
with
elements
,
if otherwise. The spectrum of graph G
. A graph with
nodes has
eigenvalues
, and
it is useful to define its spectral density as
∑ (
)
which approaches a continuous function as
.
The eigenvalues and associated eigenvectors of a graph are intimately related to important topological
features such as the diameter, the number of cycles, and the connectivity properties of the graph.
3.3.6 Graph Laplacian
One defines the graph Laplacian ̂ via
∑
Where the
{
( ̂) are elements of the graph Laplacian matrix,
the adjacency matrix, and where
are the degree of node .
3.4 NETWORKS MODELS
3.4.1 Random Graphs
The theory of random graphs was introduced by Paul Erdös and Alfred Rényi. They discovered that
probabilistic methods were often very useful in tackling problems in graph theory. In their first article,
Erdös and Rényi proposed a model to generate random graphs with N nodes and K links, that we will
henceforth call Erdös and Rényi (ER) random graphs and denote as
. Starting with N disconnected
nodes; ER random graphs are generated by connecting couples of randomly selected nodes, prohibiting
multiple connections, until the number of edges equals K [8]. We emphasize that a given graph is only
one outcome of the many possible realizations, an element of the statistical ensemble of all possible
combinations of connections. For the complete description of
one would need to describe the
entire statistical ensemble of possible realizations, that is, in the matricial representation, the ensemble
of adjacency matrices [34]. An alternative model for ER random graphs consists in connecting each
25 | P a g e
couple of nodes with a probability
. This procedure defines a different ensemble, denoted as
and containing graphs with different number of links: graphs with K links will appear in the
ensemble with a probability
.
Figure 3.4: Illustration of a random graph construction with the probability of connection. (a) Initially 20 nodes are
isolated. (b) Pairs of nodes are connected with a probability of
of selecting an edge. In this case (b
,
(c)
random graphs are the best studied among graph models. The structural properties of ER random
graphs vary as a function of
showing, in particular, a dramatic change at a critical probability
,
.
corresponding to a critical average degree
Erdös and Rényi proved that [33, 35]:
• if
, then almost surely, i.e. with probability tending to one as
no component of size greater than
, and no component has more than one cycle;
• if
, then almost surely the largest component has size
• if
, the graph has a component of
component has more than
The transition at
tends to infinity, the graph has
(
⁄
with a number
);
of cycles, and no other
nodes and more than one cycle.
has the typical features of a second order phase transition. In particular, if one
considers as order parameter the size of the largest component, the transition falls in the same
universality class as that of the mean field percolation transitions. The probability that a node i has
edges is the binomial distribution given as:
26 | P a g e
where
is the probability for the existence of
absence of the remaining
edges,
, is the probability for the
is the number of different ways of
edges, and
selecting the end points of the k edges. Since all the nodes in a random graph are statistically equivalent,
each of them has the same distribution, and the probability that a node chosen uniformly at random has
degree k has the same form as
. For large
, and fixed
, the degree distribution is well
approximated by a Poisson distribution:
For this reason, ER graphs are sometimes called Poisson random graphs. ER random graphs are, by
definition, uncorrelated graphs, since the edges are connected to nodes regardless of their degree.
|
Consequently,
and
are independent of
.
⁄
Concerning the properties of connectedness, when
, almost any graph in the ensemble
is totally connected [33], and the diameter varies in a small range of values around
⁄
[35]. The average shortest path length L has the same behavior as a
function of N as the diameter
. The clustering coefficient of
⁄
edges among the neighbors of a node with degree
since
is equal to
is the probability of having a link between any two nodes in the graph and,
consequently, there will be
maximum possible number of
[36]. Hence, ER random graphs have a vanishing
limit of large system size. For large
and
, out of a
in the
, the bulk of the spectral density of ER random graphs
converges to the distribution [27]:
{
This is i agree e t
√
| |
√
ith the predi tio of the Wig er s se i ir le la
random matrices [37]. The largest eigenvalue (
increases with the network size as
law, and its odd moments
. For
etri u orrelated
) is isolated from the bulk of the spectrum, and it
, the spectral density deviates from the semicircle
are equal to zero, indicating that the only way to return back to the
original node is traversing each edge an even number of times.
27 | P a g e
for sy
3.4.2 Small World Networks
A small-world network refers to an ensemble of networks in which the mean geodesic (i.e., shortestpath) distance between nodes increases sufficiently slowly as a function of the number of nodes in the
network. The best known family of small-world networks was formulated by Duncan Watts and Steve
Strogatz in a seminal 1998 paper [36] that has helped network science become a medium of expression
for numerous physicists, mathematicians, computer science, and many others. In fact, the term "smallworld networks" (or the "small-world model") is often used to mean Watts-Strogatz (WS) networks or
variants thereof. Watts and Strogatz [36] (see fig below) studied a simple model denoted as
that
can be tuned through this middle ground: a regular lattice where the original links are replaced by
random ones with some probability
.
Figure 3.5: The random rewiring procedure of the Watts-Strogatz model, which interpolates between a regular
ring lattice and a random network without altering the number of nodes or edges. Starting with
each connected to its four nearest neighbors. For
becomes increasingly disordered until for
nodes
the original ring is unchanged; as p increases the network
all edges are rewired randomly [36].
They found that the slightest bit of rewiring transforms the network into a 'small world', with short
paths between any two nodes, just as in the giant component of a random graph. Yet the network is
much more highly clustered than a random graph, in the sense that if A is linked to B and B is linked to C,
there is a greatly increased probability that A will also be linked to C. The random graph produced from
the rewiring process has a constraint that each node has a minimum connectivity
28 | P a g e
.
The ri h ess of the W“
odel has sti ulated a i te se a ti ity ai ed at u dersta di g the et ork s
properties as a function of the rewiring probability
and the network size N [38]. As observed in [36],
the small-world property results from the immediate drop in
as soon as
is slightly larger than
zero. This is because the rewiring of links creates long-range edges (shortcuts) that connects otherwise
distant nodes. The effect of the rewiring procedure is highly nonlinear on , and not only affects the
earest eigh or s stru ture, ut it also ope s e shortest paths to the e t-nearest neighbors and so
on. Conversely, an edge redirected from a clustered neighborhood to another node has, at most, a
linear effect on . That is, the transition from a linear to a logarithmic behavior in
one associated with the clustering coefficient
(but non-zero) values of
The change in
is faster than the
. This leads to the appearance of a region of small
, where one has both small path lengths and high clustering.
soon stimulated numerical and analytical work [39], aimed at inspecting whether
the transition to the small-world regime takes place at a finite value of
phenomenon at any finite value of
, or if there is a crossover
with the transition occurring at
. This latter scenario
turned out to be the case. To see this, we follow the arguments by Barrat and Weigt [39], and Newman
and Watts [38]. We assume that
is kept fixed and we inspect the dependency of
. For small
system sizes, the number of shortcuts is less than 1 on average and the scaling of
is linear with
the system size. However, for larger values of N, the average number of shortcuts eventually becomes
greater than one and
starts scaling as
. Similar to the correlation length behavior in
conventional statistical physics, at some intermediate system value
, where the transition
occurs, we expect
should obey the finite-
. Additionally, close to the transition point,
size scaling relation [13, 14]:
where
is a universal scaling function that obeys to:
. Let us suppose now that
(
as
⁄
⁄
)
if
and assume
⁄
(
⁄
)
⁄
and
if
. Thus from, it follows that
(
⁄
)
.
Barrat and Weigt have obtained a simple formula that fits well the dependency of
observed in the
numerical simulations of the WS model [39]. The formula is based on the fact that for
. Then, because of the fact that with probability
29 | P a g e
edges are not rewired,
two neighbors that were linked together at p=0 will remain connected with probability
corrections of order
up to
. From here, we get:
̃
where ̃
is redefined as the ratio between the average number of edges between the neighbors of a
vertex and the average number of possible links between the neighbors of a vertex. As for the degree
distribution, when
it is a delta function positioned at
, while for
it is similar to that of
an ER-network. For intermediate , the degree distribution is given by [39]:
∑
for
and is equal to zero for
smaller than
.
3.4.3 Scale-Free Networks
A scale-free network is a network whose degree distribution follows a power law, at least
asymptotically. That is, the fraction
of nodes in the network having k connections to other nodes
goes for large values of k as
where
is a parameter whose value is typically in the range 2 <
< 3, although occasionally it may lie
outside these bounds.
Empirical results [40, 41, 42, 43] demonstrate that many large networks are scale free, that is, their
degree distribution follows a power law for large k. Furthermore, even for those networks, for which
has an exponential tail, the degree distribution significantly deviates from a Poisson distribution.
It s ee sho
that ra do -graph theory and the WS model cannot reproduce this feature [27]. While
it is straightforward to construct random graphs that have a power-law degree distribution, these
constructions only postpone an important question: what is the mechanism responsible for the
emergence of scale-free networks? Answering this question will require a shift from modeling network
topology to modeling the network assembly and evolution. While at this point these two approaches do
30 | P a g e
not appear to be particularly distinct, there is a fundamental difference between the modeling approach
taken in random graphs and the small-world models, and the one required to reproduce the power-law
degree distribution. While the goal of the former models is to construct a graph with correct topological
features, the modeling of scale-free networks will put the emphasis on capturing the network dynamics.
By mimicking the dynamical mechanisms that assembled the network, one will be able to reproduce the
topological properties of the system as we see them today. Dynamics takes the driving role, topology
being only a byproduct of this modeling philosophy.
Recent interest in scale-free networks started in 1999 with work by Albert-László Barabási and
colleagues at the University of Notre Dame who mapped the topology of a portion of the World Wide
Web, finding that some nodes had many more connections than others and that the network as a whole
had a power-law distribution of the number of links connecting to a node. After finding that a few other
networks, including some social and biological networks, also had heavy-tailed degree distributions,
Barabási and collaborators coined the term "scale-free network" to describe the class of networks that
exhibit a power-law degree distribution. They argued that the scale-free nature of real networks is
rooted in two mechanisms shared by many real networks; growth and preferential attachment.
Growth and preferential attachment are the two basic ingredients captured by the Barabasi Albert (BA)
is constructed as follows. Starting with
model. An undirected graph
isolated nodes, at each
time step
a new node j with
links is added to the network. The
probability that a link will connect
to an existing node
is linearly proportional to the actual degree
of :
∏
∑
Because every new node has
links, the network at time t will have
links, corresponding to an average degree
nodes and
for large times.
Derek de Solla Price developed a model in 1976 [44] that share some similarities with the BA model. His
idea was that the rate at which a paper gets new citations should be proportional to the number that it
already has. This is easy to justify in a qualitative way. The probability that one comes across a particular
paper whilst reading the literature will presumably increase with the number of other papers that cite it,
and hence the probability that you cite it yourself in a paper that you write will increase similarly. The
same argument can be applied to other networks also, such as the Web. It would suffice to note here
31 | P a g e
that the model has two main differences with respect to the BA model: it builds a directed graph, and
the number of edges added with each new node is not a fixed quantity.
Given below are various analytical approaches that have been used to address the dynamical properties
of scale-free networks:
Continuum approach introduced by Barabási and Albert [45], and Barabási et al [40]:
It calculates the time dependence of the degree
dynamical equation which
introduction has
satisfies, with the initial condition at every node
at its
, is given as
( )
Master-equation approach introduced by Dorogovtsev et al [46]:
Studies the probability
that a time
The equation governing
a node
introduced at time
has a degree
.
for the BA model has the form
(
of a given node . The solution of the
)
Rate-equation approach introduced by Krapivsky et al [47]:
It focuses on the average number
enters the network the scale-free model,
of nodes with
edges at a time . When a new node
changes as
∑
3.5 WEIGHTED NETWORK
So far we the graphs we have considered have been such that the edges have a binary nature, where
they are either present or not. However, many real networks exhibit a large heterogeneity in the
capacity and the intensity if the connections. Granovetter (1973) (see ref [48]) argued that the strength
of a social tie is a function of its duration, emotional intensity, intimacy, and exchange of services. For
32 | P a g e
non-social networks, the strength often reflects the function performed by the ties, e.g. carbon flow
(mg/m²/day) between species in food webs [49] or the number of synapses and gap junctions in neural
networks [36]. These systems can be better described in terms of weighted networks, i.e. networks in
which each link carries a numerical value measuring the strength of the connection.
Mathematically, a weighted graph
that are
of links (or edges, or lines), and a set of weights
set of
real numbers attached to the links. The number of elements in
respectively. In matricial representation,
matrix whoce
and
of nodes, a
consists of a set
are denoted by
and
is usually described by the so-called weight matrix
is the weight of the link connecting node
are not connected (
and
to
, and
,
,a
if the nodes
, unless explicitly mentioned).
Following is a short list of useful quantities that generalizes and complement concepts already
introduced for unweighted networks, and that combine weighted and topological observables.
3.5.1 Node strength, strength distribution and correlations
A more significant measure of network properties in terms of the actual weights is obtained by
extending the definition of node degree in terms of the vertex strength, defined as [51]
∑
This quantity measures the strength of nodes in terms of the total weight of their connections.
33 | P a g e
Figure 3.6: Network of co-o urre e of ords i ‘euter s e s ire stories for O to er
,
. The idths of
the edges indicate their weights and the colors of the vertices indicate the communities found by the algorithm
described in the text [50].
The strength distribution
with the degree distribution
measures the probability that a vertex has strength , and altogether
, provides the useful information on a weighted network. The
eighted a erage earest eigh ors degrees of a ode
∑
34 | P a g e
can be defined as
This is the local weighted average of the nearest neighbor degree, according to the normalized weight of
the connecting edges
⁄ .
3.5.2 Weighted Clustering Coefficient
This is given by [51]
∑
(
)
In the general case, the weighted clustering coefficient considers both the number of closed triangles in
the neighborhood of node i and their total relative weight with respect to the vertex strength. The factor
is a normalization factor and ensures that
. The quantities
and
are
respectively the weighted node clustering coefficient averaged over all nodes in the graph and over all
nodes with degree k, respectively. In the case of a large randomized network (lack of correlations) it is
. In real weighted networks, however, we can face two
easy to see that
and
opposite cases. If
, we are in presence of a network in which the interconnected triples are
more likely formed by the edges with larger weights. On the contrary,
which the topological clustering is generated by edges with low weight.
35 | P a g e
signals a network in
This page was left blank intentionally
36 | P a g e
CHAPTER 4
SELECTIVE OVERVIEW OF NEURAL DYNAMICS
Neurons, the basic elements of nervous systems, are highly structured and complex cells. Their
elementary constituents and processes have been studied for more than a century, and, although many
important details remain to be pinned down, enough understanding has been gained to build
mathematical models that closely agree with their observed behavior [52].
Figure 4.1: A general depiction of the neuron
37 | P a g e
The methodology used in constructing mathematical neural models varies as widely as the questions to
be answered. Some models implement potentiation to understand biological learning mechanisms.
Other models may intend to reproduce anatomical data acquired from living tissue or reproduce
electrophysiological recordings measured in vitro. Models may be inaccurate or highly accurate, small or
large, from a single neuron to an entire sub-organ. The methods of spatio-temporal approximation,
numerical analysis, and synaptic connectivity all may differ. However, the diversity of these models does
not displace their commonality. In this section, we will introduce some important neural models under
some basic classification and then we will briefly attempt an overview of some neural network models.
4.1 NEURAL MODELS
4.1.1 Physiological Models
These are detailed neuron models which account for numerous ion channels, various types of synapsis
and specific spatial geometry of an individual neuron and are designed to accurately describe and
predict biological processes.
4.1.1.1 Hodgkin- Huxley Model
Alan Hodgkin and Andrew Huxley (1952) developed the first quantitative model of the initiation and
propagation of an electrical signal (the action potential) along a squid giant axon. The figure is a
schematic diagram of the resulting model from their work.
Figure 4.2: Basic components of Hodgkin-Huxley model.
38 | P a g e
The passive electrical properties of the cell membrane are described by a capacity C and a resistor R. The
nonlinear properties are attributed to voltage-independent ion channels for sodium (Na) and potassium
(K). The model comprises of three major currents. They are:
Voltage-gated persistent K+ current with four activation gates (resulting in the term
equation below, where
is the activation variable for
Voltage-gated transient
term
in the
);
current with three activation gates and one inactivation gate (the
below);
Ohmic leak current,
which is carried mostly by
ions.
The complete set of space-clamped Hodgkin-Huxley equations is
̇
̅
̅
̅
(4.1)
Each of the channels is characterized by its conductance. The leaky channel is described by a voltageindependent conductance
⁄
; the conductance of the other ion channels is voltage and time
dependent. If all channels are open, they transmit currents with a maximum conductance
respectively.
Figure 4.3: Tonic spiking activity generated in the Hodgkin-Huxley model
39 | P a g e
or
Normally, however, some of the channels are blocked. The probability that a channel is open is
described by additional variables
and
. The parameters
potentials. The gating variables are given by the following equations:
̇
where
̇
̇
)
(
(
)
(
40 | P a g e
)
,
and
are the reverse
Figure 4.4: Equilibrium function (A) and time constant (B) for the three variables
model. The resting potential is at
in the Hodgkin–Huxley
.
These parameters are as provided in the original Hodgkin and Huxley paper [53] correspond to the
membrane potential shifted by approximately 65
functions
and
, so that the resting potential is at
. The
describe the transition rates between open and closed states of the
channels.
4.1.1.2 The FitzHugh-Nagumo Model
A simplified model of spiking is justified by the observation that both
similar time scale during an action potential, while
and
as well as
change on much slower time scales.
Figure 4.5: Circuit diagram of the tunnel-diode nerve model of Nagumo et al. (1962) [54]
41 | P a g e
evolve on
The motivation for the FitzHugh-Nagumo model was to isolate conceptually the essentially
mathematical properties of excitation and propagation from the electrochemical properties of sodium
and potassium ion flow. The model consists of
a voltage-like variable having cubic nonlinearity that allows regenerative self-excitation via a positive
feedback, and
a recovery variable having a linear dynamics that provides a slower negative feedback.
, the magnitude of stimulus current.
FitzHugh (1961) [55] and, independently, Nagumo et al. (1962) [54] derived the following two equations
The parameters
and
are dimensionless and positive. The amplitude of
the inverse of a time constant, determines how fast
changes relative to
, corresponding to
.
Figure 4.6: Phase portrait and physiological state diagram of FitzHugh-Nagumo model (modified from FitzHugh
1961) [56].
42 | P a g e
The phase portrait of the FitzHugh-Nagumo model in the figure 4.6 depicts
̇
V-nullcline, which is the N-shaped curve obtained from the condition
of
̇ passes through zero,
W-nullcline, which is a straight line obtained from the condition
of
̇ passes through zero, and
̇
, where the sign
, where the sign
some typical trajectories starting with various initial conditions.
The intersection of nullclines is an equilibrium (because ̇
̇
), which may be unstable if it is on
the middle branch of the V-nullcline, i.e., when I is strong enough. In this case, the model exhibits
periodic (tonic spiking) activity.
4.1.1.3 Leaky Integrate-and-Fire Model
The basic integrate-and-fire model was proposed by Lapicque in 1907 [64], long before the mechanisms
that generate action potentials were understood. Despite its age and simplicity, the integrate-and-fire
model is still an extremely useful description of neuronal activity. By avoiding a biophysical description
of the action potential, integrate-and-fire models are left with the simpler task of modeling only
subthreshold membrane potential dynamics. This can be done with various levels of rigor [1].
The basic circuit of an integrate-and-fire model consists of a capacitor
driven by a current
43 | P a g e
(Illustrated in Figure).
in parallel with a resistor
Figure 4.7: Schematic diagram of the integrate-and-fire model. The basic circuit is the module inside the dashed
circle on the right-hand side. A current
(points) is compared to a threshold
. If
part: A presynaptic spike
pulse
(
at time
across the capacitance
an output pulse
is generated. Left
is low-pass filtered at the synapse and generates an input current
). (Adapted from Gerstner and Kistler, Spiking Neuron Models, 2002) [3].
I its si plest for , a euro is
where
charges the RC circuit. The voltage
odeled as a leaky i tegrator of its i put
represents the membrane potential at time
,
is the membrane time constant and R
is the membrane resistance. This equation describes a simple resistor-capacitor
leakage term is due to the resistor and the integration of
circuit where the
is due to the capacitor that is in parallel
to the resistor. The spiking events are not explicitly modeled in the LIF model. Instead, when the
membrane potential
reset to a lower value
reaches a certain threshold
(reset potential) and the leaky integration process described by the above
equation starts anew with the initial value
the LIF
(spiking threshold), it is instantaneously
. To add just a little bit of realism to the dynamics of
odel, it is possi le to add a a solute refra tory period ∆a s i
During the absolute refractory period,
might be clamped to
process is re-initiated following a delay of
after the spike [4].
ediately after
hits
.
and the leaky integration
In integrate-and-fire models the form of an action potential is not described explicitly. Spikes are formal
events characterized y a ‘firi g ti e
44 | P a g e
. The firing time
is defined by a threshold criterion
(
)
Figure 4.8: Stimulation of a LIF neuron by a constant input current: the time-course of the membrane potential,
(left); f-I curve for a LIF neuron with (dashed line:
) and without (solid line:
) an
absolute refractory period (right). Red line indicates the particular I value used in the simulation on the left [57].
4.1.2 Abstract Models
These are models that produce the same input-output (I/O) behavior as a physiological neuron model
but achieve this by replacing the mechanistic expressions of an H-H-like model with an alternate set of
dynamical equations. These equations sacrifice representation of the neuron's internal details in favor of
a more computationally efficient set of dynamical equations that aims only at reproducing the neuron's
input-output behavior with acceptable accuracy. As a model, it is focused on the signal processing
function of the neuron rather than the physiological understanding of its mechanisms.
4.1.2.1 The Izhikevich Model
Using the principles of nonlinear dynamics on the H-H type neuronal models, a system of two
differential equations was arrived at by Izhikevich. The model has been shown to be biologically
plausible and yet computationally efficient making it the toast of many a theorist. The spiking patterns
of the cortical and thalamo-cortical neurons have been obtained with this model [58, 4].
The Izhikevich model is a two-dimensional system of ordinary differential equations given as
45 | P a g e
where
is a variable taken to represent the membrane voltage and
is an input variable taken to
represent either a test current stimulus or synaptic inputs to the neuron. Variable
membrane recovery variable, which accounts for the activation of
ionic currents, and it provides negative feedback to
and
is called the
ionic currents and inactivation of
.
are dimensionless parameters and are explained below [58]:
The parameter
The parameter
describes the time scale of the recovery variable
describes the sensitivity of the recovery variable
to the subthreshold
fluctuations of the membrane potential
The parameter
describes the after-spike reset value of the membrane potential
the fast high-threshold
The parameter
threshold
conductances.
describes after-spike reset of the recovery variable
and
caused by
caused by slow high-
conductances
Figure 4.9: Known types of neurons correspond to different values of the parameters a, b, c, d in the model
described by the (1), (2). RS, IB, and CH are cortical excitatory neurons. FS and LTS are cortical inhibitory
interneurons. Each inset shows a voltage response of the model neuron to a step of dc-current
(bottom). Time resolution is
3
. This figure is reproduced with permission from www.izhikevich.com.
Electronic versions of the figure and reproduction permissions are freely available at www.izhikevich.com.
46 | P a g e
3
4.1.2.2 Rulkov Model
This is another abstract model but it differs from the Izhikevich modeling schema in that it employs
maps as a representation of the dynamics neurons, and hence called a map-based model. It is developed
in difference equation form from the beginning although it shares the similarity of being a product of
nonlinear dynamics with Izhikevich model. Like Izhikevich, Rulkov has shown that the model can
reproduce various types of spiking and spike-bursting activity. The model is given as [59]
where
is the fast dynamical variable and in this case it corresponds to (but does not equal) the
membrane potential of the neuron. Variable
correspondent. Slow time evolution of
. Parameters
and
neuron. Input variables
is the slow dynamical variable and has no physiological
is achieved by using small values of the parameter
control the dynamics and they are set mimic the behavior of a particular type of
and
incorporate the action of synaptic inputs
and also the action
of some intrinsic currents that are not explicitly captured by the model. The nonlinear function is given
by
{
where
⁄
.
Different choices for
, and
define different firing patterns made to correspond to the signaling
types for the particular class of neuron being modeled [60].
47 | P a g e
Figure 4.10: Waveforms of
illustrating the role of the parameter
in the generation of firing responses of the
map model (4.7). A rectangular pulse of an external depolarizing current
of duration 870 iterations and
amplitude A, 2A (200%) and 4A (400%) was applied to excite the activity. The parameter values are:
–
The actual values of
, and
and
–
(A) shows the case βe = 0.0 and (B) βe = 0.133.
from (4.7) are shown with small black circles for the cases of 100% and 200% of the
pulse amplitude. The iterations of
in the top plots are connected with lines (cases of 200% and 400%) [60].
4.2 NEURAL NETWORKS (NNs)
According to the DARPA Neural Network Study (1988, AFCEA International Press, p. 60):
... A neural network is a system composed of many simple processing elements operating in parallel
whose function is determined by network structure, connection strengths, and the processing
perfor ed at o puti g ele e ts or odes .
The neuron in vivo exists in an extraordinarily complex environment in the central nervous system. Most
neurons receive synaptic input connections from thousands of other neurons, and in turn project
outputs to hundreds or thousands of other neurons. Synapses are distributed over complex dendritic
arbors, across the cell body, and even along the axon. Furthermore, a presynaptic cell frequently will
make multiple synaptic connections to the same target neuron. The properties of each neuron, the
48 | P a g e
scheme of connection between them, and the topology of the network, interact in complex ways to
shape the dynamics of the full, high-dimensional system, and modelling this system is a major challenge
for computational neuroscience. Some modelling studies strive for a detailed investigation of the signal
processing in individual neurons. A typical procedure is the exact reconstruction of a living cell and a
subsequent compartmentalization (in particular, of the dendritic tree), whereby each compartment is
assigned parameters according to the morphology and the physiology of the corresponding part of the
cell. A complete model may thus consist of thousands of compartments. Software packages like GENESIS
[61] or NEURON [62] support such an approach.
The first neural network models go back to the 1940s. Around this time, two mathematicians, McCulloch
and Pitts (1943) suggested the description of a neuron as a logical threshold element with two possible
states. Such a threshold element has L input channels (afferent axons) and one output channel (efferent
axon). However, the theory of McCulloch and Pitts failed in two important respects. Firstly, it did not
explain how the necessary interconnections between neurons could be formed, in particular, how this
might occur through learning. Secondly, such networks depended on error-free functioning of all their
components and did not display the (often quite impressive) error tolerance of biological neural
networks. The psychologist Hebb (1949) suggested an answer to the first question [63]. According to his
suggestion, the connection between two neurons is plastic and changes in proportion to the activity
correlation between the presynaptic and the postsynaptic cell. This phenomenon is called plasticity and
it is believed that this property of the synapsis is basic to learning.
4.2.1 Network Topology
Neural network architecture defines its structure including number of layers (and a detail of their
features namely hidden or not), number of input and output nodes, the wiring of the layers, direction of
signals etc.
4.2.1.1 Feed-forward Neural Networks
A feed-forward neural network is a biologically inspired classification algorithm. It consists of a (possibly
large) number of simple neuron-like processing units, organized in layers. The units in each layer are
connected with the units in the previous layer. There are never any backward connections, and
connections never skip a layer. Typically, the layers are fully connected, meaning that all units at one
layer are connected with all units at the next layer. So, this means that all input units are connected to
all the units in the layer of hidden units, and all the units in the hidden layer are connected to all the
49 | P a g e
output units. These connections are not all equal; each connection may have a different strength
or weight. The weights on these connections encode the knowledge of a network. Often the units in a
neural network are also called nodes. Data enters at the inputs and passes through the network, layer
by layer, until it arrives at the outputs. There is no feedback between layers. This is why they are
called feed-forward neural networks. Feed-forward neural networks have been used extensively solve
many kinds problems, being applied a wide range areas covering subjects such prediction temporal
series, structure prediction proteins, and speech recognition. Depending on the number of layers, there
are two types of Feed-forward neural networks. They are discussed below.
Single layered: The figure below is a one-hidden-layer FF network with inputs
and
output ̂. Each arrow in the figure symbolizes a parameter in the network. The network is divided
into layers. The input layer consists of just the inputs to the network. Then follows a hidden layer;
consisting of any number of neurons, or hidden units placed in parallel. Each neuron performs a
weighted summation of the inputs, which then passes a nonlinear activation function
the neuron function.
Mathematically the functionality of a hidden neuron is described by
∑
where the weights (
)are symbolized with the arrows feeding into the neuron.
Figure 4.11: A feed-forward network with one hidden layer and one output.
50 | P a g e
, also called
The network output is formed by another weighted summation of the outputs of the neurons in the
hidden layer. This summation on the output is called the output layer. Generally, the number of output
neurons equals the number of outputs of the approximation problem.
Multilayered: This consists of multiple layers of computational units, usually interconnected in a feedforward way. Each neuron in one layer has directed connections to the neurons of the subsequent layer.
A typical example is typified below in figure.
An excellent model implementation of the FF network architecture is the multilayer perceptron (MLP).
MLPs are general-purpose, flexible, nonlinear models that, given enough hidden neurons and enough
data, can approximate virtually any function to any desired degree of accuracy. In other words, MLPs are
universal approximators [65]. In the MLP, the net input to the hidden layer of neurons is a linear
combination of the inputs as specified by the weights. Another example of a FF network is the radial
basis function (RBF) [66]. The RBF defers to the MLP in the fact that the hidden neurons compute radial
functions of the input, which are similar kernel functions in kernel regression [67]. In many applications
the units of these networks apply a sigmoid function as an activation function.
Figure 4.12: A multilayer Feed-Forward Network with 5 input channels and hidden layers.
51 | P a g e
The dynamics of a typical FF network can be represented by the equation
Where W is the weight matrix of the synapses, v the output (vector if the output is multiple or scalar for
single neuron output), F is a nonlinear function,
is the time constant that describes how quickly
convergence occurs and u is the input signal (vector if the input is multiple or scalar for single neuron
input).
4.2.1.2 Recurrent Neural Networks (RNNs)
These are feed-forward networks with at least a loop or a feedback connection. This class of neural
network possesses connections between units form a directed cycle. This creates an internal state of the
network which allows it to exhibit dynamic temporal behavior. Unlike feed-forward neural networks,
RNNs can use their internal memory to process arbitrary sequences of inputs. Mantas & Herbert (see ref
[68 , o
the u i ue topology of the ‘NNs stated that: The hara teristi feature of ‘NNs that
distinguishes them from the more widely used feed-forward neural networks is that the connection
topology possesses cycles. The existence of cycles has a profound impact:
• An RNN may develop a self-sustained temporal activation dynamics along its recurrent connection
pathways, even in the absence of input. Mathematically, this renders an RNN a dynamical system, while
feed-forward networks are functions.
• If driven by an input signal, an RNN preserves in its internal state a nonlinear transformation of the
input history — in other words, it has a dynamical memory, and is able to process temporal context
i for atio
RNNs represent also a more plausible approach for biologically-based computational models as all real
neural networks so far presented recurrent connections. There are various models of RNNs such as the
Hopfield networks [68], Boltzmann machines [69], Deep Belief Networks [70] and the Long Short-Term
Memory (LSTM) network [71]. Two further new approaches to modelling RNNs were independently
proposed in the early period of the 21st century by Wolfgang Maass et al and Herbert Jaeger. They are
the Liquid State Machines [72] by the former and Echo State Networks [73] by the later. These last two
are categorized under the umbrella name of Reservoir Computing. We would briefly examine these two.
52 | P a g e
4.2.2 Reservoir Computing
The idea to use the rich dynamics of neural systems which can be observed in cortical circuits rather
than to restrict them resulted in the LSM model by Maass et al. [72] and the ESN by Jaeger [73]. These
two models had been designed independently, with different application types and different parameter
regimes in mind.
Figure 4.13: Structure of a RNN in the framework of reservoir computing; only dotted synaptic connectivities are
trained.
A recurrent neural network is randomly created and remains unchanged during training. This RNN is
called the reservoir. It is passively excited by the input signal and maintains in its state a nonlinear
transformation of the input history. The desired output signal is generated as a linear combination of
the euro s sig als fro
the i put-excited reservoir [68]. The main benefit is that the training is
performed only at the readout stage and the reservoir is fixed. Reservoir computing makes a conceptual
and computational separation between a dynamic reservoir — an RNN as a nonlinear temporal
expansion function (with a fixed weight), and a recurrence-free (usually linear) readout that produces
the desired output from the expansion. In linguistics, reservoirs have been used to classify spoken words
and digits [74] and to generate grammatical structure [75], written-word sequences [76], and even
musical sequences [77].
53 | P a g e
4.2.2.1 Echo State Networks (ESNs)
The approach is based on the observation that if a random RNN possesses certain algebraic properties,
training only a linear readout from it is often sufficient to achieve excellent performance in practical
applications [73]. This approach was proposed for machine learning and nonlinear signal processing.
Superior performance of echo state networks for various engineering applications has also been
suggested [78].
ESNs standardly use simple sigmoid neurons. The basic discrete-time, sigmoid-unit echo state network
with
reservoir units,
where
is the
or the
inputs and
dimensional reservoir state,
function),
matrix,
is
and
is the
outputs is governed by the state update equation [79]
the
is the
dimensional
is a sigmoid function (usually the logistic sigmoid
reservoir weight matrix,
input
signal,
is
the
is the
output
input weight
feedback
dimensional output signal. In tasks where no output feedback is required,
nulled. The extended system state
at time
matrix,
is
is the concatenation of the reservoir
and input states. The output is obtained from the extended system state by [75]
(
where
)
is an output activation function (typically the identity or a sigmoid) and
is a
dimensional matrix of output weights.
An important element for ESNs to work is that the reservoir should have the echo state property [76].
This condition in essence states that the effect of a previous state
future state
should vanish gradually as time passes (i.e.,
and a previous input
on a
), and not persist or even get
amplified.
4.2.2.2 Liquid State Machines (LSMs)
LSMs were developed from a computational neuroscience background, aiming at elucidating principal
computational properties of neural microcircuits. Maass et al. considered reservoirs of spiking neurons,
i.e. neuron models whose activity is described by a set of dynamic differential equations rather than a
static input-output function. Thus LSMs use more sophisticated and biologically realistic models of
54 | P a g e
spiking integrate-and-fire neurons and dynamic synaptic connection models in the reservoir. The
connectivity among the neurons often follows topological and metric constraints that are biologically
motivated. In the LSM literature, the reservoir is often referred to as the liquid, following an intuitive
metaphor of the excited states as ripples on the surface of a pool of water. Inputs to LSMs also usually
consist of spike trains. In their readouts LSMs originally used multilayer feed-forward neural networks
(of either spiking or sigmoid neurons), or linear readouts similar to ESNs [72]. Additional mechanisms for
averaging spike trains to get real-valued outputs are often employed [23].
Figure 4.14: Functional scheme of a LSM
Formally, the liquid of neurons is a mapping from the time-dependent inputs
laying
(
in
a
subset
) :
of
(
)
onto
a
(
dimentional
liquid
state
in
)
The second operation one needs to define is the readout function, mapping the liquid state into an
output at every time :
(
55 | P a g e
)
All-in-all, the liquid state machine is an operator, mapping time-varying functions onto one or many
functions of time. Readout maps are generally chosen memory-less because the liquid state
contain all the information about past inputs
output
state
. Therefore, the readout function
should
, with
, that is required to construct the
at time
does not have to map previous liquid
.
For a LSM to be viable there are certain conditions that it must be tested by. They are;
A separation property measuring the distance between the different states
caused by the
different input sequence .
An approximation property measuring the capability of the readout to produce a desired output
from .
4.2.2.3 Highlight of Major Differences between ESNs and LSMs
LSM research focuses on modeling dynamical and representational phenomena in biological neural
networks [72, 80], whereas ESN research is aimed more at engineering applications [68, 73, 78]. The
"liquid" network in LSMs is typically made from biologically more adequate, spiking neuron models,
whereas ESNs "reservoirs" are typically made up from simple sigmoid units [73]. LSM research considers
a variety of readout mechanisms, including trained feed-forward networks [23], whereas ESNs typically
make do with a single layer of readout units [81].
4.2.3 Learning
In the formal theory of neural networks the weight
of a connection from neuron
to
is
considered as a parameter that can be adjusted so as to optimize the performance of a network for a
given task. The process of parameter adaptation is called learning and the procedure for adjusting the
weights is referred to as a learning rule [3]. Depe di g o the a aila ility of a tea her duri g trai i g,
there are two paradigm of learning. They are; Supervised and Unsupervised Learning. There is another
class called the Hybrid Learning. It is a combination of both learning paradigms in response to the
demand of some desired applications. The hybrid paradigm is often used in reservoir computing where
the output is subjected to supervised training but the reservoir itself is in the unsupervised training
regime.
56 | P a g e
4.2.3.1 Supervised learning or Associative learning
This is a paradigm in which the network is trained by providing it with input and matching output
patterns. These input-output pairs can be provided by an external teacher, or by the system which
contains the network (self-supervised). Neural et orks are fitted to the data y lear i g algorith s
during a training process. These learning algorithms are characterized by the usage of a given output
that is compared to the predicted output and by the adaptation of all parameters according to this
comparison. The parameters of a neural network are its weights. There are a number of learning rules
under this paradigm, they are:
Back-propagation Rule: The output values are compared with the target to compute the value
of some predefined error function. The error is then fed back through the network, using this
information, the learning algorithm adjust the weights of each connection in order to reduce the
value of the error function. After repeating this process for a sufficiently large number of
training cycles, the network will usually converge. The weigh vector is given as:
Where
Where
is the value from unit
to ,
(
)
is the weight of connecting unit to ,
is the
error.
Correlation Learning Law: Here the change in the weight vector is given by
Therefore,
This is a special case of the Hebbian learning with the output signal is being replaced by the
desired output signal,
. The difference is that correlation learning rule is a supervised
learning, since it uses the desired output value to adjust the weights. In the implementation of
the learning law, the weights are initialized to small random values close to zero, that is,
Perceptron Law: This learning rule is an example of supervised training, in which the learning
rule is provided with a set of examples of proper network behavior. As each input is applied to
the network, the network output is compared to the target. The learning rule then adjusts the
57 | P a g e
weights and biases of the network in order to move the network output closer to the target. The
changes to the weight vector is given by
Where,
Here,
is the target output,
is the perceptron output,
is the error in output and
is
the learning rate.
The Delta Rule: This overcomes the shortcoming of the perceptron training rule not being
guara teed to o erge if the para eters are t li early separa le. Co erge e a
ore or
less be guaranteed by using more layers of processing units in between the input and output
layers. The Delta rule is based on the gradient descent search and is given as;
∑
Where,
Least Mean Square Law: The learning rule adjusts the weight based on the error (
the error is calculated, the weights are adjusted by a small amount,
). Once
in the direction of the
input, . This has the effect of adjusting the weights to reduce the output error. The
implementation of L.M.S is very simple. Initially, the weights vector is initialized with small
random weights. The main repetition then randomly selects a test, calculates the output of the
neuron, and then calculates the error. Using the error, the formula of learning rule is applied to
each weight in the vector. L.M.S is a special case of the Delta rule. The learning rule is gives as;
{
4.2.3.2 Unsupervised learning or Self-organization
Here, an (output) unit is trained to respond to clusters of pattern within the input. In this paradigm the
system is supposed to discover statistically salient features of the input population. Unlike the
supervised learning paradigm, there is no a priori set of categories into which the patterns are to be
classified; rather the system must develop its own representation of the input stimuli.
58 | P a g e
He
s La : It states that the eight i re e t is proportio al to the produ t of the i put data
and the resulting output signal of the unit. This law requires weight initialization to small
random values prior to learning. The change in weight vector is given by
Where the jth component of
is given by
(
(
)
.
)
4.2.4 Other Characteristics of Neural Networks
Based on their unique architecture, neural networks exhibits very impressive features which makes
them a powerful computational tool. Some of these properties are highlighted below.
Non l inearity, the answer from the computational neuron can be linear or not. A neural
network formed by the interconnection of o ‐li ear euro s, is in itself o ‐li ear, a trait
which is distributed to the entire network. No linearity is important over all in the cases
where the task to develop presents a behavior removed from linearity, which is presented in
most of real situations.
Adaptive learning, the NN is capable of determine the relationship between the different
examples which are presented to it, or to identify the kind to which belong, without
requiring a previous model.
Self–organization, this property allows the NN to distribute the knowledge in the entire
network structure; there is no element with specific stored information.
Fault tolerance, This characteristics is shown in two senses: The first is related to the
samples shown to the network, in which case it answers correctly even when the examples
exhibit variability or noise; the second, appears when in any of the elements of the network
occurs a failure, which does not makes improbable its functioning due to the way in which it
stores information.
59 | P a g e
This page was left blank intentionally
60 | P a g e
CHAPTER 5
MODEL AND IMPLEMENTATION
Despite several decades of research there is still more to learn about the interplay between the
functional and physiological columns. This research work is carried out with in accordance with the
wide-spread conception of the columns being the essential location for cortical computation.
Investigations are carried out in two levels; mesoscopic (intracolumnar scale) and microscopic
(intercolumnar scale).
5.1 Microscopic Scale: Intracolumnar
Biological data collected by Roerig et al [1] are used as parameters in our experiment. The biological
layer implemented in the model is one built into the implementation software CSIM, which will be
elaborated upon moderately in this chapter. The network is composed of three layers (2/3, 4 and 5/6),
each of them containing a population of both excitatory and inhibitory neurons. They contain 80%
excitatory and 20% inhibitory neurons. The measurement of the computational effect of the wiring is
done within the framework of reservoir computing neural network, in this case Liquid State Machine,
which is what the CSIM is designed as. The LSMs have been briefly described in the preceding chapter.
61 | P a g e
5.1.1 The CSIM Software: Features and Availability
The software and the appropriate documentation for the liquid state machine can be found on its own
webpage, which is located under the URL www.lsm.tugraz.at/. CSIM is developed under the GPL
(www.gnu.org/licences/gpl.html). Hence, other research groups are invited to contribute their models
and research efforts to this project. The software consists of three parts:
• CSIM – the Neural Circuit Simulator
• Circuit tool – MATLAB scripts, tools and documentation for simple generation of neural microcircuits
• Learning tool – A package containing MATLAB scripts and HTML documentation for quantitative
analysis of the computational power of neural microcircuits
CSIM is the main part of the LSM since it contains the main simulator for neural microcircuits. Most
experiments can be done with only this package. CSIM is a tool for simulating arbitrary heterogeneous
networks composed of different models of neurons and synapses. The simulator itself is written strictly
in the object oriented programming language C++ with a MEX interface to MATLAB. MEX is a standard
interface offered by MATLAB for external applications to communicate and exchange data with
MATLAB, normally used to call often used functions that are precompiled for faster execution (since
MATLAB usually works as an interpreter language, i.e. it interprets and compiles commands one by one).
CSIM is intended to simulate networks containing up to 10.000 neurons and up to the order of a few
millions of synapses. The actual size depends of course on the amount of RAM available on the machine
here MATLAB is ru . It also o tai s support for MATLAB s parallel irtual
a hi es. The
ai features
and advantages that we incorporated in CSIM are described in the user manual available from the LSM
webpage, from which the following is quoted:
Different levels of modeling: Different neuron models: leaky-integrate-and-fire neurons, compartmental
based neurons, sigmoidal neurons. Different synapse models: static synapses and a certain model of
dynamic synapses are available for spiking as well as for sigmoidal neurons. Spike time-dependent
synaptic plasticity is also implemented.
Easy to use MATLAB interface: Since CSIM is incorporated into MATLAB it is not necessary to learn any
other script language to set up the simulation. This is all done with MATLAB scripts. Furthermore the
results of a simulation are directly returned as MATLAB arrays and hence any plotting and analysis tools
available in MATLAB can easily be applied.
Object oriented design: We adopted an object oriented design for CSIM which is similar to the
approaches taken in GENESIS and NEURON. That is there are objects (e.g. a LIF-Neuron object
implements the standard leaky-integrate-and fire model) which are interconnected by means of some
62 | P a g e
signal channels. The creation of objects, the connection of objects and the setting of parameters of the
objects is controlled at the level of MATLAB scripts whereas the actual simulation is done in the fast C++
core. See Figure 3.4 for a part of CSIMs class hierarchy.
Fast C++ core: Since CSIM is implemented in C++ and is not as general as GENESIS or NEURON,
simulations are performed quite fast. We also implemented some ideas from event driven simulators like
SpikeNet which result in an average speedup factor of 3 (assuming an average firing rate of the neurons
of 20 Hz and short synaptic time constants) compared to a standard fixed time step simulation scheme.
Runs on Windows and Unix (Linux): CSIM is developed on Linux (SuSE 8.0 with gcc 2.95.3) but it is
known also to run under other Linux distributions like Mandrake 8.0 and RedHat 7.2 as well as Windows
98 (we have no experience with Windows XP yet, but it should also run there) and should in principle run
on any platform for which MATLAB is available.
External Interface: There is an external interface which allows CSIM to communicate with external
programs. In this way one can for example control the miniature robot Khepera with CSIM. This feature is
not available in the Windows version.
5.2.2 Neuronal Model
The neuronal model chosen for this level of investigation is the Hodgkin-Huxley type model which was
based on the Cbneuron model with typical H-H neuron
and
generation.
The membrane voltage
is governed by
∑
Where
Membrane capacity (Farad)
Reversal potential of the leak current (Volts)
Membrane resistance (Ohm)
Total number of channels (active + synaptic)
Current conductance of channel c (Siemens)
Reversal potential of channel c (Volts)
Total number of current supplying synapses
Current supplied by synapse s (Ampere)
63 | P a g e
∑
∑
channel for action potential
Total number of conductance based synapses
Conductance supplied by synapse s (Siemens)
Reversal potential of synapse s (Volts)
Injected current (Ampere)
Detailed explanations can be found in the CSIM manual.
The network realization is based on the connection probabilities only in this work. The probability for a
connection from a neuron to a neuron is governed by
[
]
Where
is the Euclidian distance between the
and the
|̂
̂|
neuron positions in the network.
varies the
number and the typical length of the connections and as such is used to establish different network
structures. It is used as the control parameter. It varies thus:
; shows unconnectedness,
; Local next-neighbor connectivity,
; Global connectivity.
esta lishes the o
e ti ity a o g e itatory ‘E a d i hi itory ‘I
euro s, esta lished y
means of one pooled synapse. Our choice
reflects the typical biological connectivity. If a connection is made, the synaptic weights are drawn from
a uniform distribution over
,
and
, multiplied by the weight factors
.
5.2 Mesoscopic Scale: Intercolumnar
On intracolumnar (microscopic) level we used H-H neurons; which are in ordinary differential equation
(ODE) form, but on the intercolumnar level, we used discrete map-based neurons. They offer obvious
advantages in terms of simplicity and computational efficiency irrespective of the size of the simulated
network. Coupled map lattice of chaotic maps affords the flexibility required for communication.
64 | P a g e
5.2.1 Neuronal Model
Rulkov [59], with his map-based models, has proven already that they are suitably designed to capture
the dynamical mechanism underlying the generation of various patterns of spiking activity representing
columnar response without the need for increase in number of equations, hence making this approach
highly desirable. We utilize the Rulkov map-based neuron model [59] which has a spike-afterhyperpolarization modification dimension in it. The model is briefly stated below.
The fast sub-system and slow dynamical variable are given by
(
)
{
The nonlinear function is given as
{
Where
is a hyperpolarizing current, parameter
the hyperpolarizing current.
controls the duration and
is the amplitude of
is a constant defining the resting potential. The fast sub-system
represents the spike-train, where exactly one maximum value attained corresponds to one spike;
denotes the external driving current.
The fixed parameters
are then complemented by appropriately chosen
values for parameters
The probability of connectivity between two lattice points
and
of distance
which specifies the connectivity matrix. Carefully choosing parameters
network structures that can be investigated. Given
coupled network
and
65 | P a g e
is governed by
and
gives a range of
the system can be changed from a globally
into a nearest neighbor coupled network
. For
, the network is coupled to the nearest neighbor with probability 1 and to the all other
nodes with probability
up to the cutoff
. This cutoff together with the topology determines
the average number of connected nodes
5.2.2 Speed of Information Transfer (SIT) and Synchronization
For clarity and presentation purposes, we would briefly examine synchronization in chaotic maps as
presented in ref. [82] adding some missing steps to the sequence of equations presented in the work.
Considering a finite connected graph
with nodes , writing
, when
connected by an edge, and the number of neighbors of denoted by
system with discrete time
, with state of
(
Here
)
. On
are neighbors that is
, we have a dynamical
evolving according to
∑
(
and
(
)
is a differentiable function mapping some finite interval to itself,
)
is the coupling strength, and
is the transmission delay between nodes. A synchronized solution is one where the states of the
nodes are identical,
Thus
satisfies
(
)
(
)
In order to examine the stability of the synchronized solution, we consider perturbations of
of the
form
For some small
graph Laplacian
The solution
and for
that is non-constant ones.
, with corresponding eigenvalues
is the orthonormal eigenmodes of the
, (See Section 3.3.6).
is stable against such a perturbation when
. Expanding about
yields
(
If we say
Factorizing
66 | P a g e
)
(
)
, that is no delay, and then we have
(
)
(
(
)
)
(
)
The sufficient local stability condition is
|
|
expanding within the limit, we
|
||
|
as in the case without delay, that is,
|
| |
|
, we rewrite as
∏|
|
Rewriting the above as a sum of the logarithmic terms,
{
|
|
|
|
|
|
|
|}
Generalizing, we obtain
∑
Substituting for
|
in the above expression, i.e. putting
∑
67 | P a g e
|
|
(
)
into
|
,
|
∑
|
|| (
|
∑
)|
| (
)|
In order to have more ease in working around the equation, we say let
Therefore
∑
becomes,
|
Multiplying through by
| (
)|
|
|
|
|
|
,
Going with the right side of the above equation,
|
|
Where (Recalling from the simplification made to (5.19) above),
∑
is the Lyaponov exponent of
| (
)|
. The a ility to ge erate a ohere t state is e pressed y the ells a ility
to synchronize in a generalized sense. For synchronization, there is a minimal number of connections are
required to impinge on a given site. The equation
synchronization to continue to exist.
68 | P a g e
is a satisfactory condition for dynamical
We use a system of diffusive coupling for the interaction of the local chaotic site maps
evolution of the site states. Considering
equation
number of connections to the
to model the
site indexed by ,
becomes
where
labels the connected sites,
firing of the column and
(
∑
)
denotes discrete time,
is a chaotic map describing the
is the coupling strength.
The average speed by which information is propagated though a coupled map network (speed of
information transfer, SIT) is the parameter of assessment of computation we used at this scale following
the framework as developed by in ref. [82]. A small perturbation is applied to the oscillators of a
columnar cluster, and the information propagation is followed using the difference to a replica system
without perturbation. The information propagation velocity
can directly be measured from the
perturbation at the leftmost and the rightmost oscillator, which is an indication of the propagation of
the perturbation through the network. For
, the information propagation can be understood as
the result of two independent contributions: the chaotic instability of the map leads to an average
exponential growth of the initial perturbation
at a site 0, whereas the diffusive coupling results in a
Gaussian spreading. The combined perturbative effects at site are then given by the equation
where
|
denotes the diffusion coefficient,
perturbation strength. The velocity
|
√
is the Lyapunov exponent of the site map, and
of the travelling wave front is then determined at the borderline of
damped and un-damped perturbations, which means
For a given site map
is the
√ √
, SIT is therefore determined by √
depends on
according to
. The diffusion coefficient can be determined
from the network topology, using the Markov chain. The network of connected columns is represented
by a Markov chain of size , and the interaction is represented by a matrix of size
. At each end of
the chain, an absorbing state is put, which is reflected in the adjacency matrix. The time taken
for a
perturbation to diffuse from the center of the chain to either end is then calculated as in [83]. The
diffusion coefficient can then be estimated from the
69 | P a g e
entry of
as
This page was left blank intentionally
70 | P a g e
CHAPTER 6
RESULTS AND DISCUSSIONS
6.1 RESULTS
We focused on the information transfer across the cortical network and on the total wiring length
needed for obtaining a spatio-temporally coherent computation. As earlier stated, coupled chaotic maps
was used to model our network. Neuronal interaction is modeled by chaotic site maps communicating
by means of diffusive coupling (Eqn. 5.24). We employed strong coupling for the recorded data but we
also experimented with other regimes of coupling to see the effect but that would be the subject of
another research.
When we measured the speed of information transport through the network for doubly fractal, single
fractal, random, coupling topologies, we indeed found a strong dependence on the wiring topology. The
doubly fractal network display a lot of short-range connections and fewer long-range connections. It
shows a consistent and steady increase in SIT values, especially in the
regime of
and
populations. In the
, there exists some sort of fluctuation in SIT values at with respect to
changes in the number of connections. These flu tuatio s do t appear to ha e a ide tifia le pattern
and it remains uncertain why such fluctuations or if there be any significance to them. Those of fractal
and random network have an almost linear increase as the number of connections increased. The
random network performed best in terms of SIT values and this was more evident when we simulated
71 | P a g e
Figure 6.1: Main network densities compared (p: connection probabilities, d: distance, M: cutoff).
with
and
. The random network contained an evidently greater number of long-
range connections than that of the doubly fractal topology and this might be the reason behind its
superior SIT values.
As detailed in section 5.2.2, the diffusion coefficient obtained by means of Eqn. 5.27 is substituted into
Eqn. 5.26 with the SIT value of the corresponding network type and size and as such the Lyapunov
exponent is gotten. Having these parameters with the minimum non-zero Eigen value of the graph
Laplacian obtained as stipulated by Eqn. 3.9, the condition for synchronization stated in Eqn. 5.23 is then
checked. We then observed the total wiring length (TWL hereafter) of the network when it fulfills the
synchronization condition. In measuring the TWL, we found out that doubly fractal network performed
far better than the other networks. Synchrony is achieved at a far lesser TWL than in the fractal and
random topologies. We consider this measure as a more important one as synchronization is what
signifies computation.
Below, we present the results of our numerical experiment. The figure 6.2 shows tables for values of SIT
for each topology for a range of k. The data points are averages over 100 realizations. The specific value
of the exponents in each case is also stated.
Random Network
Fractal Network
Doubly Fractal Network
72 | P a g e
Figure 6.2: Values obtained from simulations. It shows the speed of information transfer (SIT) values obtained with
fixed exponents, for network realizations and varying network sizes.
73 | P a g e
Speed of Information Transfer(SIT)
1
0.9
0.8
0.7
Random
0.6
0.5
Fractal
0.4
0.3
Doubly
Fractal
0.2
0.1
0
2
20
Number of Connections (k)
N = 200
Figure 6.2: Results for fixed exponents with varying network sizes
Speed of Information Transfer(SIT)
2
1.8
1.6
1.4
Random
1.2
1
Fractal
0.8
0.6
Doubly
Fractal
0.4
0.2
0
2
20
Number of Connections (k)
N = 400
Figure 6.3: Results for fixed exponents with varying network sizes
74 | P a g e
Speed of Information Transfer(SIT)
5
4
Random
Fractal
3
D. fractal
2
1
2
20
Number of Connections (k)
N = 1000
Figure 6.4: Results for fixed exponents with varying network sizes
10
Speed of Information Transfer(SIT)
9
8
7
Random
6
5
Fractal
4
Doubly
Fractal
3
2
1
0
2
20
Number of Connections (k)
N = 2000
Figure 6.5: Results for fixed exponents with varying network sizes
75 | P a g e
6.2 DISCUSSION
There is an abundance of short-range connections and few long-ranged connections. These few longranged have been reported to have a substantial enhancing effect on the SIT [23]. The columnar
structure could thus be suggested to have emerged as a facilitating structure for such a performance.
Our results do not confirm the superiority of the SIT achieved in the doubly fractal topology to those of
the fractal and random topologies as reported in an earlier work (see ref. [23]). However, and more
importantly, we have been able to verify the fact that doubly fractal networks synchronize at a much
shorted TWL when compared to the fractal and random networks (Figure 6.7). The superior speed
exhibited by the random network can be attributed to the abundance of long-range connections, which
incidentally also accounts for the high wiring cost. The figure 6.6 below from ref. [83] typifies this
scenario.
Figure 6.6: Network topology and spatial embedding.
The complex nature of brain networks give it the ability to sit between the extremes shown in the figure
above. Since synchronization indicates computation, it is safe to say that computation in this topology is
achieved in a remarkably economic fashion. This property was found to be consistent for the range of k
we investigated. Bull ore a d “por s
i
their
ork The e o o y of
rai
et ork
orga izatio , puts it this ay: Brain networks can therefore be said to negotiate an economical tradeoff between minimizing physical connection cost and maximizing topological value.
76 | P a g e
1
TWL
0
Fractal (k = 20)
Random (k = 10)
Doubly fractal (k = 12)
Figure 6.7: Corresponding relative TWL.
Stoop et al. in their work on intracolumnar wiring [23], made some very interesting findings. They
employed recognition rate as a basis of evaluating the network and two popular time series
lassifi atio tasks were used as real-world benchmarks. Single Arabic Digit speech recognition
is based on time series of 13 Mel Frequency Cepstral Coefficients for 10 classes of digits spoken
by 88 subjects. Australian Sign Language (Auslan) recognition was based on time series of 22
parameters for 95 signs, recorded from a native signer using digital gloves and position tracker
equipment. They i estigated the i flue e of orti al orga izatio al structures on two levels of
architectural sophistication: A simpler excitatory-inhibitory EI network and the more detailed
layered excitatory-inhibitory network topology LEI. There were two general observations clear
from Fig. 6.8 whereas the particular neuron models (and the underlying circuit parameters) are
of se o dary i flue e
lue s. red ur es , the i tegratio readout right pa els has a lear
advantage over instantaneous readout (left panels). The results obtained for the EI-network
demonstrated that the connectivity expressed in terms of
does not enhance the
o putatio al po er of the et ork. The plot also o fir s that lo al o
no distinguishable role among the possible connectivities.
77 | P a g e
e ti ity
plays
FIG. 6.8: Recognition rate R for a) Arabic Digit, b) Auslan Sign recognition, using leaky integrate and fire (blue
curves), or Izhikevich (red curves) neurons in the networks. Each data point is the average over 20 realizations. Left
olu
:
e oryless ’ l’ , right olu
dependence on connectivity
local connectivity at
: i tegratio
’i t’ readout. Networks: I EI etwork, re og itio rate
(control networks: dashed curves), and on the ratio I of input receiving neurons at
. Ocher: Izhikevich neurons with
rate dependence on the rewiring probability .
(no connections). II) LEI networks, recognition
: layered:
: homogeneous control network.
Having no recurrent connections at all among the reservoir neurons did not hamper the
recognition rate, suggesting that extremely little computation is owed to synaptic interaction. It
may have been suspected that the low recognition rates from memoryless readout are because
in the classical Liquid States network paradigm the input signal is applied to all neurons, which
o sta tly o er rites
e ory that other ise
ould
e retai ed i
hidde
euro s. To
exclude this possibility, they examined in Fig. 6.8 s se o d ro the role of the hidde
by measuring
value of
78 | P a g e
euro s,
for networks having a fraction of input signal receiving neurons. The desired
was achieved by removing from a corresponding number of reservoir neurons the
input signals. The connectivity was set to local next neighbors
o her li e . If hidde
using Izhikevich neurons with
should have perceived a maximum of
, except for one test
euro s ere e efi ial, they, again,
for some optimal value of
In the Arabic Digit task with
memoryless readout they did not observe a dependence on the number of actually used
neurons (i.e. beyond
, where we have on average 13.5 input receiving neurons, at an
input dimensionality of 13). The similarity of the results obtained for
and for
suggests that the nonlinear interaction among the input receiving neurons does not
sig ifi a tly e ha e perfor a e. The EI et ork
thus does ot perfor
sig ifi a tly etter tha the o trol et ork. The results o tai ed for
the LEI et orks refle ti g to
orro orate the o ser atio s
rewiring probability
79 | P a g e
ith iology-motivated wiring structure
ore details the olu
ade for the si pler
was not observed.
ar layeri g stru tures (see Fig. 6.8 II)
odel: A sig ifi a t depe de e of
on the
This page was left blank intentionally
80 | P a g e
CHAPTER 7
CONCLUSION
In conclusion, we were able to verify that doubly fractal topology achieves computation in an amazingly
economical fashion when compared to random and fractal topologies. Hence we can say that the unique
topological structure of the neocortex plays a strong role in its possessing of strong computational
ability. We propose that this line of research be carried out with a network size comparable to the size
of the columnar population in the neocortex (there are about half a million columns in the neocortex4)
to be able to observe the capabilities of this topology. We believe that a large scale simulation with
more elaborate analysis holds the promise of a clearer verdict on the computational advantage the
doubly fractal topology of the intercolumnar wiring of the neocortex possesses.
Also future research should be done within this framework to investigate the adaptability of neocortical
network topology. What would happen if a major hub is removed? How would this impact the
performance of the network? A physiological model (like Eqn. 5.1) can also be used within this
framework and the results compared to see if the tally.
4
http://bluebrain.epfl.ch/
81 | P a g e
This page was left blank intentionally
82 | P a g e
REFERENCES
[1] Dayan, P. and Abbott, L. F. Theoretical Neuroscience. Cambridge, MA: MIT Press (2001).
[2] Microsoft ® Encarta ® 2009. © 1993-2008 Microsoft Corporation. All rights reserved
[3] Gerstner W. and Kistler W. M, Spiking Neuron Models. Cambridge University Press (2002).
[4] Izhikevich E.M. Dynamical Systems in Neuroscience. Cambridge, MA: MIT Press (2007).
[5] Purves D. Neuroscience. Sinauer Associates, Inc., Sunderland, MA (2004).
[6] Hebb, D.O. Organization of behavior. New York: John Wiley & Sons (1949).
[7] Martin SJ, Morris R. New life in an old idea: the synaptic plasticity and memory hypothesis revisited.
Hippocampus 12(5): (2002) 609–636
[8] Rioult-Pedotti MS, Friedman D, Hess G, Donoghue JP. Strengthening of horizontal cortical
connections following skill learning. Nat Neurosci. 1(3): (1998) 230–234
[9] Somogyi P, Tamas G, Lujan R, Buhl EH. Salient features of synaptic organization in
the cerebral cortex. Brain Res Brain Res Rev 26 ( 2–3 ): (1998) 113 – 135 .
[10] Roth G. and Dickle U. Evolution of brain and intelligence. TRENDS in cognitive sciences, 5, (2005)
250-257.
[11] Mountcastle, V. An organizing principle for cerebral function: the unit module and the distributed
system. The Mindful Brain (ed. G. M. Edelman & V. B. Mountcastle), pp. 7–50 (1978). Massachusetts:
MIT Press.
[12]. Mountcastle, V.B.. Modality a d topographi properties of si gle euro s of at s so ati se sory
cortex. J. Neurophysiol. 20, (1957) 408–434.
[13] Buxhoeveden D., Casanova M.F., The minicolumn hypothesis in neuroscience, Brain, 2002, 125,
935–951
[14] Mountcastle, V.B.. The columnar organization of the neocortex. Brain 120, (1997), 701–722.
[15] Mountcastle, V.B. (1998) Perceptual Neuroscience: The Cerebral Cortex, Harvard University Press
[16] Calvin W. Cortical columns, modules, and Hebbian cell assemblies. In Michael A. Arbib, editor, The
Handbook of Brain Theory and Neural Networks, (1998) pages 269-272. MIT Press, Cambridge, MA.
[17] Hashmi A. G. and Lipasti M. H. (2009). IEEE Symposium on Computational Intelligence for
Multimedia Signal and Vision Processing, CIMSVP '09.
[18] Rakic P. Specification of cerebral cortical areas, Science, 241, (1988) 170–176
[19] Bugbee, N. M. & Goldman-Rakic, P. S.. Columnar organization of cortico-cortical projections in
squirrel and rhesus Monkeys: similarity of column width in species differing in cortical volume. J. Comp.
Neurol. 220, (1983) 355–364.
83 | P a g e
[20] Grossberg S. How does the cerebral cortex work? Learning, attention and grouping by laminar
circuits of visual cortex. Spatial Vision, 12, (1999) 163-185.
[21] Geoffrey J. Goodhill, Miguel A. Carreira-Perpinan, Cortical Columns, Encyclopedia of Cognitive
Science, Macmillan Publishers Ltd., 2002. http://cns.georgetown.edu/~miguel/papers/ecs02.html
[22] Silberberg G, Gupta A, Markram H. Stereotypy in neocortical microcircuits. TRENDS in
Neurosciences, 25, (5): (2002) 227-230
[23] Ralph Stoop, Victor Saase, Clemens Wagner, Britta Stoop, and Ruedi Stoop. Beyond Scale-Free
Small-World Networks: Cortical Columns for Quick Brains. Phys. Rev. Lett. 110, (2013) 108105.
[24]. V. M. Eguiluz, D. R. Chialvo, G. A. Cecchi, M. Baliki, and A.V. Apkarian, . Phys. Rev. Lett. 94, (2005)
018102.
[25] L. Otis, Muller’s La . Oxford University Press, Oxford, (2007).
[26] S. Boccalettia, V. Latorab, Y. Morenod,e, M. Chavezf, D.-U. Hwanga. Physics Reports 424, (2006) 175
– 308
[27] R. Albert, A.-L. Barabasi. Statistical mechanics of complex networks, Rev. Mod. Phys. 74, (2002) 4797.
[28] Scott, J., Social Network Analysis: A Handbook, Sage Publications, London, 2nd ed. (2000).
[29] Milgram, S., The small world problem, Psychology Today 2, 60–67 (1967).
[30] M. E. J. Newman. The structure and function of complex networks, SIAM Review 45, 167-256 (2003)
[31] Redner, S., How popular is your paper? An empirical study of the citation distribution, Eur.
Phys. J. B 4, (1998) 131-134.
[32] M.E.J. Newman, M. Girvan. Finding and evaluating community structure in networks, Phys. Rev. E
69, (2004) 026113.
[33] Erd˝os, P. a d R´enyi, A. On random graphs, Publ. Math. 6, (1959) 290–297.
[34] Z. Burda, J. Jurkiewicz, A. Krzywicki, Physica A 344 (2004) 56.
[35] B. Bollobas, Random Graphs, Academic Press, London, 1985.
[36] Watts, D. J., “trogatz, “. H. Colle ti e dy a i s of s all- orld
et orks. Nature 393, (1998) 440-
442.
[37] M. Mehta, Random Matrices, Academic Press, New York, (1995).
[38] E J Newman, D J Watts, Renormalization group analysis of the small-world network model, Physics
Letters A 263, (1999) 341
[39] A. Barrat, M. Weigt. On the properties of small-world network models, Eur. Phys. J. B 13 (2000) 547.
[40] Albert, R., H. Jeong, and A.-L. Barabási, Nature (London) 401, (1999) 130.
84 | P a g e
[41] Barthelemy, M., and L. A. N. Amaral, Phys. Rev. Lett. 82, (1999) 3180; 82, 5180(E).
[42] Faloutsos, M., P. Faloutsos, and C. Faloutsos, , Comput. Commun. Rev. 29, (1999) 251.
[43] Newman, M. E. J., S. H. Strogatz, and D. J. Watts, 2001, Phys. Rev. E 64, 026118.
[44] D.J.de S. Price, J. Amer. Soc. Inform. Sci. 27 (1976) 292.
[45] A.-L. Barabasi, R. Albert, Science 286 (1999) 509.
[46] S.N. Dorogovtsev, J.F.F. Mendes, A.N. Samukhin, Phys. Rev. Lett. 85 (2000) 4633
[47] P.L. Krapivsky, S. Redner, F. Leyvraz, Phys. Rev. Lett. 85 (2000) 4629.
[48] Granovetter, M. The strength of weak ties. American Journal of Sociology 78, (1973) 1360-1380.
[49] Luczkowich, J. J., Borgatti, S. P., Johnson, J. C., Everett, M. G. Defining and measuring trophic role
similarity in food webs using regular equivalence. Journal of Theoretical Biology 220, (2003) 303321.
[50] M. E. J. Newman, 2004. Analysis of weighted networks. Phy. Rev. E 70, 056131
[51] A. Barrat, M. Barthelemy, R. Pastor-Satorras, A. Vespignani, Proc. Natl. Acad. Sci. USA 101 (2004)
3747.
[52] Ibarz B., Casado J.M., Sanjuán M.A.F. (2011). Map-based models in neuronal dynamics, Physics
Reports 501 (2011) 1–74
[53] A.L. Hodgkin and A.F. Huxley, "A quantitative description of membrane current and its application
to conduction of excitation in nerve," J. Physiology, 117: (1952) 500-544.
[54] Nagumo J., Arimoto S., and Yoshizawa S. An active pulse transmission line simulating nerve axon.
Proc IRE. 50: (1962) 2061–2070.
[55] FitzHugh R. Impulses and physiological states in theoretical models of nerve membrane. Biophysical
J. 1: (1961) 445-466
Fitzhugh ‘., Izhike i h E. FitzHugh-Nagu o
odel . “ holarpedia 1 (9): (2006) 1349
[57] Goodman, D. & Brette, R. Brian: a simulator for spiking neural networks in Python. Frontiers in
Neuroinformatics, 2: (2008) 5. doi: 10.3389/neuro.11.005.2008.
Euge e M. Izhike i h. “i ple Model of “piki g Neuro s , IEEE Transactions on Neural Networks,
VOL. 14, (2003) NO. 6
[59] Rulkov NF. Modeling of spiking-bursting neural behavior using two-dimensional map. Phys. Rev. E
Stat. Nonlin. Soft Matter Phys. 65: (2002) 041922.
[60] N.F. Rulkov, I. Timofeev, M. Bazhenov. Oscillations In Large-Scale Cortical Networks: Map-Based
Model Journal of Computational Neuroscience 17, (2004) 203–223
[61] J. M. Bower and D. Beeman. The Book of GENESIS: Exploring Realistic Neural Models with the
GEneral NEural SImulation System. Telos, Santa Clara CA, (1998).
85 | P a g e
[62] M. L. Hines and N. T. Carnevale. The NEURON simulation environment. Neural Comp., 9: (1997)
1179{1209.
[63] D. O. Hebb. The Organization of Behavior. Wiley, New York, (1949).
Lapi ue, L. ‘e her hes ua titati es sur l e itatio e´le tri ue des erfs traite´e o
e u e
polarization. J. Physiol. Pathol. Gen. 9: (1907) 620–635.
[65] White, H., Artificial Neural Networks: Approximation and Learning Theory, Oxford, UK: Blackwell
(1992).
[66] Wasserman, P.D., Advanced Methods in Neural Computing, New York: Van Nostrand Reinhold
(1993).
[67] Hardle, W. Applied Nonparametric Regression, Cambridge, UK: Cambridge University Press (1990).
[68] Mantas Lukosevicius and Herbert Jaeger. Reservoir computing approaches to recurrent neural
network training. Computer Science Review, 3(3): 127{149, August 2009. ISSN 1574-0137. doi:
10.1016/j.cosrev.2009.03.005.
[69] John J. Hopfield. Hopfield network. Scholarpedia, 2(5): (2007) 1977.
[70] Geoffrey E. Hinton. Boltzmann machine. Scholarpedia, 2(5): (2007) 1668.
[71] Hochreiter, Sepp; and Schmidhuber, Jürgen; Long Short-Term Memory, Neural Computation, 9(8):
(1997) 1735–1780,
[72] Wolfgang Maass, Thomas Natschlager, and Henry Markram. Real-time computing without stable
states: a new framework for neural computation based on perturbations. Neural Computation, 14(11):
(2002) 2531–2560.
Her ert Jaeger. The e ho state approach to analyzing and training recurrent neural networks.
Technical Report GMD Report 148, German National Research Center for Information Technology,
(2001).
[74] Verstraeten, D.: Reservoir Computing: Computation With Dynamical Systems, Electronics and
Information Systems. Thesis: Gent, Ghent University, (2009)
[75] Tong, M.H., Bickett, A.D., Christiansen, E.M., Cottrell, G.W. Learning Grammatical Structure with
Echo State Networks. Neural Networks 20(3), (2007) 424{432
[76] Jaeger, H. Short Term Memory in Echo State Networks. GMD Report 152, German National
Research Institute for Computer Science, (2001)
[77] Eck, D. Generating Music Sequences with an Echo State Network. Neural Information Processing
Systems 2006 Workshop on Echo State Networks and Liquid State Machines (2006)
86 | P a g e
[78] H. Jaeger and H. Haas. Harnessing nonlinearity: predicting chaotic systems and saving energy in
wireless communication. Science, 304: (2004) 78{80.
[79] Jaeger, H., "Echo state network", Scholarpedia, vol. 2, no. 9, (2007) pp. 2330.
[80] Joshi, P., Maass, W.: Movement Generation with Circuits of Spiking Neurons. Neural Computation
17(8), (2005) 1715{1738.
[81] H. Jaeger: Tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the "echo
state network" approach. GMD Report 159, German National Research Center for Information
Technology, 2002 (48 pp.)
[82] Cencini M, Torcini A. Linear and nonlinear information flow in spatially extended systems. Physical
Review E, vol. 63, (2001) pp. 056201-1-056201-13, DOI: 10.1103.
[83] J.G. Kemeny and J.L. Snell, Finite Markov Chains. Van Nostrand Reinhold Company, New York,
(1960).
[84] Bullmore E., Sporn O. The economy of brain network organization. Nature Review Neuroscience
(2012), www.nature.com/reviews/neuro
87 | P a g e