Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Vol2 No8

Download as pdf or txt
Download as pdf or txt
You are on page 1of 93

(IJCNS) International Journal of Computer and Network Security, 1

Vol. 2, No. 8, August 2010

Students’ Perspectives on their Preferences in


the Networked Learning Environment
Sha Li

Alabama A & M University, School of Education


4900 Meridian St., Normal, AL 35762, USA
sha.li@aamu.edu

Abstract: The creation of learning resources on the internet [13]. Because of the available multimedia effect, the online
provides learners with rich learning information in varieties of resources could foster enthusiasm and better meet the needs
digital formats. But the new learning paradigm generates new of different learners. They also inspire self-image through
issues. Many instructors designed learning resources online but active interaction with information as well as with peers [21,
found that their students have not used them as effectively as 43]. Peluchette and Rust made a study on the faculty
expected. There is a need for us to understand the learners’ members’ preference in using technology facilities for
preference and their needs in using online learning resources to
classroom instruction [34]. They found that most faculty in
help us teach and design instruction more effectively in the
online-resource aided learning environment. This study uses the
their study preference the use of the basic technology
course of FED 529 Computer-Based Instructional Technology as facilities for daily instruction, such as overhead
a case to explore the learners’ preferences in the use of the transparencies, PowerPoint, Blackboard and whiteboard.
online learning resources the class provides. This study uses Very few of the faculty members expressed a preference for
both quantitative and qualitative methods. There are nine teaching courses online. Learners have their own learning
student preferences found while they use the course online preferences in relation to their habit or preference of in-
learning resources. The findings could help us design better taking and processing information resources [6, 22, 33].
online learning resources to meet the learners’ preferences and Felder and Brent indicate that the learner’s preference
needs, motivate student involvement, and enhance the relates to the learning style difference that exists among
instructional effectiveness in the Internet assisted learning college students. Tailoring the instructional resource design
environment.
to meet the different student learning styles could yield
better academic outcome [15]. Kvavik states that the
Keywords: Instructional Design, Online Learning
preferences the new generation of students have in their
Resources, networked learning, distance education.
technology-facilitated learning environment are critical
issues for us. He found in his study that students showed
their highest preference level for the moderate use of
1. Introduction technology, and the lowest preference level for the classes
that have no technology or the classes that are entirely
Internet-based learning resources play an important role in online. In addition, the students’ motivation for the use of
learning. Teachers are incorporating, planning and learning technology was very much tied to the requirements
designing effective resource-based online learning of the course curriculum [24]. There are four major types of
environments to enhance learners’ academic development learning styles that guide the learner’s preferences: (1)
[2, 16], The digital resource-based learning promotes the visual learners (learn by seeing), (2) auditory learners (learn
learners’ acquisition of effective information skills and by hearing), (3) Reading/writing (learn by processing text),
engenders high quality outcomes [8, 23]. The project-based and (4) kinesthetic learners (learn by doing) [35]. Laddaga
learning under a resource-assisted learning environment and his colleagues found in their study of the student
cultivates the learners’ hands-on skills as well as their preference for audio in computer-assisted instruction that
thinking skills such as problem solving, reasoning, and there are strong differences in preference for a visual or
critical thinking through information handling and creative auditory presentation mode. They suggest providing both
experiential work [36, 41]. It encourages active participation formats of presentation mode for the learners’ options [25].
and information skill development through self-directed In this Information Age, all information reaches users in
learning and reflection as opposed to teacher-directed multiple dimensions. Helping students develop their
instruction. It also allows for the active construction of information skills is as important as teaching the subject
personal understanding through a variety of activities [17, matter. Integrating Internet-assisted resources into
40]. In the resource-based learning process, students learn instruction has become fashionable and the associated
how to connect information searching with knowledge pedagogical strategies are flourishing. Brown and Liedholm
growth [30] and improve their confidence and attitudes report that students’ cognitive strategies are the motivating
toward academic success and self- Image [12, 38]. Ehlers factor in choices about the learning materials, and their
indicates that the academic information transparency is an learning strategies and preferences for learning materials
important variable in an online learning environment while are very diverse [4]. Felder indicates that academic
providing resources and course information to the learners
2 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

information comes on a daily basis in both visual and aural content interaction [29]. The learner-content interaction is
formats. Developing students’ skills to take in information gaining more interest in empirical studies, which mostly
in both formats is necessary. He says that if an instructor focus on integrating learning resources into learning
teaches in a manner that disagrees with the student’s activities. The learner’s preference in an online learning
learning preference, the student’s discomfort level may resource rich environment has aroused research attention.
increase enough to interfere with his learning. However, if Educators try to understand the learners’ preferred ways to
the instructor teaches exclusively in the student’s learning use the available online resources. Linda Jensen asserts the
preference, or learning style, the student may feel interaction with the content as follows:
comfortable to learn, but his cognitive dexterity might not be
developed enough to meet the future challenges [14]. Interaction with the subject matter is the heart of
Byrne’s study indicates that learners tend to prefer learning education…. In order to learn, students must have a
with some online multimedia better than others, depending meaningful interaction with the content, and the content
on their individual learning style [7]. Curtis and Howard must be presented in such as way that students will be
used a slide show to provide text-based tutorials on how to motivated and inspired to think deeply about it. Since the
assemble computer hard drives for the verbal/sequential media used for instruction can greatly affect how students
learners in their computer science class [14]. They also interact with the content, there is a great deal of interest in
provided multimedia-based tutorials in the format of determining how to maximize the benefits of using
pictures, animation and video for the visual/global learners. individual or combinations of media [18].
This allowed students a variety of options to learn in their
own preferred manner. Using multimedia programs to Brown and Liedholm conducted a quantitative study on the
improve instruction and motivate learning has become a student preference in using online learning resources based
new issue for teachers. The Internet provides a convenient on the learner’s three cognitive styles: 1) visual versus
platform for teachers to exert their expertise to design and verbal; 2) active versus reflective; and 3) concrete versus
teach classes in a resource-based network environment. An abstract. They assert “Generally speaking, students vary in
understanding of the learners and their preferences in this their cognitive or learning styles and therefore would benefit
learning format would help us increase the quality of the from teaching techniques that appeal to their individual
instructional design and strategy in the technology-rich styles” [4]. In their study, they collected data from a business
learning environment. economics course in a traditional class format, which is
Li and his colleagues [26] have created an online learning theory and concept-driven, and based on reading and
model, the Online Top-Down Modeling Model, to help lecturing plus online learning resources. They found that
enhance the learning effectiveness through the learner- students valued the streaming video lecture the highest and
resource interaction on a graduate technology course FED the classroom lecture the next. They also found that the
529 Computer-Based Instructional Technology course students’ cognitive styles correlated with the values of
website at http://myspace.aamu.edu/users/sha.li. The FED different kinds of resources, but students’ GPA was not
529 Course is a graduate computer literacy course. It is related to the value of any of the resources. They concluded
taught in a blended format. The course has a resource rich that the blended course is more effective than the traditional
website, especially on the course website where there are course because it has more options and learning resources to
previous student project models such as Word projects, support the students [4]. Ke and Carr-Chellman assert that
PowerPoint projects, Excel projects , graphics design the solitary learner’s preference and the social learner’s
projects, and web page design projects, etc. available. This preference might not differ in their perspectives on the
strategy advocates retrieving the class resources from the learning situation. The solitary learner in an online
course website to show model projects to students first, and collaborative learning environment prefers internal
then to demonstrate the new program tools and teach interaction, collaboration in an independent manner, and
specific skills when each new project is taught. Through this interaction academically rather than socially [20]. The
strategy, the effective learning occurred, and the students’ learner’s preference of the online learning resources is an
motivation and positive attitude toward the use of indicator of their interaction effectiveness from the
technology-aided learning resource increased. This article is perspectives of the learner. Understanding the learner’s
a follow-up study after that previous study on the Online preference for using online learning resources would
Top-Down Modeling Model. It explores deeper into the enhance the design and instruction of the Internet resource-
learners’ world during their instruction under the Online based learning activities in the eye of the learner-content
Top-Down Modeling Model, trying to better understand the interaction.
learners’ needs and perspectives to add to the literature on
the effectiveness of the online learning resources from the 2. Methodology
learner’s side. This article focuses on the learners’
preferences in using online learning resources while This study is an action research which adopts the qualitative
learning under the Online Top-Down Modeling Model. and quantitative mixed method to collect and analyze the
data for findings. The purpose of this study is to understand
1.1 Theoretical Framework of the Study the learners’ preference and preference-related perspectives
Michael Moore defines the three major types of interaction in the Online Top-Down Modeling Model environment in
in IT driven distance learning setting: the learner-learner the FED 529 Computer-Based Instructional Technology
interaction, the learner-teacher interaction, and the learner- class. The FED 529 course was used as a case to explore the
(IJCNS) International Journal of Computer and Network Security, 3
Vol. 2, No. 8, August 2010

learners’ preferences in using online learning resources. The and skills. Besides asking the teacher and classmates for
qualitative and quantitative mixed research method is help, students also frequently access the available online
regarded to better explain the process of an event and to give resources to find solutions; some of them prefer using more
a more meaningful result [10]. The mixed research method of the course website resources, and others prefer using
draws from the strength of both qualitative and quantitative more resources from other websites. Students gave survey
approaches, and minimizes the weakness of either in a feedback first on their own entrance computer skill level and
single research study [19]. It increases the validity and then on their preference of using online learning resources
reliability of the findings by allowing examination of the from the course website vs. from other websites. Table 1 and
same phenomenon in different ways and promotes better Table 2 break these into percentages.
understanding of findings [1]. Sixty-five students from four
classes of the FED 529 course and the instructor, Dr. Lee, Table 1: Self–Assessed Student Skill Level
participated in this study. The data collection covered the Question: As compared to the peers in class, I rate my
fall semester of 2007 and the spring semester of 2008. We computer skill level as
used interviews, observation and surveys to collect both Choices N* Valid %
qualitative and quantitative data, combined with descriptive 1. Low level 9 13.9
statistics and Pearson’s Correlation in analysis. For some of 2. Middle level 40 61.5
the issues, we used graphs with SPSS to visualize the 3. High level 16 24.6
* N = number
relation of the variables. We present both quantitative data
and qualitative data item by item. In this way, we hope that
readers will clearly understand each itemized analysis and Table 2: Preference for Using Course Website vs. Other
perspective holistically. Websites
Question: Your preference for using online learning
resources is
3. Data Analysis Valid
Choices N %
In the FED 529 Computer-Based Instructional Technology 1. I used more resources from FED
class, the instructor, Dr. Lee, uses a course website to 529 course website than from other 28 43.1
integrate the online learning resources into this blended websites.
class. The course website 2. I used more resources from other
(http://myspace.aamu.edu/users/sha.li) provides, in addition websites than from FED 529 course 8 12.3
website.
to the course information and syllabus, computer-based
3. I used FED 529 course website
assignment project models/samples, FAQ tutorials 1 1.5
resources only.
(including text FAQs and video FAQs), and multimedia 4. I used other website resources
online resources such as clip art, photos, sound clips, music 2 3.1
only.
background files, sample instructional websites, etc. to 5. I used the FED 529 website and
enhance learning. Since the class resources are online, 26 40.0
other websites equally.
students can access them anytime and anywhere. Integrating 6. I used neither website. 0 0
the course website resources in instruction can provide
necessary support to the students who have unmet needs in In Table 1, we know that the majority of the students stay in
class because of skill level, background, and possible the middle level of the technology skill. In Table 2, the
absences—leaving no one "left behind." The teacher highest rate of the students’ preference of using online
introduces the course website to the students during the first learning resource focuses on the FED 529 course website
class of the semester and also introduces other useful (43.1%); the next is for the use of both the FED 529 course
website links to students, such as Marcopolo, Alex, Intel website and other websites equally well (40.0%). While
educational site, United Streaming, Alabama Virtual 12.3% of the students chose to use more resources from
Library, etc. There is no forceful requirement for the websites other than the FED 529 course website, only 4.6%
students’ use of the online learning resources or what kind chose to use either the course website only or other websites
of resources they need to use. Students could have their own only. No one chose not to use learning resources from
different preferences to access or use the online learning Internet. If we add the percent in Choice 1 and Choice 5,
resources available to them. This study tries to find their which emphasize the preference of using the course
preference tendencies and analyze the learners’ perspectives resources online, that is a total of 83.1%. The students’
and attitudes related to their preferences of using online verbal responses to this issue illustrate these trends:
learning resources. There are nine major preferences found • Our [class] online resources were extremely
in this study. helpful. They provided the necessary help
and tools to complete my assignments…. I
3.1 Prefer Using the Course Website Resources vs. learned a lot in a short period of time about
Resources from Other Websites computers.
In class, we see that students are immersed in learning the • I think that this class made the use of the
new computer skills needed to create multimedia-enhanced online learning resources more
projects like PowerPoint presentations, web page design, understandable and interesting. The website
flyer design, and graphics design. During their learning, caught my attention. This site made me
students generate different needs to master the knowledge want to use the internet more.
4 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

• [It provides you] the opportunities to work at the strength of both resources, which could help students
your own pace and having the resources yield a higher quality learning outcome. The correlation
ready available to meet your project needs. between skill level and the preference for using the course
• Using the online resources could give you website vs. other websites is r= -.053, p > .05. It is not
the freedom to access the information at significant.
your own leisure. The student can also see
an array of different kinds of information 3.2 Amount of Multimedia Preferred
about a topic all in [one] spot. I really
The major characteristic of the project resources online is to
benefited from the course website resources
provide many active multimedia features, such as sound,
as well as the other online resources during
animation and video. Multimedia features help build
my study. But I prefer using more of the
interest, motivation, comprehension, retention and
class website, because it provides
imagination [26]. Most students like to follow the online
information and support tied to our projects
models as a start. They expressed their preference on
and assignments.
following the multimedia projects and integrating
• The regular Internet resource websites have multimedia features into their projects. Table 3 summarizes
to be previewed in depth before using. their preferences on this issue.
Many sites contain too many advertisements Table 3: Preference on the Degree of Multimedia Use
or inappropriate materials for the intended Question: My preference to integrate multimedia
project. features (like text, graphics, sound effects, music,
• Whenever I couldn’t remember what was animation, video, etc.) into my projects after the
said exactly, I could always online for a online models is
demonstration. I will use it for creating Valid
projects to enhance my learning. Answer Choices N %
• I feel that the online learning resources [of 1. I would like to integrate all kinds
this course site] are very helpful. I use them of multimedia formats, such as text,
58 89.2
color, graphics, photos, sound,
to keep assignments fresh and to learn new
animation, video, etc.
things I may have missed in class. 2. I would like to consider using two
• The online learning resources are well kinds of multimedia formats.
5 7.7
explained. It provides ample time for me to 3. I would like to consider using only
complete my assignments. I can learn and 1 1.5
one kind of multimedia format.
make my projects at my own pace, and 4. I would not consider using any
1 1.5
experience more in-depth learning and multimedia (except text).
understanding.
The survey shows the students’ highest preference is for the
It is indicated that the major benefits of the resources from use of a combination of various formats of multimedia to
the course website include being very convenient, focused on create projects. Byrne conducted a study on the learner’s
class content and tasks, allow ample time for students to preference of educational multimedia with the use of self-
finish projects at their own pace, and allow students to directed online learning resources [7]. His study shows that
experience more in-depth understanding and learning. The students will prefer learning with some type of online
benefits of the resources from other sites include more multimedia resources, but their preferences vary according
variety of resources and a broadened vision on multimedia to their individual learning style. To understand the
projects. Forty percent of the student respondents prefer students’ multimedia preference related to their multimedia
retrieving resources from both the course website and the related learning style, we categorized the students’
other websites as well. Three of those students' verbal multimedia related learning style to see if there is any
responses showed their reasons: relationship between their learning style and their
• I like to get resources from both the class multimedia preference. The students’ learning styles are
website and other websites because I would shown in Table 4.
like to see more materials on how to integrate
technology into specific subject areas: math, Table 4: Students’ Media Related Learning Style
science, etc. Question: Your learning style regarding the aid of
• I like the resources from both the FED 529 multimedia is
website and the online resources. Searching Valid
Answer Choices N %
for information requires time and sometimes
1. I am a text-driven learner. 2 3.1
patience. Using both sources offers a broader
2. I am a graphic-driven learner. 21 32.3
range of choices. 3. I am a sound-driven learner. 6 9.2
• I used the FED 529 site mainly for technical 4. I prefer all of the above media
things. I might use other resources if I would 36 55.4
formats in learning.
like to get more creative with my sounds,
pictures, clipart, or other things. From the table above, we can see that the number 4 choice,
It is clear that using both the course website and other the preference on the use of multiple multimedia, is the
websites to locate information and resources could combine highest (55.4%); the next is the number 2 choice, graphic-
(IJCNS) International Journal of Computer and Network Security, 5
Vol. 2, No. 8, August 2010

driven preference (32.3%). After that is the number 3 and • It [multimedia] gives the students more to see
number 1 choice. So we can see that even though sound is than text only project. It helps you explain or
an important factor in learning resources, graphics are much demonstrate an idea. But sometimes it is hard
more preferred. This is similar to the study by Ross et al. to focus on the information if too many
that most students value integrating a variety of multimedia multimedia forms are used.
instead of one or two forms of multimedia [37]. The Dr. Lee also notes that when he began to teach this FED 529
correlation between the students’ media related learning class, he presented students with printout project models,
style and their multimedia preference is r= -.203, p>.05. It is such as flyers, web pages, PowerPoint presentations and
not significant. We graph the Distribution of Students’ graphics design. Since the creation of the course website for
Multimedia Preference and their Media Related Learning FED 529, the model projects all moved online. This brings
Style as follows: all the multimedia features alive when showing model
projects to the students either from the web browser or
downloaded to show on the computer. Students are
motivated by the multimedia functions and interested to
learn with multimedia. He could not see very many learning
style differences related to the preference of using
multimedia in learning to create assigned projects. The
possible reason for this might be the information carried by
the visual, sound, animation and video has the closest
representation of the real world that relates to everybody’s
daily life and personal experiences. The next reason might
be that the students’ computer skill level has generally
increased as compared to past years. This greatly increases
their ability to decode multimedia based information other
than text and reduces the dependence on one kind of media
such as text to receive information. Thus, their learning
Figure 1. The Histogram of the Students’ Multimedia styles related to the retrieval of information have changed to
Preference and their Media Related Learning Style accommodate the new information formats to be more
effective.
From the graph above, we see that the majority of the
students’ preference, no matter what media related learning 3.3 Online Learning Resources vs. Printout Learning
styles they possess, cluster around the Choice 1-All Media, a Resources Preferred
few with Choice 2-Two Media, a fewer with Choice 3-One
The FED 529 class website has many resources to cater to
Media, and nobody chose Choice Four-No Media. So even
the students’ needs for the content areas, anytime and
though the students’ multimedia preferences vary, they are
anywhere. Students could view them online or download
not restricted by their learning styles. It seems that
them to carry around. Those resources include syllabus,
multimedia is a commonly shared learning preference for
rubrics, project models, FAQ tutorials, multimedia resources
them to maximize achievement. This finding does not really
(sound clips, clip art, photos, videos, sample websites, etc.)
agree with Byrne who states that learners tend to prefer
and links to other resources such as the Thinkfinity, United
learning with some online multimedia better than others,
Streaming, Virtual Library, etc. This facilitation supports
depending on their individual learning style [7]. We
students during their learning processes. The students’
examined the students’ verbal feedback on this issue, which
feedback on their use of the learning resources online vs.
follows:
traditional printout resources is listed in Table 5.
• I like to integrate multimedia features
because it makes the final product look Table 5: Student Preference for Online Learning Resources
professional. It is also very interesting and vs. Printout Resources
fun to see many different enhancements you Question: For learning resources formats used in
can attach to your project. this class, such as project models, FAQs,
• Using more [multimedia] features allows your multimedia resources, etc., which do you prefer?
presentation to be more exciting and relayed Valid
N
to your audience. If a project is too simple or Answer Choices %
simply put together, it’s usually hard to get 1. I prefer online learning
your point across without boring others. resources 30 46.2
• I think when you use multimedia in 2. I prefer traditional printout
PowerPoint, for example, it keeps the attention resources 1 1.5
3. I prefer both 34 52.3
of your audience longer than if you were not to
4. I prefer neither 0 0
include anything.
• I work with special need children. Some
The highest preference is number 3 for both online learning
seeing, some not. I would use sound for my
resources and the traditional printout learning resources
blind or low vision class, and graphics for the
(52.3%). The second highest preference is number one for
children who are seeing.
the online learning resources (46.2%). The preference for
6 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

only the traditional printout resources is extremely low,


(1.5%), and the preference for neither is 0%. If we put In the first question, 61.5% of the students prefer the
choices 1 and 3 together, both of which relate to the use of syllabus online and 38.5% prefer the printout syllabus if
online learning resources, the total number will be 98.5%. there are only two options; in the second question, the
Clearly, the resources online are the most favored by the students who prefer a printout syllabus dropped to 4.7%,
students. The students gave their verbal feedback to further while those who prefer an online syllabus becomes 22.0%,
detail this: and those who prefer both formats is 60.9%, if there are
• Putting the learning resources online gave me three options for them to choose from. That means only
a wide arrange of options to choose from. It 4.7% of the students prefer a printout syllabus only in any
broadened my computer vision and skills. I condition; all others prefer a syllabus either online or in
really need those from the website, such as both formats. We unfold the students’ opinions to see how
schedule, syllabus, models, Sounds, clipart, they explain this in their own perspectives:
FAQs, and blackboard access. • I like it [syllabus] online because I don’t have
• If I forget how to use a program, I would to keep up with the hard copy. I am always on
immediately access the website to get the my laptop computer and find it very easy to
information I needed. The printout resources access it. If it is a hard copy, it is just another
are useful, but are dead. The resources online piece of paper to keep up with and fill. It is
have all the multimedia features embedded, hard to lose it online.
[such as] sounds, animation, vivid color, • I like the online syllabus because I can access
videos, and changeable features. Resources it at all times from any computer.
online are live ones. The printout is dead. • I prefer an online syllabus simply because it
• I used the syllabus each week before class. I never gets lost, and it’s easy to access without
then looked at student work examples on the long pointless searches, since I’m always on
Internet and any teacher’s directions available the computer. Anyway, online syllabus works
there to help me prepare for class. When I look better for me.
at the FAQs, I like to turn on speakers to hear • I like it online because I have access to it
the teacher’s voice, but the printout FAQs wherever I was. I occasionally travel, and I
don’t have voice. It’s silent text only. But the could refer to it when doing homework away
printout is easier to read. Having an access to from home.
both online resources and a copy of the
printout in hand is better. Those students who prefer both formats presented
We can see the reason that students favor the online their reasons as follows:
learning resources is that resources online can keep their • They both are effective. The online syllabus is
computer-based multimedia features “live” on computers, better because a person can access the
and also take the benefit of the Internet to reach the syllabus at all times.
resources any time. The advantage of the printout resources • I like both methods of receiving a syllabus. It
is “easy to read.” The comparison of the features of online is always important to have a backup. It is
learning resources and the printout learning resources is good to have a copy online in case you lose
listed in Table 13 in the Summary and Discussion section. your hardcopy. However, it can be somewhat
3.4 The Online Syllabus vs. the Printout Syllabus of a hassle to have to look up assignments
To know about the students’ preference specifically, our online. I think both are good. The online is
survey went deeper to probe the students for their especially good if you remember to print out a
preferences in specific areas. About their preference for the copy.
syllabus in the online format and printout format, the • I like both. It depends on what you are doing.
feedback was solicited in two questions as shown in Table 6: Sometimes you want it to be online because it
is always there. But the paper handouts are
Table 6: Student Preference for Online Syllabus and easier to read than on the Internet. You don’t
Printout Syllabus have to click, click, and look around for
Question: If you have only two formats of syllabus them.
to choose from, which do you prefer? • I prefer syllabus in both areas. I’m a student
Valid who doesn’t have a printer. But a hardcopy
N
Answer Choices % will help me when I am at home, even though
1. I prefer the online syllabus. 40 61.5 I like the online syllabus.
2. I prefer the printout syllabus. 25 38.5 • In my own background I used to read only
printout syllabi. I didn’t know how to type in
Question: If you have three formats of syllabus to the course web address to find it online. After
choose from, which do you prefer? one semester, I can locate the syllabus on the
Valid
N Internet. I use both ways to access syllabus
Answer Choices %
1. I prefer the online syllabus. 22 22.0
now.
2. I prefer the printout syllabus. 3 4.7
3. I prefer both. 39 60.9
(IJCNS) International Journal of Computer and Network Security, 7
Vol. 2, No. 8, August 2010

The students who prefer hardcopies only of the syllabus


gave their opinions as such:
• I prefer a hardcopy so that I can make notes
and/or changes to assignment due dates on it
as the semester goes on.
• I like a hard copy because I’m often in a
position without either internet or computer. I
do not have time to go online. I like getting it
the first day of the class.
• Hardcopy! This way, I can access it anytime
and anywhere without technical difficulty.
In summary, students find that the advantages of the online
syllabus are its convenient access, easy to use when
traveling around, and keeping the syllabus from being lost.
The students who like the syllabus in both online format and
Figure 2. The Student Skill Level and Preference for Online
printout format state that having access to both formats of
Models vs. Printout Models. (Option 2 for the printout
the syllabus is more beneficial and flexible. You can take
format is omitted because it has 0 people selected)
advantage of both. But the students who prefer the printout
format of the syllabus indicated that a hard copy is good for
• Every resource on the FED 529 course website
keeping notes and marking the changes of the assignment
has helped me or motivated me to learn in
due dates, if any. It is useful when you have no access to
some sort of way. I am glad I had to take this
Internet or computers, or hope to stay away from technical
class because it increased my computer
frustration. The positive aspect of this class is that it
knowledge and gave me examples of work
provides both syllabus possibilities: online format and
presented by others.
printout format (students could print out the syllabus in the
lab or at home from the Internet). • I like the online models projects. For example,
the flyer project model. When I open it, I just
3.5 Preferring Viewing Project Models Online vs. see different layout and design. I got an idea
Printout Project Models what other people did, how they did it, what
color they used. It helps me spark my idea to
The project models provided online are an important work on my own project out.
component in the Online Top-Down Modeling Model. To • If showing me only the printout model
know how effective the online models are to the students, we projects, many things I cannot grasp
probe the students for their feedback as follows: completely. Because I cannot fully understand
without animation, music playing, formation
Table 7: Student Preference for Online Projects vs. of the multimedia components like animation
Printout Projects path and sequencing. These features are only
Question: When viewing previous students’ functioning when they are playing on
project models/samples, you prefer computers or on Internet.
N
Valid • The resources on the FED 529 course website
Answer Choices % are a wonderful guide. Honestly, I would get
1. Viewing them online. 35 53.8 lost without many of the examples. I kept them
2. Viewing them in printout. 0 0 minimized [with Internet browser] while I was
3. Viewing them in both of the working [on my computer].
above formats. 30 46.2
It is obvious again that the multimedia features are the main
attraction to produce good computer-based projects. But the
The data shows that 53.8% of the students prefer project majority of those features are only functioning on computers
models online; 46.2% of the students prefer both formats; or on the Internet, not on paper printout. That is why
and 0% of the students prefer project models in printout. students prefer them online or in both formats, instead of
The correlation between the student skill level and their printout only.
model format preference is r= -.163, p<.05. It is not
significant. The graph of the relation between the skill level 3.6 Preferring Format of Text FAQ vs. Video FAQ
and the model format preference by the percent of the cases
The FAQs are actually tutorials provided on the FED 529
is as follows:
course website for students to learn how to make new
From the above graph (figure 2), we see that the line slopes
projects. There are two kinds: the text FAQ and the video
of the high level and low level regarding the preference of
FAQ. The text FAQ is the text tutorial and the video FAQ is
the use of the models online and both online/printout
a screen captured video tutorial. It is made with Windows
formats go in opposite directions, while the middle level
Media Encoder, something like Comtasia screen capture.
stays horizontal. Students’ verbal responses to this issue
Windows Media Encoder is a freeware provided by
further illuminate these preferences:
Microsoft Company. The students’ data feedback on the use
8 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

of the FAQ formats is solicited through two questions. They


are shown in Table 8.

Table 8: Students’ Preference for Text FAQ vs. Video FAQ


Question: If you want to use FAQs to learn a new
project and you only have two options, which do you
prefer?
Answer Choices N Valid %
A. I prefer using text FAQs. 6 40.6
B. I prefer using video FAQs. 39 59.4

Question: If you want to use FAQs to learn a new


project and you have three options, which do you
prefer? Figure 4. The Relation between the Skill
Answer Choices N Valid % Level and the 3 FAQ Options
A. I prefer using text FAQs. 13 20.0
B. I prefer using video FAQs. 16 24.6 In Figure 4, we can see that if three options are provided,
C. I prefer both 36 55.4 the high skill level students have the highest level of
preference for using text FAQs as compared to the other
In the first question of Table 8, the feedback shows that students, a low level of using video FAQ, and a high level of
59.4% of the students prefer using video FAQs, while 40.6% using both formats of FAQs. The low skill level students
of the students prefer using text FAQs, if there are only two have the highest level of preference for using both formats
options provided. Then the condition breaks down from and lowest level of using text FAQ. The middle level
providing two options into three options in the second students have a relative high tendency of using both formats
question. The responses to the second question show that of the FAQs and relative low tendency of using the text
55.4% of the students prefer using both formats of FAQs, FAQ.
24.6% of the students prefer using video FAQs, and 20.0% In class, it was also observed by Dr. Lee that there was some
of the students prefer using text FAQs. For the first difference in learning style between the low technology level
question, if two options are provided, r=-.165, p>0.05. The students and the high technology level students. The low
correlation is not significant. For the second question, if level students usually are more dependent on the tutorials if
three options are provided, r= -.047, p>.05. The correlation they cannot get the teacher’s on-screen illustration, and they
is not significant either. Then we graph the two correlations would take more time to view and follow the online FAQ
by the percent of the cases with SPSS in the following two tutorials step by step to learn a new project; the high level
figures. students preferred to spend less time viewing the online
If only two options are provided, we can see in the graph FAQ tutorials if needed, but they were more independent in
that the low technology level students have a sharp tendency producing their own projects. The low level students
to use the video FAQs and not to use text FAQs at all; the preferred viewing more video FAQs online, while the higher
middle technology level students have a similar tendency of level students preferred viewing more text FAQs online.
using more of the video FAQs and less of the text FAQs. Two low level students gave the following explanations:
Only the high technology level students have a tendency to • I used more of the video FAQ than text FAQ to
use more of the text FAQs and less of the video FAQs. learn, [because] It takes me less time to
understand what we’ll do and how by viewing
the projecting processes on a video. I just
follow it. I can view it again and again if I still
do not quite understanding.
• I didn’t know PowerPoint, [and] Microsoft
Word. I haven’t enough experience. But now I
know how to do things like web page, and
Excel grade book, too. I used resources online
a lot. I missed classes, so I had to go in
looking at the available resources. Such as, I
used video FAQ for [catching up with]
webpage design. That works real good for me.
And a higher skill level student gave us another story:
Figure 3. The Relation between the • (The) text FAQ is enough for me. I already
Skill Level and the 2 FAQ Options have a very good background with computers.
I can understand what it says in the text
If we merge the low level student’s tendency line with the description. I browse text faster than viewing a
middle level student’s tendency line, it would be even video.
clearer that the high technology level students’ and the From here we can see that low level students are
lower level technology students’ FAQ use tendency lines go more dependent on the visually presented tutorials that
to the opposite direction. guide them on a computer screen throughout the process of
(IJCNS) International Journal of Computer and Network Security, 9
Vol. 2, No. 8, August 2010

creating a new project, while the higher level students are projects and a relatively higher level of using both online
more independent of the visually presented tutorials to learn and printout formats. The middle level students have a
a new project. It is easier for them to decode the hands-on tendency of using online models and both online models and
learning guidance out of a text tutorial with their rich printout models equally well. Nobody prefers using printout
experience and knowledge. But the instructor states that models only. Regarding the format of online FAQs in
during an open-note quiz or test, it is different. The high Preference 6, the high technology skill level students have a
level students have no difficulty creating a project for a quiz, higher tendency of preference for using more printout
while the low level students preferred downloading a text format and less video format, while the middle and low level
FAQ to bring into a quiz where referring to notes was students have a higher tendency of preference for using
allowed; when faced with a time limit, viewing a video FAQ more video FAQs and less text FAQs.
takes longer than viewing a text FAQ. More notably, the low
level students become more familiar with creating the tested 3.7 The Preferred Sequence of Viewing Models vs.
projects now than at the time they started to learn them as Learning New Tools
new projects. Their tutorial preference has changed at this
Learning activity is in a sequential process (Bennett, 1999).
point. This is also supported by one student's comment:
In the Online Top-Down Modeling Model setting, showing
I used more of the video FAQs when I started
students the model projects is a necessary procedure before
this class. But I find that the text FAQs equally
teaching a new project. It is beneficial to know which
helped me later on. Because I get more familiar
sequence is better for students to reach the best learning
with the terms used in the text FAQ. I can
efficiency. We asked students whether the teacher should
understand them now. So I am using both.
show them the online model first or show them new
A few students mentioned in the interview that they
program tools/features first, like the tools/features in Word,
did not use FAQs to learn their projects because they
PowerPoint, Excel, Access, Paint, webpage design, etc. to
thought that their technology experiences were good enough
meet their sequential preference. The students’ responses are
and the teacher’s in-class instruction presented enough for
shown below:
them to go by. One student said,
I have never used FAQs. I think I am real well
Table 9: Student Preference for Sequence of Model and
on the basics of technology. I have no difficulty
Tool Exposure
to finish project assignments in class. On the
Question: When the teacher starts showing me a
other hand, I don’t have to stay on the top of model project before teaching us about that project,
technology. I don’t have to catch up with I would like the teacher to show me in the sequence
everything. Finishing my course requirement is of
what I currently want. Valid
So we can see that this student preferred avoiding using Answer Choice N %
FAQs because he/she thought that he/she already had a good 1. First show me the model project
background in one way, and just expected to meet the basic -- then teach me how to use the
50 76.9
requiremenst for the assigned project in another. As you program tools/features -- then let
know, manipulating the higher level skill for a creative us work on our project.
project design requires more self-efficacy and unique 2. First teach me how to use the
program tools/features -- then show
creativities. But those were not covered in the FAQ tutorials 13 20.0
me the model project -- then let us
which start only from the basic required skills. work on our project.
To test and really see how dependent or independent the 3. First let us work on our project -
students are of the video FAQ, one day Dr. Lee asked the - then teach me how to use the
1 1.5
whole class of students to learn a new project on their own program tools/features -- then show
with the aid of the FAQs only, without the teacher’s on- me the model project.
screen illustration. It turned out that the majority of the 4. First let us work on our project -
students, more or less, resorted to the video FAQs a lot, - then show me the model project -
1 1.5
regardless of their skill levels. So we can conclude that the - then teach me how to use the
program tools.
first visual experience is important for every beginning
learner. Those who usually prefer using text FAQs instead of In the survey feedback, 76.9% of the students prefer viewing
video FAQs to learn a new project have actually gained project models first and then learning the new program tools
related visual experiences (including physical experiences) and work on the project, while 20% of the students prefer
in advance in that specific area. If they don’t, they also viewing the program tools/features first and then watch the
prefer to gain the first visual experience when they start models and begin working on the new project. This verifies
learning a new project. that the learning sequence of the Online Top-Down
The next finding is that Preferences 5 and 6 have interesting Modeling Model fits the majority of the learners’ sequence
results. That is, regarding the online model formats in preference, which provides the students with model projects
Preference 5, the high technology skill level students have first and then engages the learners in learning new features
an extremely high level of preference for using online and tools when working on the new projects. Students
models and an extremely low level of using both online and voiced the perspectives behind their preferences:
printout formats; while the low technology skill level • I want to see the models first because I want to
students have a relatively lower level of using online model see what I am to be doing. I am a visual
learner, so it’s important for me to see it first,
10 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

and I get a better understanding. It would be 3.8 Preferring Viewing High Quality Model Projects vs.
more difficult if you show tools first for me to Lower Quality Model Projects
learn…. It just easier for me to understand
Exposing students to a learning outcome first is effective to
when I actually see it being used as apposed to
start students on a new project-based learning task [32].
just being told to do it.
There are many previous student model projects on the
• It’s pretty cool to see in result that you can do
course website to be displayed to the students when they
that…. I just like to say I like to see the models
start to learn to make that kind of project. But the quality of
first. It was difficult when you’re telling me,
the model projects online is an issue among people.
and teaching it. When I see it, you know, it
Traditionally, teachers like to show the high quality project
feels more exciting and more interested to start
models to the students as guidelines and criteria when they
doing.
start learning a new project. But as observed, some of the
• I think it’s good to see the models first because
students uttered their discomfort when they were first
you can see exactly [what] the capabilities are.
exposed to a high quality model project. With this issue, we
And then when you show the tools, you have
posted a question in the survey to solicit feedback from
an idea in your head already and what to do
students:
with those tools.
Table 10: Student Preference for Viewing High Quality
• I like to see the models first, because it gives Model Projects vs. Lower Quality Model Projects
you ideas of what to come. I just prefer to When the teacher displays a new model project to
visualize before I actually use it. It helps me you before you learn to make that project, what
feel motivated. Those models are excellent. quality project do you prefer?
They helped me very much…. If you have to Valid
explain it, and show us later, I wouldn’t catch Answer Choices N %
on to the very end. I am a visual learner. It just 1. I like the high quality, excellent
25 38.5
takes me longer to catch on. And I think if you models.
show me what I could create, it makes more 2. I like both high quality and lower
39 60.0
fun, it makes more sense, it makes more quality models.
3. I like the lower quality models. 1 1.5
exciting, and more memorable….
One low level student told us his preference: The survey shows that the students’ preference for the
• I am a beginner with computers. I am not quality of the online project models varies. From Table 10,
familiar with most of the program tools. I we can see that 60.0% of the students like to view both high
would like to see tools first and models second. quality and lower quality project models on the class
If I can see tools first, it would be better for me website; while 38.5% of the students like to view the model
to relate the tools to how the models were projects of high quality. Only 1.5% (one student) prefers
created [with them]. But if I see models first, it viewing the model projects of lower quality. The correlation
will be difficult for me to relate the models to between the students’ skill level and the model quality
each specific tool and how to create the project preference is r=.029, p>.05. It is not significant. We graph
with those tools. the distribution of the student computer skill level and their
But another low level student had a different preference for the quality of the project models by the
vision; he said: percent of the cases below:
• I am not good at computers at all. For me, I
would like to see the models first at the start.
The models you showed us from Internet are
interesting and well made. It makes me
interested to learn computers, and do the
project we are supposed to learn. While I am
learning to make the project, I will learn the
tools also.
This suggests that if the teaching sequence conforms to the
learner’s sequence preference, it’s more motivating,
interesting, and memorable. Otherwise, it would be
confusing and affect understanding, memory of what has
been shown and how to do the task. To fix this issue to meet
all learners’ needs in the sequence of showing models, the
teacher usually adopts a flexible strategy. For some of the
projects, he would show models first if this involves students Figure 5. The Histogram of Different Computer
more effectively. For others, he might show tools/features Skill Levels related to Model Quality Preferences
first—especially when those programs are new to the From Figure 5, the students of different skill levels are
majority of the learners—leaving options for them to view virtually spread around among the first choice and the
tools or models on their own after the teacher’s second choice, excepting the third one. But relatively more
demonstration, taking advantages of the strength of the low level students prefer high quality project models, more
Online Top-Down Modeling Model. middle level and high level students prefer both high quality
(IJCNS) International Journal of Computer and Network Security, 11
Vol. 2, No. 8, August 2010

and lower quality models, and only a small portion of the model the hands-on learning process to meet the students’
low level students prefer lower quality models. This graph needs at various levels.
displays that the majority of the students have a self A. 3.9 Preferring Sharing Online Learning Resources vs.
expectation for aiming at the high quality projects but prefer Personal Use of Online Learning Resources
a broadened vision of the model projects at varied quality
As Wolf said regarding the Big Six Skills [44]), the
levels. There are still a small number of the low level
information literacy skills include using and sharing the
students who prefer viewing the lower quality project
information for scaffolding each other in brainstorming and
models. Probably students like them need a period of
problem solving. Sharing learning resources/information is
transition from being exposed to lower quality models to
a required component in class. Teamwork is assigned to
high quality models, depending on development of their
develop students’ skills of collaboration. The instructor
knowledge and skills. To understand how students feel about
encourages students to pool their efforts and information for
the quality of the project models, we solicited verbal
learning tasks. Students’ responses about this issue are in
responses through interviews, which follow:
Table 11:
• The high quality model projects inspired me. I
Table 11: Student Preference for Sharing Online Learning
just got excited to see PowerPoint projects can
Resources vs. Personal Use of Online Learning Resources
be so beautiful and impacting. It has so many
Question: My general preference for sharing online
functions and can do many things. When I see learning resources and information is
them, I really want to learn those skills. I think Valid
my students will like them, too. Answer Choices N %
1. I like to share my
About the lower quality projects, student responses varied: resources/information among peers 62 95.4
• It kind of helps you know what’s going on. If and also like to share other people’s.
you look at something you don’t like, you say, 2. I don’t like to share my
“OK, I can do something better. That’s not a resources/information among my
good thing to do.” And you feel how you could peers and don’t like to share other 2 3.1
people’s. I just like to work things
do it a different way.
out on my own.
• If I see only highly excellent projects, I would 3. I only want to get
get scared. I am a slow learner. I would like to resources/information from other
start seeing the lower level projects first. That 1 1.5
people but don’t like to let other
would be close to my level. I might like higher people share mine.
quality ones in a later time. 4. I only give other people what I get
Students who prefer both high quality and lower quality but don’t like to get
0 0
project models used their own judgment to evaluate the resources/information from other
projects when both higher quality projects and lower quality people.
projects were provided: Over 95% of the students prefer the shared use of the
• Well, [I] like this one better than that. I don’t resources/information. In class, the students made effort to
think this is eye-catching as the other one. discuss, brainstorm, and share the information, resources,
You see its Christmas looking, and this is and experiences while trying to increase the quality of their
February. So I don’t think those two things go project. When they were presenting their final projects for
together. class evaluation, students often asked the presenters how to
The lower level students had more concerns and anxiety find the resources like content subjects, sound clips, photos,
when they started each new project. They felt a pressure videos and even narrations. The presenters were not only
when they saw the previous students’ model projects: presenting their final projects, but also introducing their
• I am intimidated to see those projects. They ways of finding those online resources and how to screen
are excellent. I am afraid I am far behind them resources for their projects. The students offered feedback
[and I] cannot pass this class. My technology about their shared use of resources and information:
is pretty low. A lot of students in this class are • One strength of this class is the idea of “class as
not at the same level to use technology. I am a family" and team work. I have learned new
concerned. things from collaboration with others in the class.
• I feel I just cannot do so many things at a time. I have never used MyLabSchool.com before. Jane
There are a lot to learn. We are overwhelmed told our group about that. We retrieved something
by those new features and skills we don’t from that website, and it works great.
know. • I learned lots on the PowerPoint projects from
In conclusion, the learning process is both a growing Shiena and Jennifer. We shared the PowerPoint
process and a changing process. Students start learning from FAQs and experimented with the animation
their different starting points. Both high quality models and motion path. I also learned some of the things that
lower quality models are beneficial to the students of this the instructor says we might easily miss, such as
class. In the design of the model resource online, we need to hiding the sound icon in PowerPoint slide show
hit the point of providing some of the lower quality project and sequence the animated objects in animation
models along with the higher quality project models to advanced time line. Without my classmates
12 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

reminding me, I might miss some of those that the online learning resources have a much higher
important areas. advantage level over the printout learning resources. This
• The discovery learning was by trial and error. I partly accounts for the reason why more students prefer
learned quickly what didn't work and sometimes using online learning resources. The positive side is that
took the "long way" to get back on track! I really more and more students are accustomed to the use of
learned a lot from my teammates and others in the computers, and their technology barrier is diminishing
class. The discussions and collaborations allowed while they build up more experiences to retrieve resources
deeper understanding for subject matter. online, and that their multimedia awareness is increasing
• There was the exact amount of discovery, through the learning process.
cooperative and self-actuated learning. This was
helpful in the experience needed to be successful 4. Discussion and Conclusion
in class. It tastes delicious! 1. Discovery learning- This study has yielded some impressing findings
-it's exciting to learn by doing; 2. Cooperative regarding the learners’ preferences and perspectives in
learning -- I learned from my partner in doing our using online learning resources in the Online Top-Down
projects; 3. Brainstorming--we brainstormed in Modeling Model environment. For example, more of the
ideas before we started our projects; 4. students prefer the online model projects to contain both
Constructivist learning -- we learned by high quality and lower quality examples to meet the
searching, doing, and creating. students’ varied needs. Most teachers might think that the
The majority of the students could understand the model projects are the expectation and criteria for the
importance of sharing the online resources/information and learners’ outcome, so they usually pick the highest quality
work together to enhance their learning outcome and project projects as project models for students. They might be
quality. Using people’s resources and wisdom to achieve a gratified with their well-developed course website but
common goal is a necessary competency. This class provides overlook the fact that the students’ backgrounds are not the
the opportunity to let students experience and share the same, and their starting points are different. Their needs and
information and to prepare them to be ready for teaching in self-expectation toward learning a new project are different.
the Information Age. The last survey question in Table 12 According to the Bloom’s Taxonomy, the learning process
checks for students’ attitude change toward using the online is a gradual development, from lower level to higher level
learning resources. [5]. In developing the online learning resources, we still
This survey response shows that 87.7% of the students’ need to follow those guidelines, making the online learning
attitudes (including Choice 1 and Choice 2) have changed resources an attractive component to involve learners
Table 12: Students’ Response about their Attitude Change instead of scaring the learners and increasing their learning
Question: After this FED 529 class, your attitude anxiety at the starting point [11]. The next interesting
toward using online learning resources is finding is that the majority of the students like to have a
Valid syllabus as both an Internet and printout resource. Since we
Answer Choices N %
provided the course resources online, whenever students ask
1. My attitude has changed from
14 21.5 the teacher for course materials, the answer is often
bad to good.
2. My attitude has changed from
“They’re on the website.” The teacher might think that since
43 66.2 the learning resources are already online, available anytime
good to better.
3. My attitude remains the same. 7 10.8 and anywhere and the learner’s needs are met. But the
4. My attitude has changed to students’ responses suggest that the students’ needs might
1 1.5
worse. not be gratified by only one format of resources, possibly
because of the fact that some of them are not familiar with
either from bad to good or from good to better; 10.8% the Internet, or don’t have a convenient access to the
remains the same; while only 1.5% (one person) reported website. The access to a copy of the hardcopy syllabus is still
that his/her attitude had changed to worse. The 10.8% of the necessary, even though students are placed in an online
people who choose Choice 3 are probably those who already resource-ready environment. It reflects the students’
have a good or bad attitude but no change in this semester. preferences during their switch between their uses of the two
Since this survey is anonymous, we did not find a chance to formats of learning resources, online learning resources vs.
ask those who chose Choice 3 and Choice 4 to identify the printout resources. It is the fact that even though learning
exact problems for analysis. But the overall evaluation of the occurs in the information rich Internet-assisted learning
students’ attitude toward using online learning resources is environment, traditional resources are still a necessary
in a “gaining” process instead of a “failing” process. component for learners. The findings of this study could
Meanwhile, students voiced their high preference of the enlighten the teachers who try to design learning resources
online learning resources because of their “live” multimedia online to enhance the effectiveness of instruction in
features as opposed to the “dead and silent” printout traditional and non-traditional classes to expect successful
resources. Table 13 is a comparison of the tallied benefit outcome among the learners at various levels.
points of multimedia features of online learning resources The brief outline of this article was shown as a research
vs. printout learning resources in this class. brief in the Quarterly Review of Distance Education 10(3).
The total effective feature point ratio between the online
learning resources vs. printout resources is 198:56, and the
average point ratio is 14:4. From this comparison, it is clear
(IJCNS) International Journal of Computer and Network Security, 13
Vol. 2, No. 8, August 2010
14 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

References [19] R. B. Johnson, A.J. Onwuegbuzie, “Mixed Methods


Research: a Research Paradigm Whose Time Has
[1] Anonymous, “Benefits of a Mixed-Method Approach to Come,” Educational Researcher, 33(7), 14-26, 2004.
Data Collection,” [20] F. Ke, A. Carr-Chellman, “Solitary Learner in Online
http://www.k12coordinator.org/onlinece/onlineevents/p Collaborative Learning: a Disappointing Experience?”
resentingdata/id91.htm Quarterly Review of Distance Education, 7(3), 249-
[2] C. Armatas, D. Holt, M. Rice, “Impacts of an Online- 265, 2006.
Supported, Resource-Based Learning Environment: [21] D.F. Kohl, L.A. Wilson, “Effectiveness of Course-
Does One Size Fit All?” Distance Education, 24(2), Integrated Bibliographic Instruction in Improving
141-158, 2003. Coursework,” Reference Quarterly, (26)2, 206-11,
[3] C.I. Bennett, Comprehensive Multicultural Education: 1986.
Theory and Practice, (5th Ed.), Pearson Education, [22] D.A. Kolb, Experiential Learning: Experience as the
Inc., New York, 1999. Source of Learning and Development, Prentice-Hall,
[4] B.W. Brown, C.E. Liedholm, “Student Preference in Englewood Cliffs, NJ, 1984.
Using Online Learning Resources,” Social Science [23] C.C. Kuhlthau, (1993). Seeking Meaning: A Process
Computer Review, 22(4), 479-492, 2004. Approach to Library and Information Services. Ablex
[5] B.S. Bloom, Taxonomy of Educational Objectives, Publishing, Norwood, NJ, 1993.
Handbook I: The Cognitive Domain, David McKay [24] R.B. Kvavik, “Convenience, Communications, and
Inc., New York, 1956. Control: How Students Use Technology. Educating the
[6] D.M. Buss, H. Greiling, “Adaptive Individual Net Generation,”
Differences,” Journal of Personality, 67, 209-243, http://www.educause.edu/books/educatingthenetgen/59
1999. 89
[7] D. Byrne, “A Study of Individual Learning Styles and [25] R. Laddaga, A. Levine, P. Suppes, “Studies of Student
Educational Multimedia Preferences: an Experiment Preference for Computer-Assisted Instruction with
Using Self-Directed Online Learning Resources,” Audio,” http://suppes-
http://www.computing.dcu.ie/~mfarren/denice.PDF corpus.stanford.edu/display_article.html?articleid=225
[8] B. Cleaver, “Thinking about Information: Skills for [26] S. Li, D. Liu, “The Online Top-Down Modeling
Lifelong Learning,” School Library Media Quarterly, Model,” Quarterly Review of Distance Education, 6(4),
16(1), 29-31, 1987. 343-359, 2005.
[9] R. Collins, “Students Speak, Teachers Hear: [27] R.E. Mayer, “Multimedia Learning: Are We Asking
Evaluating The Use of ICT in Curriculum Delivery,” the Right Questions?” Educational Psychologist, 32, 1-
http://www.det.wa.edu.au/connectedlearning/presenter 19, 1997.
s_concurrent_presenters.html [28] R.E. Mayer, R. Moreno, “A Cognitive Theory of
[10] J.W. Creswell, Research Design: Qualitative, Multimedia Learning: Implication for Design
Quantitative, and Mixed Methods Approaches (2nd Principles,”
Ed). Sage Publications, Thousand Oaks, CA, 2003. http://www.unm.edu/~moreno/PDFS/chi.pdf
[11] D.L. Coutu, “Edgar Schein: The Anxiety of Learning - [29] M.G. Moore, Three Types of Interaction. In M. R.
The Darker Side of Organizational Learning,” Moore & G. C. Clark (Eds.), Readings in Principles of
http://hbswk.hbs.edu/archive/2888.html Distance Education. Pennsylvania State University,
[12] P. Cull, (1991). “Resource-Based Learning: a Strategy University Park, PA, 1989.
for Rejuvenating Canadian History at the Intermediate [30] P.A. Moore, “Information Problem-Solving: a Wider
School Level,” ERIC No. ED 343 829. View of Library Skills,” Journal of Contemporary
[13] U.D. Ehlers, “Quality in E-Learning from a Learner’s Psychology, 20(1), 1-31, 1995.
Perspective,” [31] G.R. Morrison, S.M. Ross, J.E. Kemp, Designing
http://www.eurodl.org/materials/contrib/2004/Online_ Effective Instruction, (4th Ed.), John Wiley & Sons,
Master_COPs.html Inc., Danvers, MA, 2004b.
[14] R. M. Felder, “Matters of Style,” [32] G. Morrison, F. Clark, D.L. Lowther, Integrating
http://www.ncsu.edu/felder-public/Papers/LS- Computer Technology into the Classroom (3rd Ed.),
Prism.htm Prentice Hall, Upper Saddle River, NJ, 2004a.
[15] R. M. Felder, R. Brent, “Understanding Student [33] J. Neill, “Personality & Individual Differences,”
Differences,” Journal of Engineering Education, 94(1), http://www.wilderdom.com/personality/index.html
57-72, 2005. [34] J.V. Peluchette, K.A. Rust, “Technology Use in the
[16] J.R. Hill, M.J. Hannafin, (2001). “The Resurgence of Classroom: Preferences of Management Faculty
Resource-Based Learning,” Educational Technology, Members,” Journal of Education for Business, 80(4),
Research and Development, 49(3), 37-52. 200-205, 2005.
[17] A. Irving, Study and Information Skills across the [35] M.J. Reid, “The Learning Style Preferences of ESL
Curriculum. Heinemann, Portsmouth, NH, 1985. Students,” TESOl Quarterly, 21(1), 87-110, 1987.
[18] L. Jensen, “Interaction in Distance Education,” [36] L.B. Resnick, Education and Learning to Think.
http://seamonkey.ed.asu.edu/~mcisaac/disted/week2/7f National Academy Press, Washington D.C., 1987.
ocuslj.html [37] S.M. Ross, L. Smith, M. Alberg, D.L. Lowther, Using
Classroom Observations as a Research and Formative
(IJCNS) International Journal of Computer and Network Security, 15
Vol. 2, No. 8, August 2010

Evaluation Tool in Educational Reform: the School


Observation Measure. In S. Hilberg & H. Waxman
(Eds.), New Directions for Observational Research in
Culturally and Linguistically Diverse Classrooms (p.
144-173). Center for Research on Education, Diversity
& Excellence, Santa Cruz, CA, 2004.
[38] I. Schon, K.D. Hopkins, J. Everett, B.R. Hopkins, “A
Special Motivational Intervention Program and Junior
High School Students’ Library Use and Attitudes”
Journal of Experimental Education, 53, 97-101, 1985.
[39] S.E. Smaldino, D.L. Lowther, J.D. Russell,
Instructional Technology and Media for Learning (9th
ed.). Prentice Hall, Upper Saddle River, NJ, 2008.
[40] B.K. Stripling, “Learning-Centered Libraries:
Implications from Research,” School Library Media
Quarterly, (23)3, 163-170, 1995.
[41] R. Todd, C. McNicholas, “Integrated Skills Instruction:
Does It Make a Difference?” School Library Media
Quarterly, 23(2), 133-138, 1994/1995.
[42] Wikipedia Foundation. “ Learning styles: Models and
Theories” http://en.wikipedia.org/wiki/Learning_styles
[43] S.S. Wilbert, A Study of Competency-Based Instruction
to Determine Its Viability as a Technique for Teaching
Basic Library Skills to A Selected Sample of Seventh
Grade Students, Ph.D. Dissertation, Wayne State
University, Detroit, 1976.
[44] S. Wolf, “The Big Six Information Skills as a
Metacognitive Scaffold: A Case Study,” School
Library Media Research, 2003. http://www.ala.org/ala
/aasl/aaslpubsandjournals/slmrb/slmrcontents/volume6
2003/bigsixinformation.cfm.

Author Profile

Sha Li received his doctoral degree of


educational technology from Oklahoma
State University, 2001. His research
interests are in E-learning in the networked
environment, distance education,
multimedia production, and instructional
design with technology. He is also an
instructional design facilitator for the local public school
systems.
16 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

An Adoptive Algorithm for Mining Time-Interval


Sequential Patterns
Hao-En Chueh1 and Yo-Hsien Lin2
1
Department of Information Management, Yuanpei University,
No.306, Yuanpei Street, Hsinchu 30015, Taiwan, R.O.C.
hechueh@mail.ypu.edu.tw
2
Department of Information Management, Yuanpei University,
No.306, Yuanpei Street, Hsinchu 30015, Taiwan, R.O.C.
yohsien@mail.ypu.edu.tw
customers at the right time. Therefore, recently, some
Abstract: An adoptive algorithm for mining time-interval
sequential patterns is presented in this paper. A time-interval researches start to propose algorithms for discovering the
sequential pattern is a sequential pattern with time-intervals sequential patterns with time-intervals between successive
between successive itemsets. Most proposed algorithms use some itemsets, this kind of pattern is called time-interval
predefined non-overlap time partitions to find the time-intervals sequential pattern [2].
between successive itemsets, but a predefined set of non-overlap
time partitions cannot be suitable for every pair of successive To discover the time-interval sequential patterns, many
itemsets. Therefore, in this paper, the clustering analysis is used researches adopt some predefine non-overlap time
first to generate the suitable time-intervals for frequent partitions, and assume that the time-intervals between
occurring pairs of successive itemsets. Next, the generated time- successive itemsets of the frequently sequential patterns can
intervals are used to extend the typical sequential patterns
fit into one of the predefined time partitions.
mining algorithms to discover the time-interval sequential
patterns. Finally, an operator to obtain the time-interval of a However, a predefined set of non-overlap time partitions
subsequence of a time-interval sequential pattern is also cannot be suitable for every pair of successive itemsets.
presented.
Therefore, generating the suitable time partitions for every
Keywords: Adoptive algorithm, Time-Interval, Sequential pair of successive itemsets directly from the real sequence
Pattern, Clustering Analysis datasets is more reasonable. Accordingly, in this paper, we
present an adoptive algorithm to discover the time-interval
1. Introduction sequential patterns without using predefined time partitions.
Data mining is usually defined as the procedure of This algorithm uses clustering analysis to automatically
discovering hidden, useful, previously unknown information obtain suitable time-intervals between frequent occurring
from large databases. The common data mining techniques pairs of successive itemsets, and then uses these time-
include classification, clustering analysis, association rules intervals to extend typical sequential patterns mining
mining, sequential patterns mining and so on. Sequential algorithms to discover the time-interval sequential patterns.
patterns mining introduced by Agrawal and Srikant (1995) The rest of this paper is organized as follows. Some
is the task of finding frequently occurring patterns related to researches related to time-interval sequential patterns are
time or other sequences from a given sequence database [1]. reviewed in section 2. The proposed time-interval sequential
It is widely used in the field of retail business to assist in patterns mining algorithm is presented in section 3. A
making various marketing decisions [3, 5, 7]. An example example is displayed in section 4. The conclusion is given in
of a sequential pattern is “A customer who bought a digital section 5.
camera will buy a battery and a memory card later”.
Up to now, many algorithms have been proposed [1, 4, 6,
9] for mining sequential patterns, however, most of these 2. Time-Interval Sequential Patterns
algorithms only focus on the order of the itemsets, but Sequential patterns mining is defined as the task of
ignore the time-intervals between itemsets. In business field, discovering frequently occurring ordered patterns from the
actually, a sequential pattern which includes the time- given sequence database. A sequence is an ordered list of
intervals between successive itemsets is more valuable than itemsets. Let I = { i1 , i2 ,......,im } be a set of items,
a sequential pattern without any time information. An
S =< s1 , s 2 , ......, s k > is a sequence, where s i ⊆ I is called an
example of a sequential pattern with time intervals between
successive itemsets is “A customer who bought a digital itemset. Length of a sequence means the number of itemsets
camera will return to buy a battery and a memory card in the sequence, and a sequence contains k itemsets is called
within one week”. Clearly, the time-intervals between a k-sequence. The support of a sequence S is denoted by
itemsets can offer the retail business more useful supp (S ) and means the percentage of total number of
information to sell the appropriate products to their records containing sequence S in the sequence database. If
(IJCNS) International Journal of Computer and Network Security, 17
Vol. 2, No. 8, August 2010

supp (S ) is greater than or equal to a predefined threshold, that A, B and C happen in this order, and the time-interval
called minimal support, than sequence S is regarded as a between A and B is within 1 day , and the time-interval
frequent sequence and called a sequential pattern. between B and C lies between 3 days and 7 days.
Many algorithms have been proposed to discover These proposed researches can discover the sequential
sequential pattern [1, 4, 6, 9], and most algorithms only patterns with the time-intervals between successive itemsets
focus on the frequently occurring order of the itemsets, but by using a or some predefined time partitions, but the
ignore the time-intervals between itemsets. The time- sequential patterns with time-intervals between successive
intervals between successive itemsets, in fact, can offer itemsets lie outside these used time ranges cannot be found
useful information for business to sell the appropriate yet. To solve this problem, therefore, an adoptive algorithm
products to their customers at the right time. Due to the for mining time-interval sequential patterns without using
value of the time-intervals between successive itemsets, any predefined time partitions is presented in this work. The
many algorithms for mining various sequential patterns with main concept of this proposed algorithm is to generate the
time-intervals between successive itemsets have been suitable time-intervals directly from the real sequence
proposed [2, 8, 10, 11]. dataset. The algorithm first adopts clustering analysis to
Srikant et al. [10] utilize three predefined restrictions, the automatically generate the suitable time-intervals for
maximum interval (max − interval), the minimum interval frequent occurring pairs of successive itemsets, and then
(min − interval), and the time window size ( window − size) uses these time-intervals to extend typical algorithms to
discover sequential patterns with time-ntervals between
to find sequential patterns related to time-intervals. The
successive itemsets. Details of the proposed algorithm are
discovered sequential pattern is like (( A, B), (C , D)), where introduced in the next section.
( A, B) and (C , D) are two subsequences of (( A, B), (C , D)).
The max − interval and min − interval are respectively
3. Adoptive Time-Interval Sequential Patterns
used to indicate the maximal time-interval and the minimal
Mining Algorithm
time- interval within subsequence. The window − size is
used to indicate the time-interval among subsequences. The proposed algorithm for mining time-interval sequential
Assume that the max − interval is set to 10 hours, the patterns is introduced as follows. First, some notations are
min − interval is set to 3 hours, and the window − size is defined in advance.
set to 24 hours, then the time-interval between A and B I = { i1 , i 2 ,......,i m } : The set of items.
lies in [ 2, 10] , the time- interval between C and D also
Si =< s1 , s 2 ,......, s n > : A sequence, where each s k ⊆ I .
lies in [ 2, 10] , and the time- interval
between ( A, B) and (C , D) lies in [ 0, 24] . D = { S1 , S2 ,......, Sk } : The sequences dataset.

Mannila et al. [8] use a predefined window width (win) supp ( Si ) : The support of the sequence Si .
to find frequent episodes in sequences of events, and the min − supp : The minimal support threshold.
discovered episode is like ( A, B, C ). Assume that the win is
set to 3 days, then the episode ( A, B, C ) means that, in 3 CSk : The candidate set of frequent k-sequences.
days, A occurs first, B follows, and C happens finally. FSk : The set of frequent k-sequences.
Wu et al. [11] also utilize a window (d ) to find the CTISk : The candidate set of frequent time-interval k-
sequential pattern likes ( A, B, C ), such that, in the sequential sequences.
pattern ( A, B, C ), the time-interval between adjacent events
FTIS k : The set of frequent time-interval k-sequences.
is within d . Assume d is set to 5 hours, then the
discovered pattern ( A, B, C ) means that A occurs first, B 3.1 The proposed algorithm
follows, and C happens finally; the time-interval between Step 1: Produce FS1 , the set of frequent 1-sequences.
A and B , and the time-interval between B and C are
Each items s i ∈ I is as a candidate frequent 1-sequence. A
both within 5 hours.
candidate frequent 1-sequence whose support is greater than
Chen et al. [2] use a predefined set of non-overlap time or equal to min − supp is a frequent 1- sequence, and FS1
partitions to discover potential time-interval sequential
denotes the set of all frequent 1-sequences.
patterns, and the discovered pattern is like ( A, I 0 , B, I 2 , C ),
where I 0 , I 2 belong to the non-overlap set of time partitions. Step 2: Produce CS2 , the candidate set of frequent 2-
Assume that, I 0 denotes the time-interval t satisfying sequences. From any two frequent 1- sequences of FS1 , say
0 ≤ t ≤ 1 day; I 2 denotes the time interval t satisfying s1 and s2 , where s1 , s2 ∈ FS1 and s1 ≠ s 2 , generate 2
3 < t ≤ 7 days, and then the pattern ( A, I 0 , B, I 2 , C ) means candidate frequent 2-sequences belong to CS2 ,
say < s1 , s 2 > and < s2 , s1 > .
18 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

Step 3: Produce FS2 , the set of frequent 2-sequences. A time-interval (k-1)-sequences S1 and S2 ,
candidate frequent 2-sequence whose support is greater than where S1 =< s1,1 , T1,1 , s1, 2 ,L, s1,k − 2 ,
or equal to min − supp is a frequent 2-sequence, and the set T1,k − 2 , s1, k −1 >, S2 =< s 2 ,1 , T2,1 , s 2 , 2 ,L, s 2 ,k − 2 , T2 ,k − 2 , s 2 ,k −1 >∈ FTISk −1 ;
of all frequent 2-sequences is FS2 .
s1, 2 = s 2 ,1 , s1, 3 = s 2 , 2 ,L, s1,k −1 = s 2, k − 2 ; T1, 2 = T2 ,1 , T1, 3 = T2 , 2 ,L, T1,k − 2
Step 4: Find the frequent time-intervals for each 2- = T2 ,k −3 , then we can generate a candidate time-interval k-
sequence of FS2 . For any frequent 2-sequence of FS2 , sequence S12 =< s1,1 , T1,1 , s1, 2 ,L, s1,k − 2 , T1, k − 2 , s1,k −1 , T2 ,k − 2 , s 2, k −1 > .
say < s p , s q >, all the time-intervals between s p and s q
Step 7: Produce FTISk , k ≥ 3 , the set of frequent time-
appear in D and are listed in increasing order, then the
following clustering analysis based steps, Step 4(a), Step interval k-sequences. A candidate time-interval k-sequence
4(b) and Step 4(c) are used to obtain the frequent time- whose support is greater than or equal to min − supp is a
intervals. frequent time-interval k-sequence, and the set of frequent
time- interval k-sequences is FTISk .
Step 4(a): Let T (1, z) = [t1 , t 2 ,L, t z ] is the increasingly
Step 8: Repeat Step 6 and Step 7, until no next CTIS k can
ordered list of the time-intervals of
< s p , sq > . Let T < s p , s q > = {T (1, z) } be the set of time- be generated.

intervals of < s p , sq > . The first step is to find the maximal 3.2 Time-intervals of a subsequence
difference between two adjacent time-intervals of In this subsection, an operator to obtain the time-interval of
T < s p , s q >, and then divide T < s p , sq > into 2 subsets a subsequence of a frequent time-interval sequential pattern
according to the maximal difference. Assume that the is introduced. Let S =< s1 , T1, 2 , s 2 ,L, sk −1 , Tk −1, k , s k > is a
difference between ti and t i+1 is maximal, then T (1, z) is frequent time-interval k-sequence, and
divided into T (1, i ) and T (i + 1, z), where T (1, i ) = [t1 ,L, t i ], S ' =< si , Ti ,i +1 , s i=1 ,L, si + j −1 , Ti + j −1,i+ j , s i+ j > is a subsequence
T (i + 1, z) = [t i +1 ,L, t z ]. of S. Assume that Ti ,i+1 =
[a i , bi ], Ti +1,i + 2 = [a i+1 , bi +1 ],L, Ti+ j −1,i+ j = [a i+ j −1 , bi + j −1 ], and then
Step 4(b): Calculate the support of < s p , sq > that
the time-interval between si and s i + j is equal
respectively includes each time- interval set. If the support
to [a i + a i+1 + L + a i + j −1 , bi + bi +1 + Lbi+ j −1 ].
of < s p , sq > that includes time-intervals T (1, i ) is greater
By using the above steps, a simple example is displayed
than or equal to min − supp, then T (1, i ) is a frequent time-
in the next section.
interval of < s p , sq > , and then T (1, i ) is reserved, otherwise
T (1, i ) is deleted. Similarly, if the support of < s p , sq > that
4. Example
includes time-intervals T (i + 1, z) is greater than or equal to In this section, we use the example sequence database
shown as in Table 1 to discover the time-interval sequential
min − supp, then T (i + 1, z) is also a frequent time-interval,
patterns. In Table 1, Id denotes the record number of a
and T (i + 1, z) is reserved, otherwise T (i + 1, z) is deleted. sequence, and each sequence is represented
The reserved subsets of time-intervals next replaces the as < (s1 , t1 ), (s 2 , t 2 ),L, ( s n , t n ) >, where s i denotes an itemset,
original set of time-intervals T < s p , sq > . If no subset is and t i denotes the time stamp that s i occurs; here, the
reserved, then the original set of time-intervals is called min − supp is set to 0.3.
non- dividable. If all differences between two adjacent time-
intervals in the original set of time-intervals are equal, then Table 1: A sample sequence database
the original set of time-intervals is called as non-dividable Id Sequence
as well. 01 ( s 5 ,8), ( s 4 ,15), ( s 6 ,20)

Step 4(c): Repeat Step 4(a) and Step 4(b), until all subsets 02 ( s1 ,2), ( s 3 ,7), ( s 2 ,11), ( s 6 ,18)
of time-intervals in T < s p , sq > are non-dividable. 03 ( s 2 ,3), ( s1 ,4), ( s 3 ,7), ( s 6 ,16), ( s 7 ,19)
04 ( s1 ,2), ( s 2 ,8), ( s 6 ,10), ( s 7 ,15)
Step 5: Produce FTIS 2 , the set of frequent time-interval
05 ( s 5 ,4), ( s 6 ,16), ( s1 ,20), ( s 3 ,24)
2- sequences. Each 2-sequence of FS2 is extended by all its
06 ( s 7 ,7), ( s1 ,13), ( s 5 ,18), ( s 2 ,25), (s6,28)
frequent time-intervals to generate FTIS 2 . If T < s p , s q >=
07 ( s 5 ,4), ( s1 ,8), ( s 3 ,12), ( s 6 ,16), ( s 7 ,20)
{T 1 , T 2 ,L, T R } is the set of frequent time-intervals of ( s1 ,3), ( s 5 ,6), ( s 2 ,9), ( s 4 ,18), ( s 6 ,21)
08
< s p , s q >, then T < s p , T i , s q >, i = 1L R, is a frequent time- 09 ( s 2 ,5), ( s1 ,10), ( s 3 ,15), ( s 6 ,20), ( s 7 ,25)
interval 2- sequence. 10 ( s 6 ,2), ( s 7 ,8), ( s 5 ,12), ( s 2 ,17)
Step 6: Produce CTISk , k ≥ 3 , the candidate set of
First, we need to calculate the supports of all itemsets to
frequent time-interval k-sequences. For any two frequent produce FS1 . Supports of all itemsets are shown in Table 2.
(IJCNS) International Journal of Computer and Network Security, 19
Vol. 2, No. 8, August 2010

Here, we can obtain FS1 = {s1 , s 2 , s 3 , s 5 , s6 , s7 }. < s3 , s7 > T < s 3 , s 7 >={8, 10, 12}
< s5 , s2 > T < s 5 , s 2 >={3, 5, 7}
Table 2: Supports of itemsets < s5 , s6 > T < s 5 , s 6 >={10, 12, 15}
itemsets support
< s6 , s7 > T < s 6 , s 7 >={3, 4, 5, 6 }
s1 0.8
s2 0.7 According to the step 4 described in the section 3, the set
s3 of all suitable time-intervals for each sequences of FS2 are
0.5
s4 obtained as in Table 5.
0.2
s5 0.6 Next, each 2-sequence of FS2 is extended by all its
s6 1 suitable time-intervals to form FTIS 2 .
s7 0.6 1 1 1
FTIS 2 ={< s1 , T , s 2 >, 1, 2 < s1 , T , s 3 >,
1, 3 < s1 , T , s 6 >,
1, 6

Next, CS2 is generated by jointing FS1 × FS1 ; Supports of < s1 , T1,26 , s 6 >, < s1 , T11, 7 , s 7 >, < s 2 , T21, 6 , s 6 >, < s 2 , T22, 6 , s 6 >,
the sequences in CS2 are calculated and shown in Table 3. < s 2 , T21, 7 , s 7 >, < s 3 , T31, 6 , s 6 >, < s 3 , T31, 7 , s 7 >, < s 5 , T51, 2 , s 2 >,
Therefore, we obtain FS3 = {< s1 , s 2 >, < s1 , s3 >, < s1 , s 6 >,
< s 5 , T51, 6 , s 6 >, < s 6 , T61, 7 , s 7 >}.
< s1 , s 7 >, < s 2 , s6 >, < s 2 , s 7 >, < s3 , s 6 >, < s3 , s7 >, < s5 , s 2 >,
< s5 , s 6 >, < s 6 , s 7 >}.
Table 5: Suitable time-intervals of the sequences in FS2
For each frequent 2-sequence of FS2 , all its time FS2 time-intervals
intervals are recorded and listed in increasing order (Table < s1 , s 2 > T < s1 , s 2 >={ T11, 2 = [6, 12] }
4).
< s1 , s 3 > T < s1 , s 3 >={ T11, 3 = [3, 5] }
< s1 , s 6 > T < s1 , s 6 >={ T11, 6 = [8,12], T1,26 = [16,18] }
Table 3: Supports of sequences in CS2
< s1 , s 7 > T < s1 , s 7 >={ T11,7 = [12,15] }
CS2 support CS2 support
< s2 , s6 > T < s 2 , s 6 >={ T21, 6 = [ 2,7], T22, 6 = [12,15] }
< s1 , s 2 > 0.4 < s 5 , s1 > 0.2
< s2 , s7 > T < s 2 , s 7 >={ T21,7 = [16,20] }
< s1 , s 3 > 0.5 < s5 , s2 > 0.3
< s3 , s6 > T < s 3 , s 6 >={ T31,6 = [4,11] }
< s1 , s 5 > 0.2 < s5 , s3 > 0.2
< s3 , s7 > T < s 3 , s 7 >={ T31,7 = [8,12] }
< s1 , s 6 > 0.7 < s5 , s6 > 0.5
< s5 , s2 > T < s 5 , s 2 >={ T51,2 = [3,7] }
< s1 , s 7 > 0.4 < s5 , s7 > 0.1
< s5 , s6 > T < s 5 , s 6 >={ T51,6 = [10,12] }
< s 2 , s1 > 0.2 < s 6 , s1 > 0.1
< s6 , s7 > T < s 6 , s 7 >={ T61,7 = [3,6] }
< s2 , s3 > 0.2 < s6 , s2 > 0.1
< s2 , s5 > 0.0 < s6 , s3 > 0.1 CTIS3 , the candidate set of frequent time-interval 3-
< s2 , s6 > 0.6 < s6 , s5 > 0.1 sequences is generated by jointing FTIS2 × FTIS 2 . Supports
< s2 , s7 > 0.3 < s6 , s7 > 0.5 of the sequences of CTIS3 are calculated and shown in Table
< s 3 , s1 > 0.0 < s 7 , s1 > 0.1 6. A candidate frequent time-interval 3-sequence whose
support is greater than or equal to min − supp is called as a
< s3 , s2 > 0.1 < s7 , s2 > 0.2
frequent time-interval 3-sequence. Therefore, we can obtain
< s3 , s5 > 0.0 < s7 , s3 > 0.0
the set of all the frequent time-interval 3-sequences,
< s3 , s6 > 0.4 < s7 , s5 > 0.2 FTIS3 = {< s1 ,
< s3 , s7 > 0.3 < s7 , s6 > 0.1
T11, 2 , s 2 , T21, 6 , s 6 >, < s1 , T11, 3 , s3 , T31, 6 , s6 >, < s1 , T11,3 , s3 , T31, 7 , s 7 >,
< s1 , T11,6 , s6 , T61,7 , s 7 >, < s3 , T31, 6 , s 6 , T61, 7 , s7 >}.
Table 4: Time-intervals of the sequences in FS2
FS2 time-intervals
Table 6: Supports of sequences in CTIS3
< s1 , s 2 > T < s1 , s 2 >={6, 9, 12}
CTIS3 support
< s1 , s 3 > T < s1 , s 3 >={3, 4, 5}
< s1 , T , s 2 , T , s 6 >
1
1, 2
1
2,6
0.3
< s1 , s 6 > T < s1 , s 6 >={8, 10, 12, 16, 18}
< s1 , T11,2 , s 2 , T22,6 , s 6 > 0.1
< s1 , s 7 > T < s1 , s 7 >={12, 13, 15}
< s1 , T11,2 , s 2 , T21,7 , s 7 > 0.1
< s2 , s6 > T < s 2 , s 6 >={2, 3, 7, 12, 13, 15}
< s1 , T11,3 , s 3 , T31,6 , s 6 > 0.4
< s2 , s7 > T < s 2 , s 7 >={7, 16, 20}
< s1 , T , s 3 , T , s 7 >
1
1, 3
1
3,7
0.3
< s3 , s6 > T < s 3 , s 6 >={4, 5, 9, 11}
< s1 , T , s 6 , T , s 7 >
1
1, 6
1
6,7
0.4
20 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

< s 2 , T21,6 , s 6 , T61,7 , s 7 > 0.1 [2] Y. L. Chen, M. C. Chiang, M. T. Ko, “Discovering
< s2 , T , s6 , T , s7 >
2 1 time-interval sequential patterns in sequence
2,6 6,7
0.2
databases,” Expert Systems with Applications, 25(3),
< s3 , T , s6 , T , s7 >
1
3, 6
1
6,7
0.3 pp. 343-354, 2003.
< s5 , T , s2 , T , s7 >
1
5,2
1
2,7
0 [3] M. S. Chen, J. Han, P. S. Yu, “Data mining: An
< s 5 , T51,6 , s 6 , T61,7 , s 7 > 0.1 overview from a database perspective,” IEEE
Transactions on Knowledge and Data Engineering,
8(6), pp. 866-883, 1996.
The candidate set of frequent time-interval 4-sequences,
[4] M. S. Chen, J. S. Park, P. S. Yu, “Efficient data
CTIS4 , is generated by jointing FTIS3 ×FTIS3 . Here, only mining for path traversal patterns,” IEEE Transactions
one sequence, < s1 , T11,3 , s 3 , T31,6 , s 6 , T61, 7 , s 7 >, is generated. on Knowledge and Data Engineering, 10(2), pp. 209-
221, 1998.
The support of the sequence < s1 , T11,3 , s 3 , T31,6 , s 6 , T61, 7 , s 7 > is
[5] M. H. Dunham, Data mining, Introductory and
0.3, thus < s1 , T11,3 , s 3 , T31,6 , s 6 , T61, 7 , s 7 > is also a frequent Advanced Topics, Pearson Education Inc., 2003.
time- interval 4-sequences, and we obtain [6] J. Han, G. Dong, Y. Yin, “Efficient mining of partial
periodic patterns in time series database,” In
FTIS4 = < s1 , T11,3 , s 3 , T31,6 , s 6 , T61, 7 , s 7 >. Because no
Proceedings of the 1999 International Conference on
next CTIS5 can be generated, the algorithm stops here. In Data Engineering, pp. 106-115, 1999.
addition, the time-interval of any subsequence of [7] J. Han, M. Kamber, Data mining: Concepts and
< s1 , T11,3 , s 3 , T31,6 , s 6 , T61, 7 , s 7 > can be obtained by using the Techniques, Academic Press, 2001.
[8] H. Mannila, H. Toivonen, A. Inkeri Verkamo,
operator introduced in subsection 3.2. “Discovery of frequent episodes in event sequences,”
From the above example, we can clearly see that the Data Mining and Knowledge Discovery, 1(3), pp. 259-
suitable time-intervals for every pair of successive itemsets 289, 1997.
[9] J. Pei, J. Han, H. Pinto, Q, Chen, U. Dayal, M.-C. Hsu,
are different and overlap, therefore, it is more reasonable to
“PrefixSpan: Mining sequential patterns efficiently by
generate the suitable time-intervals directly from the real
prefix-projected pattern growth,” In Proceedings of
sequence data for every pair of successive itemsets when 2001 International Conference on Data Engineering,
mining time-interval sequential patterns. pp. 215-224, 2001.
[10] R. Srikant, R. Agrawal, “Mining sequential patterns:
5. Conclusion Generalizations and performance improvements,” In
Proceedings of the 5th International Conference on
In this paper, we present an adoptive algorithm for mining Extending Database Technology, pp. 3-17, 1996.
time-interval sequential patterns. A sequential pattern with [11] P. H. Wu, W. C. Peng, M. S. Chen, “Mining sequential
the time-intervals between successive itemsets is more alarm patterns in a telecommunication database,” In
valuable than a traditional sequential pattern without any Proceedings of Workshop on Databases in
time information. Most proposed algorithms reveal the time- Telecommunications (VLDB 2001), pp. 37-51, 2001.
intervals between itemsets by using some predefined non-
overlap time partitions, but this way, in fact, may not be Authors Profile
suitable for every pair of successive itemsets. To solve this
problem, the proposed algorithm uses clustering analysis to Hao-En Chueh received the Ph.D. in
automatically generate the suitable time-intervals between Computer Science and Information
frequent occurring pairs of successive itemsets, and then Engineering from Tamkang University,
uses these generated time-intervals to extend typical Taiwan, in 2007. He is an Assistant
algorithms to discover the time-interval sequential patterns Professor of Information Management at
without pre- defining any time partitions. In addition, a Yuanpei University, Hsinchu, Taiwan. His
research interests include data dining,
useful operator for computing the time-interval of a
fuzzy set theory, probability theory,
subsequence of a frequent time-interval sequential pattern is statistics, database system and its
also introduced in this paper. From the result of the applications.
example, we can conclude that because the time-intervals
between successive itemsets are quite different and overlap, Yo-Hsien Lin received the Ph.D. in
it is more reasonable to generate the suitable time-intervals information management from the
directly from the real sequence data when mining time- National YunLin University of Science
interval sequential patterns. and Technology, Taiwan, in 2008. He is
an Assistant Professor of Information
Management at the Yuanpei University,
References Hsinchu, Taiwan. His research interests
include bio-inspired systems, neural
[1] R. Agrawal, R. Srikant, “Mining sequential patterns,” networks, evolutionary computation,
In Proceedings of the International Conference on Data evolvable hardware, intelligence system
Engineering, pp. 3-14, 1995. chip, biocomputing, pattern recognition,
and medical information management.
(IJCNS) International Journal of Computer and Network Security, 21
Vol. 2, No. 8, August 2010

Adaptive Dual Threshold Multi-Class Scheduling


for Packet Switch
A. A. Abdul Rahman1, K. Seman2, K. Saadan3 and A. Azreen4
1
Telekom Research & Development, System Technology Unit,
TM Innovation Center, Lingkaran Teknokrat Timur,
63000 Cyberjaya, Selangor, Malaysia
abd_aziz@tmrnd.com.my
2, 3
Universiti Sains Islam Malaysia,Faculty of Science & Technology,
Bandar Baru Nilai, 71800 Nilai, Negeri Sembilan, Malaysia
2
drkzaman@usim.edu.my
3
kamarudin@usim.edu.my
4
Universiti Putra Malaysia, Multimedia Department, Faculty of Computer Science and Information Technology,
43400 Serdang, Selangor
azreen@fsktm.upm.edu.my
\

class may not perform well when it is applied to a multi-


Abstract: Multimedia applications such as video conferencing,
VoIP and data streaming require specified QoS to guarantee class scheduling [3]. Scheduling algorithm that is normally
their performance. Multi-class switch has been introduced to used in two traffic classes only considers one parameter,
handle different QoS requirement. In this research, a new way either only in priority packet setting or only on probability
of handling multi-class traffic is presented. The analysis is done of serving loss sensitive classes.
on N × N switch with two traffic classes; high priority for delay Priority buffer in multi-class switch will give more
sensitive cells (class 1) and low priority for loss sensitive cells priority to delay sensitive packets such as video, voice and
(class 0). In order to avoid starvation problem and to improve online game as compared to loss sensitive packet. In other
total mean delay in loss sensitive class, a novel approach has words, by using the priority buffer, real time applications
been introduced in the scheduling technique. The controller in will be served first while non-real time applications will be
the scheduler will adjust the threshold value adaptively based on queued in the buffer waiting to be served. Many studies have
the mean queue length and traffic load condition. By adjusting
been done to reduce the waiting time for the loss sensitive
these parameters adaptively the best possible mean delay and
throughput for class 0 can be achieved without degrading the
cells with less consideration on the degradation of
QoS requirement for class 1.The proposed method has been performance for the delay sensitive cells [1]-[4].
simulated to show the performance of adaptive threshold as In [4], the performance of packet switch with two
compared to priority queue(PQ) and Weighted Fair priorities has been evaluated using heuristic adjustment.
Queue(WFQ) in term of total mean delay and throughput. The Then it is improved by using the approximation technique of
results show that the proposed architecture has achieved better the flow conservation rule [1]. Both techniques are using
performance as compared to PQ and WFQ. priority buffer without any threshold control on the loss
Keywords: multi-class switch, Quality of Service (QoS), sensitive class. This will lead to the starvation problem in
adaptive threshold, switching. the loss sensitive class during high traffic load condition.
In [3], a reservation based scheduling approach has been
1. Introduction proposed to handle the QoS requirements for the delay
sensitive class and the loss sensitive class. This method uses
In modern communication network, the desire of having input-output queue that requires internal speedup which will
multiple traffic services in a single stream has created increase the complexity.
problem especially in achieving the desired QoS In Weighted Fair Queue (WFQ), traffic classes are
requirement for each services. The multi-class switch [1]- served based on the fixed weight assigned to the related
[5], [8]-[11] is used to classify the multiple traffic streams queue [10, 11]. The weight is determined according to the
based on QoS requirements. The use of priority provides the QoS parameters, such as service rate or delay. This
means to give different classes of service a different type of technique is not suitable under high traffic load because of
traffic [4]. This requires new scheduling algorithms for the fixed weight that is assigned to the queue.
packet transmission. Designing a high speed packet switching with classes
Scheduling algorithm plays a key role in obtaining a will create a few problems such as starvation in the loss
high performance in multi-class switch. Unfortunately, most sensitive class (class 0) and packet dropped due to long
of the existing scheduling algorithm [1],[3]-[5] only strive to waiting time in class 0. This research is expected to
maximize the QoS level in delay sensitive class for each minimize the waiting time in class 0 without affecting QoS
arrival cell without considering the adaptability, which may requirements for delay sensitive cells in class 1.
result in poor QoS for lost sensitive class when the system is To achieve the stated objective, separate buffer for each
in heavy load. Scheduling algorithm for a single priority individual class is used to accommodate different traffic
22 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

classes. This will eliminate the head of line (HoL) of


different classes blocking effects. For different port, Virtual Low priority
Output Queue (VOQ) [6] [7] is used in order to eliminate
the HoL of different destination port blocking. HoL
In this paper, we propose adaptive hybrid scheduling
method by combining two thresholds setting and able to
High priority
adjust the priority level of scheduling according to the mean HoL Sch
queue length of the delay sensitive class. In addition, the
scheduling technique is non-preemptive, which is more Figure 2. Head of Line Scheduler.
efficient and less complex than preemptive approaches due
to reducing the overhead needed for switching among cells. At each time slot, the switch attempts to serve the cells at
Head of Line (HoL) of each input queue as shown in Figure
2. Proposed Model 2. In the case when there are cells from different classes are
The proposed multi-class switch and the scheduling waiting at HoL, the HoL scheduler (HoL Sch) will select the
cell with high priority to be served. The losing cells in the
technique used in controlling the flows of the cells in the
contention must wait in the queue. The numbers of queue
switch is described below.
cells will increase when there are new incoming cells to the
queue.
2.1 System model The threshold setting is introduced in order to give some
The proposed multi-class switch architecture with N privileges to cells in lower priority class. The threshold
ports serving C-1 classes of traffics is shown in Figure 1. parameter used in this architecture are the number of queue
Priority switch is used to forward delay sensitive packet cell, Nbj and the probability of serving low priority cell,
(class C-1) faster than loss sensitive packet (class 0). P TSCj; j = 0, 1, 2, …., C-1. Nbj parameter is chosen because
of the limited buffer size available in practical design. The
need to adjust the Nbj parameter is necessary to reduce the
packet loss due to buffer full. This parameter is adjusted
based on the size of buffer used to store cells in Class j. P TSCj
is the probability to serve the Class j when the Nbj parameter
threshold is met. The P TSCj parameter is chosen in order to
control the variation of delay between high priority and low
priority cells based on the high priority QoS requirements.
This is necessary in order to achieve better performance for
Class j cells. In the case where both threshold values are
met, the switch will select the cell from the class which the
threshold is triggered even in the present of higher priority
cells.
2.1 Simulation model
A simulation model is developed to simulate the
performance of the proposed switch under dual thresholds
setting. In this simulation, the architecture uses 16 x 16
Figure 1. Multi-class switch architecture switch with two classes for every input port. Class 0 is used
to classify the low priority buffer for non-real-time data. At
The delay requirement for class j cells is defined by Dj the same time, Class 1 represents high priority buffer for
where j = 0, 1, 2… C-1 and D0 > D1 > D2 >……> DC-1. In real-time data.
other words, the delay requirement for Class 1 is more The design of the switch architecture for input queue
stringent than Class 0 for system shown in Figure 1. Thus, multi-class switch is using separate buffer for each class.
cells that queue in Class 0 have the lowest priority and cells Arrival cells are stored in different FIFO based on their
that queue in Class C-1 have the highest priority. These classes. The HoL scheduler will choose one cells from HoL
delay requirements are set based on the QoS requirement for FIFO classes at every port to be forwarded to switch fabric.
different type of applications. The cells will contend with each other to gain access for
Time slot is used to represent the time of one cell arrival departure.
at the input port or cell departure at the output port. The The proposed switch operates in time slotted transmission
period of time slot Ts, where s = 0, 1, 2, … is set to be equal to process each cells. Each time slot consists of three phases
to the time to process a single cell when the server is idle. which are arrival, scheduler and departure.
The class j cells arrive at the input port in every time slot In the arrival phase, the incoming packet are segmented
according to Bernoulli distributions with mean λj. The cell into fixed size packet called cells and are aligned for
is classified based on its delay requirement. In this synchronization. The number of maximum cells, P max which
architecture the class of the cell is stored in the header. An is generated in one time slot depends on the traffic load, λ
arrival cells for Class j (λj) is queued in First-In-First-Out and the number of port, N, used. The relationship is shown
(FIFO) buffer while waiting to be served. in (1). The traffic load is the total of λHi and λLi.
(IJCNS) International Journal of Computer and Network Security, 23
Vol. 2, No. 8, August 2010

(1) After HoL is selected, it will compete with other cells


The cells are generated randomly and uniformly for all from the other input port. The scheduler is using round
destination port. For uniform traffic, the maximum arrival robin policy with priority to select the cell in HoL for
rate at any queue is always less than 1/N of the traffic load. departure.
Figure 3 shows the address packet generation in arrival In departure phase, delays for Class 1 and Class 0 cells
process. are calculated to measure the switch performance.

FOR (port = 1 to N) DO
2.1.1 HoL scheduler with fixed thresholds value
{
IF (random number < traffic load) THEN HoL Scheduler is used to select HoL cell from delay
Destination address = random number * N; sensitive class (FIFOH) and/or loss sensitive class (FIFOL).
ELSE Figure 6 shows the HoL scheduler design for two class
No destination address; switch. Mux1 will transfer cell from FIFOH if dual threshold
ENDIF setting (Nb0 and P TSC0) does not meet its limit and there are
} cells in FIFOH. Cell from FIFOL will be transfered by Mux1
if there are no cell in FIFOH or when both thresholds setting
Figure 3. Address Packet Generation in Arrival Process reach its limit. Mux2 will transfer cell from FIFOL when the
destination address for high class (Addr H) is different than
Figure 4 shows the packet format for input buffer multi- destination address for low class (Addr L). Mux2 will
class switch. There are 20 bits in this cell. The first 8 bits is eliminate the HoL blocking effect for loss sensitive class and
the header which contains the destination address (7 bits) will increase the switch performance.
and class (1 bit). The others 12 bits are for data.

Figure 4. Cell format

In the class segment, bit ‘1’ indicates that the cell belongs
to the delay sensitive class and bit ‘0’ is for the loss sensitive
class. This bit is classified based on the type of packet
received and the QoS requirement for its applications. Each
generated cell is classified either to the delay sensitive class
or the loss sensitive class based on the traffic type. High
priority cell is tagged with 1 and low priority cell with 0.
Then, the cell is sent to FIFO waiting to be served. The Figure 6. HoL scheduler design
Class 1 cell is sent to FIFO_H1 and Class 0 cell to Figure 7 illustrates the cell flow at the HoL scheduler under
FIFO_L1. The HoL for Class 1 and Class 0 must wait until three situations which are based on threshold condition and
it is served. In general, the HoL scheduler will choose the destination address of class 1 and class 0.
cell in Class 1 since it has the high priority cells. In the case
when both threshold values are achieved, the HoL scheduler
will choose the Class 0 cells even in a presence of Class 1 √ Different
cells. √ destination
The pseudo code for the input buffer with HoL output address
scheduling is shown in Figure 5. When HoL for class 1 is Class1 Same
not empty and the threshold value setting is applicable for √ destination
both parameters, the HoL output will choose HoL packet
× address, Nb &
from class 0 instead of class 1. Class0 HoL WAIT PTSC0 = 0
scheduler
If (HoL class1 not empty) WAIT Same
If (Nb0 > TN) && (PTSC0 >TP) × destination
HoL_out = HoL class0; √ address, Nb &
else PTSC0 = 1
HoL_out = HoL class1;
else
HoL_ out = HoL class0; Figure 7. Example of success and failure of cell flow at
HoL Scheduler
Figure 5. Pseudo code for HoL scheduler
24 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

2.2.2 HoL scheduler with adaptive thresholds


value.
In order to improve the mean delay of the delay sensitive
class traffic in high traffic load condition, a HoL scheduler
with adaptive thresholds value is introduced. The idea is to
give absolute priority to the delay sensitive class in high
traffic load condition so that the high priority cell can be
transfer efficiently and meet its QoS requirements.
Meanwhile, when the traffic load is low or moderate, some
level of priority is given to the loss sensitive class. Figure 8
shows the architecture of HoL scheduler with adaptive
threshold.

Class 0
λ0 HoL Sch 1

16 X 16
Nonblocking
Switch
λ1 Class 1 Fabric

Figure 9. Example of the optimal threshold setting under


uniform traffic.
Thresholds
value
Controller
Under a uniform traffic condition, the measurement of
traffic load classification is defined as in Table 1 for VoIP
and video conferencing.
Figure 8. Architecture of HoL scheduler with adaptive
threshold.
Table 1: Traffic load classification for VoIP and video
conferencing.
The controller is used to set the threshold value based on the
traffic load condition. The controller uses the class 1 HoL
waiting time to determine the traffic condition. Under a Traffic load Waiting time
uniform traffic condition the best possible setting of the (time slot)
threshold value P TSC0 is based on the average of minimum
different waiting time (DW) in HoL between class 1 (WB1) LOW < 20
and class 0 (WB0), and QoSclass1 and WB1. The WB0 and WB1
values are normalized based on the QoS value for class 1. MODERATE 20 - 50
The relationship is shown in (2) and (3). Meanwhile Nb0
value is based on the average of occupied buffer in class 0 HIGH > 50
(FIFO0, i ) with number of ports in the switch (N) at the
border of classified traffic load (low, moderate or high). The Figure 10 shows the pseudo code for controller threshold
relationship is shown in (4). setting. The different level of serving probability is given
based on the percentage of QoS requirement of class 1.
(2)
If (waiting time <20% of QoS)
(3) Equal probability of serving class 1 and class 0
by not setting any thresholds or priority for both
classes.
(4) Else if (20% of QoS < waiting time <50% of QoS)
Increase probability of serving class1 to 75% by
increasing the threshold level setting of serving
Equation (2), (3) and (4) are graphically shown in Figure 9,
class0.
where the graphs of waiting time under uniform traffic
Else if (waiting time > 50% of QoS)
without any threshold setting for class 1 and class 0, noted Increase probability of class1 to 100% by giving
WB1 and WB0 respectively are used to obtain the optimum absolute priority to class1.
threshold setting.

Figure 10. Pseudo code for controller threshold setting.


(IJCNS) International Journal of Computer and Network Security, 25
Vol. 2, No. 8, August 2010

Hardware design is developed to evaluate the performance equal distribution of λ1 and λ0.
of the proposed architecture. Figure 11 shows the hardware Total Mean
Delay
timing simulation of incoming and outgoing cells in multi- (time slot)
class switch. It can be seen that there are new incoming cell (i) Class 0 (PQ)
(ii) Class 0 (adaptive threshold)
at every time slots. After the cells have been processed, only (iii) Class 0 (WFQ) (i)
10
3 (iv) Class 1 (WFQ) (iii)
the successful cell is allowed to depart. The rest of the cells (v) Class 1 (adaptive threshold) (iv)
(vi) Class 1 (PQ) (ii)
must waiting for their turns. The output cell timing is used
to calculate the total means delay for Class 1 and Class 0. 10
2

(v)
1
10

(vi)
0 0.9 1
10
0.2 0.3 0.4 0.5 0.6 0.7 0.8
Ratio (r)

Figure 14. Ratio versus Mean delay in multi-class switch


with λ1 fixed to 0.5

These graphs show the performance of multi-class switch in


term of throughput and total mean delay. Figure 12 shows
Figure 11. Timing simulation of multi-class switch the throughput of cell in the switch under three
environments for both classes.
3. Performance Results a) Priority buffer.
b) WFQ.
In order to compare switching capabilities, the following c) Adaptive threshold.
performance metrics are considered:
• Throughput: a normalized value of the cell As shown in Figure 12, throughput with adaptive threshold
delivered correctly to its destination. for class 0 increases at load 0.8 and then decreases as the
load increases to 1. The class 1 throughput with adaptive
• Total Mean delay: the average end-to-end delay of threshold remains near to the class 1 throughput in PQ. But
cells, including the waiting time and serving time. the class 1 in WFQ tends to drop as the load increase to 1.
Figure 13 shows the total mean delay with equal distribution
• Cell arrival ratio: the ratio of cell class 0 over class of λ1 and λ0. The mean delay for adaptive threshold for class
1. This will measure the efficiency of the switching 1 is better as compared to mean delay in WFQ at high traffic
technique as the input of class 0 increase. load. Figure 14 shows the mean delay of ratio distribution of
λ1 and λ0. The mean delay for class 1 adaptive threshold is
Throughput lower as compared to class 1 in WFQ as the load increases.
0.50 (i)
(ii)
(ii) This effect is because of the adaptive controller threshold
(iii)
0.45
setting, which decrease the class 0 serving properties as the
load increases.
(i) Class 1 (PQ)
0.40 (ii) Class 1 (WFQ)
(iii) Class 1 (adaptive threshold)

0.35
(iv)
(v)
Class 0 (adaptive threshold)
Class 0 (WFQ) (iv)
4. Conclusions
(vi) Class 0 (PQ)
(v) In this paper, the multi-class switch with adaptive dual
0.30 (vi)
threshold is proposed to optimize the performance of class 0
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
Load
traffic without affecting the QoS requirement for class 1.
The simulation results show that the mean delay and
Figure 12. Throughput in multi-class switch with equal
throughput using adaptive threshold is better than WFQ. In
distribution of λ1 and λ0. adaptive threshold, the P TSC0 and Nb parameter are adjusted
Total Mean
Delay automatically based on condition of the traffic load. By
(time slot)
adjusting these parameters adaptively the optimum of mean
(i) Class 0 (PQ)
(ii) Class 0 (adaptive threshold) (i) delay and throughput for class 0 can be achieved without
(iii) Class 0 (WFQ) (ii)
10 2 (iv) Class 1 (WFQ)
(iii)
degrading the QoS requirement for class 1.
(v) Class 1 (adaptive threshold) (iv)
(vi) Class 1 (PQ)

10
1 References
[1] Choi, J. S. and C. K. Un, "Delay Performance of an
(v)
10 0 Input Queueing Packet Switch with Two Priority
(vi) Classes". Communications, IEE Proceedings- Vol.
10
-1 145 (3): pp. 141-144, 1998.
0. 0.65 0. 0.75 0.

Load
[2] Warde, W. and P. A. Ivey, "Input Queueing Multicast
Atm Packet Switch with Two Priority Classes Using a
Figure 13. Total mean delay in multi-class switch with
26 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

Priority Scheme with a Window Policy". Electronics


Letters Vol. 32 (20): pp. 1854-1855, 1996. K. Seman obtained B.Elec.Eng (2nd Class
[3] Pao, D. C. W. and S. P. Lam, "Cell Scheduling for Upper) from Universiti Teknologi
Atm Switch with Two Priority Classes". ATM Malaysia (UTM) in 1985, MSc in
Telematics from Essex University, UK
Workshop Proceedings, IEEE: pp. 86-90,1998.
1986, and PhD in Electrical Engineering
[4] Chen, J. S. C. & R. Guerin, "Performance Study of an (Communication Networks) from
Input Queueing Packet Switch with Two Priority Strathclyde University UK in 1994. He
Classes". Communications, IEEE Transactions on Vol. served as an academician at the Faculty of
39 (1): pp. 117-126, 1991. Electrical Engineering, UTM from 1985
[5] A.A. Abdul Rahman, K.Seman and K.Saadan, “Multi- till 2002. He was promoted as a full professor in
class Scheduling Technique using Dual Threshold,” Telecommunication Engineering in 2000. From 2003 till 2005, he
APSITT, Sarawak, Malaysia, 2010. worked at Telekom Malaysia R&D working in numerous network
[6] N. McKeown, V. Anantharam, and J. Walrand, research projects. In Dec 2005 he joined Universiti Sains Islam
Malaysia as Professor in Network Technology and Security. His
“Achieving 100% throughput in an input-queued
research interests are network performance modeling and analysis,
switch,” in Proc. IEEE INFOCOM ‘96, San Francisco, cryptography, and switching technology.
CA, pp. 296–302, 1996.
[7] A. Mekkittikul and N. McKeown, “A practical
scheduling algorithm for achieving 100% throughput K. Saadan is a Senior Fellow (Computer
in input-queued switches,” in Proc. INFOCOM ‘98, Science) in Information Security and
San Francisco, CA, vol. 2, pp. 792–799, 1998. Assurance Programme in the Faculty of
[8] Lemin, L., H. Caijun & L. Pu. "Maximum Throughput Science, Universiti Sains Islam Malaysia.
of an Input Queueing Packet Switch with Two Priority Currently he is the Director of Centre for
Information Technology in USIM. He holds
Classes". Communications, IEEE Transactions on Vol.
a Bachelor of Science degree in
42 (12): pp. 3095-3097, 1994. Mathematics, Master of Science in
[9] Lim, Y. & J. E. Kobza.. "Analysis of a Delay- Computer Science and PhD in Systems
Dependent Priority Discipline in an Integrated Science Management. His areas of research interest are in
Multiclass Traffic Fast Packet Switch". Intelligent Decision Support Systems, Software Quality Assurance
Communications, IEEE Transactions on Vol. 38 (5): and Knowledge Management. His expertise is in Software
pp. 659-665, 1990. Engineering and Software Quality Assurance. So far he has
[10] Al-Sawaai, A., I. Awan & R. Fretwell.. "Analysis of published more than 35 papers and technical reports in the area of
the Weighted Fair Queuing System with Two Classes Computer Science and Information Technology. In the last fifteen
years he has been actively involved in various systems
of Customers with Finite Buffer". Advanced
development activities and research; and ICT project planning and
Information Networking and Applications Workshops, management.
2009. WAINA '09. International Conference on: pp.
218-223, 2009.
[11] Al-Sawaai, A., I. U. Awan & R. Fretwell. A. Azman received his B.IT from Universiti
"Performance of Weighted Fair Queuing System with Multimedia in 1999, PhD in Information
Multi-Class Jobs". Advanced Information Networking Retrieval form University of Glasgow in
and Applications (AINA), 24th IEEE International 2007.From 1999 till 2002; he worked as
System Engineer at ON Semiconductor (M)
Conference on: pp. 50-57, 2010.
Sdn. Bhd. From 2008 till 2009, he worked
as lecturer in Universiti Sains Islam
Authors Profile Malaysia (USIM). In May 2009, he joins
Universiti Putra Malaysia as Senior
A. A. Abdul Rahman received his Lecturer. His research interests are in information retrieval,
Bachelor of Engineering (Electrical – relevance feedback learning, data mining and knowledge
Electronics) and Master of Engineering discovery.
(Electrical) from Universiti Teknologi
Malaysia, Johor in 2002 and 2004. He is
currently pursuing the PhD degree at
Universiti Sains Islam Malaysia (USIM).
He is also an Associate Senior Researcher
in Telekom Research and Development. His
research interests are in hardware system
design, high speed switching, networking and software
engineering.
(IJCNS) International Journal of Computer and Network Security, 27
Vol. 2, No. 8, August 2010

Low Budget Honeynet Creation and Implementation


for Nids and Nips
Aathira K. S1, Hiran V. Nath2, Thulasi N. Kutty3, Gireesh Kumar T4
1
TIFAC CORE in Cyber Security Centre,
Amrita Vishwa Vidyapeetham, Coimbatore, India
aathiramanikutty@gmail.com
2
TIFAC CORE in Cyber Security Centre,
Amrita Vishwa Vidyapeetham, Coimbatore, India
hiranvnath@gmail.com
3
TIFAC CORE in Cyber Security Centre,
Amrita Vishwa Vidyapeetham, Coimbatore, India
thulasi.nk@gmail.com
4
TIFAC CORE in Cyber Security Centre,
Amrita Vishwa Vidyapeetham, Coimbatore, India
gireeshkumart@gmail.com

captured on honeypots. Honeypots are computer resources


Abstract: This paper describes Honeynet, a system for set up for the purpose of monitoring and logging activities
automated generation of attack signatures for network intrusion of entities that probe, attack or compromise them.
detection and prevention systems. A honeypot is a security Honeypots are closely monitored network decoys serving
resource whose value lies in being probed, attacked or
several purposes. They can distract attackers from more
compromised. We examine different kinds of honeypots,
honeypot concepts, and approaches to their implementation. valuable machines on a network; they can provide early
Our system applies pattern detection techniques and protocol warning about new attack and exploitation trends; and they
based classification on the traffic captured on a honeypot allow in-depth examination of adversaries during and after
system. Softwares like Sun virtual box-VMware were used for exploitation of a honeypot. Honeypots are a technology
this purpose so that it was not required to buy large number of whose value depends on the "bad guys" interacting with it.
high end systems for implementing this setup and thereby cost All honeypots work on the same concept: nobody should be
was reduced in a great extend. While running Honeynet on a
WAN environment, the system successfully created precise
using or interacting with them, therefore any transactions or
traffic signatures and updates the firewall that otherwise would interactions with a honeypot are, by definition,
have required the skills and time of a security officer. unauthorized. “Honeynet” is a term that is frequently used
where honeypots are concerned. A honeynet is simply a
Keywords: IDS, IPS, Honeypot, Honeynet, Snort. network that contains many honeypots and the traffic to
each honeypots is controlled using honeywall. More
1. Introduction precisely, it is a high-interaction honeypot that is designed
A honeypot is tough to define because it is a new and to capture extensive information on threats and provides real
changing technology, and it can be involved in different systems, applications, and services for attackers to interact
aspects of security such as prevention, detection, and with.
information gathering. It is unique in that it is more general Redhat Linux machine is used for routing packets between
technology, not a solution, and does not solve a specific honeypots and actual servers. It is really expensive to buy an
security problem. Instead, a honeypot is a highly flexible actual router. A firewall had been built to filter packets in
tool with applications in such areas as network forensics, the gateway using Linux machine so that its rules could be
vulnerability analysis and intrusion detection. A honeypot is updated and its not required to buy more number of
a security resource, whose value lies in being probed, firewalls. VMware software which serves the purpose of
attacked, or compromised. Currently, the creation of NIDS virtual machine is used so that there is no need to buy large
signatures is a tedious manual process that requires detailed server configuration machines. Even a single machine
knowledge of the traffic characteristics of any phenomenon resource can be shared by many hosts which are used for
that is supposed to be detected by a new signature. making honeypots.
Simplistic signatures tend to generate large numbers of false This paper is organized as follows: In Section 2 we examine
positives; overly specific ones cause false negatives. To different types of honeypots and honeywall. In Section 3 we
address these issues, we present Honeynet, a system that provide an overview of the system architecture. Section 4
generates signatures for malicious network traffic presents implementation part. Section 5 shows our findings.
automatically. Our system applies pattern detection We then conclude and provide our opinion on the future of
techniques and protocol based classification on the traffic honeypots in section 6.
28 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

2.1.2. Production Honeypot


A production honeypot is what most people think of when
discussing honeypots. A production honeypot is one used
within an organization’s environment to protect the
organization and help mitigate risk [4]. It has value because
it provides immediate security to a site’s production
resources. Since they require less functionality then a
research honeypot, they are typically easier to build and
deploy. Although they identify attack patterns, they give less
information about the attackers than research honeypots.
You may learn from which system attackers are coming
Figure 1. Honeypot Setup using Virtualization from and what exploits are being launched, but may be not
who they are, how they are organized, or what tools they are
using. Production honeypots tend to mirror the production
2. Types of Honeypots
network of the company (or specific services), inviting
Honeypots can be classified based on their purpose attackers to interact with them in order to expose current
(production, research) and level of interaction (low, vulnerabilities of the network. Uncovering these
medium, and high). We examine each type in more detail vulnerabilities and alerting administrators of attacks can
below. provide early warning of attacks and help reduce the risk of
2.1. Purpose of Honeypots intrusion [3]. The data provided by the honeypot can be used
to build better defenses and counter measures against future
2.1.1. Research Honeypot threats.
A research honeypot is designed to gain information about It should be pointed out that as a prevention mechanism,
the blackhat community and does not add any direct value to production honeypots have minimal value. Best practices
an organization [4]. They are used to gather intelligence on should be implemented involving the use of Firewalls,
the general threats organizations may face, allowing the IDS’s, and the locking down and patching of systems. The
organization to better protect against those threats. Its most common attacks are done using scripts and automated
primary function is to study the way in which the attackers tools. Honeypots may not work well against these since
progress and establish their lines of attack, it helps these attacks focus on many targets of opportunity, not a
understand their motives, behavior and organization single system. Their main benefit is in the area of detection.
Research honeypots are complex to both deploy and Due to its simplicity it addresses the challenges of IDS’s –
maintain and capture extensive amounts of data. They can there are minimal false positives and false negatives. There
be very time extensive. Very little is contributed by a are several situations where an IDS may not issue an alert:
research honeypot to the direct security of an organization, the attack is too recent for your vendor, the rule matching it
although the lessons learned from one can be applied to caused too many false positives or it’s seeing too much
improve attack prevention, detection, or response. They are traffic and is dropping packets. False Positives occur when
typically used by organizations such as universities, an untuned IDS alerts way too much on normal network
governments, the military or large corporations interested in traffic. These alerts soon get ignored or the rules triggering
learning more about threats research. Research honeypots them are modified, but then real attacks may be missed. In
add tremendous value to research by providing a platform to addition, there is a serious problem with the volume of data
study cyberthreats. Attackers can be watched in action and to analyze with IDS’s. They can’t cope with the network
recorded step by step as they attack and compromise the traffic on a large system. Honeypots address these
system. This intelligence gathering is one of the most challenges because since honeypots have no production
unique and exciting characteristics of honeypots [7]. It is activity, all the traffic sent to a honeypot is almost certainly
also a beneficial tool in aiding in the development of unauthorized – meaning no false positives, false negatives
analysis and forensic skills. Sometimes they can even be or large data sets to analyze. Also, once an attack has been
instrumental in discovering new worms. detected the machine can be pulled offline and thorough
forensics performed something that is often difficult if not
impossible with a production system. In general,
commercial organizations derive the most direct benefit
from production honeypots. These categorizations of
honeypots are simply a guideline to identify their purpose,
the distinction is not absolute. Sometimes the same
honeypot may be either a production or research honeypot.
It is not as much how it is built but how it is used [6].

2.2. Level of Interaction


Figure 2. Honeypots in a production environment In addition to being either production or research honeypots,
(IJCNS) International Journal of Computer and Network Security, 29
Vol. 2, No. 8, August 2010

honeypots can also be categorized based on the level of type of honeypot, as all actions can be logged and analyzed.
involvement allowed between the intruder and the system. Because the attacker has more resources at his disposal, a
These categories are: low-interaction, medium-interaction high interaction honeypot should be constantly monitored to
and high interaction. What you want to do with your ensure that it does not become a danger or a security hole
honeypot will determine the level of interaction that is right [2]. A honeynet is an example of a high-interaction
for you. honeypot, and it is typically used for research purposes.
2.2.1. Low-interaction Honeypots 2.3. Free and commercial honeypot solutions
A low-interaction honeypot simulates only services that 2.3.1 Nepenthes Honeypot
cannot be exploited to gain total access to the honeypot [5]. Nepenthes Honeypot is developed by SPARSA's for its
On a lowinteraction honeypot, there is no operating system ongoing viral research project. It is freely available for
for the attacker to interact with [2] (pp. 19). They can be download and use as VM from
compared to passive IDS since they do not modify network http://www.sparsa.org/node/23
traffic in any way, and do not interact with the attacker. 2.3.1.1. Features
Although this minimizes the risk associated with honeypots, Nepenthes Ampullaria acts like a honeypot to feign
it also makes low interaction honeypots very limited. vulnerability to, and download viruses / worms /Intrusions
However, they can still be used to analyze spammers and into hexdumps which can be reversed. A collection of
can also be uses as active countermeasures against worms 30,000 attacks is growing each day the nepenthes computer
[5]. Low-interaction honeypots are easy to deploy and is online, gathering data to submit to anti-virus companies
maintain. An example of a commercial low-interaction about what is in the wild. Currently SPARSA operates a
honeypot is honeyd. Honeyd is a licensed daemon that is centralized Nepenthes server out of their office in RIT's
able to simulate large network structures on a single CIMS building. A Virtual Machine running Nepenthes is
network host [3, 13]. Honeyd works by imitating computers available here for download and use by public. We need
on the unused IP address of a network, and provides the VMware Player, VMware Server, or VMware workstation in
attacker with only a façade to attack. Another example of a order to run these Virtual Machines. VMware Player and
low-interaction honeypot is Specter, which is developed and VMware Server are free to all. RIT also has a site license for
sold by NetSec. Specter has functionality like an enterprise VMware workstation you may inquire about. The Virtual
version of BOF and only affects the application layer. Machines are to be used either on their own unprotected box
2.2.2. Medium-Interaction Honeypots serving VMware images, or placed on the DMZ of a
Medium-interaction honeypots are slightly more firewalled environment. This gives the best opportunity to
sophisticated than low interaction honeypots, but less catch attack and exploits in the wild. It is SPARSA's goal to
sophisticated than high interaction honeypots [8]. Like low- set up a centralized submission and analysis cluster with
interaction honeypots they do not have an operating system help from folks like you. All VMs submit to the SPARSA
installed, but the simulated services are more complicated server where results will be analyzed and submitted to major
technically. Although the probability that the attacker AV companies and the Norman Sandbox. A copy is also
finding a security vulnerability increases, it is still unlikely kept for the local user to tinker with. Using this tactic the
that the system will be compromised [2] (pp. 20). Medium- Security Practices and Research Student Association hopes
interaction honeypots provide the attacker with a better to analyze viruses and malware in the wild by allowing
illusion of an operating system since there is more for the everyone to participate in collection and analysis. Future
attacker to interact with. More complex attacks can versions will pare down the known malware on the vm-
therefore be logged and analyzed. Some examples of clients so submission to the server is only unknown
medium-interaction honeypots include mwcollect, nepenthes malware.
and honeytrap. Mwcollect and nepenthes can be used to 2.3.2. BackOfficer Friendly:
collect autonomously spreading malware. These daemons A free win32 based honeypot solution by NFR Security (a
can log automated attacks, and extract information on how separate Unix port is available but has restricted
to obtain the malware binaries so that they can automatically functionality). It is able to emulate single services such as
download the malware. Honeytrap dynamically creates port telnet, ftp, smtp and to rudimentary
listeners based on TCP connection attempts extracted from a log connection attempts.
network interface stream, which allows the handling of 2.3.3. Deception toolkit (DTK):
some unknown attacks. A free and programmable solution intending to make it
2.2.3. High-interaction honeypots appear to attackers as if the system running DTK has a large
These are the most advanced honeypots. They are the most number of widely known vulnerabilities
complex and time-consuming to design, and involve the (http://www.all.net/dtk/dtk.html).
highest amount of risk because they involve an actual 2.3.4. HOACD:
operating system [2] (pp. 20 – 21). The goal of a high- This is a ready-to-run honeyd+OpenBSD+arpd on a
interaction honeypot is to provide the attacker with a real bootable CD (http://www.honeynet.org.br/tools)
operating system to interact with, where nothing is 2.3.5. Honeyd
simulated or restricted [8]. The possibilities for collecting In Honeyd, They expect adversaries to interact with
large amounts of information are therefore greater with this honeypots only at the network level. Instead of simulating
30 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

every aspect of an operating system, they decided to network stack.


simulate only its network stack. The main drawback of this 2.3.6. HYW – Honeyweb
approach is that an adversary never gains access to a An in-depth simulation of an IIS 6.0 webserver that enables
complete system even if he compromises a simulated you to use your web content (perfect choice for capturing
service. On the other hand, we are still able to capture worms).
connection and compromise attempts. For that reason, 2.3.7. Mantrap / Decoy Server (commercial)
Honeyd is a low-interaction virtual honeypot that simulates Symantec Decoy Server sensors deliver holistic detection
TCP and UDP services. Honeyd must be able to handle and response as well as provide detailed information
virtual honeypots on multiple IP addresses simultaneously. through its system of data collection modules.
This allows us to populate the network with a number of 2.3.8. Specter
virtual honeypots that can simulate different operating SPECTER offers common Internet services such as SMTP,
systems and services. Furthermore, Honeyd must be able to FTP, POP3, HTTP and TELNET. They appear to be normal
simulate different network topologies. to the attackers but are in fact traps for them to mess around
and leave traces without even knowing they are connected to
a decoy system. It does none of the things it appears to but
instead logs everything and notifies the appropriate people.
2.4. Installing your own honeypot
Depending on the type of technology used there are different
things to consider when installing and deploying a
honeypot.
2.4.1. Low-interaction honeypot:
Make sure an attacker can’t access the underlying operating
system (especially when using plugins). If possible make use
of the honeypot’s features to emulate a more realistic
environment (e.g. traffic shaping).Make sure to use the
latest versions available.
2.4.2. Medium-interaction honeypot:
Figure 3. Honeyd receives traffic for its virtual honeypots Make sure an attacker can’t escape the jailed environment.
via a router or Proxy ARP. For each honeypot, Honeyd can Be aware of SUID or SGID files.
simulate the network stack behavior of a different operating
2.4.3. High-interaction honeypot:
system.
Use advanced network techniques to control the honeypot
2.3.5.1. Architecture (e.g. firewalls, intrusion detection systems) and make sure it
When the Honeyd daemon receives a packet for one of the can’t be used to harm third parties (e.g. legal issues of an
virtual honeypots, it is processed by a central packet open relay). If possible, poison the honeypot. Use software
dispatcher. The dispatcher checks the length of the IP that actually has vulnerabilities or your honeypot might
packet and verifies its checksum. The daemon knows only never be exploited successfully. Use tripwire or AIDE to get
three protocols: ICMP, TCP and UDP Packets for other a snapshot of the system.
protocols are discarded.
2.5. Virtual Honeywall
It's implemented in a virtual machine which has 3 network
cards. One is for the connection to the firewall, the second
to the internal network and the third is for remote
management of the honeywall itself.
2.6. Honeynet
A honeynet is made by networking these honeypots
(explained above) and the traffic to each honeypots will be
controlled with help of honeywall.

3. System Architecture

Figure 4. Overview of Honeyd’s architecture.


Incoming packets are dispatched to the
correct protocol handler. For TCP and UDP, the
configuredservices receive new data and send responses if
necessary.All outgoing packets are modified by the
personality engine to mimic the behavior of the configured Figure 5. Small Model of our Architecture
(IJCNS) International Journal of Computer and Network Security, 31
Vol. 2, No. 8, August 2010

Here the traffic will be entering the Network through the value then the source IP is tagged as a suspicious. The same
Linux firewall. Then all the traffic will be going through the process above is then done in multiple time intervals and if
honeywall. From there, honeywall will redirect the traffic the same source IP shows a similar behaviour, it will be
from each section to each honeywall. This is done since confirmed as a suspicious IP. The IP address would be
there can be tremendous connections which comes to the updated in the iptables in the Linux firewall so that further
web server. Some of these may be an attack and some may communication from the same IP Address would be blocked.
be normal ones. Here the redirected traffic which reaches In the second module, the monitored traffic is classified
the honeywall will be examined separately and those found based on protocol. Some features have been done with help
to be malicious would be spotted, their source IP address is of honeyd. Here the whole traffic would be classified based
send to the firewall by which the rules stored in the firewall on protocols and will be logged for further analysis. Here
can be updated. The outer firewall will be initially setup in a using the dynamically generated signatures, the snort
Redhat Linux machine, by updating its IPTables via database is updated in a real-time basis.
network messages. Some groups of Honeypots were created
inside a single machine using virtualisation techniques. 5. Findings
Softwares like Sun virtual box-VMWare were used for this
By implementing this system, we could implement a cost
purpose so that it was not required to buy large number of
effective Intrusion detection and prevention system. Here,
high end systems for implementing this setup and thereby
since the signatures for snort database is updated in a real
cost was reduced in a great extend.
time basis, it would also work as a prevention system. Since
these VMware’s are running on a single hardware, we were
4. Implementation able to setup different honeypots with different interaction.
So that some would be good in capturing network attacks
and some others could be used for preventing the spread of
NoRestrictions
worms in the network.

Honeypot
6. Conclusions and Future Outlook
Internet
In this paper we have provided a brief overview of what
Honeywall honeynet are, and they are useful in NIDS and NIPS. We
have discussed the different types of honeypots, honeywall
Connections Limited Packet Scrubbed Honeypot
and how to combine and set up a honeynet. We also looked
at factors that should be considered when implementing a
honeypot. Here we have used it along with a firewall and an
Figure 6. Communication Architecture IPS module which updates the rules in firewall. VMware
software which serves the purpose of virtual machine is used
so that there is no need to buy large server configuration
FORWARD machines. Even a single machine resource can be shared by
CHAIN many hosts which are used for making honeypots thereby
bringing out cost effectiveness.
We are planning to use the honeypots or honeynet for
INPUT OUTPUT vulnerability analysis in a network to find out both host
CHAIN CHAIN based vulnerability and network based vulnerability. Also
these honeynets could be used to find out the spread of
IPTABLES FIREWALL worm in a network and prevent it from spreading to the
entire network by creating and updating the signature
automatically.
Figure 7. Implemented Routing of packets
We are using two types of attack detection techniques – one References
is Threshold based classification and another one is protocol [1] I. Mokube, M. Adams. “White paper: Honeypots:
based classification. The first one mainly monitors the Concepts, Approaches, and Challenges,” ACMSE 2007,
traffic and if the traffic exceeds the defined threshold for a March 23-24, 2007,Winston-Salem, North Carolina,
particular Source IP address. The occurrence of each source USA
IP in a flow is determined and the total number of unique [2] R. Baumann, C. Plattner, “White Paper: Honeypots,
destinations and unique ports accessed by the IP is Swiss,” Federal Institute of Technology, Zurich, 2002.
determined. The ratio of the number of destination IPs to the [3] K. Gubbels “Hands in the Honeypot,” GIAC Security
number of destination ports is determined (IP/Port ratio). Essentials Certification (GSEC), 2002.
This IP/Port ratio is compared with the threshold value and [4] Karthik S, Samudrala B, Yang AT. “Design of Network
if the value is far greater than or far less than the threshold Security Projects Using Honeypots,” Journal of
32 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

Computing Sciences in Colleges, 20(4), pp. 282-293.


2005
[5] N. Provos, “Honeypot Background, ”
http://www.honeyd.org/background.php.
[6] L. Spitzner, “Honeypots: Tracking Hackers,” Addison-
Wesley Pearson Education, Boston, MA, 2002.
[7] L. Spitzner, “The Value of Honeypots, Part One:
Definitions and Values of Honeypots,” Security Focus,
2001.
[8] Jr, Sutton, R.E. DTEC 6873 “Section 01: How to Build
and Use a Honeypot.”.

Authors Profile

Aathira K S received B.Tech in


Computer Science and Engineering from
Kerala University. Currently pursuing
M.Tech. in Cyber Security from Amrita
School of Engineering Coimbatore. Her
research interests are Intrusion Detection
and Prevention Systems, Malware
detection.

Hiran V Nath received B.Tech in


Information Technology from Kerala
University. During 2007-2009, he worked in
VSSC/ISRO, Govt of India, on contract
basis through Hi-Rel Fabs, Trivandrum.
Currently pursuing M.Tech. in Cyber
Security from Amrita School of Engineering
Coimbatore. His research interests are
Intrusion Detection and Prevention Systems,
Malware detection.

Thulasi N. Kutty received B.Tech in


Computer Science and Engineering from
Kerala University. Currently pursuing
M.Tech. in Cyber Security from Amrita
School of Engineering Coimbatore. Her
research interests are Intrusion Detection
and Prevention Systems, Malware
detection.

Gireesh Kumar T received B.Tech degree


in Mechanical Engineering from
N.S.S.college, Palghat,Kerala in 1998 .He
attained his MTech degree in Computer and
Information Science from Cochin University
of Science and Technology, Cochin, Kerala
in 2002. He is currently pursuing PhD.in
Artificial Intelligence at Anna University,
Chennai. He was Senior Lecturer with
Department of Computer Science and
Engineering at VLB Janakiammal College of Engineering,
Coimbatore, Tamilnadu from 2004 to 2008.He is now an Assistant
Professor (Sr.Grade) with Centre for Cyber at Amrita Vishwa
Vidyapeetham, Ettimadai, Tamilnadu. His research interests are in
the field of artificial Intelligence, Machine. Learning and
Algorithms. He has about 20 publications to his credit.
(IJCNS) International Journal of Computer and Network Security, 33
Vol. 2, No. 8, August 2010

Spatial Cluster Coverage of Sensor Networks


G.N. Purohit, Megha Sharma
1
Department of Mathematics, AIM & ACT, Banasthali University,
Banasthali-304022
gn_purohitjaipur@yahoo.co.in
2
Department of Computer Science, AIM & ACT, Banasthali University,
Banasthali-304022
edify44@yahoo.com

Section 3 the coverage problem at critical percolation is


Abstract: The availability of wireless Sensor Networks
solved. Section 4 concludes the paper.
(WSN’s) offers the opportunity to approach practical problems
in a different way, sensing the environment and collecting data
dynamically. Since sensor nodes are deployed in a large region, 2. Coverage in a Three-Dimensional region at
the objective is to achieve complete coverage of the region that is critical percolation
every location in the region lies in the observation field of at We propose a probabilistic approach to compute the covered
least one sensor node. However, the initial placement of sensors volume fraction at critical percolation for the phase
may not achieve this goal for various reasons. In this paper, we
transition of coverage. We try to solve the coverage problem
study the coverage phase transition in sensor networks. Given a
three dimensional Region of Interest (R.O.I) we study the
using a Poisson distribution for computing covered volume
transition from small fragmented regions to a single large fraction. We consider a set of homogeneous sensing spheres
covered region. whose centers represent the location of sensors and are
randomly distributed in three dimensional space according
to Poisson distribution of density (λ ) . In percolation based
Keywords: critical density, covered components, percolation,
spherical shell.
approach, we find out the critical density λ cd . This density
1. Introduction λ cd represents the density of sensors for the first minimum
A sensor is an equipment which has the capability to required coverage such that we say percolation surely occurs
perceive the environment where it is established or the
when λ > λ cd .
phenomenon that justified its implementation. It must also
be able to transmit the perceived data. There has been a 2.1 Models and Terminology
growing interest to study and build systems of mobile sensor In this section we describe the percolation model and
networks. It is envisaged that in the near future, very large introduce some relevant terminology. We first give some
scale networks consisting of both mobile and static nodes definitions related to the model.
will be deployed for various applications, ranging from 2.1.1 Definitions
environment monitoring to emergency search-and-rescue Def 1. (Sensing Range). The sensing range Si (r ) of a
operations [12], [3]. Some issues such as location,
deployment and tracking are the fundamental issues, sensor Si is a sphere of radius r centered at ξ i and defined
because many applications such as battlefield surveillance, by
environmental monitoring and biological detection rely on
 
them. S (r ) = ξ ∈ I R3: ξ − ξ ≤ r ,
In this paper, we address one of the fundamental i  i j 
problems of WSN’s i.e., coverage. Coverage is a metric of
the quality of service that reflects how well a target field is
where ξ i − ξ j stands for the euclidian distance between
monitored under the base station .The coverage problem can
ξ i and ξ j .
be studied under different objectives and constraints
Def 2. (Covered Volume Fraction). The covered volume
imposed by the applications such as, worst-case coverage fraction of a Poisson Boolean model
[6], deterministic coverage [6], [1] or stochastic coverage
[2], [6], [1], and [4], [9]. The coverage provided by sensor
( X λ , {S i (r ) : i ≥ 1} ) given by V(r ) = 1 − exp(− vλ )
is the mean fraction of volume covered by sensing spheres
4 3 Si
networks is very crucial to their effectiveness [8]. We wish to (r), for i≥1, in a region of unit volume where v = πr is
track the transition of the Region of Interest (R.O.I) from the volume of the sensing disk and λ is the density 3 of
partially covered to fully covered. As more and more sensors Poisson point process X λ .
are continuously deployed, the degree of sensing coverage Def 3. (Collaborating Sensor). Two sensors Si and S j are
provided by a WSN and/ or the size of covered region said to be collaborating if and only if the Euclidian distance
increases. We compute the probability of the change from between the centres if their spheres satisfies ξ i − ξ j ≤ 2r .
small covered fragments to a single large connected The collaborating set of the sensors Si and S j , denoted by
component, and we study such transition of phase in Col(Si), include all of the sensors it can collaborate with ,
network coverage through percolation theory. i.e.
The remainder of this paper is organized as follows.
Section 2 defines the percolation approach for coverage. In {
Col(Si ) = S j : ξ i − ξ j ≤ 2r }
34 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010
Def 4. (Collaboration Path). A collaborating path between 1 ≤ j , l ≤ m and j ≠ l . A covered K-component, denoted
two sensors Si and Sj is the sequence of sensors by C k is a covered component having k sensing spheres.
Si , Si +1 ,..., S j-1 , S j such that any pair of sensors S l and Def 7. (Critical Covered Volume Fraction). The critical
S l+1 for i≤l≤j-1 are collaborating. covered volume fraction of (X λ , {S i (r ) : i ≥ 1}), computed
as Vc = 1 − exp (− vλ c ) , is the fraction of volume covered
Let X λ = {ξ i : i ≥ 1} be a three dimensional homogeneous at critical percolation, where λ cd is the associated density of
Poisson point process of density λ , where ξ i represents the Xλ.
location of the sensor Si . Here, we compute the probability of the occurrence of first
Def 5. (Spatial Poisson Point Process). Let X λ (V ) be a minimum required coverage that appears at critical
random variable representing the number of points in a density λcd . We assume that λ is not constant as the sensors
volume (region) V. The probability that there are k points are deployed randomly. We want to compute the critical
inside V is computed as density, λ cd at critical percolation, such that when
λ > λ cd , the Boolean model (Xλ ,{Si (r ) : i ≥ 1}) is said to
k be percolating.
λk V
P(X λ (V) = k ) = exp (- λ v ) (1) 3.3 Approximation of the Shape of Covered
k! Components
For all k ≥ 0 V is the volume of region V.

2.2 Percolation Model


A Percolation model can be viewed as an ensemble of points
distributed in space, where some pairs are adjacent [B]. We
consider a Boolean model which is defined by two
components
(i) Point process X λ , and
(ii) Connection function h (a)

The set X λ = {ξ i : i ≥ 1} is a homogeneous Poisson point


process of density λ in a three dimensional Euclidian plane
3
of I R , where the element of X λ are the locations of the
sensors used to cover a field.
The connection function h is defined such that two points
ξ i and ξ j are adjacent independently of all other points,
( )
with probability h ξ i − ξ j given by

1 if ξ i − ξ j ≤ d
(
h ξi − ξ j =  )
 0 otherwise
(b)

where ξ i − ξ j is the Euclidian distance between
Figure 1. (a) Diagramatic representation of overlapping
ξ i and ξ j
We consider a continuum percolation model that consists spheres. (b) Shape of a covered component
of homogeneous spheres whose centers (representing
locations of spheres) are randomly distributed in I R
3 The centers of all covered k-components represented by
according to a spatial Poisson point process of density λ . In ξ k also form a Poisson process with density λ(k ) .i.e., the
percolation theory we are interested in the critical density covered components are randomly and independently
λ cd at which infinite cluster of overlapping spheres first distributed according to a Poisson process with a density of
appears. The density λ cd is the critical value for density λ
λ(k ) centers per unit volume. If we assume the geometric
such that there exists no infinite cluster of overlapping
spheres almost surely when λ < λ cd , but there is an infinite form that encloses a covered k-component is a spherical
cluster of overlapping spheres, almost surely when λ > λ cd , shell, then let R k be the radius of the spherical shell
denoted by S p (R k , k ) and there is no other sensing sphere
and we say percolation occurs.

3. Phase Transition from Scattered Sensing overlapping with the boundary of the spherical shell. Thus
the cocentric spherical band of thickness r, denoted by
Clusters to Single Percolation Coverage
CC b (r ) , which surrounds the spherical shell should not
In this problem we are interested in finding probability of
include any other sical sensing sphere. Hence the annulus
first appearance of an infinite (or single large) coverage
component that spans the entire network. between radii R k and R k + r around the center ξ (k ) must
be empty.
Def 6. (Covered k-Component). A set of sensing spheres Let P(k ) is the probability that the concentric spherical
{S i (r ) : 1 ≤ i ≤ m} is said to be a covered component if shell encloses only one covered k-componenet. This
and only if it is maximal, and there exists a collaboration
path between any pair of sensors Si and S j , for all probability is given by
(IJCNS) International Journal of Computer and Network Security, 35
Vol. 2, No. 8, August 2010

[
P(k ) = Prob S p (R k , k ) CC b (r ) is empty , ] x
( )
erf (x ) = ∫ exp - t 2 dt
2
where, CC b (r ) - cocentric spherical shell; Sp(R k , k ) π0
which encloses a covered k-component ; R k - radius of From (4) and (5) we get,
λc
sphere. λc (k) =
(erf(2 λ π r)− 4 - λ re ) 2
(6)
To ensure, that the circle encloses only one covered k- 4λλcr 2
c c
component, annulus between radii Rk and Rk+r around the
center ξ (k ) must be empty.

P(k) =
[
ProbSp (R k , k)andCCb (r)empty ] 3.5 Radius at Critical Percolation of Covered
Components
Prob[CCb (r)empty]
(2)
Critical radius of a covered component is a particular value
of the radius R k of the spherical region enclosing a covered
Where, [
Prob S p (R k , k )andCC b (r )empty ] is the component that surely guarantees the formation of a special
class of covered k-components. Regardless of the number of
probability that the spherical shell of thickness
sensing spheres of radius r, located in sphere of radius 2r ,
R k + r encloses only one covered k-component. these sensing spheres should definitely form a covered k-

[ ]
Thus,
ProbSp (R k , k )andCCb (r )empty =
component (i.e., when R k = 2r , the critical radius, ensures
[
Prob Sp (R k +r , k ) ] covered k-component).
At critical percolation, the density of covered k-
From (1)
k components, which are enclosed in spheres whose radii
 4 3
 λ π (R k + r )  (critical radius) are equal to 2r , is given as
[
Prob Sp (R s + r ), k = 
3
]  exp  − λ 4 πR 3 
 k 
N
 
k! 3
λ= (7)
and 4
πR 3
Prob [CC b (r )empty ] =
3
where, N is number the number of sensing spheres

( 3 
exp  - λ π (R k + r ) − R k 
4 3
) randomly deployed in a spherical region of radius R , an
 3  density λ
Therefore, The mean number of covered k-components,

ω k = λ(k ) πR 3
k
4
 4 3 
 λ π (R k + r )  3
p(k) =  3  exp  − λ 4 πR 3  (3)
  ωk ω
λ(k ) = From (7) λ(k ) = λ k
k
k!  3  (8)
4 3 N
3.4 Density at Critical Percolation for covered πr
component 3
The average distance between sensors is the average of the ωk
can be approximated by probability
minimum distance between all sensing spheres, each from N
P[rad(C k ) = 2r ]
one covered component. Two covered components can be
merged together into a single one if and only if, there is a
ω
Hence, P[rad(C k ) = 2r ] = k
pair of sensing spheres, such that the distance between their
centers is at most equal to 2r. (9)
N
At critical percolation, the average distance between two
neighboring covered components is given by: From (8) and (9)
1
d 1 avg = λ(k) = λP[rad(C k ) = 2r ]
(4)
(10)
2 λ c (k)
Substituting R k = 2r in (3)
λ c (k ) =density of Ck at critical percolation
Also the average distance between two neighboring covered
λ c (k ) =
(
λ c 36λ c πr 3 )k
 32 
exp - λ c πr 3  (11)
components is given by k!  3 

d 2
avg =
( )
erf 2 λ cπr − 4 - λ c re 4λλπr
2

(5) 3.6 Identification of critical percolation


2 λc We generate an equation that identifies critical percolation
for a set of covered k-components
erf(x) is the error function [9]
36 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

From (5) and (6)

g (V(r),k) =
(erf(2 λ πr)−4 λ re-λ 4πr ) (27μ) e
c c c
22 k
-8μ
−1=0 (12)
1 c
k!
where, g1 (Vc (r ), k ) is the equation for percolation.
Since we are interested in computing the covered volume
fraction for critical fraction

Vc ( r ) = 1 − e - λ cπ r 3
4
3
π r 3 = − log(1 − Vc (r )) = µ
4
Hence, λ c
3
Therefore, substituting Vc ( r ) in (7)
g1 (Vc (r ), k ) (b)
2
  3μ  4 3μ 
erf2  − exp-3µ  (27μ)
k

  2r  2r π  r 
 
=  exp(-8μ) −1= 0 (13)
k!

3.7 Numerical Results


The function g1 (Vc (r ), k ) described in equation (8), which
varies with Vc (r ) is represented in Figs. 2(a), 2(b), 2(c).
for different values of k=2,k=3,k=4 respectively. We observe
that the function g1 (Vc (r ), k ) does not attain the value 0
for k=2, Fig. 2(a). However, it attains the value 0 for k=3
Thus percolation occurs first at k=3 and for k=4, Figs. 2(b)
and 2(c).Since, we are interested in the covered volume (c)
fraction. The percolation will be complete when value of Figure 2. Critical percolation at k=3 and k=4 for Vc(r) =
Vc (r ) is greater than equal to 0. Analyzing the results we 0.23 and Vc(r)=0.26 respectively
obtain that critical percolation occurs at Vc (r ) =0.23 and
Vc(r) =0.26 for k=3 and k=4, respectively.
4. Conclusion
In this paper we have discussed the problem of Phase
Transition from Scattered Sensing Clusters to single
Percolation Coverage in WSN’s using a probabilistic
approach. We determined, when an infinite covered
component could take place for the first time. For achieving
this objective, we have taken the covered volume fraction
metric for calculating the critical percolation.

References
[1] B. Liu, and D. Towsley, “A Study of the Coverage of
Large-scale Sensor Networks,” In Proceedings of
MASS ’04, 2004.
[2] C. Huang, and Y. Tseng, “The Coverage Problem in a
Wireless Sensor Network,” In Proceedings of WSNA
’03, pp. 115-121, 2003. (Conference proceedings)
[3] D. Estrin, D. Culler, K. Pister, and G. S. Sukhatme,
(a) “Connecting the physical world with pervasive
networks,” IEEE Pervasive Computing, vol. 1, no. 1,
pp. 59.69, 2002.
[4] D. Miorandi, and E. Altman, “Coverage and
Connectivity of Ad Hoc Networks in Presence of
Channel Randomness,” In Proceedings of the IEEE
INFOCOM 05, March 2005, pp. 491–502. 2005.
(IJCNS) International Journal of Computer and Network Security, 37
Vol. 2, No. 8, August 2010

[5] F. Koushanfar, S. Meguerdichian, M. Potkonjak, and


M. Srivastava, “Coverage Problems in Wireless Ad-Hoc
Sensor Networks,” In Proceedings of the IEEE
INFOCOM 01, pp. 1380–1387, 2001.
[6] G. Vendhan.S1and S.V Manisekaran2, “A Survey on
Hybrid Mobile Sensor Networks,” International Journal
of Recent Trends in Engineering, Vol. 1, No. 2, 2009.
[7] G. Xing, X. Wang, Y. Zhang, C. Lu, R. Pless and C.
Gill, “Integrated Coverage and Connectivity
Configuration for Energy Conservation in Sensor
Networks,” ACM Transactions on Sensor Networks,
Vol. 1, No. 1, pp. 36–72, 2005.
[8] http://mathworld.wolfram.com/Erf.html, 2008.
[9] J.W. Essam, “Percolation theory, “Reports on Progress
in Physics, vol. 43, pp. 833-912, 1980.
[10] S. Poduri and G. S. Sukhatme, “Constrained Coverage
for Mobile Sensor Networks,” In IEEE International
Conference on Robotics and Automation, pp. 165-172,
2004.

Authors Profile

Prof. G. N. Purohit is a Professor in


Department of Mathematics & Statistics at
Banasthali University (Rajasthan). Before
joining Banasthali University, he was
Professor and Head of the Department of
Mathematics, University of Rajasthan, Jaipur.
He had been Chief-editor of a journal.His
present interest is in O.R., Discrete
Mathematics and Communication networks. He has published
around 40 research papers in various journals.

Megha Sharma received the B.C.A and


M.C.A degree from I.G.N.O.U in 2004 and
2008, respectively. She is currently working
towards a Ph.D degree in computer Science
at the Banasthali University of Rajasthan.
Her research interests include wireless sensor
networks with a focus on the coverage of
wireless sensor networks.
38 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

A New Approach to Measure Quality of Image


Encryption
Alireza Jolfaei1 and Abdolrasoul Mirghadri2
1
Faculty and Research Center of Communication and Information Technology, IHU, Tehran, Iran
Jolfaei@yahoo.com
2
Faculty and Research Center of Communication and Information Technology, IHU, Tehran, Iran
Amrghdri@ihu.ac.ir

During last two decades, chaotic dynamical systems have


Abstract: Image encryption techniques are applied widely in attracted the attention of cryptographers due to their
the digital world today to assure information security. Although definable and pseudo-random behavior. In consequence of
more and more encryption algorithms appear, they lack a increased interest in this field, a large number of chaos
method to evaluate the encryption quality. Visual inspection is based image encryption schemes have been proposed [1, 2,
not enough on judging the quality of encrypted images. So, we 3]. Designing good image encryption schemes has become a
propose three classes of measurements based on the pixel’s
focal research topic since the early 1990s. So far a number
position changing, value changing and both value-position
of image encryption quality measures have been proposed
changing. In order to evaluate the efficiency of methods,
measurements were applied on three different chaotic image [4, 5, 6]. However, Most of the previous studies on image
encryption algorithms based on the baker’s map, Arnold cat encryption were based on visual inspection to judge the
map and standard map. Experimental results indicate the effectiveness of the encryption techniques. Unfortunately,
performance of the measurement techniques in terms of there are no classified measures to justify and compare the
producing results that are consistent with the judgment by visual effectiveness of proposed schemes. However, in [7],
inspection. Elkamchouchi and Makar presented quantitative measures
of the encryption quality based on maximum deviation and
Keywords: encryption quality, chaotic image encryption, correlation. Afterwards, they proposed an improved version
baker’s map, Arnold cat map, standard map of maximum deviation measure and named it as irregular
deviation measurement. In this paper, we present new
classified tests for encryption quality measurement and
1. Introduction implement these tests on three common encryption schemes
based on baker’s map, Arnold cat map and standard map
Nowadays, along with the development of digital
and compare the results.
technologies and telecommunication networks, there is a
substantial increase in the demand for private and secure This paper is organized as follows. In the next section
movement of highly confidential imagery data over public three image encryption schemes based on chaotic maps are
channels. The concern for protection of information is briefly overviewed. In Section 3, the new classified measures
increasing at an alarming rate. It is important to protect the of encryption quality are introduced. Experimental results
confidentiality of imagery data from unauthorized access. for presented encryption schemes are reported in section 4.
Security breaches may affect user’s privacy and reputation. Finally, some conclusions are given in Section 5.
So, data encryption is widely used to confirm security in
open networks such as the internet.
2. Chaotic Image Encryption Algorithm
Digital image is a massive two-dimensional data. The
smallest unit of an image is a pixel. In a digital image, each The increasing interests in utilizing chaotic dynamics in
pixel represents a different level of color intensity. various cryptographic applications have ignited tremendous
According to the capacity of human visual perception in demands for chaos generators with complex dynamics but
distinguishing different levels of intensity, the entire range simple designs. The mixing property of chaotic maps is of
of intensity is divided into 256 levels. Thus, the level of particular interests for cryptographic designs. Due to the
intensity in each pixel has a value between 0 and 255. This differences in formulations, the nature of the generated
range is demonstrated by a byte (8 bits). Therefore, each chaotic maps may not be the same and hence their
pixel is equal to one byte. For example, a gray scale image characteristics are different. Among chaotic maps, 2D
with size of 256×256 pixels is approximately 65 KB. So, an baker’s map, Arnold cat map and standard map attract
image with a small size has a large data volume. However, much attention. These prevalent maps are described as
due to large data size and real time requirement, it is not follows.
reasonable to use conventional encryption methods. Thus, a
major recent trend is to minimize the computational 2.1 Baker’s Map
requirements for secure multimedia distribution. The baker’s map, invented by Eberhard Hopf in 1937, is an
intuitively accessible, two-dimensional chaos-generating
(IJCNS) International Journal of Computer and Network Security, 39
Vol. 2, No. 8, August 2010

discrete dynamical system [8]. This is a simple example of a discovered the ACM in the 1960s and he used the image of
map similar to a horseshoe, although it is a discontinuous a cat while working on it [11]. Assume that the dimension
map [9]. Consider the map F for the half-open square of the original grey scale image is N×N. Arnold cat map is
[0,1) ×[0,1) onto itself where described as follows:

 x n +1  x n  1 p  x n 
F (x , y ) = (σ (x ), g (a , x , y )) (1)   = A   mod N =     mod N , (5)
 y n +1  yn  q pq + 1  y n 

where p and q are positive integers and det (A) = 1, which


σ ( x) = 2 x mod 1 , 0 ≤ σ ( x) < 1 (2)
makes the map area-preserving. The (xn+1, yn+1) is the new
position of the original pixel position (xn, yn) when Arnold
1 1 
cat map is performed once. The period T of the Arnold cat
 2 ay 0≤x <
2  map depends on the parameters p, q and the size N of the
g (a , x , y ) =   mod 1, 0 ≤ g < 1. (3) original image. After iterating this map m times, we have
1
 (ay + 1) 1 
≤ x <1 
 2 2  x n +m  m x n 
  = A   mod N . (6)
We show F(S) in Fig. 1, where S is the unit square. The  n +m 
y yn 
geometrical nature of the map is equivalent to a horizontal
stretching and vertical contraction, followed by a vertical An interesting property of the ACM is the Poincaré
cutting and stacking. This resembles the preparation of Recurrence Theorem [12]. The Poincaré Recurrence
dough, so F is often called the baker’s transformation. Theorem states that certain systems will, after a sufficiently
long time, return to a state very close to the initial state.
Since an image is defined as a 2D matrix with finite
This means that after a certain number of iterations the
pixels, a correspondingly discretized form of the baker’s
ACM will return to its original state.
map needs to be derived. In fact, a discretized map is
required to assign a pixel to another pixel in bijective 2.3 Standard Map
manner. In [10], Pichler and Scharinger suggested an The 2D standard map illustrates the motion of a simple
approach for the discretized generalized baker’s map as mechanical system called the kicked rotator [13]. This map
follows: is an area-preserving chaotic map from [0, 2π)×[0, 2π) onto
itself and is described by
N N n N
B( n ,...,n k ) (x , y ) =( (x − N i ) + y mod , i ( y − y mod ) + N i ). (4)
1 ni ni N ni x n +1 = (x n + y n ) mod 2π
 (7)
Considering a N×N square image, B(n1,...,nk ) denotes the  y n +1 = ( y n + k sin x n +1 ) mod 2π ,
discretized generalized baker’s map with where (x n , y n ) ∈ [0,2π ) , and the constant k > 0 is the
N i ≤ x < N i + ni +1 and 0 ≤ y < N . The sequence of k control parameter. In order to map image pixels to another
integers, n1,…, nk, is chosen such that each integer ni in a bijective manner, the discretized version of standard
divides N and n1+…+ni = Ni. map is required. In [14], Fridrich stated the criterion for
continuous map discretization. So, the discretized standard
map is attained by substituting
N N N
X =x , Y =y , K =k , which maps from [0,
2π 2π 2π
2π)×[0, 2π) to N×N. The discretized map is as follows

x n +1 = (x n + y n )mod N ,

 2π x n +1 (8)
 y n +1 = ( y n + k sin( N ))mod N .
(a)
This map reduces the computational complexity by
operating in integer domain. So, it is more suitable for real-
time data encryption.

3. Measurement of Encryption Quality


(b)
Image encryption quality measures are figures of merit used
Figure 1. Baker’s map: (a) geometrical nature of the baker’s map, (b) for the evaluation of image encryption techniques. We
area contraction by the map F.
classify these measures into three categories: methods based
on the pixel’s position changing, methods based on the
2.2 Arnold Cat Map pixel’s value changing and methods based on both pixel’s
The Arnold cat map is a discrete system that stretches and value and position changing. We present these measures as
folds its trajectories in phase space. Vladimir Arnold follows.
40 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

3.1 Measurement Based on the Position Changing H −1 W −1


| P (i , j ) − C (i , j ) |
∑∑
1
Here, we propose a method to justify the confusion property ARE = , (13)
HW | P (i , j ) |
of a chaotic map. That is to test the average distance change i =0 j =0
(ADC) among indices of closed pixels in plain-image and
which gives the average relative error of a pixel.
indices of relocated pixels in cipher-image. If an H×W
image is permuted by chaotic map, then for the four
3.3 Measurement Based on the Value and Position
neighbor pixels in the plain-image {(i–1, j), (i+1, j), (i, j–1),
Changing
(i, j+1): (i = 1, 2,…, H–2), (j = 1, 2 ,…, W–2)}, the average
distance change is defined as In [6], Yu et al. presented a new method of evaluating
scrambling degree through judging the relativity of closed
1 pixels. Their method for evaluating scrambling degree for
ADC (i , j ) = [ D ((i ′ − 1, j ′),(i − 1, j )) + D ((i ′ + 1, j ′), (i + 1, j )) (9)
4 an image of H×W is as follows:
+ D ((i ′, j ′ − 1),(i , j − 1)) + D ((i ′, j ′ + 1), (i , j + 1))],
H −1 W −1

D ((i ′, j ′), (i , j )) = (i ′ − i ) 2 + ( j ′ − j ) 2 , (10) ∑ ∑ Rij


i =0 j =0
S= , (14)
where (i', j') is the location of the pixel permuted from the 2552 × H ×W
one (i, j). Thus, the average distance change in the whole
image is Rij = F1 (i , j ) + F2 (i , j ) + F3 (i , j ) + F4 (i , j ), (15)

H −2 W −2
F1 (i , j ) =| [C (i − 1, j ) − C (i , j )]2 − [P (i − 1, j ) − P (i , j )]2 |
∑ ∑ ADC (i , j ). (11)
1
ADC = 
(H − 2)(W − 2) F2 (i , j ) =| [C (i + 1, j ) − C (i , j )]2 − [P (i + 1, j ) − P (i , j )]2 |
i =1 j =1 (16)

F3 (i , j ) =| [C (i , j − 1) − C (i , j )] − [P (i , j −1) − P (i , j )] |
2 2
Seen from Eq. (11), the average distance change is always

F4 (i , j ) =| [C (i , j + 1) − C (i , j )] − [P (i , j + 1) − P (i , j )] |,
bigger than 0, unless the permuted image is the same as the 2 2

original one. The bigger ADC, the more confused the


original image. The ADC is in relation with iteration time. where S is scrambling degree, S ∈ (0,1) , Rij is the relativity
of each pixel and its closed pixels. Fk (i , j ) : k ∈ {1, 2,3, 4} ,
3.2 Measurement Based on the Value Changing is the relativity of each pixel and 4 pixels around it, C(i, j) is
Plain-image pixels values change after image encryption as the pixel of the cipher-image and P(i, j) is the pixel of the
compared to their original values before encryption. Such plain-image. A problem exists in Yu et al.'s measurement
change may be irregular. This means that the higher the method that occurs at the edge of the image. The pixels that
change in pixels values, the more effective will be the image fall off the image matrix are not defined. So, there is a
encryption and hence the encryption quality. So the problem in computing Fk (i , j ) for {(i , j ) : i = 0, H + 1 &
encryption quality may be expressed in terms of the total j = 0,W + 1} . Yu et al. probably performed zero padding
changes in pixels values between the plain-image and the
cipher-image. Ahmed et al. proposed a measure for for the pixels that fall off the image to calculate Fk (i , j ) .
encryption quality that is expressed as the deviation between We redefine the space of scrambling degree function to
the original and encrypted image [4]. This method is (H–2)×(W–2) to omit zero padding and improve scrambling
determined as follows: degree measurement as follows:
Let P, C denote the original image (plain-image) and the H − 2 W −2
encrypted image (cipher-image) respectively, each of size
W×H pixels with L grey levels. ∑ ∑ Rij
i =1 j =1
P (x , y ),C (x , y ) ∈ {0,..., L − 1} are the grey levels of the S= . (17)
255 × (H − 2) × (W − 2)
2
images P, C at position (x, y), 0 <x < W–1, 0 <y < H–1. We
will define HL(P) as the number of occurrence for each grey Not only this method can evaluate the change of each
level L in the original image (plain-image), and HL(C) as pixel’s position, but also evaluate the change of adjacent
the number of occurrence for each grey level L in the pixel’s value.
encrypted image (cipher-image). The encryption quality
represents the average number of changes to each grey level
L and it can be expressed mathematically as: 4. The Analysis of Simulation Experiment
In order to further confirm the feasibility and validity of the
255
∑ | H L (C ) − H L (P ) |
presented measures, we select the classical image of
256×256 Lena with 256 gray levels as the original image
L =0
EQ = . (12) and adopt the discretized generalized baker’s map, Arnold
256 cat map and discretized standard map to be the encryption
Another measurement is proposed by Luo et al. by algorithm.
computing the relative error [5], which for an image of
H×W is defined as
(IJCNS) International Journal of Computer and Network Security, 41
Vol. 2, No. 8, August 2010

Fig. 2 shows the original image. Fig. 3 shows the results


of applying the generalized discretized baker map with the
sequence of 9 divisors of 256: (8, 8, 8, 64, 128, 16, 16, 4, 4),
once, 9 and 20 times, respectively. By comparing the
original and the encrypted images in Figs. 2 and 3, there is
no visual information observed in the encrypted images, and
the encrypted images are visually indistinguishable. The (a) (b) (c)
cipher-images of Figs. 3(b) and 3(c) are almost the same Figure 5. The test image after applying the discretized standard map: (a)
and seem to have uniform pixel distribution. Uniformity once, (b) 9 times and (c) 20 times.
caused by an encryption function is justified by the chi-
square test. For more information about uniform According to the property of chaotic map, the confusion
distribution, we recommend taking a look at [15]. Fig. 4 property is in relation with iteration time. We tested the
demonstrates the results of applying the ACM once, 9 and confusion property of baker’s map, Arnold cat map and
20 times, respectively. The fluctuations in the cipher image standard map by computing the average distance change.
are visually inspected along with the iteration times. The Table 1 shows the average distance change of chaotic maps
cipher-image of iterated 9 times is almost distinguishable for one, 9 and 20 iterations. The resulting curves are shown
from the 20 times. Fig. 5 depicts the results of applying the in Fig.6, which show the relationship between the ADC and
discretized standard map with k = 2010 as the control number of iteration. The chaotic algorithms are iterated 192
parameter once, 9 and 20 times, respectively. The cipher- times for this test. Seen from the figure, curves are
images of Fig. 5 are visually more uniform than cipher- fluctuating as the number of iteration increases. However,
images of Fig. 4. Figs. 3, 4 and 5 draw a conclusion that the curve of the baker’s map oscillates much more than the
there is a small fluctuation in cipher-image along with the one of ACM and the one of standard map. Moreover, by
iteration times. However, there are some differences in comparing Figs. 6(a), 6(b) and 6(c), it is seen that ACM
cipher-images and it is difficult to judge the quality by permutation period is smaller than the discretized
visual inspection. Also, it is observed that discretized generalized baker’s map and the discretized standard map.
generalized baker’s map and discretized standard map The ADC curve of ACM is symmetric for 192 iterations. By
generate more uniform cipher-images in comparison with contrast, there is no symmetry in baker and standard map’s
ACM. ADC curve within 192 iterations. After 192 times of
iterating ACM, the pixels of test image return to their
original locations. This return to the original image makes it
proportionately easier for the attackers to decipher the
message through a simple brute force attack.

Figure 2. Original image.

(a) (b) (c)


Figure 3. The test image after applying the discretized
generalized baker’s map: (a) once, (b) 9 times and (c) 20
times. (a)

(a) (b) (c)


Figure 4. The test image after applying the ACM: (a)
once, (b) 9 times and (c) 20 times.

(b)
42 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

(c) (a)

Figure 6. Test Result of average distance change (ADC)


for the discretized generalized baker’s map, Arnold cat map
and discretized standard map. Here, Figs. (a), (b) and (c)
correspond to baker, Arnold and standard map, respectively.
In each figure, the curves show the relationship between
ADC and iteration time.

Table 1: ADC measurement


Number of iteration
Chaotic Map
1 9 20
Discretized generalized baker’s map 100.5775 131.4606 126.9832
ACM 138.5058 132.9395 132.9348 (b)
Discretized standard map 136.5355 132.5094 132.8694

Test results based on the image Value Changing for one,


9 and 20 iterations are listed in table 2. Table 2 illustrates
that Ahmed et al.’s proposed method is not efficient for
computing encryption quality of permutation algorithms
realized by chaotic maps. This method is based on
computing the average changes in the number of occurrence
for each grey level. The chaotic maps under study do not
change the number of occurrence for each grey level. So,
result of Ahmed et al.’s method of encryption quality is
zero. From the listed data in table 2 we can see that the
average relative error between pixels of plain-image and (c)
cipher-image changes as the number of iteration increases.
It is not easy to compare the results by simply observing Figure 7. Test Result of average relative error between
them in the table. So, for a better comparison, we computed pixels of plain-image and cipher-image for the discretized
the average relative error of chaotic maps for 192 iterations generalized baker’s map, Arnold cat map and the
and depicted the results in Fig. 7. By comparing Figs. 7(a), discretized standard map. Here, Figs. (a), (b) and (c)
7(b) and 7(c), it is seen that baker’s and standard’s curve correspond to baker, Arnold and standard map, respectively.
contain large sharp rises followed by sharp declines, as In each figure, the curves show the relationship between
opposed to ACM’s curve that is steadier. Also, baker’s average relative error and iteration time.
oscillation range is larger than standard map’s oscillation
range. Along with iteration time increment, the change in The improved scrambling degree was computed By
pixel values realized by the discretized standard map varies Applying equation (17) on the test image and its
less than the changes caused by the discretized generalized corresponding cipher-images. The measurement of
baker’s map realization. scrambling degree for one, 9 and 20 iterations is shown in
table 3. Table 3 illustrates that there is a fluctuation between
scrambling degree of the chaotic maps under study. It is not
easy to compare the results by simply observing them in the
table. So, for a better comparison, we have computed the
scrambling degree of each chaotic map for 192 iterations
and depicted the results in Fig. 8. The highest and the
(IJCNS) International Journal of Computer and Network Security, 43
Vol. 2, No. 8, August 2010

lowest degree are distinguished from the figure. Fig. 8 Table 2: Measurement based on the value changing
indicates that baker’s scrambling degree resultant curve has Luo et al.
the least deviation among curves. Also, the resultant curve Chaotic Map Ahmed et al. Number of iteration
of scrambling degree of standard map is smoother than
1 9 20
ACM’s. The higher the scrambling degree, the better the
encryption security. Discretized generalized baker map 0 0.1456 0.1471 0.1437
ACM 0 0.1382 0.1469 0.1469
Discretized standard map 0 0.1526 0.1458 0.1456

Table 3: Measurement based on scrambling degree


Number of iteration
Chaotic Map
1 9 20
Discretized generalized baker’s map 0.0535 0.2731 0.2770
ACM 0.0508 0.2845 0.2794
Discretized standard map 0.2476 0.2786 0.2773

5. Conclusion
(a)
A new classification in the field of image encryption’s
quality measurement is introduced in this paper. The new
approach is based on the pixel’s position changing, value
changing and both value and position changing. We applied
these measures on the ciphers based on discretized
generalized baker’s map, ACM and discretized standard
map. Experimental simulations showed the performance of
the developed quality measurement techniques in terms of
producing results that are consistent with the judgment by
visual inspection. We showed that Ahmed et al.’s method of
quality measurement is inefficient for the permutation only
image ciphers. According to the test results of ADC, relative
error and improved scrambling degree, encryption schemes
(b) based on discretized generalized baker’s map and
discretized standard map generate more uniform cipher-
images compared to ACM. Besides, resulting curves
demonstrate that ACM permutation period is smaller than
discretized baker and standard map. Moreover, by
comparing these curves, we can find the iteration time that
maximum encryption quality occurs. Finally, we suggest
using a combination of the three classified quality
measurement techniques when judging a certain encryption
algorithm.

Acknowledgments
This research was supported by the Iran Telecommunication
(c) Research Center (ITRC) under Grant no. 18885/500.

Figure 8. Test Result of improved scrambling degree for


the discretized generalized baker’s map, Arnold cat map References
and the discretized standard map. Here, Figs. (a), (b) and (c) [1] A. Akhshani, S. Behnia, A. Akhavan, H. Abu Hassan, and Z.
correspond to baker, Arnold and standard map, respectively. Hassan, “A Novel Scheme for Image Encryption Based on 2D
In each figure, the curves show the relationship between Piecewise Chaotic Maps,” Optics Communications 283, pp.
scrambling degree and iteration time. 3259–3266, 2010.
[2] A. Jolfaei and A. Mirghadri, “An Applied Imagery Encryption
Algorithm Based on Shuffling and Baker's Map,” Proceedings
of the 2010 International Conference on Artificial Intelligence
and Pattern Recognition (AIPR-10), Florida, USA, pp. 279–
285, 2010.
44 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010
[3] A. Jolfaei and A. Mirghadri, “A Novel Image Encryption
Scheme Using Pixel Shuffler and A5/1,” Proceedings of The Abdolrasoul Mirghadri received the
2010 International Conference on Artificial Intelligence and B.Sc., M.Sc. and PHD degrees in
Computational Intelligence (AICI10), Sanya, China, 2010. Mathematical Statistics, from the faculty of
[4] H.H. Ahmed, H.M. Kalash, and O.S. Farag Allah, Science, Shiraz University in 1986, 1989
“Encryption Quality Analysis of RC5 Block Cipher Algorithm and 2001, respectively. He is an assistant
for Digital Images,” Journal of Optical Engineering, vol. 45, professor at the faculty and research center
2006. of communication and information
[5] R.C. Luo, L.Y. Chung, and C.H. Lien, “A Novel Symmetric technology, IHU, Tehran, Iran since 1989.
Cryptography Based on the Hybrid Haar Wavelets Encoder His research interest includes: Cryptography, Statistics and
and Chaotic Masking Scheme,” IEEE Transactions on Stochastic Processes. He is a member of ISC, ISS and IMS.
Industrial Electronics, vol. 49, no. 4, 2002.
[6] X.Y. Yu, J. Zhang, H.E. Ren, S. Li, and X.D. Zhang, “A New
Measurement Method of Iimage Encryption,” Journal of
Physics: Conference Series, vol. 48, pp. 408–411, 2006.
[7] H.M. Elkamchouchi and M.A. Makar, “Measuring Encryption
Quality for Bitmap Images Encrypted With Rijndael and
KAMKAR Block Ciphers,” Proceedings of The Twenty
Second National Radio Science Conference (NRSC 2005),
Cairo, Egypt, pp. 111–118, 2005.
[8] F. Han, X. Yu and S. Han, “Improved Baker Map for Image
Encryption,” proceedings of the first International Symposium
on Systems and Control in Aerospace and Astronautics
(ISSCAA), pp. 1276–1279, 2006.
[9] A.J. Lichtenberg and M.A. Lieberman, Regular and Chaotic
Dynamics, New York: Springer, 1992.
[10] F. Pichler and J. Scharinger, “Ciphering by Bernoulli shifts in
finite Abelian groups,” in Contributions to General Algebra,
Proc. Linz-Conference, pp. 465–476, 1994.
[11] G. Peterson, “Arnold’s Cat Map,” Fall 1997, http:
online.redwoods.cc.ca.us/instruct/darnold/maw/catmap.htm
[12] W.H. Steeb, Y. Hardy, and R. Stoop, The Nonlinear
Workbook, 3rd edition, World Scientific Publishing Co. Pte.
Ltd, ISBN: 981-256-278-8, 2005.
[13] E. Ott, Chaos in Dynamical Systems, Cambridge University
Press, New York, 2002.
[14] J. Fridrich, “Symmetric Ciphers Based on Two-Dimensional
Chaotic Maps,” Int J Bifurcat Chaos, vol. 8, no. 6, pp. 1259–
1284, 1998.
[15] P. L'ecuyer and R. Simard, “TestU01: A C Library for
Empirical Testing of Random Number Generators,” ACM
Transactions on Mathematical Software, vol. 33, no. 4,
Article 22, 2007.

Authors Profile

Alireza Jolfaei received the Bachelor’s


degree in Biomedical Engineering in the
field of Bio-electric with the honor degree
from Islamic Azad University, Science and
Research branch, Tehran, Iran in 2007 and
Master’s degree in Telecommunication in
the field of Cryptography with the honor
degree from IHU, Tehran, Iran in 2010. He
was a chosen student in the first meeting of
honor students of Islamic Azad University, Science and Research
Branch in 2005. Currently, he is a teacher assistant at the faculty
and research center of communication and information technology,
IHU, Tehran, Iran. His research interest includes: Cryptography,
Information Systems Security, Network Security, Image Processing
and Electrophysiology.
(IJCNS) International Journal of Computer and Network Security, 45
Vol. 2, No. 8, August 2010

Improved User-Centric ID Management Model


for Privacy Protection in Cloud Computing
Moonyoung Hwang1, Jin Kwak2
1
Dept. of Information Security Engineering, Soonchunhyang University,
Asan si, Chunghcheongnam-do, Korea
Myhwang@sch.ac.kr
2
Dept. of Information Security Engineering, Soonchunhyang University,
Asan si, Chunghcheongnam-do, Korea

through an imaginary server network. In the cloud


Abstract: The development of the Internet has caused many
different Internet services to appear, including cloud computing,
computing system, users can authenticate and use the
which is getting a lot of attention recently. Users of cloud services they need, but do not know detailed information
computing services must give personal information to get about these services.[4]
services. However, users can still experience privacy The structure of cloud computing is depicted in Figure 1.
infringement because users cannot have direct control over the
exchange of personal information between service providers.

Keywords: Security, Privacy Protection, Cloud Computing, ID


management

1. Introduction

Due to the rapid growth in popularity of new computing


environments, cloud computing [1] has become an
important research issue. Cloud computing is Internet-based
computing whereby shared resources, software, and
information are provided to computers and other devices on
demand, similar to the function of the electricity grid.
Almost every cloud computing system uses ID Federation
for ID management. ID Federation [2] provides secure Figure 1. Structure of cloud computing
access to user data, a Single Sign On (SSO) that functions
as both access control and ID creation and management. To use cloud services, a user must supply credentials to a
However, business partners can only exchange Federated ID service provider each time. Therefore most users of cloud
information with each other by prior consultation. Because services much manage many different Ids in order to
all rights are transferred to service providers, users cannot connect to many different cloud services.
control their own information. Therefore, users need a new The general procedure of cloud services is depicted in
model that can give them control of their own information Figure 2..
and prevent privacy infringement [3] or the piracy of a
user’s data.
In this paper, we propose a user-centric ID management
model that provides the security and rights to control a
person’s own information in a cloud computing
environment.

2. Related work

2.1 Summary of Cloud Computing


Cloud service providers build an imaginary resource pool
from diffuse physical infrastructure and efficiently divide up
virtual resources according to a user’s workload in a cloud
computing environment. Users request cloud services Figure 2. Cloud service procedure
through a catalog and a service provider’s system
administration module supplies the necessary resources
46 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

2.2 ID Management information included in the ID card. The user selects a


2.2.1 SAML Protocol suitable ID card on the screen, and user information is
SAML is eXtensible Markup Language (XML) framework requested from the IDP which is the relevant ID offerer
developed by the Organization for the Advancement of according to selected card information. If IDP passes user
Structured Information Standards (OASIS). Transaction information to CardSpace, CardSpace passes this
partners of a platform were designed so that disparate information to the service provider again.
systems can exchange certification information, grant CardSpace offers high security because it acts in a system
privileges, and safely profile analysis information. This environment that is not a general user environment and has
system offers a Single Sign-On between enterprises, and is the advantage of reducing phishing attacks since the
not subject to base security infrastructure. SAML is derived function that displays information about ID cards includes
from Security Services Markup Language (S2MLs) and IDP information
Authorization XML (AuthXML).
These days SAML 2.0 has been selected for many ID 2.2.3 OpenID
management systems and access control solutions. Google is OpenID is a way to log into all web sites using one ID. In
using SAML 2.0 to authenticate customers in Google Apps, other words, it embodies the concept of SSO technology.
and NTT developed SASSO in which users are individual Internet users do not need to depend on one service provider
ID offerers who can achieve SSO and take advantage of to manage their own ID information and can log in to any
certification functions of their mobile phone with a PC using services with an ID which is a type of web address. Since
SAML 2.0. SAML offers the following functions in they do not need to input their name and personal address
different environments.. information continuously, it is no danger to lose a user’s ID
information. Therefore, a user manages one account only.[6]
- Single Sign-on OpenID - User-centric ID management technology -
SSO is a connection technology that does all certifications in authenticates a user using IDP, therefore it authenticates the
a system. With this technology a user logs in once and gains user with a URL only, without additional information.
access to all systems without being prompted to log in again OpenID has some characteristics in common with general
for each independent software system. ID management engineering. First, it’s not a centralized
system but a distributed processing system. Everybody
- Identity Federation involved in OpenID becomes an IDP, and does not need
SAML 2.0 can connect an existing ID of a user from a permission or registration from any central authority.
service provider (SP) to an identity provider (IDP). This Furthermore, users can select the IDP that they wish to use,
method can connect a user's name or attribute information and in case of a change in IDP, a user can keep their own
or connect creating a pseudonym that consists of random ID. Second, the service area is expanded by using OpenID
numbers for privacy protection. at any web site that uses OpenID. Third, OpenID achieves
user certification using existing web browsers on-line
- Single Logout without the request of additional ID information.
The end user's certification session of IDP and SP by
logging out once through the SSO function.
2.3 ID management in cloud computing
- Securing Web Service The ID management systems in cloud computing are
Uses SAML 2.0 assertions by method that define and protect depicted in Figure 3. Cloud service providers construct a
web service messages in the SAML Token Profile. relationship of mutual trust through prior consultations and
provide a service by Federated ID.[6]
Each cloud service provider takes charge of creation of ID
2.2.2 CardSpace and stores personal information like an independent service
Information Cards are personal digital identities that people provider.
can use online. Visually, each Information Card has a card- Users can use web service providers who construct a
shaped picture and a name associated with it that enables relationship of mutual trust by agreeing to a mutual
people to organize their digital identities and to easily select exchange of information without special certification
the one that they want to use for any given interaction. It has formality. In other words, a registered user of a web service
provider can use other web services with a relationship of
an IDP’s position and actual user information. In other
mutual trust.
words, CardSpace does not play the IDP role by actually
Therefore it is called the Circle of Trust (CoT).
issuing a user’s ID information, and achieves the role of an
informing ID meta-system for IDP.[5] First, if user requests
services from a service provider, the service provider
delivers logs on a page that have special tags that can run
CardSpace in a user's web browser. The user’s browser
confirms user ID information required from the service
provider through tag information and displays that
(IJCNS) International Journal of Computer and Network Security, 47
Vol. 2, No. 8, August 2010

3.2 Composition and concept


In this subsection, we explain the composition and concept.
The proposed model consists of a UCIDP that provides an
ID management service, a user who controls the UCIDP, a
cloud service provider, and the assumption that there exists
a certificate authority (CA) who is responsible for issuing
certificates to ensure the UCIDP.
The composition of the proposed model is depicted in
Figure 4.

Figure 3. Federated ID in cloud computing

2.4 Problem Analysis


2.4.1 Absence of right to control own information
Service providers almost always require more personal
information than is necessary when offering services to
users. But a user must give all the information to a service
provider even if they do not want to. Furthermore, if users
enter their own private information into the system
individually they can no longer control own personal
information, because of the characteristics of the cloud
computing environment.
Therefore users do not know where their own personal Figure 4. Concept of proposed model
information was stored, which can lead to an invasion of
privacy or the theft of user information without noticing. 3.3 Service process of the proposed model
In this subsection, we explain the simple service process of
2.4.2 Centralized ID information the proposed model, which is also depicted in Figure 5.
Cloud service providers such as Google, Microsoft, Amazon
and Facebook provide cloud services to general users.
Therefore, the amount of personal information about a user
that each service provider collects is increasing. This means
that privacy infringement outside of a user’s control is
possible because of the centralized storage system.

3. Proposed model

3.1 TERMINOLOGY
Table 1. Terminology
composition explanation
Someone who uses cloud
User computing services using UCIDP
and controls the UCIDP
Manages authentication
Figure 5. The service process of the proposed model
User-Centric ID
information and personal
Provider (UCIDP) information between users and Step 1: user selects UCIDP in the UCIDP list and creates an
cloud service providers ID
Certificate Authority Authorizes the UCIDP Step 2: user requests cloud computing services
Unique information for user Step 3: cloud service provider asks user for an ID and PW
Personal ID authentication Step 4: user transmits ID and PW to cloud service provider
information such as social security number, Step 5: cloud service provider requests personal information
PIN for the service
Required information from user for Step 6: user confirms and transmits required information
Authentication ID
cloud service provider such as an
Information Step 7: cloud service provider offers service
id or password
Additional information for cloud
Common ID
services such as address, age, e-
information
mail, or phone number
48 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

3.4 Management function of ID information


The proposed model provides ID management of all ID
information. The functions offered are as follows.

(a) Issuance of ID

Step 1 : User authenticates to UCIDP


Step 2 : UCIDP delivers certification confirmation
information to user
Step 3 : UCIDP delivers IDP certification confirmation
information to each service provider
Figure 6. The process of ID issuance Step 4 : Service provider verifies certification confirmation
information and publishes service provider's
Step 1 : user selects UCIDP in the UCIDP list and requires certification information
an ID to UCIDP Step 5 : User has possible SSO function with certification
Step 2 : UCIDP requires a user’s personal ID information information from the service provider
Step 3 : user transmits a Personal ID to UCIDP
Step 4 : UCIDP issues ID to user (d) Change of common ID information

(b) Federated ID

Figure 7. The process of change common ID information

Step 1 : user changes a common ID in UCIDP


Step 2 : UCIDP requires that the common ID change is
Step 1 : UCIDP creates intermediate information for reflected to the cloud service provider
Federated ID from user's certificate information Step 3 : Cloud service provider requires authentication ID
Step 2: UCIDP transmits created intermediation information information from UCIDP
to each service provider Step 4 : UCIDP transmits authentication ID information to
Step 3 : Service provider verifies information and stores cloud service provider
information after verification ends Step 5 : Cloud service provider confirms authentication ID I
Step 4 : Service provider transmits verification sequence to nformation and if it’s correct, reflects the changed
UCIDP common ID information.
Step 5 : All processes for Federated ID are ended

(c) SSO(Single Sign-On)


4. Comparison

The proposed model differs from existing ID management


systems in the areas of ID information management and
control. In the proposed model, users can choose the UCIDP
(IJCNS) International Journal of Computer and Network Security, 49
Vol. 2, No. 8, August 2010

and control personal ID information, authentication ID [6] H.K.Oh,S.H.Jin,”The Security Limitations of SSO in
information, and common ID information. Furthermore, the OpenID”, Advanced Communication Technology,
user has the authority to offer, alter, or discard his or her 2008. ICACT 2008. 10th International Conference on,
own ID information. pp.1608-1611, 2008
The comparison of the proposed model with other systems [7] Juniper Networks, "Identity Federation in a hybrid
is depicted in Table 2. cloud computing environment solution guide",
JuniperNetworks, pp.1-6. 2009
Table 2. Comparison with other system [8] Y.S Cho, S.H. Jin, “Practical use and investigation of
SAML 2.0 OpenID Card OASIS SAML(Security Assertion Markup Language)
UCIDP
[8] [6][9] Space v2.0”, korea multimedia society, Vol.10, No. 1, pp.59-
70, 2006.
User User [9] http://en.wikipedia.org/wiki/OpenID
Agreement Existing
Certification chooses chooses
between Model
Method the IDP the IDP
IDP and SP integration
and SP and SP Authors Profile
ID federation o x o o
Moonyoung Hwang was received the B.S.
ID information
x x o o degrees from Department of Information Security
offer Engineering, Soonchunhyang University, Asan,
Korea in 2008. Now he is a student of
Change of M.S.course in Department of Information
x x x o
ID information Security Engineering, Soonchunhyang
University, Korea.
SSO o o o o

Jin Kwak was received the BE, ME and PhD


5. Conclusion degrees from Sungkyun-Kwan University,
Seoul, Korea in 2000, 2003, and 2006
respectively. He has joined Kyushu University
The appearance of the cloud computing environment has in Japan as a visiting scholar at the Graduate
become an issue to all users who use services through a School of Information Science and Electrical
network environment. However, users can’t control their Engineering. After that, he joined MIC(Ministry
own personal ID information, authentication ID of Information and Communication, Korea) as a
Deputy Director. Now he is a professor and Dean of Department of
information, or common ID information. This problem can
Information Security Engineering, and also Director of SCH BIT
result in the infringement of user privacy. Therefore we Business Incubation Center, Soonchunhyang University, Korea.
have proposed a new user-centric ID management model. His main research areas are Cryptology, Information security
This model offers another ID management system and applications includes Cloud computing security, Multimedia
controls user information naturally. security, Embedded System security, and IT product
evaluation(CC). He is a member of the KIISC, KSII, KKITS, and
KDAS.
6. References

[1] G.H. Nam “trend of cloud computing technology”


ETRI, 2009
[2] Y.S. Cho, S.H. Jin, P.J. Moon, K.I. Chung, “Internet
ID management System based ID Federation”, the
institute of electronics engineers of korea, Vol. 43, No.
7, pp. 104-113, 2006
[3] Salmon, J. “Clouded in uncertainty – the legal pitfalls
of cloud computing”, Computing magazine, September
24, 2008.
[4] Rich Maggiani, "Cloud Computing Is Changing How
We Communicate" Professional Communication
Conference, pp.1-4, 2009.
[5] AlrodhanW.A,MitchelC.J,”Addressing privacy issues
in CardSpace”, Information Assurance and Security,
2007. IAS 2007. Third International Symposium on,
pp.285-291,2007.
50 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

Stock Market Forecasting using Artificial Neural


Network and Statistical Technique:
A Comparison Report
K S Vaisla1 and Dr. Ashutosh Kumar Bhatt2, Dr. Shishir Kumar3

1
Deptt. of Computer Science & Engineering,
VCT Kumaun Engineering College,
Dwarahat, District –Almora (Uttarakhand), INDIA,
Email - vaislaks@rediffmail.com
2
Deptt. Of Computer Science,
Birla Institute of Applied Sciences,
Bhimtal, Post- Bhimtal, Distt-Nainital (Uttarakhand), INDIA
E-mail - ashutoshbhatt123@rediffmail.com
3
Head, Department of Computer Science & Engineering,
Jaypee University of Engineering & Technology,
Raghogarh , District – Guna (MP), India,
Email - dr.shishir@yahoo.com

company's database or data warehouse is selected and


Abstract: Stock price prediction is one of the hot areas in refined to form training data sets.
neural network applications. This paper presents the Neural Artificial Neural Network are widely used in various
Networks’ ability to forecast the Stock Market Prices. Predicting branches of engineering and science and their property to
the stock market is very difficult since it depends on several approximate complex and nonlinear equations makes it a
known and unknown factors. In recent years, one of the useful tools in econometric analysis. A number of statistical
techniques that have been used popularly in this area is
model and Neural Network model have been developed for
artificial neural network. The power of neural network is its
ability to model a nonlinear process without a priori knowledge forecasting stock market.
about the nature of the process. In this paper, Neural Networks
and Statistical techniques are employed to model and forecast The study of financial data is of great importance to the
the stock market prices and then the results of these two models researchers and to business world because of the volatile
are compared. The forecasting ability of these two models is nature of the series. Statistical tools like Multiple Regression
accessed using MAPE, MSE and RMSE. The results show that
Techniques [1] and Time Series Analysis are the very well
Neural Networks, when trained with sufficient data and proper
inputs, can predict the stock market prices very well. Statistical
built methodologies used for forecasting the series, but as
technique though well built but their forecasting ability is the series become complex their forecasting ability is
reduced as the series become complex. Therefore, Neural reduced, [2]. Regression models have been traditionally used
Networks can be used as an alternative technique for forecasting to model the changes in the stock markets. Multiple
the stock market prices. regression analysis is the process of finding the least squares
Keywords: Foreign Investors Inflow (FII), Wholesale Price Index prediction equation, testing the adequacy of the model, and
(WPI), Money Supply Broad Money (MSBM), Money Supply conducting tests about estimating the values of the model
Narrow Money (MSNM), Exchange Rate (ER). parameters, [3]. However, these models can predict linear
patterns only. The stock market returns change in a
1. Introduction nonlinear pattern such that neural networks are more
Artificial neural network models are based on the neural appropriate to model these changes.
structure of the brain. The brain learns from experience and
so do artificial neural networks. Previous research has Neural Network have become popular in the world of
shown that artificial neural networks are suitable for pattern forecasting because of their non-parametric approach [4], as
recognition and pattern classification tasks due to their well as their ability to learn the behavior of the series, when
nonlinear nonparametric adaptive-learning properties. As a properly trained. Many researches [5, 6] have been made to
useful analytical tool, ANN is widely applied in analyzing compare Neural Networks with statistical tools. Neural
the business data stored in database or data warehouse Networks have been successfully applied to loan evaluation,
nowadays. Customer behavior patterns identification and signature recognitions, time series forecasting and many
stock price prediction are both hot areas of neural network other difficult pattern recognition problems [2, 4, 5, 6 and
researching and applying. One critical step in neural 7]. If stock market return fluctuations are affected by their
network application is network training. Generally, data in recent historic behavior, neural networks which can model
such temporal stock market changes can prove to be better
(IJCNS) International Journal of Computer and Network Security, 51
Vol. 2, No. 8, August 2010

predictors, the changes in a stock market can then be weights. The process of adjusting the weights to make the
learned better using networks which employ a feedback Neural Network learn the relationship between the inputs
mechanism to cause sequence learning. and the targets is known as learning or training. There are
several methods of finding the weights of which the gradient
2. Literature Review descent method is most common.
In general, the approaches to predict stock market could be
classified into two classes, fundamental analysis and
technical analysis. Fundamental analysis is based on
5. Statistical Technique
macroeconomic data and the basic financial status of
companies like money supply, interest rate, inflationary Multiple Regression Analysis is a Multivariate Statistical
rates, dividend yields, earnings yield, cash flow yield, book technique used to examine the relationship between a single
to market ratio, price-earnings ratio, lagged returns[8,9]. dependent variable and a set of independent variables. The
Technical analysis is based on the rationale that history will objective of the multiple regression analysis is to use
repeat itself and that and the correlation between price and independent variables whose values are known to predict the
volume reveals market behavior. Prediction is made by single dependent variable.
exploiting implications hidden in past trading activities and
by analyzing patterns and trends shown in price and volume
charts[10]. Using neural networks to predict financial 6. Data and Methodology
markets has been an active research area in both methods,
6.1 Data Set Used
since the late 1980's [11, 12, 13, 14, 15]. Most of these
published works are targeted at US stock markets and other The data is obtained from the RBI site (www.rbi.org.in),
international financial markets. In this article our Prediction NSE site [17], SEBI site (www.sebi.gov.in). The NIFTY
is made by exploiting implications hidden in past trading data (closing Nifty Index), Industrial Production, Wholesale
activities and by analyzing patterns and trends shown in Price Index, Exchange Rate, Net Investment by FIIs, Export,
monthly stock price and Industrial Production, Wholesale Import, Money Supply Narrow Money, Money Supply Broad
Price Index, Exchange Rate, Net Investment by FIIs, Export, Money is from April, 1994 to March , 2007. All above data
Import, Money Supply Narrow Money, and Money Supply taken on monthly basis. The stock market can display
Broad Money. varying characteristics for Industrial Production, Wholesale
Price Index, Exchange Rate, Net Investment by FIIs, Export,
Import, and Money Supply. So it is necessary to develop
Training a Neural Network model for predicting monthly stock return of NIFTY. The
To experiment with neural networks, we used NeuralWare, data for the study comprises the monthly stock returns of
NeuralWorks Predict, [16] which provides the tools to NIFTY, monthly Industrial Production, monthly Wholesale
implement and test various configurations of neural Price Index, monthly Exchange Rate, monthly Net
networks and learning algorithms. Investment by FIIs, monthly Export & Import, monthly
Money Supply from April, 1994 to March , 2007 creating a
3. Objective of Study series of 156 observations which were collected from the
The objective of this study is to model the Stock Prices data Reserve Bank of India website (www.rbi.org.in) , NSE
using the Statistical Technique and the Neural Networks, site(www.nseindia.com), SEBI site(www.sebi.gov.in).
and then to compare the results of these two techniques.
To build the Neural Network forecasting models monthly
data (156 observations) is used to for the measurement of
4. Neural Networks
forecasting accuracy. An important first step in the analysis
Artificial Neural Network is an artificial representation of of the data is to determine if the series is stationary, as all
the human brain that tries to simulate its learning process. other calculations of invariants presume stationarity in both
To train a network and measure how well it performs, an linear and nonlinear. A time series is said to be stationary if
objective function must be defined. A commonly used there is no systematic change in mean (no trend), in
performance criterion function is the sum of squares error variance, and, if so, periodic variations have to be removed.
function. To detect nonstationarity, the study uses a stationary test,
2

∑ ∑ (t )
1 p N called the unit root test (Augmented Dickey Fuller Test and
E = p i − yp i (1) Philip Perron Test). The null hypothesis tested here is “the
2 p =1 i =1
series is non-stationary”. If the absolute value of the statistic
Where, p represents the patterns in the training set, yp is the is greater than the critical Value, then the null hypothesis is
output vector (based on the hidden layer output), tp is the rejected and hence the series is stationary.
training target. The above equation represents the output
nodes, tpi and ypi are, respectively, the target and actual
network output for the ith output unit on the pth pattern.
The network learns the problem at hand by adjusting
52 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

Figure 1. .Monthly stock closing for period April 1994 to Figure 9. .Monthly Export for period April 1994 to March
March 2007 2007

Figure 2. .Monthly Closing alternate series Figure 10. Export alternate series

Figure 3. .Monthly Industrial production for period April Figure 11. Monthly Import for period April 1994 to March
1994 to March 2007 2007

Figure 4. .Monthly Industrial Production alternate series Figure 12. Import alternate series

Figure 5. .Monthly Wholesale Price Index for period April Figure13. .Monthly Money Supply Narrow Money for
1994 to March 2007 period April 1994 to March 2007

Figure 6.Wholesale Price Index alternate series Figure 14. Money Supply Narrow Money alternate series

Figure 15. .Monthly Money Supply Broad Money for period


Figure 7. .Monthly Exchange Rate for period April 1994 to
April 1994 to March 2007
March 2007

Figure 16. .Money Supply Broad Money alternate series


Figure 8. Exchange Rate alternate series
(IJCNS) International Journal of Computer and Network Security, 53
Vol. 2, No. 8, August 2010

7. Design Methodology
Dependent Variable: Closing Nifty
It is difficult to design a Neural Network Model for a
Method: Least Squares
particular forecasting problem. Modeling issues must be
Sample (adjusted): 3 156
considered carefully because it affects the performance of an
Included observations: 154 after adjusting endpoints
ANN. One critical factor is to determine the appropriate
architecture, that is, the number of layers, number of nodes
Variable Coefficient Std. Error t-Statistic Prob.
in each layer. Other network design decisions include the
C 0.010347 0.006730 1.537532 0.1263
selection of activation functions of the hidden and output
nodes, the training algorithm, and performance measures. Exchange Rate(-1) -0.426087 0.364542 -1.168830 0.2444

The design stage involves in this study to determine the Export(-1) -0.042306 0.064830 -0.652570 0.5151
input nodes and output nodes, selecting the performance Foreign Investors Inflow(- 6.25E-06 2.06E-06 3.039389 0.0028
metrics etc. 1)
The number of input nodes corresponds to the number of Import(-1) -0.044457 0.061156 -0.726940 0.4684
variables in the input vector used to forecast future values. Industrial Production(-1) 0.269663 0.154961 1.740202 0.0839
However currently there is no suggested systematic way to Money Supply broad 0.119117 0.286181 0.416230 0.6779
determine this number. Too few or too many input nodes money(-1)
can affect either the learning or prediction capability of the Money Supply narrow -0.017499 0.333083 -0.052538 0.9582
network. For this study the output is the forecasted monthly money(-1)
stock return.Monthly Closing NIFTY is taken as dependent Wholesale Price Index(-1) -0.833971 0.963889 -0.865215 0.3884
variable(Output) and Industrial Production, Wholesale Price R-squared 0.121900 F-statistic 2.516168
Index, Exchange Rate, Net Investment by FIIs, Export,
Adjusted R-squared 0.073454 Prob (F-statistic) 0.013663
Import, Money Supply Narrow Money, Money Supply Broad
Money are taken as independent variable (Inputs).

From the above observation only Foreign Investors Inflow is


found significant. It means if we increase the Foreign
Investors Inflow by 1% the stock return will increase by
6.25E-06. Prob (F-statistics) also not showing significant
result. Hear Adjusted R-squared is 0.073454 it means we
can predict only 0.07% and rest 0.93% is unpredictable.
This Model is chosen for forecasting the next one year
values of Nifty closing. The following table gives the
forecasting results using Regression Model.

Table 2: Using Regression


No of %
Figure 17. Neural Network Model. Obs. Error
MAPE MAE MSE RMSE

Hear Noisy Data has been taken for Neural Network Regression 156 0.017077 0.000552 0.023492
forecasting and Data Transformation level is moderate. The
variable selection for the network is comprehensive. Comparison of the Models:-
Adaptive Gradient Learning Rule is applied hear, the output The Stock Market Monthly closing values were forecasted
Layer Function is taken Sigmoid for the forecasting. using Neural Networks and Regression technique. The
comparison between the two models is done on the basis of
Table 1 : Using Neural Network the MAPE, MSE and RMSE values obtained for the
No of MAE MSE RMSE forecasted values of the two models. An accuracy measure is
Obs. often defined in terms of the forecasting error, which is the
Neural 156 0.05007 0.00419 0.06475 difference between the actual (desired) and predicted value.
Network The ultimate and the most important measure of
performance is the prediction accuracy.
Modelling & Forecasting using Multiple Regression The following table gives the table of comparison:-
Technique:-
Closing nifty is taken as dependent variable and Industrial Table 3: Comparison of Models
Production, Wholesale Price Index, Exchange Rate, Net No of Obs. MAE MSE RMSE
Investment by FIIs, Export, Import, Money Supply were Regression 156 0.066 0.007 0.082
taken as independent variables. The variables (closing nifty, Neural 156 0.05007 0.00419 0.06475
Industrial Production, Wholesale Price Index, Exchange Network
Rate, Export, Import, Money Supply) were transformed
using natural log to achieve normality and linearity.
54 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

Where formulae for the statistics are: clearly depicts the fact that Neural Networks outperform
MAE = abs (Actual –Forecast)/n. Statistical technique in forecasting stock market prices.
MSE = 1/n * [Actual –Forecast] 2. The effectiveness of neural network can be measured using
RMSE = SQRT (MSE). the hit rate, which may be a better standard for determining
From the above Table, Neural Networks performs well than the quality of forecast instead of the traditional measures
compared to Statistical forecasting of monthly closing Nifty like RMSE, SSE, and MAE. The field of neural networks is
values. very diverse and opportunities for future research exist in
The following figures shows the MAE, MSE and RMSE many aspects, including data preprocessing and
calculated for the forecast period using the above two representation, architecture selection, and application. The
forecasting techniques logical next step for the research is to improve further the
performance of NNs, for this application, perhaps through
better training methods, better architecture selection, or
better input.

References
[1] E. M. Azoff. “Neural network time series forecasting of
financial market.” JohnWiley & Sons Ltd. 1994.
[2] C.M. Bishop, “Neural Networks for Pattern
Recognition,” Oxford University press. 1995.
[3] Fama, F. Eugene & French, R. Kenneth, “Dividend
Figure 18. MAE of Neural Network and Statistical Method yields and expected stock returns,” Journal of
Financial Economics, Elsevier, vol. 22(1), pp. 3-25.
1988.
[4] Hair, Anderson, Tatham, Black, “Multivariate Data
Analysis,” Pearson Education press. 1998.
[5] Kalyani Dacha “Causal Modeling of Stock Market
Prices using Neural Networks and Multiple
Regression: A Comparison Report,” Finance India,
Vol. xxi , No.3, pp. 923-930. 2007.
[6] Lakonishok et al. “The Journal of Finance,” Volume
49, Issue 5, pp. 1541-1578. Dec. 1994.
[7] Mendenhall and Beaver, “Introduction to Probability
Figure 19. MSE of Neural Network and Statistical Method and Statistics,” Ninth Edition, International Thomson
Publishing, 1994.
[8] National Stock Exchange (NSE), Available: www.nse-
india.com.
[9] NeuralWare, NeuralWorks Predict, Available:
http://www.neuralware.com.
[10] H.P. PAN, “A joint review of technical and
quantitative analysis of the financial markets towards a
unified science of intelligent finance,” Proc.2003
Hawaii International Conference on Statistics and
Related Fields, June 5-9, Hawaii, USA, 2003.
[11] R. Sharda, and R. Patil, “Neural Networks as
Figure 20. RMSE of Neural Network and Statistical Method forecasting experts: an empirical test,” Proceedings of
the 1990 International Joint Conference on Neural
The experiments illustrate a varying degree of predictability Networks, Vol-I, pp. 491-494, Washington DC, USA.
of the monthly stock returns. For Example based on the 1990.
values of RMS, MAE, RMSE and other statistics. The above [12] Smirlock Michael and Starks, T. Laura, “A Further
comparision of Neural Network and Statistical model clearly Examination of Stock Price Changes and Transactions
shows that the Neural Networks prediction is better than the Volume,” Journal of Financial Research 8, pp. 217-
Statistical technique. 225. 1985.
[13] G.S. SWALES, and Y.YOON, “Applying artificial
neural networks to investment analysis,” Financial
8. Conclusion
Analysts Journal, 48(5). 1997.
In this paper, two techniques for modeling and forecasting [14] Tang, Almeida and Fishwick, Simulation, “Time series
stock market prices have been shown: Neural Network and forecasting using neural networks vs. Box-Jenkins
Statistical Technique. The forecasting ability of models is methodology,” pp. 303-310. November 1991.
accessed on the basis of MSE, MAE and RMSE. This [15] P.C.Verhmf, P.N. Spnng, J.C. Hmksb, “The
commercial use of segmentation and predictive
(IJCNS) International Journal of Computer and Network Security, 55
Vol. 2, No. 8, August 2010

modeling techniques for database marketing in the


Netherlands,” Decision Support Systems, 34(4), pp.
471-481. 2003.
[16] Wong B.K., Bodnovich T.A., Selvi, Y., “A
bibliography of neural networks business application
research”: 1988-September 1994”, Expert Systems,
12(3), pp. 253-262. 1995.
[17] Yao J.T., Tan C.L., “Time dependent directional profit
model for financial time series forecasting”. In
proceeding of the IJCNN, Como, Italy, Vol. 5, pp. 291
- 296, 2000.

Authors Profile

K. S. Vaisla received the Graduation in


Science (B. Sc.) and Master (MCA) degrees
in Computer Applications from University of
Rajasthan, Jaipur in 1994 and 1998,
respectively. Presently working as Associate
Professor (Computer Science & Engineering)
in Kumaon Engineering College (A Govt.
Autonomous College), Dwarahat (Almora) –
Utarakhand. Interested field of research are
ICT impact on G2C of e-Governance, Data Warehouse and
Mining, Complex / Compound Object Mining, IBIR. Authored
many research papers in International / national
journals/conferences in the field of computer science and also
many books in reputed publishing house.

Dr. Ashutosh Kumar Bhatt is Ph.D. in


(Computer Science) from Kumaun
University Nainital (Uttrakhand). He
received the MCA in 2003. Presently he is
working as Assistant Professor in Dept of
Computer Science, at Birla Institute of
Applied Sciences, Bhimtal, Nainital
(Uttrakhand). His area of interest is
including Artificial Neural Network, JAVA
Programming, Visual Basic. He has a
number of research publications in National journals, Conference
Proceeding. He is running project entitled “Automated Analysis
for Quality Assessment of Apples using Artificial Neural
Network” under the Scheme for Young Scientists and Professional
(SYSP) Govt. of India, Department of Science and Technology
(DST) New Delhi for 3 year.

Dr. Shishir Kumar is currently working as


Head in Dept. of Computer Science &
Engineering, Jaypee University of
Engineering & Technology, Guna, India. He
has completed his PhD in the area of
Computer Science in 2005.He is having
around 12 year teaching experience. His
area of Interest is Image Processing & Network Security.
56 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

The Effect of Image Compression on Face


Recognition Algorithms
Qeethara Kadhim Al-Shayea1, Muzhir Shaban Al-Ani2 and Muna Suliman Abu Teamah3
1
Al-Zaytoonah University, Department of Management Information Systems,
Amman-Jordan, 11733
kit_alshayeh@yahoo.com
2
Amman Arab University, Department of Computer Science,
Amman-Jordan, 11953
muzhir@gmail.com
3
Amman Arab University, Department of Computer Science,
Amman-Jordan, 11953
teamah@yahoo.com

Abstract: Face recognition becomes an important field via the


revolution in technology and computer vision. This paper 2. Literature Reviews
concentrated on recognition rate of face recognition algorithms.
The algorithms examined are: Principal Component Analysis, There are many researches related to face recognition
Two Dimensional Principal Component Analysis in Column algorithms such as:
Direction, Two Dimensional Principal Component Analysis in Wang [1] presented a structural two dimensional principal
Row Direction and Two Dimensional Two Directional Principal component analysis for image recognition, which is a
Component Analysis. All these algorithms are implemented into subspace learning method that identifies the structural
two environments: training environment and recognition information for discrimination.
environment. Then a comparison between these four algorithms Norouzi, Ahmadabadi and Araabi [2] presented a new
with respect to recognition rate is implemented. The proposed
method for handling occlusion in face recognition. In this
algorithm is implemented via Discrete Wavelet Transform
(DWT) that minimizes the images size. A complexity reduction
method the faces are partitioned into blocks and a sequential
is achieved by optimizing the number of operations needed. This recognition structure was developed, that increase the
optimization does not increase the recognition rate only, but also correct classification rate.
reduce the execution time. A recognition rate improvement of Sevcenco and Lu [3] presented an enhanced principal
4% to 5% is achieved by introducing DWT through PCA component analysis algorithm for improving rate of face
algorithms. recognition, in which modified the histogram to match a
Gaussian shaped tonal distribution in the face images.
Keywords: Face Recognition, Recognition Rate, Image Choudhury [4] proposed a 3D human face recognition, the
Compression, Principal Component Analysis and Wavelet face correlation task is completed by carrying out cross
Transform. correlation between the signature functions and of the faces
and analyzing the correlation peaks, where high correlation
1. Introduction peak signifies true class recognition and low or no peak
signifies false class rejection.
Face recognition becomes more popular via the fast growth
Wang, Ding, Ding and Fang [5] developed a 2D face fitting
in information and communication technology. It has been
assisted 3D face reconstruction algorithm that aims at the
introduced in many applications especially in access control
recognizing faces of different poses when each face class has
and information security that can be applied in building
only has only one frontal training sample. This algorithm so
access, internet access, medical records, car license plate
called Random forest Embedded active shape model, which
number, identification number, face recognition and
embeds random forest learning into the framework of active
surveillance system and so on [1].
shape model.
Face recognition system can be organized into two major
Sokolov et al. [6] constructed a face recognition system
areas; the first one called face detection in which the
using preliminary training based on sample images of object
algorithms focus on finding the face in an image, and the
and non objects, where the images are represented by
second one called face recognition, in which the algorithm
separate points in multidimensional space of features.
focus on recognizing the face. Recently, face recognition
Bourlai, Kittler and Messer [7] investigated different
becomes more important area, and many algorithms have
optimization strategies by considering both image
been developed to implement face recognition, also
compression and image resolution and demonstrate that
researchers have spent more effort on enhancing the
both the system performance and speed of access can be
performance of these algorithms in terms of quality and
improved by the jointly optimized parameter setting and the
recognition rate. This work aims to compare the recognition
level of probe compression.
rate between the four indicated algorithms, and also study
Ebrahimpour et al. [8] proposed a face models were
the effects of using DWT as compression approach via these
processed according to the human vision pathology. Three
algorithms.
(IJCNS) International Journal of Computer and Network Security, 57
Vol. 2, No. 8, August 2010

different sets of anthropometric points were used for recognition accuracy. In general, face recognition
asymmetry estimations and the resulting asymmetry algorithms have two major phases; the training stage and
measure was calculated by averaging these estimations. the recognition stage.
Grgic, Delac and Grgic [9] described a database of static
3.1 Training Phase
images of human faces that proposed a tested protocol. A
simple baseline principal component analysis face Training phase represents the first phase where the database
recognition algorithm was tested following the proposed of images for known people is used, and at least one image
protocol, which proving their robustness and efficiency. per known person is available at the database. In this phase
Zhan et al. [10] presented an evolutionary classifier fusion the features for each known face image are extracted and
method inspired by biological evolution is presented to stored in the database. There are three main steps in the
optimize the performance of a face recognition system. training phase that shown in Figure 1, including calculating
Different illumination environments are modeled as a projection matrix from the trained images, then extracting
multiple contexts using unsupervised learning and then the features from the images and finally storing these features to
optimized classifiers are searched. be used in the recognition phase. First step concerns with
Saleh et al. [11] proposed a shift invariant pattern the calculation of the projection matrix, in which Face
recognition mechanism using a feature-sharing recognition algorithm use the training images to calculate
hypercolumn model. In this work a share mape is the projection matrix. For example in Principal Component
constructed to improve the recognition rate and to reduce Analysis (PCA) the projection matrix consists of
the memory requirements of the hypercolumn model. eigenvectors of the covariance matrix, where the covariance
Guoxing et al. [12] a local binary pattern (LPB) has been matrix is derived from the trained images. The second step
proved to be successful for face recognition. A new LPB- deals with the extraction of the features from images. In
based multi-scale illumination preprocessing method was PCA each image is multiplied by the projection matrix to
proposed, in which this method performs better than the formulate the eigenfaces. The eigenfaces represent the
existing LPB-based methods. principal component of the images, or in other words it
Sing, Thakur, Basu, Nasipuri and Kundu [13] proposed a represents the features of the image. Finally the last step in
self adaptive radial basis function neural network based the training phase is to store the features that are extracted
method for high speed recognition of human faces. It has from the trained images [1], [3].
been seen that the variations between the images of a
person, under varying pose, facial expirations, illumination, Calculate Calculate
Store
etc. were quite high. projection feature
feature
Face matrix extraction
Kumar et al. [14] presented a PCA memetic algorithm images as vector in
database
approach for feature selection of face recognition. PCA has input of
training
been extended by memetic algorithm, where the former was phase
used for feature extraction reduction and the later exploited Projection Feature
matrix vector
for feature selection.
Banerjee et al. [15] studied the performance of frequency Figure 1. Training phase of general face recognition
domain correlation filter for face recognition by considering algorithm.
only the phase part of the face images, in which the
dimensions of the phase spectra were reduced by using 3.2 Recognition Phase
frequency domain principal component analysis. Recognition phase represents the second phase, in which
Zhangand and Qiao [16] represented a new gradient Gabor each new unknown face image is analyzed to obtain its
filter to extract multi-scale and multi-orientation features to features, and then a comparison between its features and the
represent and classify faces. An efficient kernel fisher stored features from the training stage is performed to
analysis method was proposed to find multiple subspaces identify the unknown face image. Many algorithms for face
based on both gradient Gabor magnitude and phase features, recognition have both a training stage and a recognition
which is a local kernel mapping method to capture the stage. This phase consists of two main steps that shown in
structure information in faces. Figure 2, including extracting the features for the unknown
Huang, Yi and Pu [17] proposed a new mean shifting image and comparing it with the stored features. The
incremental PCA method based on the autocorrelation calculated projection matrix in the training phase is used to
matrix, which required lower computational time and extract the features of the unknown image. So in order to
storage capacity owing to the two transformation design. extract the features from the unknown image applying PCA
Chougdali, Jedra and Zahid [18] proposed kernel relevance algorithm, multiply the unknown elements of images by the
weighted discriminant analysis for face recognition which projection matrix which is derived from the training stage.
has several interesting characteristics. Two novel kernel When the feature of the unknown image is extracted, a
functions were introduced o improve the performance of the comparison process is performed between the extracted
proposed algorithm. features and the stored features from the training stage.
There are many methods to do such a comparison, such as
3. Face Recognition Algorithms the Euclidean distance measure. The distance between the
features of the unknown image and the stored features is
Face recognition algorithms are implemented in many forms
computed, and then the minimum distance corresponding to
to perform high efficiency of recognition. They are affected
the closest face features is selected as the matched face [17].
by many factors such as recognition rate, execution time and
58 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

5. Results and Discussion


Calculate Comparing
Identifying A different algorithm are implemented to present the
feature the result
Face extraction with the
the recognition rate for the PCA algorithm before and after
images as unknown
stored applying DWT. Applying DWT indicates that the
input of face images
features
recognitio recognition rate increases and also the size of the feature
n phase
Feature Decision
matrix is reduced. DWT can be implemented via different
vector making types of DWT filters such as Harr, Mallet, Daubechies and
other types.
Figure 2. Recognition stage of general face Figure 5-a shows the comparison between the recognition
recognition algorithm. rate of PCA, PCA using first level DWT and PCA using
second level DWT. The results shows that the recognition
4. Proposed System rate increase by 1.4% by using first level DWT and increase
The proposed system based on discreet wavelet transforms by 4.2% by using second level DWT.
(DWT) algorithm to minimize image size, then PCA Figure 5-b shows the comparison between the recognition
algorithm is applied for face recognition. DWT is used to rate of 2D-PCA-C, 2D-PCA-C using first level DWT and
decomposed the original image into four sub images each 2D-PCA-C using second level DWT. The results shows that
one contains some of the information of the original image. the recognition rate increase by 1.6% by using first level
The most important components of the original image are DWT and increase by 4.8% by using second level DWT.
stored in one part, which is the low low (LL) sub image that Figure 5-c shows the comparison between the recognition
represents identical features of the original image. This will rate of 2D-PCA-R, 2D-PCA-R using first level DWT and
increase the recognition rate even so it uses images of 2D-PCA-R using second level DWT. The results shows that
smaller size. Second level DWT is implemented to achieve the recognition rate increase by 1.0% by using first level
more reduction in image size, i.e. more reduction in DWT and increase by 4.2% by using second level DWT.
comparisons operations required. In addition this Figure 5-d shows the comparison between the recognition
improvement in recognition rate cases some reduction in rate of (2D)2-PCA, (2D)2-PCA using first level DWT and
image resolution as shown in Figure 3. (2D)2-PCA using second level DWT. The results shows that
Figure 3. Decomposing an image using DWT (second the recognition rate increase by 1.8% by using first level
DWT and increase by 4.6% by using second level DWT.

(a) Recognition rate of PCA before and after applying DWT


95

90 .... after applying 2nd level DWT


---- after applying 1st level DWT
percentage of the recognition rate

-.-. befor applying DWT


85

80

75

70

level).
65

For each new image after applying feature extraction


60
process, then comparing the extracted features with the 1 1.5 2 2.5 3 3.5 4 4.5 5
number of images per person
features stored from the training stage. Two possibilities will
appear: the extracted features are matched one of the stored (b) Recognition rate of 2D - PCA - C before and after applying DWT
95
images, so this leads to a matching decision, other
possibility the extracted features are not matched one of the
90 .... after applying 2nd level DWT
stored images, so this leads to a mismatching decision, in
percentage of the recognition rate

---- after applying 1st level DWT


this case the features of the new image are added as a new -.-. befor applying DWT
85
features to the database, as shown in Figure 4.
80
Trained Feature Input of
images extraction new
image
75

Yes Identify
Add the It is
new image matched matching
70
to the the feature vector
database
No
65
1 1.5 2 2.5 3 3.5 4 4.5 5
number of images per person

Figure 4. Flow chart of the face recognition algorithm.


(IJCNS) International Journal of Computer and Network Security, 59
Vol. 2, No. 8, August 2010

(c) Recognition rate of 2D - PCA - R before and after applying DWT [4] T. Choudhury, "Three Dimensional Human Face
95
Recognition", Optical Society of India, Vol. 38, No. 1,
pp. 16-21, 2009.
90 .... after applying 2nd level DWT
[5] L. Wang, L. Ding, X. Ding and C. Fang, "2D Face
percentage of the recognition rate

---- after applying 1st level DWT


-.-. befor applying DWT Fitting-Assisted 3D Face Reconstruction for Pose-
85
Robust Face Recognition", Soft Computing, 8 Nov.
2009.
80
[6] M. Sokolov et al., "Face Recognition Using Lateral
Inhibition Function Features", Optical Memory and
75
Neural Networks, Vol. 18, No. 1, pp. 1-5, 2009.
[7] T. Bourlai, J. Kittler and K. Messer, "On Design and
70
Optimization of Face Verification Systems that are
Smart-Card Based", Machine Vision and Applications,
65
1 1.5 2 2.5 3 3.5 4 4.5 5 10 Feb. 2009.
number of images per person
[8] Ebrahimpour et al., "Applying Computer Stereovisio
(d) Recognition rate of (2D)2 - PCA before and after applying DWT
95 Algorithms to Study of Correlation between Face
Asymmetry and Human Vision Pathology", Pattern
90 .... after applying 2nd level DWT
Recognition and Image Analysis, Vol. 19, No. 4, pp.
679-686, 2009.
percentage of the recognition rate

---- after applying 1st level DWT

85
-.-. befor applying DWT [9] M. Grgic, K. Delac and S. Grgic, "SCface – Surveillance
Cameras Face Database", Multimedia Tools and
80 Applications, 30 Oct. 2009.
[10] Y. Zhan et al., "Evolutionary Fusion of Multi-Classifier
75 System for Efficient Face Recognition",International
Journal of Control, Automation, and Systems, Vol. 7,
70 No. 1, pp. 33-40, 2009.
[11] A. Saleh et al., "Feature Map Sharing Hypercolumn
65 Model for Shift Invariant Face Recognition", Artificial
1 1.5 2 2.5 3 3.5 4 4.5 5
number of images per person Life Robotics, 5-9 Feb. 2009.
[12] Guoxing et al., "An LPB-Based Multi-Scale
Figure 5. Comparison of recognition rate for different
Illumination Processing Method for Face Recognition",
algorithms.
Journal of Electronics (China), Vol. 26, No. 4, Jul.
2009.
6. Conclusion [13] J. K. Sing, S. Thakur, D. K. Basu, M. Nasipuri and M.
Kundu, "High Speed Face Recognition Using Self
This work examines the four algorithms of face recognition: Adaptive Radial Basis Function Neural Networks",
Principle Component Analysis, Two Dimensional Principle Neural Computing and Applications, 24 Feb. 2009.
Component Analysis in Column direction, Two [14] Kumar et al., "Feature Selection for Face Recognition:
Dimensional Principle Component Analysis in Row A Memetic Algorithm Approach", Journal of Zhejiang
Direction and Two Dimensional Two Directional Principal University Science A, Vol. 10, No. 8, pp. 1140-1152,
Analysis. An algorithm is proposed to reduce the execution 2009.
time of the face recognition algorithms and improve the face [15] Banerjee et al., "Illumination and Noise Tolerant Phase
recognition rate. The execution rate of an algorithm depends Recognition Based on Eigen Phase Correlation Filter
mainly on the number of operations needed to recognize the Modified by Mexican Hat Wavelet", Journal of Optical
indicated face image. Society, Vol. 38, No. 3, pp. 160-168, 2009.
Introducing DWT through PCA algorithms improved the [16] B. Zhangand and Y. Qiao, "Face Recognition Based on
recognition rate up to 5%, also we can conclude that there Gradient Gabor Feature and Efficient Kernel Fisher
are no effective effects on applying the 2nd level DWT. Analysis", Journal of Neural Computing and
Applications, 4 Nov. 2009.
References [17] D. Huang, Z. Yi and X. Pu, "A New Incremental PCA
[1] H. Wang, "Structural Two Dimensional Principal Algorithm with Application to Visual Learning and
Component Analysis for Image Recognition", Machine Recognition", Neural Process Letter, Vol. 30, pp. 171-
Vision and Applications, Jan. 2010. 185, 2009.
[2] E. Norouzi, M. N. Ahmadabadi and B. N. Araabi, [18] K. Chougdali, M Jedra and N. Zahid, "Kernel
"Attention Control with Reinforcement Learning for Relevance Weighted Discriminant Analysis for Face
Face Recognition under Partial Occlusion", Machine Recognition", Pattern Analysis Applications, 9 Apr.
Vision and Applications, Jan. 2010. 2009.
[3] A. Sevcenco and W. Lu, "Perfect Histogram Matching
PCA for Face Recognition", Multidimensional System
and Signal Processing, 14 Jan. 2010.
60 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

Authors Profile

Qeethara Kadhim Abdul Rahman Al-Shayea, has received Ph.


D. in Computer Science, Computer Science Department,
University of Technology, Iraq, 2005. She received her M.Sc.
degree in Computer Science, Computer Science Department from
University of Technology, Iraq, 2000. She has received her High
Diploma degree in information Security from Computer Science
Department, University of Technology, Iraq, 1997. She has
received B. Sc. Degree in Computer Science Department from
University of Technology, Iraq, 1992. She joined in September
(2001-2006), Computer Science Department, University of
Technology, Iraq as assistant professor. She joined in September
2006, Department of Management Information Systems Faculty of
Economics & Administrative Sciences Al-Zaytoonah University of
Jordan as assistant professor. She is interested in Artificial
intelligent, image processing, computer vision, coding theory and
information security.

Muzhir Shaban Al-Ani has received Ph. D. in Computer &


Communication Engineering Technology, ETSII, Valladolid
University, Spain, 1994. Assistant of Dean at Al-Anbar Technical
Institute (1985). Head of Electrical Department at Al-Anbar
Technical Institute, Iraq (1985-1988), Head of Computer and
Software Engineering Department at Al-Mustansyria University,
Iraq (1997-2001), Dean of Computer Science (CS) & Information
System (IS) faculty at University of Technology, Iraq (2001-2003).
He joined in 15 September 2003 Electrical and Computer
Engineering Department, College of Engineering, Applied Science
University, Amman, Jordan, as Associated Professor. He joined in
15 September 2005 Management Information System Department,
Amman Arab University, Amman, Jordan, as Associated Professor,
then he joined computer science department in 15 September 2008
at the same university.
Muna Suliman Abu-Teamah has received M.Sc in Computer
Science, Amman Arab University. (Oct 1998 to Oct 1999) Lab
Assistance in Zarqa Private University. (Oct 1999 to May 2000)
Technical & Administrator in Zarqa Private University.(Sep 2000
to Jan 2007) Teacher in Um Al-Drda'a School. (Dec 2002 to
March 2003) Teacher in Al-Dwali Center. (Dec 2008 pld assistent
in UNRWA HQA Amman. (Jan 1998 to Jan 2010) Teacher in Al
Huda-Wl-Nour Center.
(IJCNS) International Journal of Computer and Network Security, 61
Vol. 2, No. 8, August 2010

An Exploration of Ad-hoc Network in a Real World


builds in a Laboratory Environment
Nitiket N Mhala1 and N K Choudhari 2
1
Associate Professor, Head, Department of Electronics Engg., BDCOE, Sevagram,India
nitiket_m@rediffmail.com
2
Principal, Bhagwati Chadurvedi COE, Nagpur,India
drnitinchoudhari@gmail.com

Abstract: A mobile ad-hoc network is a collection of mobile undersea operation and temporary offices such as campaign
nodes forming an ad-hoc network without the assistance of any Headquarters.
centralized structures. These networks introduced a new art of
network establishment and well be suited for an environment
whether either the infrastructure is lost or deploy an
2. Related Background
infrastructure is not very cost effective. The paper focuses The whole life-cycle of ad-hoc networks could be
briefly on whole life cycle of adhoc networks .We discuss the categorized into the first, second, and the third generation
open problems related to the ad-hoc network. The contribution
of this paper is the exploration of ad-hoc network in a small
ad-hoc networks systems. Present ad-hoc networks systems
laboratory environment based on IEEE 802.11 standardized are considered the third generation. The first generation
medium access protocol under Linux at a relatively very low goes back to 1972. At the time, they were called PRNET
cost. (Packet Radio Networks). In conjunction with ALOHA
Keyword: Ad-hoc network, PCMCIA, MAC, ICMP,ARP, Linux (Areal Locations of Hazardous Atmospheres) and CSMA
approaches for medium access control and a kind of
1. Introduction distance-vector routing PRNET were used on a trial basis to
provide different networking capabilities in a combat
One of the most vibrant and active “new” fields today is that
environment. The second generation of ad-hoc networks
of adhoc networks. Significant research in this area has been
emerged in 1980s, when the ad-hoc network systems were
ongoing for nearly 30 years, also under the names of packet
further enhanced and implemented as a part of the SURAN
radio or multihop networks. Within the past few years,
(Survivable Adaptive Radio Networks) program.[1] This
though, the field has seen a rapid expansion of visibility and
provided a packet-switched network to the mobile battlefield
work due to the proliferation of inexpensive, widely
in an environment without infrastructure. This program
available wireless devices and the network community’s
proved to be beneficial in improving the radios' performance
interest in mobile computing. An adhoc network is a
by making them smaller, cheaper, and resilient to electronic
(possibly mobile) collection of communication devices
attacks. In the 1990s, the concept of commercial ad-hoc
(nodes) that wish to communicate, but have no fixed
networks arrived with notebook computers and other viable
infrastructure available ,and have no pre-determined
communications equipment. At the same time, the idea of a
organization of available links. Individual nodes are
collection of mobile nodes was proposed at several research
responsible for dynamically discovering which other nodes
conferences.[2,3]. The IEEE 802.11 subcommittee had
they can directly communicate with. A key assumption is
adopted the term "ad-hoc networks" and the research
that not all nodes can directly communicate with each other,
community had started to look into the possibility of
so nodes are requested to delay packets on behalf of other
deploying ad-hoc networks in other areas of application.
node in order to deliver data across the network. A
Meanwhile, work was going on to advance the previously
significant feature of adhoc network is that rapid changes in
built ad-hoc networks. GloMo (Global Mobile Information
connectivity and link characteristics are introduced due to
Systems) and the NTDR (Near-term Digital Radio) are some
node mobility and power control practices. Ad hoc networks
of the results of these efforts. GloMo was designed to
can be built around any wireless technology, including
provide an office environment with Ethernet-type
infrared and radio frequency. Ad hoc networks are suited
multimedia connectivity anywhere and anytime in handheld
for use in situations where infrastructure is either not
devices. NTDR is the only "real" non-prototypical ad-hoc
available, not trusted, or should not be relied on in times of
network that is in use today. It uses clustering and link-state
emergency. A few examples include: military solders in the
routing, and is self-organized into a two-tier ad-hoc
field, sensors scattered throughout a city for biological
network. Development of different channel access
detection, an infrastructure less network of notebook of
approaches now in the CSMA/CA and TDMA molds, and
computers in a conference or campus setting, the forestry or
several other routing and topology control mechanisms were
lumber industry, rare animal tracking, space exploration,
some of the other inventions of that time. Later on in mid-
62 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

1990s, within the Internet Engineering Task Force (IETF), 3. Ad hoc network in a our Laboratory
the Mobile Ad-Hoc Networking working group was formed Environment
to standardize routing protocols for ad-hoc networks. The
development of routing within the working group and the Our Approach is based on IEEE802.11 standardized
larger community resulted in the invention of reactive and medium access protocol based on collision avoidance and
proactive routing protocols [4] .Soon after, the IEEE 802.11 tolerated hidden terminals usable for building mobile adhoc
subcommittee standardized a medium access protocol that network using notebooks and 802.11 PCMCIA cards. The
was based on collision avoidance and tolerated hidden Basic purpose is to constitute an adhoc network under Linux
terminals, making it usable for building mobile ad-hoc in a laboratory environment for the academic research
networks prototypes out of notebooks and 802.11 PCMCIA purpose.
cards. HYPERLAN and Bluetooth were some other ad-hoc
network standards that addressed and benefited ad-hoc
networking.
Open Problems
Adhoc networks designed for military, scalability is one of
the most important open problems. Scalability in adhoc
network can be broadly defined as whether the network is
able to provide an acceptable level of service to packets
even in the presence of large number of nodes in the
network. As in wired network, this capability is closely
related as how quickly network protocol control overhead
increases a function of increase in the number of nodes and
link changes. In proactive networks the scalability is often
accomplished by introducing routing and or location
hierarchy in the network [5],or by limiting the scope of
control updates to location close to the changes [6,7].In
reactive adhoc networks, dynamically limiting the scope of
route request and attempting local repairs to broken routes
are often used. Since adhoc networks do not assume the
availability of fixed infrastructure, it follows that individual
nodes may have to rely on portable limited power source.
The idea of energy –efficiency therefore becomes an
important problem in an adhoc network. Most existing Figure 1. Logical Implementation of Ad hoc Network on
solutions for saving energy in an adhoc network resolve each Node
around the reduction of power used by radio transceiver. At The physical layer must adapt to rapid changes in link
the MAC level and above, this is often done by relatively characteristics. The Multiple Access control (MAC) layer
sending the receiver into sleep mode or by using a needs to minimize collisions, allow fair access and semi
transmitter with variable output power and selecting routes reliably transports data over the short wireless links in the
that require many short hops, instead of few longer hops [8]. presence of rapid changes and hidden or exposed terminals.
The ability of fixed, wireless networks to satisfy quality of The network layer needs to determine and distribute
service (QoS) requirement is another open problem.Adhoc information used to calculate paths in a way that maintains
network further complicates the known QoS challenges in efficiency when links change often and bandwidth is at
wire line networks with RF channel characteristics that premium. It also needs to integrate smoothly with
often change unpredictly, along with the difficulty of traditional, non adhoc-aware internet works and perform
sharing the channel medium with many neighbours, each functions such as auto configuration in this changing
with its own set of potentially changing QoS requirement. environment. The Transport layer must be able to handle
Reflecting the multilayer nature of adhoc network, there are delay and packet loss statistics that are very different than
numerous attempts to improve the QoS problems from the wired networks. Finally; applications need to be designed to
service contracts [9] to the MAC layer. Similarly the handle frequent connection and disconnection with peer
security issue in adhoc networks [10].Since nodes uses the applications as well as widely varying delay and packet loss
shared medium in a potentially insecure environment; they characteristics
are susceptible to Deniol of Service (DoS) attacks that are
harder to track down than in wired network. Finally, a
3.1 Challenges in a Laboratory Environment
problem that overarches all these others is the lack of well
defined and widely accepted models for RF path attenuation, Testing adhoc network in a laboratory environment presents
mobility and traffic. These tightly interrelated models are a number of challenges. The most obvious challenge is
needed for quantityfying and comparing adhoc system being able to test the effects of node mobility [11] on the
performance to a common baseline. adhoc routing protocols and adhoc applications. Moreover,
(IJCNS) International Journal of Computer and Network Security, 63
Vol. 2, No. 8, August 2010

configuring individual nodes, installing patches, monitoring BOOTPROTO=static-


log files, updating software and debugging beta releases of WIRELESS=yes
experimental software distributions on a modest size of RATE=54Mb/s
adhoc network can be very time consuming. Recreating Set Mode =Ad-hoc
realistic environmental conditions and signal transmission ESSID=Prit
characteristics using off-the-shelf computing nodes and IPV6INIT=No
wireless cards in a laboratory setting is also very difficult. ONBOOT=Yes
USERCTL=No
3.2 Hardware and Software requirements
PEERDNS=No
3.2.1 Laptops/Desktops CHANNEL=1
At least three nodes with Intel Pentium or higher IPADDR=192.168.0.96
Server: Linux system (ix86) with one wired and one NETMASK=255.255.0.0
wireless interface
Clients: Any Linux system with one wired and one wireless 3) ifconfig ath0 up
interface 4) ifup ath0
5) vi /etc/resolve.conf
3.2.2 Wireless LAN Card Add
Ad hoc network needs wireless LAN cards (IEEE 802.11) name server 192.168.1.1
which should be configured to Adhoc mode.
4.2 Installation of madwifi driver [14] in order to
Our Choice is Netgear [12]
activate wireless interface
NETGEAR WG511T 108 Mbps Wireless PC Card cd madwifi-0.9.4
NETGEAR WG311T 108 Mbps Wireless PCI Card make
(With Chip set Atheros 5212) make install
/sbin/modprobe wlan
3.2.3 Madwifi Driver (madwifi-0.9.4) /sbin/modprobe ath_hal
A Linux kernel driver for Atheros –based wireless LAN /sbin/modprobe ath_pci
devices. The driver support ad hoc mode of operation. The
4.3 Creation of actual Ad hoc mode in real field
three important modules we need
We constitute the adhoc network for four nodes physically
(a) ath_pci available in laborotory on each wireless interface ath0.
Supports PCI,MiniPCA,Cardbus devices
Node A 192.168.0.91 MAC address (00:14:6c:8d:2b:a8),
(b) ath_hal
Node B 192.168.0.96 MAC address
It contains Atheros Hardware Access Layer (HAL) (00:18:48:71:5e:17)
(c) ath_wlan
Contains 802.11 state machines, protocol support and other Node C 192.168.0.99 MAC adress (00:18:4d:9c:4cd9) ,
device-independent support needed by any 802.11 device. Node D 192.168.0.92 MAC adress (00:18:4d:71:5d:f4)

3.2.4 Operating System If ath0 wireless interface already exist for instance, we have
Linux with kernel 2.6 to destroy it by issuing following command
Our choice is FedoraCore7 Linux kernel version 2.6.21
[13] Wlanconfig ath0 destroy

4. Formation of Ad hoc network Inorder to create an interface (called ath0) in adhoc mode,
following command is issued on each node
4.1 Installation of Wireless Cards
After physical installation of wireless cards, they are Wlanconfig ath0 create wlandev wifi0 wlanmode adhoc
configured as below to generate wireless interface as ath0
under Linux on each node with different IP Addresses The connectivity with each node is tested by the use of
simple ping command. The ping statistics conferred that the
1) Vi /etc/modprobe.conf file Add a line alias ath0 ath_pci Node A, Node B and Node C are communicating with each
other except than node D.
2) Vi /etc/sysconfig/network-scripts/ifcfg-ath0

Add

DEVICE =ath0
64 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

(192.168.0.91) and Node B (192.168.0.96) are connected


using ICMP proocol.

Figure 2. capturing of packets on ath0

Figure 4. Identification of AODV protocol

Figure 3. capturing of packets on ath0


The Above Figures illustrate the capturing [15] of the
packets on the interface ath0. Node C (192.168.0.99)
communicate with Node B (192.168.0.96) using ICMP
protocol.The packet number 31 and 49 are obsreved as
Malformed packet.
Packet Number 40 and 41 resolve the concept of Adress
Resolution protocol (ARP).They attempt Full Duplex
communication among the two MAC adresses .The same is
the case for packet number 55 and 57.Similarly Node A
Figure 5. Flow graph
(IJCNS) International Journal of Computer and Network Security, 65
Vol. 2, No. 8, August 2010

The graph analysis indicate the packets flooding from Networks,” IEEE JSAC,Vol 17,no.8,Aug 1999,pp-
source Node A (192.168.0.91) only Broadcast to the 1369-79
destination (255.255.255.255) using UDP protocol on port [8] S.singh,M.Woo and C.S.Raghavendra, “Pwer Aware
654 signifying AODV route reply from destination port. Routing in Mobile Adhoc Networks”, Proc. ACM,
Here, we are not applying any external AODV routing Mobicomm 1998
daemon either in user space or kernel space. [9] S.B.Lee, G.S.Ahn and A.T.Campbell,” improving
UDP & TCP performance in mobile adhoc networks
5. Conclusion with INSIGNIA,”IEEE Communications mag, Vol.39,
no 6, June 2001.
The dynamical nature of an ad hoc network is very
[10] L.Zhou and Z.J.Haas,”Securing Adhoc networks,”
interesting. The strength of the connection can change
IEEE Network, Vol13, no 6, Nov-Dec.1999,pp. 24-30.
rapidly in time or even disappear completely. Nodes can
[11] Nitiket N Mhala and N K Choudhari,’’ An Envision of
appear, disappear and re-appear as the time goes on and all
low cost mobile ad-hoc network test bed in a lab
the time the network connection should work between the
environment emulating an actual ANET”,IJCNC,Vol
nodes that are part of it. As one can easily imagine, the
2,No.3,May2010,pp 52-63
situation in adhoc networks with respect to ensuring
[12] Wireless Network cards available at
connectivity and robustness is much more demanding than
http://www.netgear.co.uk/wireless_networkcard_
in the wired case. Generally researchers traditionally use
wg511t.php and http:uk.farnell.com/netgear/wg311t/
simulations because they easily allow for a large number of
card-pci-w-wn-108mbps-netgear/.
nodes and reproducible environment conditions. In
[13] Fedoracore7 (FC7) Linux is available:
simulation, the developer controls the whole system, which
http://mirrors.fedoraproject.org/publiclist/fedora7/i386/
is in effect only a single component. But on the other hand,
[14] Madiwi-0.9.4 drivers are available for downloaded on
here, our submission is for an implementation in real world,
http://linux.softpedia.com/progdownload/madwifi-
which needs to interoperate with a large, complex system.
download-12546.html
Some components of this system are operating systems,
15] Etheral GUI network protocol analyzer using
network interfaces and suitable wireless drivers. This paper
tcpdump’s capture format available on
throws a light on challenges to be faced in a laboratory
http://www.ethereal.com.
environment. Here, we practically explored the creation of
adhoc network which uses ix86 architecture and Linux can
run even in 80386 machines (at least requirement is
Authors Profile
Pentium II), so we can gather all those old PCs intended
thrown away, adding a PCMCIA wireless card on each of Mr. Nitiket N. Mhala is PhD student and also
working as Associate Professor in the
them and set up adhoc network in a laboratory at a very low
Department of Electronic Engineering,
cost suitable for the academic researcher. Sevagram, India. He received his ME Degree
from RM Institute of Research and
References Technology, Badnera, Amravati University and
BE Degree from Govt. College of Engineering,
[1] J.Freebersyer and B.Leiver, “A DoD Perseptive on Amravati, Amravati University. He published
Mobile Ad hoc networks,” Ad hoc Networking,Ed. a Book Entitled PC Architecture and Maintenance and many
C.E. perkins,Addisson-Wesley,2001, pp-29-51 research paper at International and Nationl level. He is a member
[2] C.E Perkins and P. Bhagwat, “highly dynamic of Institute of Electronics and Telecommunication Engineer
destination Sequenced Distance vector routing (DSDV) (IETE). His area of interest spans Data communication, Computer
for Mobile Computers,” proc. ACM SIGCOMM’94, network and Wireless Ad hoc networks.
oct, 1994.
Dr. N. K. Choudhari is a Professor and
[3] D.B Johnson, “routing in ADHOC Networks of Mobile
completed his Ph.D degree in Electronics
hosts”, Proc. ACM Mobicomm, 94, Dec.1994. Engineering from J.M.I., New Delhi and
[4] E.Royer and C.K.toh,a, “A Review of Current routing received his M.Tech in Electronics
ptotocols for Adhoc Mobile Wireless Networks”, IEEE Engineering from visveswaraya regional
Pers. Commun,Vol6,no.4,Aprl,1999,pp-46-55. Engineering College, Nagpur. He received his
[5] R.Ramnathan and M.Steenstrup, “Hierarchically- BE in Power Electronics from B.D.C.O.E.,
Sevagram. Presently he is Principal at
organised Multi-hop mobile wireless Networks for Smt.Bhagwati Chaturvedi COE, Nagpur, India. He is guiding few
Quality of service support,”Baltzer Mobile Networks & research scholars for persuing Ph.D degree in RTM Nagpur
Appicications, 1998 University, Nagpur, India. He has worked as members of different
[6] C.Santivaneez, R.Ramnathan and I.Stavrakakis, advisory committees and was a member of Board of Studies,
“Making link state routing scale for Adhoc Networks”, Electronics Engg. of Nagpur University, Nagpur, India.
Proc.ACM Mobile 2001, Long Beach, CA.
[7] A.Iwatta, C.C.Chiang, G.Pei,MGerla and T.W.chen,
“Scalable Routing Stratgies for adhoc wireless
66 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

The Electronic Intermediary and the Effect on this


For Developing E-Commerce
Sahar Kosari1, Mohammadreza Banan2, Hadi Fanaee Tork3 and Ali Broumandnia4
1
Payam-E-Noor University of Tehran, Department of Management,
Tehran, Iran
s_kosari@ymail.com
2
Young Researchers Club, Islamic Azad University Qazvin Branch,
Qazvin, Iran
engineerbanan@gmail.com
3
Researches & Development Department, Kasbarg, Information Technology Company
Mashad, Iran
info@fanaee.com
4
Islamic Azad University-South Tehran
Tehran, Iran
broumandnia@gmail.com

and customers around the world using advantage of a


computer network’s capacity to reduce transaction costs [1].
Abstract: In export marketing, an electronic intermediary
serves as a business-to-business (B2B) electronic marketplace,
which allows trade parties achieve cost-efficient international 2. E-commerce & E-Intermediary
trade. Previous studies have not paid enough consideration to The emergence of an electronic intermediary is an
the electronic intermediary. Instead, they have accent a direct
outcome of the development of electronic commerce [2].
Internet-based exchange, which was yielded to decrease
transaction costs. This research suggests an electronic
Considering its characteristics, an electronic intermediary is
intermediary as a hybrid-exporting channel, combining a an appropriate market intermediary for small and medium
traditional intermediary and a direct Internet-based exchange. A exporters who lack the necessary knowledge and resources
direct Internet-based exchange is attending the most efficient to engage in international commerce. This research aims at
exporting channel to reduce cost, but it involves high risk. A better understanding an electronic intermediary in export
traditional intermediary may be an effective exporting channel marketing by investigating determinants and effects of
to reduce risk, but it accompanies high cost such as electronic intermediary use.
commissions and agent fees. This research suggests that an The Internet is often considered to be fundamentally
electronic intermediary is an alternative to balance between changing the business paradigm [3]. In market transactions,
profit and risk. This research examines determinants and effects the Internet has also become an important medium [4]. A
of electronic intermediary use in export marketing. The results
typical by-product of the Internet’s development is
indicate that electronic intermediary use is influenced by some
electronic commerce, defined as “any transaction completed
IT and marketing determinants. Electronic intermediary use
also has a positive impact on export performance. Particularly, over a computer-mediated network that transfers ownership
high entrepreneurial or low bureaucratic exporters may use an of, or rights to use, goods or services” [5] .Electronic
electronic Intermediary more effectively in export marketing. commerce is a way of doing real-time business transactions
via telecommunication networks, when the customer and the
Keywords: electronic intermediary, performance, E- merchant are in different geographical places. Also,
commerce, customer, IT, industries, B2B, payment. electronic commerce is a commercial transaction with
business partners, including buyers or sellers, over the net
1. Introduction [6]. According to [7], electronic commerce includes the
support of markets, inter-firm coordination, and information
This research investigates the use of an electronic exchange via electronic means. Electronic commerce is also
intermediary in international commerce. Specifically, this a broad concept that includes virtual browsing of goods on
research aims to provide a comprehensive understanding of sale, selection of goods to buy, and payment methods. The
an electronic intermediary regarding the growth of Internet has become an important medium for business
electronic commerce in export marketing. An electronic transactions [4]. The Internet allows customers and
intermediary is a typical form of electronic commerce. manufacturers in different geographical places to conduct
Electronic export intermediary is an electronic marketplace real-time market transactions. Furthermore, the Internet has
of sources in which qualified members simply post requests motivated firms to participate in electronic commerce,
to buy or sell and its sales representatives will search around because it can reduce costs in market transactions. Previous
the global for companies to supply or purchase the posted research on electronic Commerce insisted that a direct
products. The popularity of electronic commerce allows
market intermediaries to connect between Manufacturers
(IJCNS) International Journal of Computer and Network Security, 67
Vol. 2, No. 8, August 2010

exchange via the Internet might lower transaction costs


Incurred [8], [9].
In international commerce, an exporter in one country can
theoretically trade directly with customers in other countries
via its online catalogs or Internet exchange at a much lower
transaction cost than incumbent distributors could match
[8]. If this were the case, market intermediaries who connect
between manufacturers and customers would have
disappeared. In electronic international commerce, however,
there are still various market intermediaries.

2.1 Information Technology (IT) Figure1. Effects of electronic intermediary use on export
Information technology (IT) adoption is critical to the performance
growth of an economy [10]. Although IT adoption has been
researched by academics for more than a decade and many However, lack of credibility, unestablished payment systems,
theories attempt to explain IT adoption in different domains, and language and Cultural barriers are presented as costs of
there are still several critical components related to IT electronic intermediary use in the literature [18].
adoption that have not yet been thoroughly investigated.
First, among studies that focus on technology adoption, only 4. Roles of Electronic Intermediaries
a small percentage is devoted to the adoption and use of
electronic commerce (e-commerce) in small- and medium- 4.1 Connecting between Exporters and Foreign
sized enterprises (SMEs). The contribution of SMEs is Customers
extremely important to the economy and rapid growth of Electronic intermediaries play an important role in
developing countries. In addition, small businesses differ connecting between exporters and foreign customers
from large businesses in terms of IT adoption patterns [11].
effectively. An exporter of one country could trade
For example, SMEs often find technology difficult to
efficiently with customers of other countries via an
implement due to resource constraints.
electronic intermediary at much lower transaction costs [8].
Second, there is a need to validate existing theories in
different contexts. The majority of IT adoption research Since they participate in transactions with different
focuses on the technologically developed world, mostly customers, different suppliers, and potentially in different
because the majority of research/academic institutions are industries, electronic intermediaries can analyze consumer
located in developed countries. preferences across products, suppliers, and industries [4].
4.2 Providing Market Information
An electronic intermediary assists exporters in identifying
3. Effects of Electronic Intermediary use on and taking full advantage of Business opportunities [18].
Performance Through their global networks and Drawing on their
Figure 1 illustrates the effects of electronic intermediary experience in carrying out international trade transactions,
use on export performance. An Export intermediary is a electronic Intermediaries are able to gather and analyze
specialist firm that functions as the export department of information quickly and accurately [18].
several manufactures in noncompetitive lines as a
transaction channel for exporters [2]. Despite scarce
theoretical and empirical research, export intermediaries
have played a major role in export marketing [12]. An
electronic intermediary is an alternative export-oriented
market intermediary in electronic international commerce
[1].
An electronic intermediary allows exporters to enhance their
access to decision-making information by exploiting the use
of contemporary technology [13]. An electronic Figure 2. Connecting between customers and suppliers
intermediary has various roles, benefits, and even costs.
Roles identified by the literature include providing market Moreover, electronic intermediaries provide updates on
information, connecting between exporters and foreign business trends, market conditions and individual
customers, and serving as an electronic marketplace. Also, commodities, and products [13]. Electronic intermediaries
an electronic intermediary provides many benefits, such as also provide advice on legal matters and local business
accelerating the internationalization of SMEs, making customers to assist exporters in realizing the potential of
market transactions efficient, and overcoming time and their products [1].
geographical barriers [14].
4.3 Serving as an Electronic Marketplace
An electronic intermediary provides an electronic
marketplace of sources. An electronic Intermediary serves as
68 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

B2B electronic marketplace in which qualified members can electronic intermediary [8]. In export marketing, an
post Requests to buy or sell [15]. An electronic intermediary electronic intermediary’s functions that benefit buyers
provides the collection of many demands from buyers and include assistance in search and evaluation, need assessment
many products from sellers effectively via the Internet [19]. and product matching, risk reduction, and product
The decision of channel type belongs to exporters. Exporters distribution/delivery [1]. Buyers execute transactions based
can trade directly with foreign customers who are on electronic information without inspecting products, thus
introduced by the electronic intermediary. Also, exporters encountering risks of uncertain product quality for the
who have insufficient resources and knowledge regarding buyers [1]. An electronic intermediary’s functions that
the direct foreign exchange can trade indirectly with foreign benefit exporters include creating and disseminating Product
customers via the electronic intermediary. Furthermore, information and creating product awareness, influencing
exporters can introduce / identify uniqueness of services and consumer purchases,
products, provide detailed product specifications, and make providing customer information, reducing exposure to risk,
available a forum for advertising and marketing new or and reducing costs of distribution through transaction scale
existing products in the electronic marketplace [19]. economies [9].

5. Benefit of Electronic Intermediaries 6. Cost of Electronic Intermediaries


Finding the ideal buyer or supplier in the electronic
6.1 Lack of Credibility
marketplace can be extremely time Consuming and costly
[18]. Moreover, an Inding relevant source in the “virtual Credibility is a very important factor in channel working
jungle” is a hard challenge for non-experienced users [16]. relationships in international Commerce, credibility is
Small and medium exporters usually do not have sufficient especially important. In an electronic international
resources or experiences regarding foreign markets [12]. relationship, face-to- face communication is rare, which
Small and medium exporters also face significant may induce a lack of credibility. As a consequence,
uncertainty in electronic international commerce. exporters may be exposed to the opportunistic behaviors of
E-commerce arguably has a potential to add a higher value foreign participants regarding electronic intermediary use
to businesses and consumers in developing countries than in [4]. It is difficult for exporters to monitor or safeguard
developed countries. Yet most developing country-based against opportunistic behaviors of foreign customers;
enterprises have failed to reap the benefits offered by therefore, much cost may be incurred to prevent the
problem.
modern information and communications technologies. A
fundamental motive for using an electronic intermediary is 6.2 Unestablished Payment System
to reduce transaction costs, which is theorized from the Payment is another concern in electronic intermediary use
transaction cost analysis. An electronic intermediary allows in export marketing. In the Trading world, there are several
exporters to meet customer’s need, increase customers’ types of payment terms, such as cash in advance, letter of
accessibility, and provide variety of their products or credit (L/C), drafts, and open accounts [18]. Among them,
services, which correspond to the roles of a marketing L/C is the most frequently used. In general, a traditional
strategy [9]. An electronic intermediary may make market export intermediary offers full service to assist buyers and
transactions easier and more efficient, which may decrease sellers regarding this payment issue. Opening an L/C
transaction costs. Also, using an electronic intermediary through the Internet is already possible by connecting
may play a role of a marketing strategy for exporters to electronic intermediaries to banks that offer exporting
penetrate the global market. Firm resources are usually services. Nevertheless, use of L/C is still limited and rare in
an electronic intermediary due to cultural, practical, and
strengths that firms can use to conceive of and implement
technical limitations [1].
their marketing strategies [17]. Therefore, the transaction
cost analysis and resource-based view may be appropriate 6.3 Other Costs
for explaining theoretically the electronic intermediary in Exporters pay some commissions, such as transaction fees
export marketing. or membership fees for exporting via an electronic
5.1 Accelerating the Internationalization of SMEs intermediary. As a result, exporters may lose part of their
profit margins by using an electronic intermediary [4]. Also,
An electronic intermediary is originated and developed
language and cultural barriers can further contribute to the
from electronic commerce, which is the fastest growing
cost of using an electronic intermediary in export marketing,
facet of the Internet. An electronic intermediary is thus
closely associated with the Internet. The Internet’s provision because exporters usually bargain without assistance with
of low-cost and efficient interconnectivity has had a unfamiliar foreign customers from different countries and
dramatic influence on the way in which business is being cultures.
conducted [14]. The Internet offers Small and Medium-
Sized Enterprises (SMEs) a level playing field in relation to 7. Conclusion
their larger competitors [3].
The advent of the Internet has generated significant
5.2 Reducing Transaction Costs interest in electronic commerce. Development of electronic
Reduced transaction costs from easier and more efficient commerce is expected to bring changes in the economics of
market transactions may be the typical benefit of an Marketing and distribution channels by creating a new
(IJCNS) International Journal of Computer and Network Security, 69
Vol. 2, No. 8, August 2010

generation of market intermediary called an electronic [14] J. T. Goldsby, and J. A. Eckert, “Electronic
intermediary. This study highlights how an electronic Transportation Marketplaces: A Transaction cost
intermediary will allow small and medium exporters to Perspectives”, Industrial Marketing management, 32:
effectively participate in the global market. An Electronic pp. 187-198, 2003.
intermediary is expected to bring significant changes to the [15] M. G. Martinsons, “Electronic commerce in china:
economics of marketing channels and the structure of Emerging Success Stories”, Information and
distribution in export marketing. Management, 39(7), pp. 571-579, 2000.
[16] B. Ancel, “Using the Internet: Exploring International
References markets”, International Trade Forum, PP. 1-14, 1999.
[17] A. S. Bharadwaj, “A Resource-Based Perspective on
[1] H. Theodore Clark, and H. Geun Lee “Electronic Information Technology capability and Firm
Intermediaries: Trust Building and Market Performance: An Empirical Investigation”, MIS
Differentiation”, Proceeding Of the 32nd Hawaii Quarterly, pp. 169-196, 2000.
International conference On System Science, [18] H. Lee and D. Danusutedjo,”Export Electronic
Http://Www.computer.Org/Proceedings, 1999. Intermediaries”, American University: Washington
[2] H. Trabold, “Export Intermediation: An Empirical Test D.C.
of Pang and Ilinitch”, Journal of International [19] D. Chrusciel, "The Internet Intermediary: Gateway to
Business Studies, 33(2), pp. 327-344, 2002. Internet Commerce opportunities", Journal of Internet
[3] V. Kanti. Prasad, K. Ramamurthy, and G. M. Naidu, Banking and Commerce, 2000.
“The Issuance of Internet-Marketing Integration on
Marketing competencies and Export Performance”,
Journal of International Marketing, 9(4), pp.82-110,
2001.
[4] J. P. Bailey, and Y. Bakos,”An Exploratory Study of
the Emerging Role of Electronic Intermediaries”,
International Journal of Electronic Commerce, pp. 7-
20, 1997.
[5] B.K. Atrostic, john Gates, and Ron Jarmin,
“Measuring the Electronic Economy: current Status
and next Steps”, U.S. census Bureau,
http://www.census.gov/eos/www/papers/G.pdf, 2000.
[6] B. Mahadevan, “Business Models for Internet-Based E-
commerce: An Anatomy”, California Management
Review, 42(4), pp. 55-69, 2000.
[7] J. W. Palmer, “Electronic commerce: Enhancing
Performance in Specialty Retailing”, Electronic
Markets, pp. 6-8, May. 1995.
[8] D. Narayandas, M. Caravella, and J. Deighton, “The
Impact of Internet Exchange on Business-To-
Business Distribution”, Journal of the Academy of
Marketing Science, pp. 500-505, 2002.
[9] M. B. Sarkar, B. Butler, and C. SteinIeld,
“Intermediaries and cybermediaries: A continuing
Role for Mediating Players in the Electronic
Marketplace”, Journal of Computer-Mediated
Communication, 1995.
[10] K. E. Kendall, J. E. Kendall, M. O. Kah, “Formulating
information and communication technology (ICT)
policy through discourse: how internet discussions
shape policies on ICTs for developing countries”,
12(1), pp. 25-43, Dev 2006.
[11] J. lee, J. Runge, “Adoption of information technology
in small business: testing drivers of adoption for
entrepreneurs”, The Journal of Computer Information
Systems, 42(1), pp. 44-57, 2001.
[12] M. W. Peng, and A. Y. Ilinitch, “Export Intermediary
Firms: A note on Export Development Research”,
Journal of International Business Studies, pp. 609-
620, 1998.
[13] D. Chrusciel, And F. Zahedi, “Seller-Based Vs. Buyer-
Based Internet Intermediaries: A Research Design”,
AMCIS 1999 Proceedings, Paper 86, August 1999.
70 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

Pairwise Key Establishment in Hierarchical WSN


Cluster Network
G.N.Purohit 1 , Asmita Singh Rawat2
1
Department of Mathematics, Banasthali University
AIM & ACT, Banasthali-304022, INDIA
gn_purohitjaipur@yahoo.co.in
2
Department of Computer Science,
AIM & ACT, Banasthali University, Banasthali-304022, INDIA
singh.asmita27@yahoo.com

between every pair of sensors to overcome the storage


Abstract: Key establishment in sensor networks is a challenge
in problem because asymmetric key cryptosystem are unsuitable constraints.
for use in resource constraint sensor nodes and also because the Random key pre-distribution(RKP) schemes
nodes could be physically compromised by an adversary. We [2],[3],[4],[5] have been proposed to provide flexibility for
present a mechanism for key establishment using the framework the designers of sensors network to tailor the network
of pre distributing a random set of keys to each node. We deployment to the available storage and the security
consider a hierarchical network model of sensor network requirement .The RKP schemes propose to randomly select
consisting a three tier system having a base station, cluster a small number of keys from a fixed key pool for each
heads and sensing nodes. There is a large number of sensor sensor. Sensors then share keys with each other with a
nodes scattered in different clusters. All sensor nodes in probability proportional to the number of keys stored in each
particular cluster are identical, however, in different clusters sensor and using this scheme, one can achieve a known
there may be nodes of different strength. Each cluster has a
probability of connectivity within a network.
cluster head of much stronger strength. The base station
contains a large pool of keys and nodes selects randomly key
There are instances, as per requirement of the
chains for themselves which are recorded in base station. In this landscapes in which sensor nodes segregate themselves into
paper we basically calculate the probabilities of connectivity exclusive neighborhoods and these neighborhoods are
between two nodes in its own cluster, between two nodes in separated from each other for number of reasons. For
different clusters, between nodes. example there may be signal blocking terrians like hills,
buildings, walls between clusters. Each cluster contains a
Keywords: Wireless Sensor Network, Key pre-distribution,
certain number of nodes and one strong node of much
Secured Connectivity.
higher strength and working as cluster head. In the present
paper we consider a sensor distribution of this nature . The
1. Introduction sensor nodes are deployed in different clusters along with a
Recent advances in wireless communications and electronics cluster head in each cluster.
have enabled the development of low cost, low power, multi- This paper is organized as follows: Section -2 includes a
functional sensor nodes that are small in size and brief description of related work. In Section -3 the model is
commincate untethered in short distances. These tiny sensor described, connection probabilities are calculated. In
nodes whose performance consists of sensing, data section-4 numerical evaluations and verification of results is
processing and communicating components, leverage the included in Section 5.
idea of sensor networks. Thus, the sensor networks give a 2.Related Work.
significant improvement over the traditional sensors .Large Security services, such as authentication and
scale sensor networks are composed of a large number of confidentiality, are critical to secure the communication
low powered sensor devices. According to [1],the number of between sensors in hostile environments. For these security
sensor nodes deployed to study a phenomenon may be on the services, key management is a fundamental building block.
order of hundreds or thousands. To solve the problem Eschenhauer and Gligor [5] , first
Within network sensors communicate among proposed a random key predistribution scheme, which let
themselves to exchange data and routing information. each sensor node randomly pick a set of keys from a key
Because of the wireless nature of communication among pool P before deployment such that two sensor nodes share
sensors. These networks are vulnerable to various active and a common key with certain probability after deployment.
passive attacks on the communication protocols and devices. Since this original work, several other variations of this
This demands secure communication among sensors. Due to scheme have suggested to strengthen this method . Du et.al
inherent storage constraints, it is infeasible for sensor [7], Liu et.al [9] and Zhu et.al[ 14] extended this scheme to
devices to store a shared key value for every other sensor in further strengthen the security or improve the efficiency.
the network .Moreover, because of the lack of post Du et.al[7] and Liu et.al[9] provide a random random key
deployment geographic configuration information of key pre-distribution scheme using deployment knowledge
sensors; keys cannot be selectively stored in sensor devices. which reduces memory size significantly. Since the RKP
Although a simple solution would be to use a common key schemes necessitate only a limited number of keys to be
preinstalled in sensors, a sensor may not share keys with of
(IJCNS) International Journal of Computer and Network Security, 71
Vol. 2, No. 8, August 2010

its neighbour nodes. In this case a Pairwise key • Collecting and analysing the data from the nodes in
Establishment (PKE)scheme is required to set up shared their clusters and communicating to the base station.
keys with required fraction of neighbour nodes. Traynor • Having secured communication with every other cluster
et.al[13] proposed a random key distribution scheme based header.
on the .Instead of a homogeneous compositon of nodes ,this
kind of network now consists of a mix of nodes with Member nodes in a cluster are connected with the cluster
different capabilities and missions. Patrik et.al[13] header via a one-hop or multi-hop link and these member
established pairwise keys in heterogeneous sensor networks. nodes perform sensing and forwarding the data to the cluster
They demonstrated that a probabilistic unbalanced head. After gathering or aggregating localized sensing
distributions of keys throughout the network that leverage information from their cluster member’s nodes, the cluster
the existence of a small percentage of more capable sensor header sends packets to the base station. The nodes in a
nodes can not only provide an equal level of security but cluster adopt the following protocol for communicating
also reduce the consequences of nodes compromise. among them. If two nodes lying between there sensing range
3. The Model and share in a common key can communicate directly. In
In the most of recent studies the sensor network is order to securely communicate with the nodes in which a
considered either as a grid or a very large random graph particular node i does not directly share an encryption key
arrangement such that all neighbors within the transmission with another node j, the message is routed via multiple hops
radius of a given node can have communication. In the case in the following manner:
of random key pre deployment, in such networks, the 1. To securely communicate with node j, node i first
communication between adjacent nodes (within Encrypts the message using the encryption key it shares
communication range) is therefore limited only by key with node l that is closest to the destination and with which
matching. However this model is not always realistic for it (node i ) has a direct connection and sends the encrypted
many reasons. The sensor node is deployed randomly by air message to node l.
dropping or other means on landscape that segregates nodes 2. Node l then decrypts the message, and checks if node
into different exclusive neighborhoods. There may be signal j is its direct contact. If it is, then node l encrypts the
blocking barriers in the landscape including hills, walls, and message using the encryption key it shares with node j and
high rising buildings. Sometime it is needed to deploy the sends the message to node j directly. However, if j is not
sensor nodes in different clusters e.g. in the battle fields, one of l ’s direct contacts, then node l locates the next node,
controlled by a common base station. We consider a similar m , that is closest to node q , among its direct contacts, and
scenario in this paper. encrypts the message with the encryption key its shares with
3.1. Model Setup node m and sends the encrypted message to it. 3. Node m
In our model we consider three different clusters C1,C2, repeats step 2, and so on until the message reach’s node j.
C3 (there can be any number of clusters) of nodes controlled However, since the cluster has limited number of nodes, we
by single base station. The schematic diagram of the Models have a threshold of 3 hops i.e. every node can have a link
given in Fig.1.Each cluster contains nodes of identical with cluster header within 3 hops.
hardware, however, nodes in different clusters may have
different sensing strength and of different hardware. 3.2. Keys Distribution in the Network
There are three (related to number of clusters) large key
pools, each of size of P keys in the base station. Each cluster
header receives mi keys and each node in a cluster Ci
receives a ki keys, i=1, 2, 3… (mi >> ki ).The information
of distributed keys lies with the base station. We further
assume that nodes in cluster C1 can have communication if
they share at least one common key. Since clusters C2 and
C3 can be compromised by a hacker. The nodes in C2 and
C3 can communicate each other if they share at least 2 and 4
keys, respectively. The same is true for any node in a cluster
to communicate with their respective cluster header.
In the next section we calculate probabilities for having
communication to their respective header directly or
indirectly by the encrypted path with multi hops limited to
3. It is assumed that the headers are securely connected to
Figure1. The schematic diagram of the Models each other having multiple common keys and also with the
base station.
Sensor nodes are organized in a hierarchical structure. They
are grouped into a number of clusters, each cluster 4. Mathematical Formulation
containing strong sensor nodes having a large sensing, data In this section, we calculate the probabilities for the
gathering and communicating strength. This particular node hierarchical sensor network.
is called the cluster header and plays the following roles The base station which is the key distributor centre consist
controlled by part of nodes playing a particular role: of a large key pool of size 3P, a pool of size P for each
cluster, with random symmetric keys .Each cluster header
72 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

C i (i=1,2,3) draws a key pool of size m i from the key Probability of sharing at least one common key between
two nodes Fig.2 illustrates probability of sharing at least
pool at base station. Each node in cluster C i also draws a one common key for connectivity for the function (Eq.1)
key chain of size k i (k i << m i ) from the key pool meant for various values of P. Probability therefore the
probability that at least one key being shared between two
for cluster C i and maintained at base station. nodes (at one hop distance).
= 1- ( P − k 1 )!
2
Now we can calculate the different probabilities for
sharing a common key (at least one in cluster C1, at least P ! ( P − 2 k 1 )!
2 in cluster C 2 and at least 4 in cluster C 3 ),between
two nodes, between a node and its corresponding cluster • Probability that two nodes are connected at one
head , between two nodes in different clusters. Since in hops
p 11 = 1 - ( ( P − k 1 )!
2
cluster the nodes are of different harwares and have )
different protocols for communicating within the cluster, P ! ( P − 2 k 1 )!
we consider the communicating for probability for each • Probability that two nodes are connected at two
each cluster separately. hops
4.1 Cluster C1 .
p 12 = 1 - ( ( P − k 1 )! 2 (2)
)2
• The probability that two nodes in cluster C1 P ! ( P − 2 k 1 )!
share at least one common key. • Probability that two nodes are connected at three
hops.
p 13 = 1 - ( ( P − k 1 )!
2
We are given a key pool of size P and each each sensor (3)
)3
node in C1 is loaded with k1 keys.The probability that two P ! ( P − 2 k 1 )!
nodes share at least one common key is
1-(probability that two nodes share no keys)
From the above equation one can calculate probability at
Following Eschenauer et.al [5], we can calculate probability
different hops for the value of n. Probability at two hop and
for sharing key between two nodes. The number of possible
at three hop can be calculated from the above equation.
ways of selecting k 1 keys for a node (say n 1 ) from the pool
P ro b ab ility at d ifferen t h op s
is
1.2
P!
probability at different

1
k1 ! ( P − k1 )! 0.8 k = 15
hops

0.6 k = 25
Similarly, the number of possible ways of selecting k 1 keys
0.4 k = 35
for another node (say n2), from the pool 0.2

( P − k1 )! 0
1 2 3
k1 ! ( P − 2 k1 )! Num be r of hop s (h)

The probability that no key is shared between these two


Figure 3. That the probability of at different hops
rings is the ratio of the number of rings with no match to
the total number of possible rings is given by:
The network connectivity probabilities for 1-hop path key
establishment are plotted in Fig. 4 for various values .It is
= ( P − k1 )! ÷ P!
clear from the figure that one can achieve significantly
k1!( P − 2k1 )! k1!( P − k1 )! better connectivity after executing this phase even if the
network is initially disconnected with high probability.
= ( P − k 1 )! 2
P ! ( P − 2 k 1 )! • Probability that two nodes share exactly one
Therefore the probability that at least one key being shared key in common.
between two nodes (at one hop distance). Therefore the probability that at least one key being shared
p 11 =1- ( P − k 1 )!
2
(1) between two nodes (at one hop distance).
P ! ( P − 2 k 1 )! p 11 = 1- ( P − k 1 )!
2

P ! ( P − 2 k 1 )!
• Probability that cluster header and node share
a common key.

Figure 2. illustrates probability for connectivity


(IJCNS) International Journal of Computer and Network Security, 73
Vol. 2, No. 8, August 2010

P ro b a b ility b etween th e n o d es a n d c lu s ter h ea d er • Probability that two nodes are connected at two
1.2
hops
p 22 = 1 - ( ( P − k 2 )!
2
1
m= 125,k = 15
)2 (8)
probability

P ! ( P − 2 k 2 )!
0.8
m= 150,k = 20
0.6
m= 175,k = 25
0.4
m= 200,k = 30 • Probability that two nodes are connected at three
0.2
0
hops.
p 23 = 1 - ( ( P − k 2 )!
2000 4000 6000 8000 10000 100000 2
K e y P ool siz e (P ) )3 (9)
P ! ( P − 2 k 2 )!
Figure 4. That the probability between the nods and cluster So, we can calculate probabilities for different hops from
header the above equations.
Let p1h be the probability that an sensor nodes and cluster • Probability that two nodes share exactly two
header share at least one common key in their respective key key in common .
ring. The number of possible key ring assignments for node With the Chan et al.[3] equation ,we can calculate
. The probability has been calculated for m=250 and k=15 probability that two nodes have i keys in common. There
and the value of pool P and key ring size for connectivity. P
are ( i ) ways to pick i and (P- i) is the number of
P! remaining. Keys in the key pool after i is picked .The
k 1 ! ( P − k 1 )! number of ways in which a key ring of size k and one of
The number of possible key ring assignment for the P
cluster header is size m can be chosen from a pool P are k and ( )
( P − m 1 )!
m1 ! ( P − m 1 − k1 )
( mP ) respectively , total number of ways for both nodes to
The equation for the probability of a node and cluster header pick m. Thus the equation
is connected by following equation. The probability that no P −i ( m − i )+ ( k − i )
key is shared between a node and the cluster head is ( pi )( ( m − i )+ ( k −i ))( m−i )
P(i)= (10)
( P − m1 )! P P
= ÷ P! ( )(
m k )
m1 ! ( P − m1 − k1 ) k1 ! ( P − k1 )! Thus the probability for sharing two common keys can be
[Where m >> k ] calculated from the following equation.
i i
p 21 = 1 − [ p (0) + p (1) ]
= ( P − k 1 )! ( P − m 1 )! k 1 !
P ! ( P − m 1 − k 1 )! m 1 ! 4.3 Cluster C 3 .
Hence, the probability that a node in C 1 and the cluster head • Probability sharing common key between cluster
header and node.
shares at least one common key is
( P − k 3 )! ( P − m 3 )! k 3 !
( P − k1 )!( P − m1 )!k1! p 3h = 1 – ( (11)
p 1h =1–( ) (4) P ! ( P − m 3 − k 3 )! m 3 !
P!( P − m1 − k1 )!m1!
• Probability that two nodes share exactly three
The probability that a node in the cluster head share key in common.
common keys can be calculated with the Chan et.al[3 ]
For the cluster C 3 the probability value decreases as the
connectivity equation.
4.2. Cluster C 2 distance increases. In following table we have calculated
probabilities at different hop values.
• Probability that two nodes share exactly a key With the Chan et al.[3] Equ.10. we have plotted
in common. graph for key value (1,2,4). The following fig .5. Illustrates
.Therefore the probability that at least one key being shared probabilities for the key values.
between two nodes.
= 1- ( P − k 2 )!
2
p (5)
22
P ! ( P − 2 k 2 )!
• Probability sharing common key between cluster
header and node.
p 2 h = 1 – ( ( P − k 2 )! ( P − m 2 )! k 2 ! (6)
P ! ( P − m 2 − k 2 )! m 2 !

• Probability that two nodes are connected at one


hops
p 21 = 1 - ( ( P − k 2 )!
2
) (7)
P ! ( P − 2 k 2 )! Figure 5. Illustrates probabilities for the key values
74 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

5. Connectivity discussion
Fig.5 Probability that two nodes containing key rings of Let p denote the probability for two neighboring nodes
differing sizes share exactly i keys is for the above Eq. sharing at least one key. To achieve a high connectivity, we
Rest all probabilities we left out because the value vanishes need to increase P(pool).
as the size of distance increases .Thus in the cluster C 3 the Fig.2. illustrates probability for connectivity for the above
function for various values of (P, k) under our proposed
probability value is very low as compared to the cluster C 2
scheme, the key pre distribution scheme. One can see that as
and C 1 . the pool size increases the probability values increases (For
Pool size 10,000, probability is 0.9989 for different keys).
As the size of the pool become larger, the number of key
Table 1: probability of connectivity of node with its cluster requirement increases. The proposed scheme offers a much
header has been calculated for key ring size and having better resilience property while requiring a much smaller
common keys between the nodes and cluster headers. key ring size when compared with Eschenauer and Gligor’s.
Probability is calculated at one-hop, two –hop and three-hop
distances for the key values. Similarly, Fig.3.shows that the probability of key sharing
among nodes and cluster header increases by a very little
Node Probability of Probability of Probability of increase in the number of preloaded keys in nodes. If
connectivity connectivity of connectivity of preloaded keys are increased from 20 to 50, the key sharing
connectivity
of node with node with its node with its
its cluster cluster head probability increases from 0.5 to 0.8 approximately, for 120
cluster head
head. key ring size.
cluster(C 2 )
cluster(C 1 ) cluster(C 3 ) The probability calculated between the nodes and the cluster
Node directly header is calculated for various values of (P, k , m ). Keys
connected to 0.9887 0.8948 0.7998 are drawn from the pool at different levels. In Fig.3 we
header illustrate the probability between the nodes and the cluster
Node connected
by one
header for sharing a common key.
0.9809 0.8372 0.6885
intermediate
node According to the proposed scheme, there are several nodes
Node connected and cluster headers. As discussed in section. 4. The sensor
by two 0.7928 0.6821 0.4763 nodes in the clusters are classified into one-hop neighbors,
intermediate
nodes(2-hop) 2-hop neighbors and 3-hop neighbors depending on how
Node connected they share keys with the cluster headers. The probability that
by three 0.6099
intermediate
0.4499 0.3098 for one-hop neighbor for the cluster is given in Eq.2.
nodes (3-hop To be a 2-hop neighbor, a node should share at least one key
with the with one two nodes being able two establish a
secured link is at p=0.3329.Thus we conclude that the
4.4. Probability of node in cluster header with a node in probability that two nodes and cluster header are within
another cluster header. range can communicate via a 1-hop,2-hop and 3-hop and for
other values we consider that the range vanishes.
Any node in cluster can have connection with any other The probability range is high when at least one common key
node in another cluster. The probability of connectivity is is shared between node and cluster header. Probability range
obtained as described below. We introduce some notations decreases as number of keys increases. Probability is lesser
the purpose only. for sharing at least two common keys between node and
cluster header and much lesser probability for sharing at
n 0 : originating node. least four common keys.

nd : destination node. References


C 0 : originating cluster. [1].Camtepe, S.A.; Yener, B.” Key Distribution Mechanisms
for Wireless Sensor Networks: a Survey; “Technical
C d : destination cluster. Report TR-05-07; Department of Computer Science,
p o : Probability of connectivity between n 0 and C 0 . Rensselaer Polytechnic Institute:Troy, NY, USA, March
2005.
p d : Probability of connectivity between C d and n d . [2] Y. Cheng and D. P. Agrawal. “Efficient pairwise key
p ( n 0 is connected to n d )= probability ( n 0 connected to establishment and management in static wireless sensor
networks.” In Second IEEE International Conference on
C 0 ) . Probability (C 0 connected to C d ). Probability (C d is Mobile ad hoc and Sensor Systems, 2005.
[3] H. Chan, A. Perrig, and D. Song, “Random key
connected to n d ) = p o .1. p d
predistribution schemes for sensor networks” , In IEEE
Since C 0 and C d have secured connectivity with probability Symposium on Security and Privacy, Berkeley,
1. California, May 11-14 2003, pp. 197-213.
[4] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E.
Cayirci, “A survey on sensor networks,” IEEE
(IJCNS) International Journal of Computer and Network Security, 75
Vol. 2, No. 8, August 2010

Communications Magazine, vol. 40, pp. 102 – 114, Transactions on Information and System Security,
August 2002. 8(1):41–77, 2005.
[5] L Eschenauer, V. D. Gligor. “A key-management scheme
for distributed sensor networks” , In Proceedings of the Authors Profile
9th ACM Conference on Computer and
Communications Security, Washington, DC, USA, Prof. G. N. Purohit is a Professor in
November 18-22 2002, pp. 41-47. Department of Mathematics & Statistics
[6] D. Huang, M. Mehta, D. Medhi, and H. Lein, “Location- at Banasthali University (Rajasthan).
aware key management scheme for wireless sensor Before joining Banasthali University, he
networks,” in Proceedings of ACM Workshop on was Professor and Head of the
Security of Ad Hoc and Sensor Networks (SASN ’04), Department of Mathematics, University
October 2004, pp. 29–42. of Rajasthan, Jaipur. He had been
[7] M. Mehta, D. Huang, and L. Harn, “RINK-RKP: “A Chief-editor of a research journal and regular reviewer of
scheme for key pre distribution and shared-key many journals. His present interest is in O.R., Discrete
discovery in sensor networks,” in Proceedings of 24th Mathematics and Communication networks. He has
IEEE International Performance Computing and published around 40 research papers in various journals.
Communications Conference, 2005.
[8] X. Du, Y. Xiao, M. Guizani, and H.-H. Chen. “An Asmita Singh Rawat received the BSc
effective key management scheme for heterogeneous degree from University Of Lucknow and
sensor networks” . Ad Hoc Networks, 5(1):24–34, 2007. M.C.A degree from U.P Technical
[9] W. R. Heinzelman, A. Chandrakasan, and H. University in 2006 and 2009,
Balakrishnan. “ Energy-efficient communication respectively. She is currently working
protocol for wireless microsensor networks” . In IEEE towards a PhD degree in computer
Hawaii Int. Conf. on System Sciences, pages 4–7, 2000. Science at the Banasthali University of
[10] K. Lu, Y. Qian, and J. Hu. “A framework for Rajasthan. Her research interests include wireless sensor
distributed key management schemes in heterogeneous network security with a focus on the elliptic curve
wireless sensor networks.” In IEEE International cryptography.
Performance Computing and Communications
Conference, pages 513–519, 2006.
[11] S. Zhu, S. Xu, S. Setia, and S. Jajodia. “Establishing
pairwise keys for secure communication in ad hoc for
wireless microsensor networks” . In IEEE Hawaii Int.
Conf. on System Sciences, pages 4–7, 2000.
[12] L. B. Oliveira, H. C. Wong, M. Bern, R. Dahab, and
A. A. F. Loureiro. Sec leach: “ A random key
distribution solution for securing clustered sensor
networks.” In 5th IEEE international symposium on
network computing and applications, pages 145–154,
2006.
[13] K. Ren, K. Zeng, and W. Lou. “A new approach for
random key pre-distribution in largescale wireless
sensor networks.” Wireless communication and mobile
computing, 6(3):307– 318, 2006.
[14].Traynor P, Kumar R, Bin Saad H, Cao G, La Porta T
(2006) Establishing pair-wise keys in heterogeneous
sensor networks. In: INFOCOM 2006. 25th IEEE
international conference on computer communications.
Proceedings, pp 1–12.
[15]. W. Du, J. Deng, Y. S. Han, and P. K. Varshney, “ A
pairwise key predistribution scheme for wireless sensor
networks,” in Proceedings of the 10th ACM Conference
on Computer and Communications Security (CCS),
Washington, DC, USA, October 27-31 2003, pp. 42–51.
[16] D. Liu and P. Ning, “Establishing pairwise keys in
distributed sensor networks,” in Proceedings of the 10th
ACM Conference on Computer and Communications
Security (CCS), Washington, DC, USA, October 27-31
2003, pp. 52–61.
[17]. D. Liu, P. Ning, and R. Li. “Establishing Pairwise
Keys in Distributed Sensor Networks.” ACM
76 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

Performance Evaluation of MANET Routing


Protocols Under Black Hole Attack
M.Umaparvathi1 and Dr. Dharmishtan K Varughese2
1
SNS College of Engineering, Coimbaotre, India
parvathicbe@yahoo.co.in
2
Professor, Karpagam College of Engineering, Coimbatore, India

Abstract:- Mobile Ad hoc Networks (MANETs) are open to a In a mobile ad hoc network, all the nodes co-
wide range of attacks due to their unique characteristics like operate amongst each other to forward the packets in the
dynamic topology, shared medium, absence of infrastructure, network and hence, each node is effectively a router. Several
multi-hop scenario and resource constraints. In such a network, routing protocols have been proposed for ad hoc networks.
each mobile node operates not only as a host but also as a The protocols AODV and AOMDV are the on-demand
router, forwarding packets for other nodes that may not be routing protocols, in which the protocols discover routes as
within direct wireless transmission range of each other. Thus,
needed. Due to the inherent characteristics of dynamic
nodes must discover and maintain routes to other nodes. . Data
packets sent by a source node may be reached to destination
topology and lack of centralized management, MANET is
node via a number of intermediate nodes. In the absence of a vulnerable to various kinds of attacks [1]. One such attack is
security mechanism, it is easy for an intermediate node to insert, the Black hole attack. In this attack, a malicious node sends
intercept or modify the messages thus attacking the normal a forged Route REPly (RREP) packet to a source node that
operation of MANET routing. One such attack is Black hole initiates the route discovery in order to pretend to be a
attack. Black hole is a type of routing attack where a malicious destination node. Use of reply from an intermediate node
node advertise itself as having the shortest path to all nodes in rather than the destination reduces the route establishment
the environment by sending fake route reply. By doing this, the time and also the control traffic in the network. This,
malicious node can attract the traffic from the source nodes. however, leads to vulnerabilities such as black holes [2].
And then all the packets will be dropped. This paper analyzes the Sequence numbers used in RREP messages serve as time
performance evaluation of Ad hoc on-demand Distance Vector
stamps and allow nodes to compare how fresh their
(AODV) and its multipath variant Ad hoc On-demand Multi-
information on the other node is. When a node sends any
path Distance Vector (AOMDV) routing protocols under black
hole attack. Their performances were evaluated through type of routing control message, RREQ, RREP etc., it
simulations using network simulator (NS-2). The performance increases its own sequence number. Higher sequence
of these two protocols were analyzed and compared based on number is assumed to be more accurate information and
packet delivery ratio (%), throughput (kbps), average end to end whichever node sends the highest sequence number, its
delay (ms), and average jitter (ms). information is considered most up to date and route is
established over this node by the other nodes.
This paper analyses the effect of black hole attack
Keywords: MANET, Black hole attack, AODV, AOMDV on the reactive routing protocol, AODV and its variant
AOMDV via simulation. The paper is organized as follows:
1. Introduction Section 2 describes the background of the protocol AODV,
section 3 describes the multipath on-demand routing
Mobile ad hoc networks consist of a collection of wireless
protocol AOMDV, and section 4 discusses the
mobile nodes which dynamically exchange data among
characteristics of black hole attack. Section 5 analyses the
themselves without the reliance on a fixed base station or a
effects of black hole attack in the two routing protocols
wired backbone network. These nodes generally have a
AODV and AOMDV through simulations followed by
limited transmission range and so, each node seeks the
conclusions in section 6.
assistance of its neighboring nodes in forwarding packets
and hence the nodes in an ad-hoc network can act as both
routers and hosts, thus a node may forward packets between 2. AODV Routing Protocol
other nodes as well as run user applications. MANETs have Ad-hoc On-Demand Distance Vector (AODV) [3] is a
potential use in a wide variety of disparate situations. Such reactive routing protocol in which the network generates
situations include moving battle field communications to routes at the start of communication. AODV uses traditional
disposable sensors which are dropped from high altitudes routing tables. This means that for each destination exist
and dispersed on the ground for hazardous materials one entry in routing table and uses sequence number, that
detection. Civilian applications include simple scenarios this number ensure the freshness of routes and guarantee the
such as people at a conference in a hotel where their laptops loop-free routing. It uses control messages such as Route
comprise a temporary MANET to more complicated Request (RREQ), and Route Reply (RREP) for establishing
scenarios such as highly mobile vehicles on the highway a path from the source to the destination. When the source
which form an ad hoc network in order to provide vehicular node wants to make a connection with the destination node,
traffic management. it broadcasts an RREQ message. This RREQ message is
(IJCNS) International Journal of Computer and Network Security, 77
Vol. 2, No. 8, August 2010

propagated for the source, and received by neighbors of the The performance study of AOMDV relative to AODV
source node. Then these nodes broadcast the RREQ message under a wide range of mobility and traffic scenarios reveals
tot heir neighbors. that AOMDV offers a significant reduction in delay, often
This process goes on until the packet is received by more than a factor of two. It also provides reduction in the
destination node or an intermediate node that has a fresh routing load and the end to end delay.
enough means that the intermediate has a valid route to the
destination established earlier than a time period set as a 4. Black Hole Attack
threshold. Use of reply from an intermediate node rather
than the destination reduces the route establishment time In black hole attack, a malicious node injects false route
and also the control traffic in the network. This, however, replies to the route requests it receives advertising itself as
leads to vulnerabilities such as black holes [2]. Sequence having the shortest path to a destination [6]. These fake
numbers used in RREP messages serve as time stamps and replies can be fabricated to divert network traffic through
allow nodes to compare how fresh their information on the the malicious node for eavesdropping, or simply to attract
other node is. When a node sends any type of routing all traffic to it in order to perform a denial of service attack
control message, RREQ, RREP etc., it increases its own by dropping the received packets.
sequence number. Higher sequence number is assumed to be In AODV, the sequence number is used to determine
more accurate information and whichever node sends the the freshness of routing information contained in the
highest sequence number, its information is considered most message from the originating node. When generating RREP
up to date and route is established over this node by the message, a destination node compares its current sequence
other nodes. number, and the sequence number in the RREQ packet plus
one, and then selects the larger one as RREPs sequence
number. Upon receiving a number of RREP, the source node
3. Overview of AOMDV
selects the one with greatest sequence number in order to
The main idea in AOMDV [5] is to compute multiple paths construct a route. But, in the presence of black hole [8]
during route discovery. It is designed primarily for highly when a source node broadcasts the RREQ message for any
dynamic ad hoc networks where link failures and route destination, the black hole node immediately responds with
breaks occur frequently. When single path on-demand an RREP message that includes the highest sequence
routing protocol such as AODV is used in such networks, a number and this message is perceived as if it is coming from
new route discovery is needed in response to every route the destination or from a node which has a fresh enough
break. Each route discovery is associated with high overhead route to the destination. The source assumes that the
and latency. This inefficiency can be avoided by having destination is behind the black hole and discards the other
multiple redundant paths available. Now, a new route RREP packets coming from the other nodes. The source
discovery is needed only when all paths to the destination then starts to send out its packets to the black hole trusting
break. To keep track of multiple routes, the routing entries that these packets will reach the destination. Thus the black
for each destination contain a list of the next-hops along hole will attract all the packets from the source and instead
with the corresponding hop counts. All the next hops have of forwarding those packets to the destination it will simply
the same sequence number. For each destination, a node discard those [9]. Thus the packets attracted by the black
maintains the advertised hop count, which is defined as the hole node will not reach the destination.
maximum hop count for all the paths. This is the hop count
used for sending route advertisements of the destination. 5. Simulation Methodology
Each duplicate route advertisement received by a node
defines an alternate path to the destination. To ensure loop The performances of AOMDV and AODV routing protocols
freedom, a node only accepts an alternate path to the under the presence of black holes were evaluated using NS-2
destination if it has a lower hop count than the advertised simulator. The simulations have been carried out under a
hop count for that destination. AOMDV can be used to find wide range of mobility and traffic scenarios. The goal is to
node-disjoint or link-disjoint routes. To find node-disjoint study how AOMDV outperforms with AODV, particularly
routes, each node does not immediately reject duplicate in terms of end-to-end delay, jitter, through-put and packet
RREQs. Each RREQ arriving via a different neighbor of the delivery ratio.
source defines a node-disjoint path. This is because nodes
cannot broadcast duplicate RREQs, so any two RREQs 5.1. Network Simulator
arriving at an intermediate node via a different neighbor of The entire simulations were carried out using NS-2.34
the source could not have traversed the same node. In an network simulator which is a discrete event driven simulator
attempt to get multiple link-disjoint routes, the destination developed at UC Berkeley [4] as a part of the VINT project.
replies to duplicate RREQs regardless of their first hop. To The goal of NS-2 is to support research and education in
ensure link-disjoint ness in the first hop of the RREP, the networking. It is suitable for designing new protocols,
destination only replies to RREQs arriving via unique comparing different protocols and traffic evaluations. NS2 is
neighbors. After the first hop, the RREPs follow the reverse developed as a collaborative environment. It is distributed as
paths, which are node disjoint and thus link-disjoint. The open source software. The propagation model used in this
trajectories of each RREP may intersect at an intermediate simulation study is based on the two-ray ground reflection
node, but each takes a different reverse path to the source to model. The simulation also includes an accurate model of
ensure link-disjoint ness. the IEEE802.11 Distributed Coordination Function (DCF)
wireless MAC protocol.
78 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

Here the black hole attack takes place after the


Packet Delivery ratio
attacking node receives RREQ for the estimation node that
it is going to impersonate. To succeed in the black hole 100
attack, the attacker must generate its RREP with sequence 80
number greater than the sequence number of the destination

PDR(%)
60 AODV
[6]. Upon receiving RREQ, the attacker set the sequence
AOMDV
number of REP as a very high number, so that the attacker 40

node can always attract all the data packets from the source 20
and then drop the packets [7]. 0
For the performance analysis of the network, a 0 1 2 3 4 5
regular well-behaved AODV network [AODV] was used as Number of block holes
a reference. Then black holes were introduced into the
network. Simulations were carried out for the MANET with Figure 2. Comparison of Packet Delivery ratio
one and more black holes. Then using the same set of
scenarios, the simulation was carried out with the variant
protocol AOMDV. The simulation parameters are tabulated Throughput - AODV & AOMDV
in Table 1.
10

Table 1: Simulation Parameters 8

Throuphput
Parameter Value 6 AODV
Simulator NS-2 (ver 2.34) AOMDV
4

2
Simulation Time 500 sec
Number of mobile nodes 50 0
0 1 2 3 4 5
Topology 1000 m X 1000 m
Number of black holes
Transmission range 250 m
Routing Protocol AODV & AOMDV Figure 3. Comparison of Throughput
Maximum bandwidth 1Mbps
Traffic Constant Bit Rate
Maximum Speed
5 m/s End-to-end Delay - AODV & AOMDV
Source destination pairs 22
350
The sample screen shot of a scenario of 50 mobile nodes 300
End-to-end Delay

with five black holes is shown in the figure Fig.1. 250


200 AODV
150 AOMDV
100
50
0
0 1 2 3 4 5
Number of black holes

Figure 4. Comparison of End-to-end Delay

Jitter - AODV & AOMDV

160
Figure 1. Sample simulation scenario with 5 black holes 140
120
The following figures show the performance comparison of 100
Jitter

the two routing protocols AODV and AOMDV based on the AODV
80
AOMDV
routing parameters packet delivery ratio, average 60
throughput, average delay and average jitter. 40
20
0
0 1 2 3 4 5
Number of black holes

Figure 5. Comparison of Average Jitter


(IJCNS) International Journal of Computer and Network Security, 79
Vol. 2, No. 8, August 2010

The performance study of AOMDV relative to AODV under [9] Deng, H., Li, W., Agrawal, D., “Routing Security in
a wide range of mobility and traffic scenarios reveals that Wireless Ad Hoc Networks” IEEE Communication
AOMDV offers better through-put, better packet delivery Magazine (October 2002) pp. 70-75.
ratio, reduction in jitter and significant reduction in delay
even with the presence of black hole nodes. Authors Profile

6. Conclusion Ms.Umaparvathi completed her


B.E.(ECE) from Madras University in
This paper analyses the effect of black hole in an AODV the year 1995. She completed her
and AOMDV network. For this purpose, a MANET with M.Tech (Communication Systems)
AODV and AOMDV routing protocol with black holes were from NIT, Trichirapalli in the year 2005.
implemented in NS-2. Using fifteen different scenarios each Currently she is doing Ph.D in
with 50 nodes and with five different speeds, the parameters Anna University of Technology, Coimbatore. Her research
packet delivery ratio, throughput, end-to-end delay and jitter interests are wireless networks, Information security and
were measured. Having simulated, it is seen that, the packet Digital Signal Processing.
loss is more with the presence of black hole in AODV than
in AOMDV. And also AOMDV produced more throughput, Dr. Dharmishtan K Varughese
less end-to end-delay and jitter when compared with AODV. completed his B.Sc.(Engg.) from College
In general, AOMDV always offers a superior immune of Engineering, Trivandrum in the year
routing performance against black hole attack than AODV 1972. He completed his M.Sc.(Engg.)
from College of Engineering,
in a variety of mobility and traffic conditions. Thus, it is
Trivandrum in the year 1981. He
better to consider the defense mechanism against the black completed his Ph.D from
hole attack in MANET based on AOMDV than that of
AODV. Indian Institute of Science, Bangalore in the year 1988. He
was working as Senior Joint Director from the year 2003 to
2007. Currently he is working as a Professor in Karpagam
References College of Engineering, Coimbatore. His research interests
[1] Y.C.Hu and A.Perrig, “A survey of secure wireless ad are Microstrip Antennas, Microwave Theory, Information
Theory and Optical fiber Communication.
hoc routing,” IEEE Security &Privacy Magazine, vol.2,
no.3, pp. 28-39, May/June 2004.
[2] Y.A. Huang and W.Lee, “Attack analysis and detection
for ad hoc routing protocols,” in Proceedings of 7th
International Symposium on Recent Advances in
Intrusion Detection (RAID’04), pp. 125-145, French
Riviera, Sept. 2004.
[3] Perkins CE, Belding-Royer E, Das SR. Ad hoc on-
demand distance vector (AODV) routing.
http://www.ietf.org/rfc/rfc3561.txt, July 2003. RFC
3561.
.[4] The Network Simulator, NS-2, Available from
www. isi.edu/ nsnam/ ns.
[5] Mahesh K. Marina and Samir R. Das, "On- Demand
Multipath Distance Vector Routing in Ad Hoc
Networks", in proceedings of 9th IEEE International
Conference on Network Protocols, 11- 14 November
2001, pp: 14- 23.
[6] Shideh Saraeian, Fazllolah Adibniya, Mohammed
GhasemZadeh and SeyedAzim Abtahi, “Performance
Evaluation of AODV Protocol under DDoS Attacks in
MANET,” in the Proceedings of World Academy of
Science, Engineering and Technology, Volume 33,
September 2008, ISSN 2070-3740.
[7] Dokurer, S.; Ert, Y.M.; Acar, C.E., “Performance
analysis of ad hoc networks under black hole attacks,”
In the proceedings of IEEE SoutheastCon 2007, 22-25
March 2007 Page(s):148 – 153 D.O.I 10.1109 /
SECON.2007.342872.
[8] Mohammad Al-Shurman and Seong-Moo Yoo, Seungjin
Park, “Black hole Attack in Mobile Ad Hoc Networks”
Proceedings of the 42nd annual Southeast regional
conference ACM-SE 42, APRIL 2004, pp. 96-97.
80 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

A Survey and Comparison of Various Routing


Protocols of Wireless Sensor Network (WSN) and a
Proposed New TTDD Protocol Based on LEACH
Md. Habibe Azam1, Abdullah-Al-Nahid2, Md. Abdul Alim3, Md. Ziaul Amin4
1
Khulna University, School of Science, Engineering and technology
Electronics and Communication Engineering Discipline
Bangladesh
hakki.ece06@gmail.com
2
Khulna University, School of Science, Engineering and technology
Lecturer, Electronics and Communication Engineering Discipline
Bangladesh
nahidku@yahoo.com
3
Khulna University, School of Science, Engineering and technology
Assistant Professor, Electronics and Communication Engineering Discipline
Bangladesh
alim.ece@gmail.com
4
Khulna University, School of Science, Engineering and technology
Lecturer, Electronics and Communication Engineering Discipline
Bangladesh
ziaulece@yahoo.com

Abstract: In wireless sensor network, the lifetime of a sensor efficient routing protocol. Moreover, our proposed new
node depends on its battery. By energy efficient routing protocol, TTDD protocol will save the lifetime of the sensing node.
it can increase the lifetime of the network by minimizing the The reminder of this paper is organized as follows, in
energy consumption of each sensor node. Some energy efficient section 2, we briefly discuss the selected protocols, among
protocols have been developed for this purpose. Among those, which we have done the survey and made the comparative
we have made a survey on TTDD, LEACH, PEGASIS, SPIN and list. Section 3 represents the comparative list. we introduce
TEEN on the basis of some basis of some important evaluation
my proposed new TTDD protocol and its advantage in
matrix. Beside this, in this paper we have proposed new Two
Tier Data Dissemination (TTDD) based on LEACH.
section 4. Finally, concluding remarks are given in section
5.
Keywords: WSN, Cluster, Protocol, TTDD, LEACH.
2. Selected Protocols
1. Introduction
2.1 TTDD
Wireless sensor network (WSN) [1], [6] consists of small Two-Tier Data Dissemination (TTDD) approach is used to
tiny devices called sensor nodes distributed autonomously to address the multiple mobile sink problems. TTDD design
monitor physical or environmental conditions at different uses a grid structure so that only sensors located at grid
locations. These sensor nodes sense data in the environment points need to acquire the forwarding information such as
surrounding them and transmit the sensed data to the sink query and data [2]. When a node sense an event than the
or the base station. To transmit sensed data to the base source node proactively forms a grid structure throughout
station affects the power usage of sensor node. Typically, the sensor field and sets up the forwarding information at
wireless sensor network (WSN) contain a large number of the sensors closest to grid points. After forming this grid
sensor nodes and these sensor nodes have the ability to structure, a query from a sink traverses two tiers to reach a
communicate with either among each other or directly to the source. The lower tier is within the local grid square of the
base station. For this reason energy plays a vital role in sink's current location and the higher tier is made of all the
WSN and as much as possible less consumption of energy of dissemination nodes at grid points from source to sink. The
each node is an important goal that must be considered sink floods its query within a cell. Fig.1 shows the total
when designing a routing protocol for WSN. procedure.
Many routing protocol have been developed for this It is assumed that in TTDD’s design sensor nodes are
purpose. In this paper, we have made a survey among some both stationary and location-aware. For the static sensor’s
selected protocols and made a comparative list of those locations TTDD can use simple greedy geographical
protocols which will help to develop the new energy forwarding to construct and maintain the grid structure with
low overhead and their locations awareness TTDD can tag
the sensing data [3], [4], [5].
(IJCNS) International Journal of Computer and Network Security, 81
Vol. 2, No. 8, August 2010

Set-up Phase
CH selection is done by considering two factors. First, the
desired percentage of nodes in the network and second the
history of node that has served as CH. This decision is made
by each node n based on the random number (between 0 and
1) generated. If the generated random number is less than a
threshold value T (n), then the corresponding nodes
becomes CH for that round. The threshold value T (n) is
calculated from equation 1as

(1)

Figure 1. TTDD protocol.


Where P is the desired percentage of cluster-head, r is the
When a sink moves more than a cell size away from its number of round and G is the set of nodes that have not
previous location, it performs another local flooding of data been cluster-heads in the last 1/P rounds. Nodes that have
query which will reach a new dissemination node. Along its been cluster heads cannot become cluster heads again for P
way toward the source this query will stop at a rounds. Thereafter, each node has a 1/p probability of
dissemination node that is already receiving data from the becoming a cluster head in each round. In the following
source. This dissemination node then forwards data advertisement phase, the CHs inform their neighborhood
downstream and finally to the sink. In this way, even when with an advertisement packet that they become CHs. Non-
sinks move continuously, higher-tier data forwarding CH nodes pick the advertisement packet with the strongest
changes incrementally and the sinks can receive data received signal strength.
without interruption. Thus TTDD can effectively scale to a In the next cluster setup phase, the member nodes inform
large number of sources and sinks. the CH that they become a member to that cluster with "join
packet" contains their IDs using CSMA. After the cluster-
2.2 LEACH setup sub phase, the CH knows the number of member
Low Energy Adaptive Clustering Hierarchy (LEACH) is the nodes and their IDs. Based on all messages received within
first hierarchical cluster-based routing protocol for wireless the cluster, the CH creates a TDMA schedule, pick a CSMA
sensor network. In LEACH the nodes are partitions into code randomly, and broadcast the TDMA table to cluster
clusters and in each cluster there is a dedicated node with members. After that steady-state phase begins.
extra privileges called Cluster Head (CH). This CH creates
and manipulates a TDMA (Time division multiple access) Steady-state phase
schedule for the other nodes (cluster member) of that Nodes send their data during their allocated TDMA slot to
cluster. Those CHs aggregate and compress the sensing data the CH. This transmission uses a minimal amount of energy
and send to base Station (BS) [7]. Thus it extends the (chosen based on the received strength of the CH
lifetime of major nodes as shown in Fig. 2. advertisement). The radio of each non-CH node can be
turned off until the nodes allocated TDMA slot, thus
Base Station minimizing energy dissipation in these nodes. When all the
Cluster-head data has been received, the CH aggregate these data and
send it to the Base Station (BS).
LEACH is able to perform local aggregation data in each
Cluster cluster to reduce the amount of data that transmitted to the
Cluster member BS.

2.3 PEGASIS
Power Efficient Gathering in Sensor Information System
(PEGASIS) is an energy efficient protocol and it is
guaranteed by two characteristics [8], only one node
Figure 2. LEACH protocol. communicates at a time with the base station, and the rest of
the nodes communicate locally only with their neighbours.
This protocol is divided into rounds [6]; each round consists Each node communicates only with the closest neighbour by
of two phases. adjusting its power signal. By using signal strength, each
Set-up Phase node measure the distance to neighbourhood nodes in order
(1) Advertisement Phase to locate the closest nodes. After chain formation PEGASIS
(2) Cluster Set-up Phase elects a leader from the chain in terms of residual energy in
Steady-state Phase every round. The leader collects data from the neighbours to
(1) Schedule Creation transmit to the base station. For this reason, the average
(2) Data Transmission energy spent by each node per round is reduced. Unlike
LEACH, PEGASIS avoids cluster formation and uses only
82 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

one leader in a chain to transmit to the BS instead of The nodes sense their environment continuously. The first
multiple CHs. This approach reduces the overhead and time a parameter from the attribute set reaches its hard
lowers the bandwidth requirements from the BS. Fig. 3 threshold value; the node switches on its transmitter and
shows that only one leader node forward the data to the BS. sends the sensed data. The sensed value is stored in an
internal variable in the node, called the sensed value (SV).
The nodes will next transmit data in the current cluster
period, only when both the following conditions are true.
1. The current value of the sensed attribute is greater
than the hard threshold.
2. The current value of the sensed attribute differs from
SV by an amount equal to or greater than the soft
threshold.
Figure 3. PEGASIS protocol. Whenever a node transmits data, SV is set equal to the
current value of the sensed attribute. Thus, the hard
2.4 SPIN threshold tries to reduce the number of transmissions by
Sensor Protocol for Information via Negotiation (SPIN) [7] allowing the nodes to transmit only when the sensed
is one of the first data-centric dissemination protocols for attribute is in the range of interest. The soft threshold
wireless network. The target scenario is a network where further reduces the number of transmissions by eliminating
one, several, or possibly all nodes have data that should be all the transmissions which might have otherwise occurred
disseminated to the entire network. when there is little or no change in the sensed attribute once
This negotiation replaces the simple sending of data in a the hard threshold.
flooding protocol by a three step process. First, a node that 3. Comparison
has obtained new data either by local measurements or from
some other nodes, advertises the name of this data to its In this section we present the comparison among the above
neighbours. Second, the receiver of the advertisement can protocol based on their various evaluation matrices [10].
compare it with its local knowledge and if the advertised
data is as yet unknown, the receiver can request the actual Table 1. Comparison among the protocols
data. If the advertisement describes already known data (for
example, because it has been received via another path or Data
Routing Power Scala- Query Over
another node has already reported data about the same area), Aggre-
Protocol Usage bility Based hade
the advertisement is simply ignored. Third, only once a gation
request for data is received, the actual data is transmitted. TTDD Ltd No Ltd Yes Low
Fig. 4 represents the working procedure of SPIN protocol.
LEACH High Yes Good No High

PEGASIS Max No Good No Low

SPIN Ltd Yes Ltd Yes Low

TEEN High Yes Good No High

4. Proposed new TTDD

Main Features
Our proposed routing protocol includes the following
features:
Figure 4. SPIN protocol. • Sensor nodes are homogeneous and energy
constrained.
2.5 TEEN • Sensor nodes are stationary, the BS is mobile and
Threshold sensitive Energy Efficient sensor Network located near from the sensing area.
protocol (TEEN) [9] is targeted at reactive networks. In this • Each node periodically senses its nearby
scheme, at every cluster change time, in addition to the environment and would like to send its data to the
attributes, the cluster-head broadcasts to its members. base station.
Hard Threshold (HT): This is a threshold value for the • A server is used for building a location database of
sensed attribute. It is the absolute value of the attribute sensor node.
beyond which, the node sensing this value must switch on • At first the total area is divided into grid when a
its transmitter and report to its CH. node senses any event and then there form a cluster
Soft Threshold (ST): This is a small change in the value keeping that node as CH.
of the sensed attribute which triggers the node to switch on • Data fusion or aggregation is used to reduce the
its transmitter and transmit. number of messages in the network. Assume that
(IJCNS) International Journal of Computer and Network Security, 83
Vol. 2, No. 8, August 2010

combining n packets of size k results in one packet


of size k instead of size nk. 5. Advantage
• Using TDMA, cluster sends their data to the CH.
Advantages of the proposed protocol
The routing process can be organized into two phases, • Lifetime of sensing node is greater than TTDD.
grid construction phase and cluster construction phase. • Node consumes less energy than TTDD by aggregating
the sensing data.
Grid Construction Phase • Data quality is batter than TTDD.
The sensing node builds a grid structure throughout the
6. Conclusion
sensor field. The grid size as R×R, where R is a sensor
node’s radio range. All sensor nodes in a grid are within
their radio range. It sets up the forwarding information at Every protocol has some advantages and disadvantages but
the sensors closest to grid points. The sink floods its query if we classify protocol according to their application and
within a cell. When the nearest dissemination node for the design those protocols only for specific purpose, then it will
requested data receives the query, it forwards the query to its be energy efficient otherwise not.
upstream dissemination node toward the source as like as
TTDD. References
This query forwarding process provides the information of [1] J. M. Kahn, R. H. Katz, and K. S. J. Pister, "Next
the path to the sink, to enable data from the source to Century challenges: Mobile networking for smart
traverse the query but in the reverse order. Location of all dust.”
grid point through which the data is disseminated for the [2] Haiyun Luo, Fan Ye, Jerry Cheng, Songwu Lu, Lixia
first time are stored in server. Zhang, “TTDD: A Two-tier Data Dissemination Model
for Large-scale Wireless Sensor Networks”, UCLA
Cluster Construction Phase computer science depertment, Los Angeles, CA
After the grid construction phase, server will receive the 900095-1596.
location of first time data dissemination grid point. Now the [3] S.Bassgni, “Distributed clustering for Ad Hoc
node which first create the grid structure, form cluster Networks”International Symposium on parallel
containing a cluster head whose role is considerably more Architechtures, Algorithms and Networks. (I-
energy intensive than the rest of the nodes. For this reason, SPAN’99).
nodes rotate roles between CH and ordinary sensor [4] J. Hightower and G. Borriello, “Location Systems for
throughout the lifetime of the network. At the beginning of Ubiquitous Computing”. IEEE Computer Magazine,
each round every node chooses a random number. If this 34(8):57{66, 2001}.
random number is less than calculated thresholds then the [5] A. Ward, A. Jones, and A. Hopper, “A New Location
node become a CH, else it does not (according to LEACH). Technique for the Active Oce”. IEEE Personal
Once a node becomes a CH, it cannot become a CH for a Communications, 4(5):42{47, 1997}.
certain number of rounds. The threshold value depends [6] Wendi Beth Heinzelman, “Application-specific
upon the percentage of nodes wanted as CH and the number protocol architechtures for wireless networks”
of rounds elapsed. Fig. 5 represents the total protocol. Massachusetts Institute of Tchnology, June, 2000.
[7] Mark A. Perillo and Wendi B. Heinzelman, “Wireless
Sensor Network Protocols”.
[8] Laiali Almzaydeh, Eman Abdelfattah, Manal Al-zoor
and Amer Al-Rahayfeh, “Performance evaluation of
routing protocols in wireless sensor networks”
International Journal of computer Science and
Information Techonology , Volume 2, Number 2, April
2010.
[9] Arati Manjeshwar and Dharma P.Agrawal, “TEEN: A
routing protocol for enhanced efficiency in wireless
networks” Uniersity of Cincinnati, Cincinnati, OH
45221-0030.
[10] P.T.V. Bhuvaneswari and V.Vaidehi, “Enhancement
techniques incorporated in LEACH- a survey”, Indian
Journal of Science and technology, Vol. 2, No. 5 (May
2009), ISSN: 0974-6846.

Figure 5. Proposed new TTDD based on LEACH.


84 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

Mathematical Models on Interaction between


Computer Virus and Antivirus Software inside a
Computer System
Bimal Kumar Mishra1 and Gholam Mursalin Ansari2

1
Department of Applied Mathematics
Birla Institute of Technology, Mersa, Ranchi, India – 835 215
Email: drbimalmishra@gmail.com
2
Department of Computer Science
University Polytechnic, Birla Institute of Technology, Mesra, Ranchi, India- 835 215
Email: rajasofti@gmail.com

Abstract: In this paper an attempt has been made to develop codes such as: Worm, Virus, Trojan etc., which differ
mathematical models on interaction between computer virus according to the way they attack computer systems and the
and antivirus software inside a computer system. The basic malicious actions they perform. Some of them erase hard
reproductive ratio in the absence and presence of the immune
disks; some others clog the network, while some others
system has been found and the criterion of spreading the
computer virus is analyzed in Models 1 and 2. An analysis is
sneak into the computer systems to steal away confidential
also made for the immune response to clear the infection. Effect and valuable information.
of new or updated antivirus software on such viruses which are A virus, worm or Trojan horse can (like HIV) be latent, only
suppressed (quarantine) or not completely recovered by the to become active after a certain period. This is called a 'logic
lower version of installed antivirus software in the system is bomb'. These three classes of computer malware can also
studied in model 3 and it has been shown that the number of have hundreds of variants or several slightly modified
infected files falls exponentially when new or updated antivirus
versions, with parallel microbial diversity [2, 9].
software is run. Reactivation of computer virus when they are in
the latent class is mathematically formulated and basic The study of computer malware may help to control
reproductive ratio is obtained in Model 4. A mathematical model infectious disease emergence. Among the two main
has also been developed to understand the recent attack of the approaches: behavioral and content-based to automate the
malicious object Backdoor.Haxdoor.S and Trojan. Schoeberl.E detection of malicious executable, a knowledge-based
and its removal by newly available tool FixSchoeb-Haxdoor in approach will be more appropriate, because we use the
Model 5. knowledge acquired from the disassembly of executables to
Keywords: Prey-predator model; Computer virus; antivirus
extract useful features like common instruction sequences,
software; quarantine; latency time; self-replication.
DLL calls etc. [11].
Conventional antivirus systems are knowledge-based, so if
1. Introduction
the system doesn't recognize a piece of code as malware, it
A year or two ago, most malware was spread via e-mail won't block it. If you let in a virus or a piece of malware, it
attachments, which resulted in mass outbreaks like Bagle, can run amok.
Mydoom and Warezov. Nowadays sending .EXE The vast majority of computer viruses have been designed
attachments in e-mail doesn't work so well for the criminals specifically for IBM-based PCs running the DOS and
because almost every company and organization is filtering Windows operating systems. The malicious code (machine
out such risky attachments from their e-mail traffic. language program) which has the ability to spread through
The criminals’ new preferred way of spreading malware is various sources may spread in any one or all of the
by drive-by downloads on the Web. These attacks often still following ways:
start with an e-mail spam run but the attachment in the e- • The spreading medium may be a malicious attachment
mail has been replaced by a web link, which takes you to the to an email
malicious web site. So instead of getting infected over • Malware medium may constitute a USB pen drive, a
SMTP, you get infected over HTTP. It is important to be floppy disk, a CD or any secondary media which is
aware of this shift from SMTP to HTTP infections, which commonly used by almost all computer professionals.
can be exploited by the criminals in many ways. It is An acute epidemic occurs due to infectious malcode
predicted that the total number of viruses and Trojans will designed to actively spread from host to host over a network.
pass the one million mark by the end of 2008 [12]. When the user executes an infected program, the virus may
Transmission of malicious objects in computer network is take control of the computer and infect additional files.
epidemic in nature. Malicious object is a code that infects After the virus completed its mischief, it would transfer
computer systems. There are different kinds of malicious control to the host program and allow it to function
(IJCNS) International Journal of Computer and Network Security, 85
Vol. 2, No. 8, August 2010

normally. This type of virus is called a “parasitic” computer models on the transmission of malicious objects in computer
virus, since it does not kill its host; instead, the host acts. network as per the spreading behaviors and nature of the
This malicious code when tries to enter into a protected malicious objects. Predicting virus outbreaks is extremely
(secured system) system installed with an Intrusion difficult due to human nature of the attacks but more
Detection System (IDS), it analyzes the unknown binary importantly, detecting outbreaks early with a low probability
code whether it is malicious or not. An IDS, enabled with of false alarms seems quiet difficult . By developing models
signature analysis and an ad-on security alarm is deployed it is possible to characterize essential properties of the
to monitor the network and host system activities [5, 6]. attacks [1].
IDS’s are supported by a knowledge-based evaluation
system to focus on real threatening alerts and assist in post 2. Basic Terminologies
attack forensics. The job done by such knowledge-based
i. Computer virus is a program that can "infect" other
systems is to filter out false positives and rank the severity of
programs by modifying them to include a possibly evolved
attacks. The Knowledge base stores all well known exploits
version of it. With this infection property, a virus can spread
and system vulnerability information together with the
to the transitive closure of information flow, corrupting the
corresponding security solutions. It tunes the IDS with the
integrity of information as it spreads. Additionally most
known signatures and sends the proper action to the
computer viruses have a destructive payload that is activated
Artificial Immune system (AIS). This AIS attempts to
under certain conditions [1]. Self replicating virus may be
classify network traffic as either self (normal file or
defined as “A software program capable of reproducing
uninfected file) or non-self (malicious or infected file) and
itself and usually capable of causing great harm to files or
provide a proactive protection via negative selection [7].
other programs on the same computer; "a true virus cannot
All the above information along with vulnerability
spread to another computer without human assistance.
knowledge is stored in an information asset database or
ii. Antivirus (or "anti-virus") software is a class of program
knowledge base. The intelligent host with proper anti-
that searches your hard drive and floppy disks for any
malicious installed on it then characterizes this vulnerability
known or potential viruses. This is also known as a "virus
identifications based on the evaluation process or actions.
scanner." As new viruses are discovered by the antivirus
The immune system dynamically looks for the security
vendor, their binary patterns are added to a signature
reference into the knowledge base. If the referred signature
database that is downloaded periodically to the user's
is found to be unknown or a high priority alert an associated
antivirus program via the web.
action is fired on the target system on the demand of its
iii. Quarantine: To move an undesired file such as a virus-
expert system engine. With great insight into the virus
infected file or spyware to a folder that is not easily
signature, the immune system disinfects the infected files
accessible by regular file management utilities. The
verifying the occurrence of the attack, or otherwise it issues
quarantine option is available in antivirus software so that
an isolated alert and quarantines the infected data into its
companies can keep a record of which users have been
blind spots. Therefore, by correlating these alerts, the
infected, where the file came from and to possibly send the
quarentined data is kept under a latency period. During this
virus to the antivirus vendor for inspection. Spyware
period the antivirus update is incorporated and finally, the
blockers quarantine files so that they can be restored.
data kept under latency is recovered to its original normal
form. Figure 1 describes a generic conceptual framework of
malware transmission through various sources and its 3. Development of the model
interaction with the Intrusion Detection System. As we know an instruction on its own does absolutely
nothing, it’s the set of instructions (program) developed by
software personnel intensely written to harm the computer
system said to be virus which plays an active role to attack
the files in the computer node. Some of the viruses have the
characteristic of self-replicating and some of them get enter
in the latent class and reactivate after certain duration.
When a system gets attacked by the virus, antivirus software
is run to immune the system. During this process some of
the infected files get fully recovered, whereas, some of them
are quarentined (or suppressed), may be due to the lower
version of the antivirus software installed. Then for this
situation a higher version or new antivirus software is run to
get a full recovery. We try to develop Mathematical models
for these situations [8].
Assumptions:
1. Virus is replicated by the infected files.
Figure 1. Virus attack cyber defense analysis 2. Viruses die at a specific rate b. Death of a virus
Mishra et al [1, 2, 9] has developed various epidemic equivalently mean to say the complete recovery of
86 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

infected files from virus when antivirus software is run The non-dimensionalisation for X arises from its steady
in the computer node for a specific session. state in the absence of infection, that for Y is chosen to be
3. The uninfected files are constantly being produced or the same, and that for V arises from its steady state value
developed by the users at a rate c. We choose one of the time scales τ to non-dimensionalise
4. Uninfected files die at a constant rate d (natural death). with d. The system (1) thus becomes
Death of a file equivalently mean to say that the file dv
become irrelevant (garbage) after a certain interval of ε = αy − v
time. dt
5. Infected files die at a specific rate f = e + d , where d dx
= 1 − x − R0 xv (4)
is natural death rate and e the death rate of the file (files dt
get damaged and unable to be recovered after the run of dy
antivirus software)due to infection from the virus. = R0 xv − αy
6. Death of antivirus software equivalently mean to say the dt
present version of the software is incapable of d f
Where ε = , α = (5)
identifying the attack of new viruses. b d
3.8 Model 1: Primary phase of an Infection For typical parameter values ε << 1.
Viruses get entry to the computer node via various means The steady states of the non-dimensionalised system (4)
(emails, infected disks etc.) and hijack various files are S0 = (0,1,0) , the uninfected steady state, and
(command files, executable files, kernel.dll, etc.) in the node S* = (v*, x*, y*) , where
for its own replication. It then leaves a specific file and the
1 1 1 1
process is repeated. Viruses may be of different nature and v* = 1 − , x* = , y* = (1 − )
as per their mode of propagation; they target different file R0 R0 α R0
types of the attacked computer for this purpose. (6)
As per the assumptions, the model is described by the
For R0 > 1 , the normal situation,
system
dV (v(t ), x(t ), y(t )) → (v*, x*, y*) as t → ∞ . The
= aY − bV
dτ susceptible population X (uninfected files) is reduced by the
attack until each virus is expected to give rise to exactly one
dX
= c − dX − βXV (1) new virus, R0 x* = 1 .
dτ This we assume as the primary phase of an infection.
dY
= βXV − fY 3.9 Model II: Secondary Phase of Infection (Effect of
dτ Immune system)
The relationship between the computer virus and uninfected
We assume the response of the immune in the computer
file is analogous to the relationship between predator and
system due to antivirus software Z which are run at a
prey as given in the classical work of Lotka-Volterra [3, 4].
constant rate g and h being the death rate of antivirus
Let X be the number of uninfected files (prey) and V be the
software (which mean to say that the antivirus software is
number of computer virus (predators) [8]. Then,
incapable to identify the attack of new viruses). The
{Rate of change of X}= {net rate of growth of X without
predation}-{rate of loss due of X to predation, and antivirus software cleans the infected files at a rate γYZ .
{Rate of change of V}= {net rate of growth of V due to There is an analogy here of Z antivirus software as predators
predation}-{net rate of loss of V without prey} and Y infected files as prey. We take linear functional
Let, R0 be the basic reproductive ratio for the computer response of Z to Y.
virus; defined to be the expected number of viruses that one Our system thus becomes
virus gives rise to an uninfected file population. A virus dV
= aY − bV
gives rise to infected files at a rate βX for a time 1 , and dτ
b
dX
each infected file gives rise to a virus(self-replication) at a = c − dX − βXV
rate a for a time 1 . Since X = c for an uninfected dτ
f d (7)
dY
population, = βXV − fY − γYZ

βca
R0 = (2) dZ
dbf = g − hZ

The criterion for the spread of the computer virus is R0 > 1 . The non-dimensionalisation of the system is done as what
We non-dimensionalise the system (1) by defining hZ
d d bf we have done in Model 1, with z = in addition, we get,
x = X , y = Y, v = V, t = dτ (3) g
c c ac
(IJCNS) International Journal of Computer and Network Security, 87
Vol. 2, No. 8, August 2010

dv We further assume that the half-life of the virus is much less


ε = αy − v than that of the virus producing files. Then,
dt
Y = Y0 e − ft
dx
= 1 − x − R0 xv
dt V0 (be − ft − fe −bt ) (14)
(8) V=
dy (b − f )
= R0 xv − αy − κyz From equation (14) we are able to say that the number of
dt
infected files falls exponentially. The behavior of V follows
dz
= λ (1 − z) from the assumption on half-lives, so that f << b , that is,
dt the amount of free virus falls exponentially after a shoulder
h γg phase.
Where λ = , κ = (9)
d dh 3.11 Model IV: Reactivation of computer virus after
The steady states of the non-dimensionalised system (8) they are in latent class
are S0 = (0,1,0,1) , the uninfected steady state, and When computer virus attacks the computer node, some of
S* = (v*, x*, y*, z*) , where them enter a latent class on their infection. While in this
class they do not produce new viruses, but may later be
α 1
v* = (1 − ' ) reactivated to do so. Only the files in the productive infected
α +κ R0 class Y1 produce viruses, and files at latent infected class Y2
1 leave for Y1 at a per capita rate δ. Thus our system becomes:
x* = dV
R0' (10) = aY1 − bV

1 1
y* = (1 − ' ) dX
= c − dX − βXV
α +κ R0 dτ
z* = 1 dY1
(15)
'
Let R0 be the basic reproductive ratio in the presence of the = q1 βXV − f1Y1 + δY2

immune system defined by
dY2
α = q 2 βXV − f 2Y2 − δY2
R0' = R0 (11) dτ
α +κ Infected files at class Y2 produce viruses in class Y1 at a rate
Then we observe that if the infection persists then R0 x = 1
'
1
δ for a time .Thus adding the contribution of both
and the infection persists as long as R > 1 . δ + f2
'
0
In order for the immune response to clear the infection we the classes, the reproductive ratio R0 is expressed as
need the immune response parameter κ to satisfy βc δ a
κ > α ( R0 − 1) (12) R0 = (q1 + q 2 ) (16)
db δ + f 2 f1
3.10 Model III: Effect of new antivirus software on
3.12 Model V: Recent Attack by malicious object
such viruses which are suppressed (quarantine)
Backdoor.Haxdoor.S and Trojan.Schoeberl.E
We assume a case where the viruses are not completely and its Mathematical approach
cleaned (quarantine) from the infected files on run of On January 9, 2007 Backdoor.Haxdoor.S and
installed antivirus software on the computer node. For the Trojan.Schoeberl.E malicious object of type Trojan Horse
complete recovery of infected files from viruses, updated having infection length of 56,058 bytes affected Windows
version of antivirus has to be run. Further we assume that 2000, Windows 95, Windows 98, Windows Me, Windows
such updated antivirus software is available and is 100% NT, Windows Server 2003, Windows XP.
efficient. This antivirus software switches β to zero and thus Backdoor.Haxdoor.S is a Trojan horse program that opens a
the equations for the subsequent dynamics of the infected back door on the compromised computer and allows a
files and free virus from equation (1) is expressed as remote attacker to have unauthorized access. It also logs
dV
= aY − bV keystrokes, steals passwords, and drops rootkits that run in
dτ safe mode.
It has been reported that the Trojan has been spammed
dX
= c − dX through email as an email attachment. The tool FixSchoeb-
dτ (13) Haxdoor.exe is designed to remove the infections of
dY Backdoor.Haxdoor.S and Trojan.Schoeberl.E. [10].
= − fY FixSchoeb-Haxdoor.exe tool meant to remove the deadly
dτ Backdoor.Haxdoor.S and Trojan.Schoeberl.E prevent
infected files from producing infectious virus. We assume
that W are the un- infectious virus which start to be
88 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

produced from the infected files Y after the tool FixSchoeb- a: Replicating factor
Haxdoor.exe is run. Infectious virus are still present, and die b: Death rate of a virus
as before, but are no longer produced. Under this c: Birth of uninfected files by users
assumption the system can be modeled as d: Natural Death of an uninfected file
dV e: Death rate of infected files
= −bV f=e+d
dτ β: Infectious contact rate, i.e., the rate of infection per
dX susceptible and per infective
= c − dX
dτ R0: Threshold parameter
(17) Z: Response of antivirus software, which immunes the
dY
= − fY system
dτ g: Rate at which antivirus software is run, which is constant
dW h: Death rate of antivirus software
= aY − bW γYZ : Rate at which antivirus software cleans the infected

We assume that the uninfected file population X remains files
roughly constant for a given time-scale, that is, κ : Immune response parameter
Y1: productive infected class
bf
X = X* = and that f << b System (17) becomes a Y2: latent infected class
aβ q1: Probability of entering productive infected class
linear system which is integrated to have q2: Probability of entering latent infected class
V=V0e−bτ
fe−bτ −be−fτ (18) References
Y=Y0 ,when f <<b
f −b [1] Bimal Kumar Mishra, D.K Saini, SEIRS epidemic
b b −fτ −bτ model with delay for transmission of malicious
W=W0 ( (e −e )− fτe−bτ ) objects in computer network, Applied Mathematics
b− f b− f
and Computation, 188 (2007) 1476-1482
From (18) it is clear that the total amount V + W of free [2] Bimal Kumar Mishra, Dinesh Saini, Mathematical
virus falls exponentially after a shoulder phase. models on computer viruses, Applied Mathematics and
Computation, 187 (2007) 929-936
4. Discussion and Conclusion [3] Lotka, A. J., Elements of Physical Biology, Williams
and Wilkins, Baltimore, 1925; Reissued as Elements of
The threshold parameter obtained in (2) for primary phase
Mathematical Biology, Dover, New York, 1956.
of infection discusses the criterion for the spread of the
[4] Volterra, V., Variazioni e fluttazioni del numero
computer virus, that is, R0 > 1 . The susceptible population d’individui in specie animali conviventi, Mem. Acad.
X (uninfected files) is reduced by the attack until each virus Sci. Lincei, 1926, 2:31-13
is expected to give rise to exactly one new virus, R0 x* = 1 . [5] Jones, A.K. and Sielken, R.S., Computer System
Intrusion detection: a survey, Technical report,
The basic reproductive ratio in the presence of the immune
Computer Science Department, University of Virginia,
system is defined by (11) and in order for the immune
2000
response to clear the infection we need the immune response
[6] Yu, J., Reddy, R., Selliah, S., Reddy, S., Bharadwaj, V.
parameter κ to satisfy κ > α ( R0 − 1) . For the viruses and Kankanahalli S., TRINETR: An Architecture for
which are quarentined by the installed antivirus software, Collaborative Intrusion Detection and Knowledge-
we assume that updated antivirus software is available and Based Alert Evaluation, In Advanced Engineering
is 100% efficient. When this updated antivirus software is Informatics Journal, Special Issue on Collaborative
run, from equation (14) we are able to say that the number Environments for Design and Manufacturing. Editor:
of infected files falls exponentially. The behavior of V Weiming Shen. Volume 19, Issue 2, April 2005.
follows from the assumption on half-lives, so that f << b , Elsevier Science, 93-101
that is, the amount of free virus falls exponentially after a [7] Jinqiao Yu, Y.V.Ramana Reddy , Sentil Selliah,
shoulder phase. Discussion is also made for those viruses Srinivas Kankanahalli, Sumitra Reddy and Vijayanand
which enter a latent class on their infection and in this class Bhardwaj, A Collaborative Architecture for Intrusion
they do not produce new viruses, but may later be Detection Systems with Intelligent Agents and
reactivated to do so. Infected files at class Y2 produce viruses Knowledge based alert Evaluation, In the Proceedings
of IEEE 8th International Conference on Computer
1
in class Y1 at a rate δ for a time and the Supported Cooperative work in Design, 2004, 2: 271-
δ + f2 276
reproductive ratio is also obtained. [8] Nicholas F. Britton, Essential Mathematical Biology,
Nomenclature Springer-Verlag, London, 2003
V: number of viruses in the computer [9] Bimal Kumar Mishra , Navnit Jha, Fixed period of
X: number of uninfected target files temporary immunity after run of anti-malicious objects
Y: number of infected files
(IJCNS) International Journal of Computer and Network Security, 89
Vol. 2, No. 8, August 2010

software on computer nodes, Applied Mathematics and


Computation, 190 (2007) 1207-1212
[10] http://www.symantec.com/smb/security_response/write
up.jsp?docid=2007-011109-2557-99
[11] Masud, Mohammad M., Khan, Latifur and
Thuraisingham, Bhavani, A Knowledge-based
Approach to detect new Malicious Executables. In the
proceedings of the Second Secure Knowledge
Management Workshop (SKM) 2006, Brooklyn, NY,
USA
[12] http://www.f-secure.com/f-
secure/pressroom/news/fsnews_20080331_1_eng.html,
March 31, 2008

Authors Profile

Bimal Kumar Mishra is a faculty member in the


Department of Applied Mathematics, Birla
Institute of Technology, Mesra, Ranchi, India –
835215. He received his Master degree in
Operational Research from University of Delhi,
Delhi and Masters in Mathematics also. He
earned his Ph. D. degree from Vinoba Bhave
University, Hazaribag, Jharkhand, India and D.Sc. degree from
Berhampur University, Berhampur, Orissa, India. His research
area is in the field of population dynamics and flow of blood in
human body. He is presently working in the area of Mathematical
models and Simulation on Cyber attack and Defense.

Gholam Mursalin Ansari is the faculty member


of University Polytechnic, BIT Mesra, Ranchi.
He had his MCA degree from BIT, Mesra Ranchi.
He is pursuing his PhD degree from BIT, Mesra
Ranchi and his research topic is " Cyber attack
and defense ".
90 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

A Low Power High Gain Folded-Cascode CMOS


Op-Amp with Slew Rate Enhancement Circuit for
100mW 10-bit 50MS/s High Speed ADC Designed
in 0.18um CMOS Technology
Bhanu pratap singh dohare1, D.S.Ajnar2 and P.K.Jain3
1
Electronics & Instrumentation Engineering Department,
S.G.S.I.T.S. 23, Park Road, Indore, M.P. India-452003
bhanuecvlsi@gmail.com
2
Electronics & Instrumentation Engineering Department
S.G.S.I.T.S. 23, Park Road, Indore, M.P. India-452003
dajnar@sgsits.com
3
Electronics & Instrumentation Engineering Department
S.G.S.I.T.S. 23, Park Road, Indore, M.P. India-452003
pramod22_in@yahoo.com

Abstract: This work describes the design and Simulation of gain. The realization of high speed and high accuracy op-
high speed, high gain and low power fully differential op-amp amps has proven to be very challenging task. Optimizing
with specifications 110dB DC open loop Gain, Phase margin 72 the circuit design for both requirements leads to conflicting
deg and Unity Gain Bandwidth 822MHz .Input referred noise is demands [1]. A single-stage folded cascode topology is a
about 8nV/Hz@10MHz. Folded-cascode op-amp with positive popular approach in designing high speed op-amps. Besides
slew rate 35V/ns & negative slew rate 28V/ns .The settling time large unity gain frequency, it offers large output swing.
is 3.5ns and the op-amp power consumption 2.8mW with
However, it has limitation to provide high DC gain which is
supply voltage +1.2/-1.2,This design has been implemented in
0.18um UMC mixed signal CMOS Technology using Cadence.
required for high settling accuracy. In 1990, Bult and
The op-amp is designed for sample-and-hold stage of 100mW Geelen proposed the folded cascode op-amp with gain
10-bit, 50MS/s high speed ADC. With speed optimization the boosting technique [3]. This technique help to increase the
0.488% settling time is 3.5ns. This design utilizes Gain-Boosting op-amp DC gain without sacrificing the output swing of a
Technique, which is suitable for low supply voltage applications, regular cascade structure [3]. The pushing up the doublet
has been used to achieve high gain. Common mode feedback can raise stability problem [5], [6]. Based on that, this paper
(CMFB) is used to stable the designed op-amp against presents a simple but robust optimization design method; a
temperature. Three fully differential folded cascode op-amps sample fully differential gain-boosted folded-cascode op
have been used in this designing, one for main op-amp and amp was also designed in 0.18um mix-signal CMOS
others for gain-boosting techniques. The two fully differential process with 1.2V power supply. Purpose of this paper is to
folded-cascode. Op-Amp have continuous time with CMFB
discuss design consideration when utilizing gain boost
which is used as Gain-Boosting techniques to increase the open
cascade op-amp in the sample-and-hold (SHA) stage of
loop gain of the main Gain-Boosting. A slew rate enhancement
circuit is introduced for improving the non symmetric slew rate 100mW 10-bit 50Ms/s Pipeline A/D converter. This paper is
of the output stages. divided into three additional sections. The gain boosting
technique is explained in section 4. And the circuit
Keywords: Gain-Boosting, slew rate enhancement, CMFB. frequency behavior is analyzed in section 5. In section 2, the
circuit implementation with 0.18um CMOS Process is
1. Introduction presented. The simulation results are given and discussed in
section 6. Finally, the conclusions are drawn in section 7.
In high performance analog integrated circuits, such as
switch-capacitor filters, delta-sigma modulators and pipeline 2. Design of Gain Boosted Folded Cascode Op-
A/D converters, op amps with very high dc gain and high
Amp
unity-gain frequency are needed to meet both accuracy and
In this section, the implementations of the main op-amp and
fast settling requirements of the systems. In application of
the gain enhancement stages are discussed. A general
pipelined analog-to-digital (A/D) converters, the
method of designing a pipeline A/D converter for minimum
requirement for high speed and high accuracy operational
power consumption was performed at the system level. This
amplifiers (op-amps) are essential. The speed and accuracy
results in a set of specifications for each stage in pipelined
criteria are determined by the settling behavior of the op-
A/D converter. The selected system architecture has a SHA
amps. Fast settling mainly depends on the unity gain
stage followed by eight 1.5bit residue gain stages and a 2-bit
frequency while high settling accuracy is due to high DC
flash stage. The op-amp has to meet the specifications for
(IJCNS) International Journal of Computer and Network Security, 91
Vol. 2, No. 8, August 2010

the SHA as shown in Table 1. Since regular cascode device parameters related to channel length modulation
can not meet these specifications, gain boost cascode respectively for NMOS and PMOS devices. Taking the
topology has been chosen to meet both the high gain and complementarily between the transistors M4 and M6 into
high bandwidth requirements. Fully differential folded- account:
cascode op-amps have been adopted in this design, one for The gain expression becomes:
main op amp, and the others for auxiliary op amps. The
complete implementation is shown in Figure.2.Because the
gain-boosted op amp will be used in a closed-loop The unity gain frequency of the OTA is given by the
configuration, in order to minimize the virtual ground expression:
parasitic that reduces feedback factor, a NMOS differential Table 2: Design parameters and specifications
pair is chosen as input stage in the main op-amp. As for the Specifications Values
two auxiliary op amps, there is not any difference except f(MHz) 340
their input stages. The auxiliary op amp A2 is shown in ID(μA) 30
Figure.3 and Al is not shown again for its similarity to A2. Channel length(μm) 0.18
The ideal effect of the auxiliary op amp is to increase the AV(dB) 82
output impedance of the main op amp by auxiliary times so
CL(Pf) 0.1
as to improve the dc gain of the main op amp by the same
Vdd +1.2/-1.2
times. At the same time, the dominant pole of the main op
amp is pushed down by auxiliary times, where auxiliary is Parameters Values
the dc gain of the auxiliary op amp. As long as the unit-gain gm9,10/ID(V-1) 8
bandwidth of the auxiliary op amp is designed to be larger ID(W/L)9,10(μA) 0.86
than the -3dB bandwidth of the main op-amp the high- g,m4/ID(V-1) 6
frequency performance of the main op amp will be ID(W/L)4(μA) 1.65
unchanged, i.e. the gain-boosted op amp has the same high- W9,10(μm) 35
frequency performance as that of the main op amp. In fact, W1,2,3,4(μm) 18
the gain-boosting technique can potentially raise two W5,6,7,8,11,12(μm) 6
significant problems for the time-domain performance of the
gain-boosted op amp, i.e. doublet and instability.
4. Gain Boosting Technique
Table 1: Op-amp specification for SHA stage Figure 1 illustrates a gain boost cascade topology where
Parameters Specifications transistor MI is an input device, M2 a cascode device and
Stage capacitor(Cf) 1.2Pf
M3 a gain boost device. M3 drives the gates of M2 and
Load capacitor 1.9Pf forces the voltage at nodes X and Y to be equal. As a result,
Feedback factor 0.9 voltage variations at the drain of M2 will affect the voltage
Settling time 3.5ns at node X to a lesser extent because the gain boost device
DC gain 72dB regulates this voltage [3]. Figure. 1 Gain Boost cascode
Gain bandwidth(GBW) 326MHz topology The addition of gain boost device with open loop
Phase margin(PM) 70degree gain, Afb, provides a small signal output resistance
Input transistor current 0.72mA approximately Afb times larger than that of a regular
cascode [4]. Through this technique, the output resistance
3. Optimum Technology OTA Architecture and gain can be increased by the gain of the gain boost
Several fundamental issues exist when selecting an optimal device without adding more cascade devices. However,
architecture for the operational transconductance amplifier. transient response from such an op-amp is degraded by the
This choice aimed both at large gain and large bandwidth presence of pole-zero
performances. The folded cascode OTA is shown in Figure.
3 [2 - 4]. The name “folded cascode” comes from folding
down n-channel cascode active loads of a diff-pair and
changing the MOSFETS to p-channels. This OTA, like all
OTAs, has good PSRR compared to the operational
amplifier. To understand the operation of the folded cascode
OTA, this last has a differential stage consisting of PMOS
transistors M9 and M10 intend to charge Wilson mirror.
MOSFETS M11 and M12 Provide the DC bias voltages to
M5- M6-M7-M8 transistors. The open-loop voltage gain is
given by:
Figure 1. Gain boosting cascode topology
Where gm9, gm4 and gm6 are respectively the
transconductances of transistors M9, M4 and M6. ID is the doublet [5]. This doublet appears as a slow exponential term
bias current flowing in MOSFETS M4, M6, and M9. Like, in the step response of the op-amp, thus degrading the total
CL is the capacitance at the output node. λN and λP are the settling time drastically and will discussed further in the
analysis section.
92 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 8, August 2010

Figure 5. Pole and zero locations


CMFB circuit is indispensable in fully differential
operational amplifier. Conventional dynamic SC-CMFB
circuit, which is shown in Figure. 2, is adopted in the main
op amp, for this CMFB circuit can save static power
consumption and the common mode voltage sense circuit
does not limit the output swing of the op amp. However, the
capacitors in SC-CMFB should be elaborately selected such
that these capacitors will not over-load the main op amp or
be affected by the charge injection of the switches. Although
the SC-CMFB circuit has many advantages described above,
it is not appropriate for the two auxiliary op amps. On the
Figure 2. Fully differential gain-boosted folded-cascode op one hand, the load capacitances of the two auxiliary op
amp with CMFB amps are small, as a result, the capacitors in SC-CMFB will
smaller than them, and the charge injection of the switches
will decrease the accuracy of the circuit. One the other hand,
the output of each auxiliary op amp does not need high
swing. Therefore, two continuous-time CMFB circuits are
used. The CMFB circuit for A2 is shown in Figure. 3. The
CMFB circuit of Al is not shown, for it is similar to the one
of A2.

6. The Simulation Results


With the design process described above, a single stage fully
differential gain-boosted folded-cascode op amp was
designed and implemented in UMC 0.18um mix-signal
process with 1.2V power supply. The step response is
simulated by a closed-loop configuration Shown in Fig. 5.
Figure 3. Fully differential folded-cascode amplifier A2 with Here, both input capacitor C1 and feedback capacitor Cf are
CMFB l pF, while load capacitor CL is 4pF. The Cp represents
5. Settling Response Analysis parasitic capacitances at the input of the op amp, which is
To understand the effect of pole zero doublets on slow 0.185pF.
settling behavior, the transfer function of the gain-boosting
technique is derived using small signal model as shown in
Figure 4.

Figure 6. Frequency Response (bode plot)

Figure 4. Small signal model


Figure 4. Small signal model the capacitors C1 through C3
are the equivalent parasitic capacitance of the MOS
transistors at nodes X and Y. Meanwhile CL is the load
capacitance at output node. To simplify the analysis,
parasitic drain-to-gate capacitor C4 of M2 is broken into its
Miller equivalent at node X and at output node. This Miller
capacitance is included in the value of parasitic capacitor C2
at node X and value of capacitor CL at output node.
Figure 7. Slew rate performance
(IJCNS) International Journal of Computer and Network Security, 93
Vol. 2, No. 8, August 2010

[3] Mrinal Das, "Improved Design Criteria of Gain-


Boosted CMOS OTA with High-Speed
Optimizations", IEEE Trans. on Circuits and Systems
II Vol. 49, No. 3, March 2002, p. 204-207.
[4] K. Bult and G Geelen, "The CMOS gain-boosting
technique", Analog Integrated Circuits and Signal
Processing, Vol. 1, No. 2, Oct. 1991, p. 119-135.
[5] European Industry Association (EICTA) MBRAI-02-
16 v1.0 (2004-01): “Mobile and Portable DVB –T
Radio Access Interface Specification”, 2004.
Figure 8. Relationship between settling time &Cc [6] P. Bogner, “A 28mW 10b 80MS/s pipelined ADC
in0.13μm CMOS”, Proc. ISCAS’04, vol. 1, pp. 17-
20,2004

Authors Profile

Bhanu Pratap Singh Dohare received the B.E.


degree in Electronics and Communication
Engineering from R.G.P.V. Bhopal in 2008 and
M.Tech in Microelectronics and VLSI Design
from S.G.S.I.T.S. Indore, India in 2010. Recently
he is working with a analog filter design and
analysis.
Figure 9. Differential output DC swing versus input voltage
(vin+ only) under different power supplies. D.S.Ajnar received the B.E. degree in Electronics
and Communication Engineering from D.A.V.V.
University, India in 1993 and M.E. Degree in
Table 3: Simulated Performance Digital Techniques & Instrumentation
Engineering from Rajiv Ghandhi Technical
Parameters Simulated results University Bhopal, India in 2000. He has been
DC gain 110 dB working in teaching and research profession since
1995. He is now working as Reader in Department of Electronics
Unity gain frequency 821 MHz & Instru. Engineering of S.G.S.I.T.S. Indore India. His interested
Phase margin 70 degree field of research is to Design the analog filter and Current-
Conveyor.
Power dissipation 7.8 mW
Settling time 3.7 ns P.K.Jain received the B.E. degree in Electronics
Slew rate 35V/ns,28V/ns and Communication Engineering from D.A.V.V.
University, India in 1987 and M.E. Degree in
Differential output swing 2 Vp-p
Digital Techniques & Instrumentation
Engineering from Rajiv Ghandhi Technical
7. Conclusions University Bhopal, India in 1993. He has been
A single-stage folded cascode gain-boosted CMOS OTA has working in teaching and research profession
been designed and simulated using 0.18um CMOS since 1988. He is now working as Reader in Department of
Electronics & Instru. Engineering of S.G.S.I.T.S. Indore India. His
technology. In this design, a single-transistor was applied as
interested field of research is analog cicuit design.
gain-boost device. Care has been taken in selection of the
current values in both the cascode device and the gain boost
device to ensure good settling time performance while
maintaining the gain and bandwidth of the op-amp. The
designed op-amp fulfills the stringent specifications of SHA
stage of pipelined A/D converter with minimal additional
power consumed.
References

[1] K. Bult and G Geelen, "A fast-settling CMOS op amps


for SC circuits with 90-dB DC gain", IEEE Joumal o
Solid-State Circuits, Vol. 25, No. 6, Dec. 1990,
p.1379-1384.
[2] B.Y. Kamath, R. G Meyer and P. R. Gray,
"Relationship Between Frequency Response and
Settling Time of Operational Amplifiers", IEEE
Journal of Solid-State Circuits, Vol. SC-9, No. 6,
Dec. 1974, p. 347-352..

You might also like