Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Autonomic Fault Management based on Cognitive Control Loops 1 1 2 Sung-Su Kim, 1Sin-seok Seo, 2Joon-Myung Kang, and 1,3James Won-Ki Hong Dept. of Computer Science and Engineering, POSTECH, {kiss, sesise, jwkhong}@postech.ac.kr Department of Electrical and Computer Engineering, University of Toronto, joonmyung.kang@utoronto.ca 3 Division of IT Convergence and Engineering, POSTECH Abstract— This paper presents an efficient fault management approach based on cognitive control loops in order to support autonomic network management for the Future Internet. The cognitive control loops determines urgency of network alarms, processes urgent alarms more quickly, and then infers root causes of the problems based on learning and reasoning. We show that we reduce a number of alarms by correlation and detect alarm priorities using an ontology model based on the policy. Index Terms— Autonomic Fault Management, Cognitive Management, Alarm Correlation, Association Rule Mining T I. INTRODUCTION he Internet is a very successful modern technology. Despite that success, fundamental architectural and business problems exist in its design. Incremental patches have been added to solve those problems so far. However, there is a limitation to solve inherent problems incrementally, such as a lack of IP addresses, security, and management problems. There are some approaches for the design of the Future Internet: revolutionary and evolutionary [1] [2]. In this design, management of the Future Internet is one of the important topics. However, we do not have a clear picture of the Future Internet yet and many emerging technologies are investigated for the Future Internet. For example, Content Centric Networking (CCN) is the one of the hot issues and network virtualization and autonomic networking will be the key technologies for the Future Internet [3]. Although a new Internet architecture substitutes the current Internet architecture, a basic paradigm of network management will not be changed. The paradigm is to understand current status of network and take the appropriate actions. In order to understand the network status, we need to monitor network devices, links, and servers. Network administrators suffer from lots of network events and alarms. Enterprise networks generate millions of network alarms per day. In cloud computing or virtualized network environment, there will be more network events and alarms to be analyzed. In addition to physical entities, alarms related to virtualized resources will be generated. Existing rule based and case based alarm correlation approaches need manually defined rules and cases based on assumption that a managed network is stable. However, there might be a missing dependency between alarms and a manual modification is necessary when a managed network is changed. For example, if a topology of the managed network is changed, some rules related with a topology should be changed manually. Therefore, it is necessary to update a dependency model with learning. Alarms contain information about serious status of network resources, such as link, router, switch, etc. However, this fragmentary information does not tell the impact of a certain problem. Serious and urgent alarms need to be detected and processed more quickly than normal alarms. We propose an efficient fault management approach based on a cognitive control loop which is a part of the new FOCALE model. The cognitive control loop determines priorities of network alarms, processes alarms with three different control loops, and then infers root causes of the problems based on learning and reasoning. In order to evaluate our approach, we synthetically generate alarms, correlate and analyze them to find root causes. In addition, we propose ontology for determining the priorities of alarms. Urgent cases are treated immediately with specified actions. Otherwise, possible sets of actions are examined and the most appropriate one is selected. In our experiment, 16 different alarms are reduced to four clusters by using learned rules and our clustering algorithm. It means that the effort and time of higher-level network manager s can be reduced. The organization of this paper is as follows. Section 2 covers related work on a FOCALE autonomic architecture [4] for the future Internet and alarm correlation. Section 3 presents a concept of a cognitive control loop. Section 4 describes a detailed approach for processing network alarms. Section 5 presents a case study to validate our concept and algorithm. Finally, Section 6 presents conclusions and future work. II. RELATED WORK In this section, we present a FOCALE autonomic architecture and existing alarm correlation approaches. A. FOCALE FOCALE [5] is an autonomic networking architecture. The acronym FOCALE stands for Foundation – Observation – Compare – Act – Learn – rEason, which describes its novel control loops. Note that other autonomic approaches, such as Since there are at least two fundamentally different operations 1104 978-1-4673-0269-2/12/$31.00 ©2012 IEEE that the control loop is responsible for – monitoring vs. (re)configuration – this overloads the semanttics of the control structure, since these two operations have notthing in common. Indeed, a fault received from one managed entity might not have anything to do with the root cause of thee problem; hence, the (re)configuration loop will affect ddifferent entities. FOCALE uses the DEN-ng information moodel [7] and the DENON-ng ontologies [8] to translate dispaarate sensed data into a common networking lingua francaa. The DEN-ng information model is currently being stanndardized in the Autonomic Communications Forum (ACF F); its previous versions have already been standarrdized in the TeleManagement Forum and in the ITU-T. The DEN-ng is used to represent static characteristics and behhaviors of entities; the DENON-ng is then used to augment this model with consensual meaning and definitions so that vendor-specific concepts can be mapped into a common teerminology. This enables facts extracted from sensor input datta to be reasoned about using ontology-based inferencing. B. Alarm Correlation Approaches There are four alarm correlation approaaches, rule-based alarm correlation [9], codebook-based alarm m correlation [10], case-based alarm correlation, mining based alarm correlation [11] . However, Rule-based, codebook-basedd, and case-based approaches are highly dependent on expert knowledge of skilled operators. Especially, it is not easy to reflect dynamically changing network condition succh as wireless or overlay environments because rules or dependdency models are made manually based on the assumption that nnetwork is mostly stable. Mining based alarm correlation is aable to detect the cause and effect relationships between aalarms [11, 12]. However, it is hard to detect relationships in a short period of time because of its long processing time. Our m method used both rule-based and mining based approaches. E Efficiency can be taken from the rule-based approach and dyynamic changing relationships are detected by mining based approach. Fig. 1 Simplified Version of thee FOCALE Autonomic Architecture nite State Machine (FSM) Since all processes use the Fin and reasoner, the system can recogn nize when an event or a set of events has been encountered befo ore. Such results are stored in short-term memory. This reactivee mechanism enables much of the computationally intensive porrtions of the control loop to be bypassed, producing two “shortcu uts” labeled “high priority” and “urgent”. The deliberative proceess is embodied in the set of bold arrows, which h take the Observe-Normalize-Compare-Plan-D Decide-Act path. This uses long-term memory to store how w goals are met on a context-specific basis. The reflectiive process examines the different conclusions made by the seet of deliberative processes being used, and tries to predict the best b set of actions that will maximize the goals being addresssed by the system. This process uses semantic analysis to un nderstand why a particular context was entered and why a conttext change accrued to help predict how to more easily and effiiciently change contexts in the future. These results are also sto ored in long-term memory, so that the system better understaand contextual changes its reasoning to aid debugging. OPS III. COGNITIVE CONTROL LOO FOCALE [5] control loops are self-governinng, in the system senses changes in itself and its environment, annd determines the effect of the changes on the currently active set of business policies. As shown in Fig. 1, the FOCALE ccontrol loops [13] operate as follows. Sensor data is retrieved fr from the managed resource (e.g., a router) and fed to a model--based translation process, which translates vendor- and device--specific data into a normalized form in XML using the DEN N-ng information model and ontologies as reference data. This iss then analyzed to determine the current state of the managed enntity. The current state is compared to the desired state. In order to strengthen the self-awarreness, the new FOCALE cognition model employs a m model of human intelligence built using simple processes,, which interact according to three layers, called reactive, deliberative, and reflective [14, 15]. The new FOCALE cognition model employs cognitive processes as shown in Fig. 2. gnitive Model Fig. 2 FOCALE Cog IV. AUTONOMIC FAULT T MANAGEMENT In this section, we describe how alarms a are processed in a cognitive control loop. Cognitive con ntrol loops are able to adapt to changing environment with reeasoning and learning. In addition, alarms are classified based d on their urgency to solve important problems more quickly. 2012 IEEE/IFIP 4th Workshop on Management of the Future Internet (ManFI) 1105 A. Multiple Control Flows based on the Prioorities of Alarms The cognitive control loops process netw work events and alarms. At the same time, relationships bettween alarms are leaned to adapt to changing environmentaal conditions. As shown in Fig. 3, multiple control loops are avaailable based on a priority of an alarm. t normalize phase of new Network alarms are correlated in the FOCALE control loops. First, alarm m information is extracted to the form of Fig. 4. Typically, a sin ngle failure affects to other services and devices. Therefore, if i a single failure occurs somewhere in the network, many allarms related to the failure are generated. Once a fault occurs, many identical alarms are generated to notify the fault before it is fixed. Those identical alarms are generalized as shown in Fig. 4. By alarm generalization, the number of alarm ms is reduced. Generalized alarms are then correlated to reducee the number of alarms and find root cause alarms. Fig. 4 Example of Alarm m Generalization n a Priority Fig. 3 Multiple Control Flows based on These control flows are mapped to thee new FOCALE cognition model in Fig. 3. In an observe phasee, data is retrieved from the managed resource (e.g., SNMP polling or trap). Vendor specific data is translated to a normaalized form based on the DEN-ng information model. Network aalarms are filtered and correlated in order to efficiently find roott cause alarms. In this phase, a dependency model is used to corrrelate alarms. At the same time, a normalized data is fed to a learning phase. Changing environment conditions are captuured by learning, especially relationships between alarms are deetected to update a dependency model. After correlating alarms in a normalize phase, a priority of the alarms is determined bby classifying the alarm. The alarm is classified as urgent if tthis alarm affects serious performance degradation of netwoork resources or services. Alarm priorities are determined based on a policy. If an alarm is urgent, a set of actions is sent to thee network devices without passing through plan and decide phhases. This is the difference from the previous version of FOCA ALE control loops. If the current state is a high priority, it skipss a plan phase for taking immediate actions. For a low priority allarm, a plan phase takes a high-level behavioral specification frrom humans, and controls the system behavior in such a way as to satisfy the specifications. It means that a plan phase computes all the possible sets of actions to change the current state to a desired state. A decide phase chooses a set of actionss which maximize a goal. Finally, an act phase sends commends for chosen action to target network devices. Model-based trannslation converts device-neutral actions to device-specific comm mands. 1106 Fig. 5 An Example of a Deependency Model Alarm correlation depends on a baasic dependency model and association rules detected in the learrning process are added. As we mentioned, a learning process leearns relationships between alarms. Fig. 5 shows a basic dep pendency model which is manually defined. It is based on th he TCP/IP model. A lower layer problem affects to higher layerrs. For example, if a server link is down, an IP layer is also unavailable. At first, a manually d. Additional rules learned defined dependency model is used from association rule mining are add ded to the basic dependency model to adapt to changing environm ment conditions. Algorithm 1 describes how to mak ke a set of clusters based on the association rules. Initially, alaarms are generalized and grouped with a same alarm ID. It means m that each alarm is a single cluster by itself in the iniitial phase. Then, all the association rules are examined one by one and the corresponding clusters are merged. In this way, each cluster hip information. Therefore, contains both alarms and relationsh root cause alarms can be anaalyzed easily. Based on relationships between alarms belong ged to the same cluster, root causes can be inferred. 2012 IEEE/IFIP 4th Workshop on Management of the Future Internet (ManFI) Algorithm 1. Alarm Clustering Input: A set of E of alarms (a1, a2, … ) A set of R of association rules (r1, r2, ...) Output: A set of C of clusters (c1, c2, … ) 1: C= group the set E by an identical alarm ID D 2: n= count(R) 3: for i = 1 to n 4: rule ri is form of aj՜ ak 5: find cluster cl, including alarm aj 6: find cluster cm, including alarm ak 7: merge cl and cm into cl 8: put the association rule ri into cl B. Association Rule Mining We can use various machine learningg techniques for inferences. For efficient alarm correlation,, it is extremely ms. In this paper, important to find relationships between alarm association rule mining is used to find the causse and effect from relationships between alarms. Table 1. Alarm transaction dataa sets TID Transaction item sets 1 A1, A2, A4, A55 2 A1, A4, A5 3 A2, A3, A4, A55 4 A1, A2, A4, 5 A1, A3, A5 The transaction database is made of the aalarm data in the managed network after pretreatment shown in Table 1. Each transaction in a database has a unique traansaction ID and contains a subset of the items. A rule is defined as an implication of the form ܺ ՜ ܻ, where ܺǡ ܻ ‫ܫ ك‬and ܺ ‫ ܻ ת‬ൌ ‫׎‬. A priori association rule algorithm basically has two steps; the first is finding all frequent item sets in a daata set by applying min_sup (minimum support threshold).; the second is generating association rules based on the frequuent item sets. For any transaction sets for X, the support for the X, sup(x), is defined as a portion of the transactions in thhe data set which contains the item set in Equation (1). In Table 1, support of {A1} is 4/5 (80%). We assume that the default vaalue of min_sup is 10% and the support of {A1} is greater than min_sup. Therefore, rules related to A1 should be foundd. •—’ሺšሻ ൌ ୡ୭୳୬୲ሺ୶ሻ ୲୦ୣ୬୳୫ୠୣ୰୭୤୲୭୲ୟ୪୲୰ୟ୬ୱୟୡ୲୧୭୬ ൈ ͳͲͲሺΨሻ alarms. Classifying urgent alarms iss dependent on a goal and policy of a network. We defined the t ontology based on the DEN-ng information model to make effective semantic ments, alarms, and their representations of network elem priorities. Fig. 6 describes the concept of network elements and alarms for determining their state and prriorities. An element is a network resource that has its own staate, such as CPU utilization, link throughput, etc. An element pro ovides services and notifies its state to a network administratorr. A notification can be an alarm or event. An alarm has a destiination, source, and type as described in Fig. 4. Alarms are claassified into three classes: urgent, high priority, and low priority. p An element also provides a service. A service has threee classes: gold, silver, and bronze. A gold service is the most im mportant service. Three alarm classes are defined baased a policy of a managed network. For example, it can bee defined by a network administrator that if an alarm affeects to the Service Level Agreement (SLA) violation of a gold service, it can be classified as urgent. We use Semaantic Web Rule Language (SWRL) [18] to make conditional ru ules into the ontology. We assume that alarms related to o a gold service are urgent and a gold service is provided by the server WS2 in Fig. 7 and alarms related to WS2 are classifieed as urgent. For example, “WS2 HTTP unavailable”, “WS2 IP down”, or “WS2 port b fixed as soon as possible. down” are urgent alarms needed to be The following SWRL rules are for classifying c alarms based on our assumption. These SWRL rules determine alarms as urgent if alarms are about an element that provides p a gold service. z Alarm(?a) ෺ hasAlarmDest(? ?a, ?dest) ෺ Element(?dest) ෺ providesService(?dest, ՜UrgentAlarm(?a) z ?s) ෺ GoldService(?s) a, ?dest) ෺ Element(?dest) ෺ Alarm(?a) ෺ hasAlarmSrc(?a providesService(?dest, ՜UrgentAlarm(?a) ?s) s) ෺ GoldService(?s) (1) Confidence of the rule ܺ ՜ ܻ is defined inn Equation (2). In Table 1, conf(‫ ͳܣ‬՜ ‫ܣ‬Ͷ) is sup(‫ܣ ڂ ͳܣ‬Ͷሻ Ȁ‫݌ݑݏݏ‬ሺ‫ͳܣ‬ሻ ൌ 75%. Frequent item sets and the minimum confidennce constraint are used to form rules. ‘ˆሺš ՜ ›ሻ ൌ  ୱ୳୮ሺ୶ ‫୷ ڂ‬ሻ ୱ୳୮ሺ୶ሻ ൈ ͳͲͲ ͲሺΨሻ (2) C. Determination of Alarm Priorities One of the most important features of the cognitive control loops is that alarms are controlled differentlly based on their priorities. Urgent alarms can be processed faaster than normal Fig. 6 Ontological Model for Alarms and Priority ND RESULT V. EVALUATION AN In this section, we describe evalu uation and its results for validating our proposed approach.. We W implemented the alarm 2012 IEEE/IFIP 4th Workshop on Management of the Future Internet (ManFI) 1107 unavailable” and “FS2 FTP unavaillable” alarms. However, in the specific time window, the nod de N1 generates multiple alarms when the fault is not fixed during the time window. Those alarms are including redundaant and similar alarms. For example, the node N1 generates fivee “WS2 HTTP unavailable” and five “FS2 FTP unavailable” alarrms. Those identical alarms are generalized as we explained in n Section 4. Therefore, a higher level manager receives a red duced number of alarms. In our experiment, the generalization process p reduces 100 alarms to 22 alarms. However, alarms gen nerated by node N6 and N8 are not received because of the failu ure of R4. X\ uœ”‰Œ™G–Gh“ˆ™”š correlation algorithm in a Java language and uused the Weka [16] library for association rule mining. We gennerated synthetic alarm data sets for the experiment which corrrelates alarms and finds root causes. Fig. 7 shows the experiimental topology composed of 22 nodes with four critical alaarms. “IP Down”, “Link down”, “Port down”, and “Routeer down” occur randomly at designated nodes as shown in Fig 8. A default route from R6 to R1 is through R3-R0-R1. If a link between R3 and R0 is down, N4 cannot connect to WS S1 and FS2. We generated two synthetic alarm data sets for training and validation. For example, the link of the routerr R0 is down from 5 to 15 second and the WS1’s port is dow wn from 20 to 30 second. If “WS1 port down” occurs, N1 andd N8 generate an alarm “WS1 HTTP Unavailable”. We assumeed that N1 and N8 periodically poll the states of all the servers inn the network. XW \ W h——“Šˆ›–•G“ˆ Œ™ uX uY uZ u[ u\ XW XW XW XW XW yW yX yY yZ {™ˆ•š—–™›G“ˆ Œ™ ~zX mzY \ \ \ uŒ›ž–™’G“ˆ Œ™ \ \ [ ` kˆ›ˆG“•’G“ˆ Œ™ \ \ ] XZ \ ^ Fig. 8 Synthetically Generated Alarms from Each Device Fig. 7 Experimental Topology Based on the generated synthetic alarm daata set, the cause and effect relationships are detected. Table 2 shows a part of alarms, alarm IDs, and detected association ruules. A3ÎA4 and A3ÎA6 mean that the services on WS S1 and FS2 are unavailable if the R0-R3 link becomes dow wn. Based on the rules, when A3, A4, and A6 alarms are ggenerated, A3 is identified as the root cause alarm. There aare thousands of alarms in enterprise networks [17] and a largee number of rules can be detected. Table 2. Alarms and Association Rulees Detected Alarm Alarm Rules ID A3 R0-R3 link down A3Î A4 A4 N4ÎWS1 HTTP unavailable A3ÎA6 A5 N1ÎWS1 HTTP unavailable A8ÎA6 A6 N4ÎFS2 FTP unavailable A9ÎA6 A7 N4ÎWS2 HTTP unavailable A3ÎA7 A8 N1ÎFS2 FTP unavailable A9 FS2 IP down Then, all the critical alarms described in Fiig 8 are generated simultaneously. The type and the number of generated alarms are described in Fig. 8. The node N1 generaates “WS2 HTTP 1108 Fig. 9 shows an output of our clu ustering algorithm. Even if the total number of alarms is still larg ge, we can find a root cause alarm in each cluster easily. For example, the cluster 1 consists navailable”, and “FS2 FTP of “R0 link down”, “WS1 HTTP un unavailable” alarms. Based on the association a rule in Table 2, we can infer that the root cause alarrm is “R0 link down”. The cluster 4 consists of “WS1 port down” and “WS1 HTTP unavailable” alarms. Therefore, neetwork administrators only can focus on the root cause alarm. Fig. F 9 shows the number of clusters and alarms of each cluster. The T root cause alarm of the cluster 1 is “R0-R3 link down”. Th he other alarms caused by “R0-R3 link down” are included in the cluster 1. The cluster 1 includes the other alarms caused by y “R0-R3 link down”, such as “N4ÎWS1 HTTP unavailab ble”, “N4ÎWS2 HTTP unavailable” and “N4ÎFS2 FTP un navailable”. The root cause alarm of the cluster 2 is “R4 router down”, and “R4 neighbor loss” and “R4 IP down” are relateed alarms. The root cause alarms for cluster 3 and 4 are “FS2 2 IP down” and “WS1 port down” respectively. 14 different alarms are reduced to four clusters. Based on the ontological model shown in Fig. 6, we made a SWRL rule for determining a prioriity of an alarm. The alarms in four clusters are examined and claassified to determining their priorities by a SWRL rule shown n in Section 4. An alarm “N4ÎWS2 HTTP unavailable” is determined as an urgent alarm. WS2 provides a gold servicce, which has the highest priority among the service classes. Itt means that the problem is significant and should be solved imm mediately. The root cause of the alarm is “R0-R3 link down” by y analyzing from the alarm 2012 IEEE/IFIP 4th Workshop on Management of the Future Internet (ManFI) cluster 1. Therefore, a set of commands for recovering a link R0-R3 is sent to R0 and R3 without passing through plan and decide phases. Ontology and SWRL rules enable us to analyze services affected by an alarm as well as the state of network devices. Based on the analysis, we can determine whether the alarm is urgent or not. It enables us to solve an urgent problem quickly. If the cognitive control loop is not used, alarms are processed one by one. Even if an alarm is urgent, it would be treated same as other normal alarms. Then, critical alarms cannot be examined while others are being processed. This is the strength of a cognitive control loop. REFERENCES [1] [2] [3] [4] uœ”‰Œ™G–Gh“ˆ™”š [ Z [5] Y [6] X W j“œš›Œ™X h——“Šˆ›–•G“ˆ Œ™ j“œš›Œ™Y j“œš›Œ™Z j“œš›Œ™[ Z Z Y {™ˆ•š—–™›G“ˆ Œ™ X uŒ›ž–™’G“ˆ Œ™ X X kˆ›ˆG“•’G“ˆ Œ™ X X [7] X [8] Fig. 9 Clustered Set of Alarms VI. CONCLUSIONS In this paper, we have proposed an efficient fault management approach based on cognitive control loops. We have shown a case study to validate our concept using the synthetically generated alarm data sets. At first, manually defined dependency model is used. Missing and changing dependencies are detected by a learning phase of the control loops. Ontology and SWRL rules are used to represent the relationships among network resources, services, and alarm priorities. From the evaluation, we have shown that we reduced a number of the alarms, processed the alarms with different orders based on the alarm priority, and found root causes easily by association rules. Our future work is to evaluate the performance of the control loops. We will show that our approach can process urgent alarms more quickly comparing to the existing control loops. [9] [10] [11] [12] [13] ACKNOWLEDGMENTS This research was supported by the WCU (World Class University) program through National Research Foundation of Korea funded by the Ministry of Education, Science and Technology (R31-2010-000-10100-0) and the KCC(Korea Communications Commission), Korea, under the “Novel Study on Highly Manageable Network and Service Architecture for New Generation" support program supervised by the KCA(Korea Communications Agency)” (KCA-2011-10921-05003) [14] [15] A. Feldmann, “Internet clean-slate design: what and why?”, ACM SIGCOM Computer Communication Review, Vol. 37, Issue 3, Jul. 2007, pp. 59-64. M. Blumenthal and D. Clark, “Rethinking the design of the Internet: The end to end arguments vs. the brave new world”, ACM Transactions on Internet Technology, vol. 1, no. 1, Aug. 2001, pp. 70-109. V. Jacobson, D.K. Smetters, J.D. Thornton, M.F. Plass, N.H. Briggs, and R.L. Braynard, “Networking Named Content,” In CoNEXT '09, Rome, Italy, Dec. 2009 J. Strassner, “Autonomic Networking – Theory and Practice”, 20th Network Operations and Management Symposium (NOMS) 2008 Tutorial, Brazil, April 7, 2008. J. Strassner, N. Agoulmine, and E. Lehtihet, “FOCALE – A Novel Autonomic Networking Architecture”, ITSSA Journal, Vol. 3, No. 1, May 2007, pp 64-79. IBM, “An Architectural Blueprint for Autonomic Computing, v7”, http://www-03.ibm.com/autonomic/pdfs/AC%20Bluep rint%20White%20Paper%20V7.pdf. J. Strassner, “Introduction to DEN-ng”, Tutorial for FP7 PanLab II Project, 2009. M. Serrano, J. Serrat, J. Strassner, and M. Ó Foghlú, “Management and Context Integration Based on Ontologies, Behind the Interoperability in Autonomic Communications”, extended journal publication of the SIWN International Conference on Complex Open Distributed Systems, Chengdu, China, Vol 1, No. 4, Jul. 2007 D. Banerjee, V. R. Madduri, and M. Srivatsa, “A Framework for Distributed Monitoring and Root Cause Analysis for Large IP Networks,” 28th IEEE International Symposium on Reliable Distributed Systems, September 2009, pp.246-255. White Paper, “Automating Root-Cause Analysis: EMC Ionix Codebook Correlation Technology vs. Rules-based Analysis, Nov. 2009. Jukic O. and Kunstic M., “Logical Inventory Database Integration into Network Problems Frequency Detection Process,” ConTEL 2009, Jun. 2009, pp.361-365. Risto Vaarandi, “A Data Clustering Algorithm for Mining Patterns from Event Logs,” IP Operations and Management (IPOM 2003), Oct. 2003, pp.119-126. J. Strassner, J.W.K. Hong, S. van der Meer, “The Design of an Autonomic Element for Managing Emerging Networks and Services”, International Conference on Ultra Modern Telecommunications (ICUMT 2009), October 12-14, 2009, St. Petersburg, Russia. M. Minsky, “The Society of Mind”, Simon and Schuster, New York, ISBN 0671657135, 1988. J. Famaey, S. Latre, J. Strassner, and F. De Turck, “A Hierarchical Approach to Autonomic Network Management”, Network Operations and Management Symposium Workshops, 2010 IEEE/EFIP, , Osaka, Japan, April 19, 2010. 2012 IEEE/IFIP 4th Workshop on Management of the Future Internet (ManFI) 1109 [16] [17] [18] 1110 Ian H. Witten and Eibe Frank, “Data Mining: Practical Machine Learning Tools and Techniques with Java Implementation”, Morgan Kaufmann Publishers. ISBN 1-55860-552-5. X. Chen, Y. Mao, Z. M. Mao, and J. Van Der Merwe, “KnowOps: Towards an Embedded Knowledge Base for Network Management and Operations,” In Proceedings of the 11th USENIX conference on Hot topics in management of internet, cloud, and enterprise networks and services (Hot-ICE’11), Berkeley, CA, USA, 7-7. SWRL: A Semantic Web Rule Language Combining OWL and RuleML,W3C Member Submission 21, May 2004. 2012 IEEE/IFIP 4th Workshop on Management of the Future Internet (ManFI)