Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Fuzzy Cognitive Maps for the Simulation of Individual Adaptive Behaviors C. Buche, P. Chevaillier, A. Nédélec, M. Parenthoën and J. Tisseau UEB - ENIB - LISyC CERV - 25 rue Claude Chappe, F-29280 Plouzané, FRANCE [buche, chevaillier, nedelec, parenthoen, tisseau]@enib.fr Abstract This paper focuses on the simulation of behavior for autonomous entities in virtual environments. The behavior of these entities must determine their responses not only to external stimuli, but also with regard to internal states. We propose to describe such behavior using fuzzy cognitive maps (FCM), whereby these internal states might be explicitly represented. This article presents the use of fuzzy cognitive maps as a tool to specify and control the behavior of individual agents. First, we describe how fuzzy cognitive maps can be used to model behavior. We then present a learning algorithm allowing the adaptation of FCMs through observation. Keywords: behavioral simulation, cognitive map, agent modeling, learning by imitation 1 Introduction To date, the majority of work on virtual reality has been related to the multi-sensory immersion of human users within virtual worlds. These increasingly realistic universes will continue to lack credibility as long as they are not populated with autonomous entities, i.e. having the ability to perceive and act in the environment according to their decision-making capacities. This overall autonomy is based on three forms: 1) sensorimotor autonomy, where each entity is equipped with sensors and effectors enabling it to be informed about and act upon its environment, 2) autonomy of execution, where the execution controller of each entity is independent from the controllers of other entities and 3) autonomy of decision, where each entity makes decisions in accordance with its own experience of the environment (past experience, intentions, emotional state, perceptions, etc.). For simulation purposes, decision making by autonomous entities is defined according not only to external stimuli, but also to internal states. In this article, we show that such behaviors can be described using fuzzy cognitive maps (FCMs) in which these internal states will be explicitly represented. The strengths of FCMs lie in the fact that they can be used to graphically represent specific behavior in the form of a semantic graph and that their evaluation at run-time is fast enough to meet the requirements of real-time simulations, as do connectionist architectures. FCMs are the outcome of research by psychologists. In 1948, Tolman introduced the key concept of “cognitive maps” to describe complex topological memorizing behavior in 2 rats [1]. In the seventies, Axelrod described “cognitive maps” as directed, inter-connected, bilevel-valued graphs, and used them in decision theory applied to the political-economics field [2]. In 1986, Kosko extended the graphs of Axelrod to the fuzzy mode which thus became FCM [3]. In 1994, Dickerson and Kosko proposed the use of FCMs to obtain overall virtual world modeling [4]. Recently, FCMs have been successfully used to describe and model complex dynamic systems [5], both for medical diagnosis [6] and in decision-making [7]. In all these studies, FCMs have been used to control a global system. Here, we propose to decentralize FCMs onto each agent, in order to model the agents’ decisions within a virtual world. This article proposes the use of FCMs as a tool to model the reactive and adaptive behavior of agents improvising in free interaction. The paper is organized as follows. First, we highlight the fact that FCMs are particularly well adapted to specifying and controlling agents’ decisions. We present the uses of FCMs in reactive behavior, illustrating these uses with a real-life example involving different types of agent: a shepherd, dogs and a herd of sheep. Second, we describe the ability provided for agents to adapt their representation of other actors’ behavior using FCMs, which leads to their predictions becoming more significant. This means that we give an agent the ability to learn through imitation. The agent is then able to modify its own behavior to mimic an observed behavior by either another actor or an avatar controlled by a human operator. By observing the imitated model, the agent must adapt its representation of the model behavior. The mechanism used to control the imitating behavior model is independent of learning. 3 Thus, imitated models can be driven by any decision-making mechanism. We apply this algorithm to the example given above (a sheepdog herding sheep). The learning mechanism allows the dog to adapt an FCM prey prototype to a given sheep in real time. Reactive behavior with Fuzzy Cognitive Maps FCM presentation FCMs are influence graphs (see fig 1). Nodes are named by concepts forming the set of concepts C = {C1 , · · · , Cn }. Edges (Ci , Cj ) are oriented and represent the causal links between concepts (how concept Ci causes concept Cj ). Edges are elements of the set A = {(Ci , Cj )ij )} ⊂ C × C. The edges’ weights are associated with a link value matrix: Lij ∈ Mn (R). If (Ci , Cj ) ∈ / A then Lij = 0, else the excitation link (and inhibition link respectively) from concept Ci to concept Cj gives Lij > 0 (Lij < 0 respectively). FCM concept activations take their value from an activation degree set V = {0, 1} or {−1, 0, 1}, or an interval. At moment t ∈ N, each concept Ci is associated with two types of activations: an internal activation degree ai (t) ∈ V and an external forced activation value fai (t) ∈ R. a(0) = 0, where 0 is the Rn null vector. FCMs are dynamic systems. The dynamic obeys a recurring relationship involving link matrix products with internal activation vectors, and fuzzy logical operators between this result and external forced activation vectors. 4 Formalization of the FCM dynamic Until the end of this article, δ indicates one of the numbers 0 or 1, and V one of the sets {0, 1}, {−1, 0, 1} or [−δ, 1]. Given n ∈ R∗ , t0 ∈ N, ρ ∈ R∗+ , and a0 ∈ R. A fuzzy cognitive map F is a sextuplet < C, A, L, A, fa , R > where: 1. C = {C1 , · · · , Cn } is the set of n concepts forming the nodes of a graph. 2. A ⊂ C × C is the set of edges (Ci , Cj ) directed from Ci to Cj . ¯ ¯ ¯ C×C → R ¯ is a function C × C to R associating a weight Lij to a pair 3. L : ¯¯ ¯ (C , C ) 7→ L ¯ i j ij of concepts (Ci , Cj ), with Lij = 0 if (Ci , Cj ) ∈ / A, or with Lij equal to the weight of the edge directed from Ci to Cj if (Ci , Cj ) ∈ A. L(C × C) = (Lij ) ∈ Rn×n is a matrix of Mn (R). It is the link matrix of the map F that, to simplify, we note L unless indicated otherwise. ¯ ¯ ¯ C → VN ¯ is a function that maps each concept Ci to the sequence of its 4. A : ¯¯ ¯ C 7→ a ¯ i i activation degrees such as for t ∈ N, ai (t) ∈ V is its activation degree at the moment t. We note a(t) = [(ai (t))i∈[[1,n]] ]T the vector of activations at the moment t. 5. fa ∈ (Rn )N is a sequence of vectors of forced activations such as for i ∈ [[1, n]] and t ≥ t0 , fai (t) is the forced activation of concept Ci at the moment t. 6. (R) is a recurring relationship on t ≥ t0 between ai (t + 1), ai (t) and fai (t) for i ∈ 5 [[1, n]] indicating the dynamics of the map F.     ai (t0 ) = 0 (R) : ∀i ∈ [[1, n]], ∀t ≥ t0 , ´i h ³  P   ai (t + 1) = σ gi fai (t), j∈[[1,n]] Lji aj (t) (1) where gi : R2 → R is an operator combining two variables, for example: gi (x, y) = min(x, y) , or max(x, y) , or αi x + βi y , . . . and where σ : R → V is a function from R to the set of activation degrees V normalizing the activations as follows (see fig 2): (a) If continuous mode (called fuzzy mode), V = [−δ, 1], σ is the sigmoid function σ(δ,a0 ,ρ) : a 7→ 1+δ 1+e−ρ(a−a0 ) ), with a slope ρ · − δ centered in (a0 , 1−δ 2 1+δ 2 in a0 and with limits at ±∞ respectively 1 and −δ. The larger ρ is, the less linear the transformation will be. In practice, δ ∈ {0, 1} : 0 stands for a bivalued discret logic or a fuzzy logic with values in [0, 1], while 1 corresponds to trivalued discret logic or a fuzzy logic with values in [−1, 1]. ¯ ¯ ¯ 0 if σ ¯ (0,0.5,ρ) (a) ≤ 0.5 (b) If bimodal mode, V = {0, 1}, σ : a 7→ ¯¯ . ¯ ¯ 1 if σ(0,0.5,ρ) (a) > 0.5 ¯ ¯ ¯ −1 if σ(1,0,ρ) (a) ≤ −0.5 ¯ ¯ ¯ (c) If ternary mode, V = {−1, 0, 1}, σ : a 7→ ¯¯ 0 if −0.5 < σ(1,0,ρ) (a) ≤ 0.5 . ¯ ¯ ¯ σ(1,0,ρ) (a) > 0.5 ¯ 1 if The asymptotic behavior (t → +∞) of FCMs with constant externally-forced activation vector sequence may be a fixed point, a limit cycle, or even a strange attractor if it is 6 sufficiently complex [3]. FCM for which behavior? It is difficult to describe the entire behavior of a complex system with a precise mathematical model [8]. It is therefore more useful to represent it graphically, showing the causal relationships between the concepts involved. Therefore, FCM can avoid many of the knowledgeextraction problems which are usually present in rule based systems [9]. FCMs are capable of forward chaining only, i.e. they can answer the question ”What would happen if...?”, but not ”Why...?”, due to the non-linear nature of FCM dynamics. FCMs help predict the system’s evolution (behavioral simulation) and can be equipped with Hebbian learning capabilities, as proposed by Dickerson and Kosko [4]. The fundamental difference between an FCM and a Neural Network (NN) is that, in FCM, all nodes have strong semantics defined by the concept’s model, whereas internal nodes in NN graphs have weak semantics which are only defined by mathematical relationships. As regards the learning capacity, during the training phase, activation vectors must be given for all concepts of FCMs, whereas for NNs, activation vectors are only required for the peripheral neurons. Constructing FCM FCMs use symbolic representations which are able to incorporate experts’ knowledge [10, 11, 12]. Human experts have drawn up FCMs for planning or making decisions in the 7 field of international relations and political developments [13], to model intruder detection systems [14] and demonstrate the impact of drug addiction [15]. First of all, the experts determine the concepts that best describe the system. They know which factors are crucial for modeling the system (characteristic, state, variable, etc.) and provide a concept for each one. Next, they identify those elements of the system which influence other elements and, for the corresponding concepts, determine the positive or negative effect of one concept on the others. Finally, not all causes affect a factor with the same intensity. Some of them have a greater, and others a lesser effect. The experts must assign effect intensities. In order to simplify matters, they might separate the relationships between factors into groups, for example high intensity (Lij = 6), moderate intensity (Lij = 3), and low intensity (Lij = 1). Within the framework of studies to design an autopilot for racing yachts, we proposed to model the cognitive activity of action selection using FCMs [16]. In such FCMs, concepts are action strategies or affordances. Originally introduced by Gibson in ecological psychology, an affordance is defined as an ”action possibility” latent in the environment, objectively measurable and independent of the individual’s ability to recognize it, but always in relation to the actor [17]. In collaboration with an ergonomist, an expert lists a set of required affordances related to the activity, which is to be modeled. The affordance approach needs a model which explain how an individual selects one affordance out of a few; this where we use FCMs. Rather than considering activation levels or the internal inhibition of action graphs based on releases [18], we again worked from the notions of attractors 8 and repulsors in external environments, as suggested by research on affordance selection [19, 20, 21]. From this point of view, an affordance is not necessary an actual invariant of the environment, but it is a hypothesis made by the agent based on its immediate environment, which is associated with an action strategy. One part of the expert’s knowledge is translated into inhibition/excitation relations between affordances: for instance, obstacles could inhibit pathways, gateways and landmarks, while gateways inhibit each other. This gives the link matrix (Lij ). The other part of the expert’s knowledge concerns the affordance perception value. As proposed in experimental psychology [22], for each affordance concept the expert proposes a formula for this perception value, which we use as the external activation fai of the affordance concept. The FCM dynamics occur and activations ai converge towards its attractor. The selected affordance i is the greatest ai (t) while a(t) follows the path of the attractor (most often a fixed point or a limit cycle). Such a virtual agent acts according to the expert description and then increases its credibility [23]. We have also used FCM to model emotional states. In collaboration with psychologists, we have described Fuzzy Emotional Map (FEM) model, first presented in [24]. Each emotion is modeled as an FCM. In this model we defined sensitive concepts (emotion input and state of mind), one output concept to determine an emotional intensity and four internal concepts to represent perception, feeling, sensitivity and the construction of emotional ouput. The only features that require modification between different types of FEM are the influence weights between concepts of the map. Each weight is defined by a particular personality trait (e.g. anxiety or relaxation), and is used according to a specific kind of emotion. 9 FCM for modeling reactive agents’ decision making Principle FCMs can specify and control the behavior of autonomous entities (agents). Agents have sensors and effectors, and make independent decisions with respect to their behavior. FCMs working in relation with these agents have perceptive, motor and internal concepts. The agents’ decision-making is replaced by FCM dynamics in this way: • agents’ sensors define FCM perceptive concept activations through fuzzification1 • defuzzification2 of FCM motor concept activations provides agents’ effectors. Fuzzification and defuzzification are obtained using the principles of fuzzy logics [25, 26], where a specific concept is represented by a fuzzy subset, and its degree of activation represents the degree to which it belongs to this subset [27] (calculated using the membership function of the fuzzy set). As an example, we aim to model an agent capable of perceiving its distance from an enemy. The agent will decide whether or not to escape from the situation, depending on this perceived distance. The closer the enemy is to the agent, the more frightened it will be, and vice-versa. The more frightened the agent, the more quickly it will try to flee. We model 1 Fuzzification consists in converting external FCM values to FCM concept activations. fuzzification is a function from Rn to V. 2 Defuzzification consists in converting FCM concept activation to FCM external values. Defuzzification is a function from V to Rn . 10 this escape behavior using the FCM in Figure 3a. This FCM has 4 concepts and 3 links: “enemy close”, “enemy far”, “fear” and “escape”, with stimulating links (+1) from “enemy close” to “fear” and from “fear” to “escape”, and an inhibiting link (−1) from “enemy far” to “fear”. We chose fuzzy mode (V = [0, 1], δ = 0, ρ = 5, a0 = 0.5), not forced (fa = 0). The sensitive concepts “enemy close” and “enemy far” are activated by the fuzzication of the sensor for the distance to the enemy (Figure 3c) while the defuzzification of “escape” gives this agent an escape speed (Figure 3d). Sensation must be distinguished from perception, in that sensation results from the sensors alone, whereas perception is the sensation influenced by an internal state. FCMs make it possible to model perception, thanks to the links between central concepts and sensitive concepts. For example, let us add 3 links to the previous escape FCM (Figure 3b). An initial self-stimulation of “fear” (link from ”fear” to ”fear” with (γ ≥ 0)) simulates the effect of ”stress”: the more afraid the agent is, the more afraid it will feel. A second stimulation (λ ≥ 0) goes from “fear” to “enemy close” while a final inhibitor (−λ ≤ 0) from “fear” to “enemy far” simulates the phenomenon of being ”fearful”, i.e. when the agent is afraid, it tends to perceive its enemy as being closer than it actually is. The agent becomes perceptive according to its degree of fearfulness λ and stress γ (see Figure 4). Application This section illustrates the usage of FCMs in simulating the behavior of believable agents. In this example, FCMs characterize believable agent roles in interactive fictions through a 11 story taking place in a mountain pasture. ”Once upon a time there was a shepherd, his dog and his herd of sheep . . .” This example has already been used as a metaphor for complex collective behaviors within a group of mobile robots (RoboShepherd [28]), as an example of a watchdog robot for real geese (Sheepdog Robot [29]), and as an example of improvisation scenes (Black Sheep [30]). The shepherd moves around in the pasture and can talk to his dog and give it information. He wants to round up his sheep in a given area. In this simulation, the shepherd is an avatar for a human actor that makes all his decisions. Thus, no FCM is attached to this actor. By default, he remains seated. Each sheep can distinguish an enemy (a dog or a human) from another sheep and from edible grass. It can evaluate the distance and the relative direction (left or right) from an agent in its field of vision. It is able to identify the closest enemy. It can turn left or right and run without exceeding a certain maximum speed. It has an energy reserve that it regenerates by eating and exhausts by running. By default, it moves in a straight line and ends up wearing itself out. We want the sheep to eat grass (random response), to be afraid of dogs and humans when they are too close, and, in keeping with their gregarious nature, to stay close to other sheep. So, we chose a main FCM containing sensory concepts (enemy close, enemy far, high energy, low energy), motor concepts (eat, socialize, flee, run) and internal concepts (satisfaction, fear). This FCM calculates moving speed through defuzzification of the concept ”run”, and the direction of movement by defuzzification of the three concepts ”eat”, ”socialize” and ”flee”. Each activation corresponds to a weighting on the relative 12 direction to be followed: to go towards the grass, join another sheep, or to flee from an enemy respectively. The dog is able to identify humans, sheep, the specific herding area within the pasture and the guard point. It distinguishes its shepherd from other humans and knows how to spot the sheep that is the farthest away from the area among a group of sheep. It knows how to turn to the left and to the right and run up to a maximum speed. Its behavior consists in running after the sheep, which quickly scatters them (see Figure 5a). First, the shepherd wants the dog to obey the order ”stay”, which will lead the sheep to socialize. This is done by giving the dog a sensory FCM of the shepherd’s message, which inhibits the dog’s desire to run (see Figure 5b). The dog’s behavior is driven by the FCM and the dog keeps still when asked to do so (message ”stop”). The dog has an FCM based on the concepts associated with the herding area, for example whether a sheep is either inside or outside the area. These concepts also make it possible for the dog to bring a sheep back (Figure 6cde) and keep the herd in the area by staying at the guard point, in other words, on the perimeter of the herding area and opposite the shepherd. It is remarkable to observe the virtual sheepdog’s path in this simulation forms an S shape (Figure 6c), which is a strategy that can be observed for real sheepdogs rounding up sheep. 13 Adaptive behavior with Fuzzy Cognitive Maps To obtain believable behavioral simulations, agents of the same type must have slightly differents behaviors, and these individual behaviors must evolve over time. The actual behavior of a given agent is the result of its adaptation to the situations it has encountered. As each agent has its own past, evolution induces individual variability amongst agents’ behaviors. When interacting with other agents, one agent has to adapt its behavior according to the way the behavior of its protagonists develops. For example, in a prey-predator interaction, the co-evolution of the two species has been observed in many cases and is known as the “Red Queen Effect” [31]. Imitation from prototypic behavior The idea is to provide the ability for the agent to adapt its representation of other actors’ behavior. This learning is done using the comparison between the simulation model and the observation of reality [28]. We propose a learning based on imitation [32, 33] of observed behaviors to modify predefined prototypes. Four main types of approach to learning by imitation can be distinguished: logical, connectionist, probabilistic and prototypical approaches. 1. Logical approach. Learning consists in generating a set of rules based on logic [34, 35] and the sensorimotor observation describes the example (see XSTL logic [36]). This approach is difficult to adapt to our perceptual modeling behavior based on FCM, 14 requiring that the weighting of the edges’ be linked with such sensorimotor rules. 2. Connectionist approach. The actions are correlated with sensation by an adaptive neural network, possibly inhibited by a mechanism of exception recognition (see: architecture PerArc [37]); modeling a neural network provides a statistically satisfactory, but not semantically explicit, behavior instead of FCMs. 3. Probabilistic approach. Many internal variables are used [38], but since they do not model the emotions and do not reflect our concept of perception, these internal variables do not change the variables through sensory feedback effects. 4. Prototypical approach. Learning from prototypes creates an animation by finding primitives generating the imitated movement [39, 40]. An FCM is an explanatory model suited to behavior specification. Thus, an expert will be able to develop a library of prototypic behaviors. This library represents the agent’s behavioral culture [40]. For example an animal’s library is made up of the prototypic behavior of both predator and prey. Therefore, our agents have a library of prototypical behaviors, but unlike Mataric or Voyles, our primitives are at the decision of the movements, not within the movements themselves. Principle We consider that an agent has sensors allowing it to perceive its environment, as well as the effectors it uses to perform action. Any given agent also has a library of prototypic behaviors 15 specified by FCMs. In parallel to the virtual world, an agent also has an imaginary world, in which it can simulate its own behavior as well as the behavior of other actors. This imaginary world corresponds to an approximate representation of the environment from the agent’s point of view, along with the representation of other actors’ behavior. Agents use prototypic behaviors in order to simulate other actors’ behavior. They imagine their own behavior by simulating their own decisional mechanisms and imagine the behavior of the other actors using prototypic FCMs. They can use their imaginary worlds to choose one strategy amongst several possibilities, not through logical reasoning but rather by behavioral simulation. Thus, they will be able to predict evolutions within the environment. Learning In this section, we present a method for adapting prototypic behavior through imitation in real-time. Agents observe their environment (i.e. other agents), which allows them to simulate the behavior of other entities in their imaginary worlds with prototypic FCMs. The idea is to provide a more relevant simulation by adapting prototypic FCMs through imitation. The modification of prototypic FCMs reduces the difference between predictions in the imaginary worlds and reality [28]. We assumed that agents have sensors to deduce information relating to prototypic FCMs. This means estimating sensor and effector values that will fuzzify sensor values, and comparing the result of defuzzified motor concept activations 16 with the model’s effector values. The learning mechanism consists in retrieving the simulation results in the imaginary world, comparing them to what happened in the virtual world, and thereby adapting the prototypic FCM. To be consistent with the knowledge-based modeling of the behaviors, leading the designer to explicit both concept and links of the FCM, the learning solely consists in adapting the weights of the causal links between concepts of the prototypic FCM. Therefore, the learning algorithm does not modify either the structure of the FCM’s influence graph, or the fuzzification of the sensors, or the defuzzification of the motor concepts. This modification in the causal connections between concepts could be controlled by the expert. In particular, the expert could verify the FCM’s structure, impose signs for some links and define some link interval values. This is what we call ”meta-knowledge about learning”. Why not modify the FCM structure? FCMs have the ability to visually represent behavioral expertise by means of a semantic graph. The concepts, the causal links between them, and these links’ signs are assigned semantic descriptions. In this case, learning does not alter the structure of the influence graphs, so that the behavioral coherence as seen by a human observer may be preserved [41, 42, 43]. Nor does it alter the fuzzification of the sensors or the defuzzification of the motor concepts which remain unique for each agent. 17 Algorithm Kosko [44] proposed two different Hebbian learning methods [45]. One is based on the correlations between activations [26] and the other on a correlation of their variations (differential Hebbian learning) [4]. Differential learning modifies only the links associated with correlated variations in the concepts’ activations, while non-differential correlation learning runs the risk of inappropriately modifying all the links. Kosko’s differential learning is based on the knowledge of a limit cycle which takes all concepts into consideration, and which is provided by an expert. However, we cannot use such a limit cycle because only estimated model sensors and effectors can be observed, and the FCM which generated them is unavailable. In addition, Kosko’s differential learning makes the assumption that external activations are constant, however, the virtual world is a dynamic system and external activations evolve over time. It is therefore necessary to adapt Kosko’s differential Hebbian learning to simulate realistic behaviors in a dynamic virtual environment. The adaptation algorithm that we propose is a four-stage iterative cycle (see Figure 7): 1. Model estimation. The agent estimates model-sensors and model-effectors through observation. We make the assumption that these features are available. 2. Simulating prototypic behavior. Sensors are fuzzified into external perceptive concept activations. The FCM’s dynamics are calculated, and image-effectors are obtained, by activating inner motor-concept 18 defuzzification. 3. Calculating reconsiderations. Image-effectors and model-effectors are compared, generating a set of desired pseudoactivations. These pseudo-activations are obtained by going back up along the graph from motor concepts towards perceptive concepts, without modifying links and by using meta-knowledge about learning. 4. Updating causal links. FCM causal links are updated by applying discrete differential Hebbian learning to the sequence corresponding to the transition from FCM activations to desired pseudoactivations. In the following sections we shall examine these four stages in more detail. Observation During the first stage, the agent measures features of the actor-model which are required for model-sensor and model-effector estimations. For example, the agent ”dog” estimates the distance between the sheep and a predator by the difference in position at that instant t, and it estimates sheep’s speed by the difference in the sheep’s positions at the instants t and t − 1. 19 Prediction The second stage simply corresponds to the classic use of FCMs for controlling a virtual actor and determines image-actor FCM activations at t + δt ≈ t in the imaginary world, according to model-sensor estimation and FCM dynamics with N iterations: ¡ ¢ a(t+ NI δt) = S G(f (t), LT · a(t+ I−1 δt)) N (2) for I = 1, · · · , N ; δt ¿ 1 N equals the length of the longest acyclic path added to the length of the longest cycle in the influence graph in order to make sure that sensor information is spread to all nodes. n is the FCM concept number, f =(fi )iJ1,nK external activations coming from sensor fuzzification, a=(ai )iJ1,nK internal activations, L=(Lij )(i,j)J1,nK2 link matrix, G : (R2 )n → Rn a comparison operator and S a standardization function transforming each coordinate by the 1+δ sigmoid function: σ(δ,a0 ,ρ) (x)= 1+e−ρ(x−a − δ, with parameters (δ, ρ, a0 ) ∈ {0, 1}×R+ ∗ ×R. 0) FCM motor concept defuzzification at t + δt ≈ t provides image-effectors. To clarify, we note a the resulting inner activations a(t + δt) in the following paragraphs. Reconsideration The third stage recursively generates sets of pseudo-activations (Pi )i∈J1,nK representing the orientation for FCM dynamics. This is done by moving back up the influence graph from motor concepts towards perceptive concepts, proposing pseudo-activation values according to meta-knowledge about learning, and bringing image-effectors closer to the estimates of model-effectors. We did not use the gradient backpropagation method [46] because FCMs 20 are cyclical processes and their topology is not organized in layers (recurrent links). Furthermore, the gradient backpropagation method does not hold graph semantics and we wanted to be able to apply specific meta-knowledge to specific nodes. We shall now look more closely at the recursive process. Initialisation m = 0. Entering into the FCM from effectors. A set I0 represents indices of concepts defuzzified onto image-effectors. For each i ∈ I0 , we apply learning metaknowledge. Two potential pseudo-activations p± i = σ(a0 ± 2αi ) ρ simulate an active/inactive concept Ci , αi ≥ 1 representing choice radicality. Including the ai value, there are three − CardI0 possible pseudo-activations pi = ai , p+ combinations are i or pi for each Ci . The 3 0,{} defuzzified and compared to model-effector estimates. The best combination (pi )i∈I0 is retained (the 0 deals with defuzzification and the {} is a set of future labels). ∀i ∈ I0 , Pi = 0,{} {pi }. The other pseudo-activation sets (Pi )i∈(J1,nK\I0 ) are empty. We propose this discrete reconsideration rather than a gradient-scaled calculation [47, 48]. A discrete choice like this facilitates the agent’s decision-making process. Progression from m to m + 1. Let Im ⊂ J1, nK be the index set of concepts whose desired pseudo-activation set is not empty. For i ∈ Im , note ai (respectively fi ) internal k ,{··· } (respectively external) activation of concept Ci , Pi = {pi 1 k ,{··· } , · · · , pi L } its desired pseudo-activation set whose cardinal equals L and J ⊂ J1, nK the index set of concepts which are causes for the concept Ci (i.e.:Lji 6= 0) and such that the edge from Cj to Ci has not been studied: ∀λ ∈ J1, LK, j 6= kλ . We will calculate pseudo-activations Pj for j ∈ J as follows: 21 • For each j ∈ J, we apply learning meta-knowledge: two potential pseudo-activations, − p+ j and pj , are calculated using the formula (3) so that their influence on ai causes a clear choice between an active Ci or an inactive Ci . This accounts for external activations, with α ≥ 1 representing the choice radicality: p± j = à X 2αj − fi − Lli al a0 ± ρ l6=j ! /Lji (3) Note that a0 and ρ are FCM sigmoid function parameters (see fig 2); α is a learning algorithm parameter. k ,{··· } • Then we randomly select a λ ∈ J1, LK. This gives pi λ ∈ Pi and we choose − between the possible 3CardJ combinations pij = aj , p+ j or pj for j ∈ J, the one ´ ³ P k ,{··· } i,{··· ,kλ } i , pj which gives a Ci activation σ Gi (fi , j Lji pj ) the nearest to pi λ • Thus we obtain a new set of concept indices with a pseudo-activation set which is not i,{··· ,kλ } empty: Im+1 = Im ∪ J with Pj = Pj ∪ {pj }for j ∈ J. Termination. If for each i ∈ Im , the corresponding J set is empty, that means every edge belonging to the paths arriving in (Ci )i∈I0 has been studied. We use a discrete method by proposing three pseudo-activations. The discrete method chosen will allow us on one hand to limit the calculations and on the other, to represent a radical choice. Our argument is that to learn, the proposed modifications need to correspond to radical choices rather than minor alterations. 22 Update The fourth and final stage modifies the matrix of the FCM’s links in such a way that its dynamics are directed towards resulting behavior similar to that of the actor-model. We use discrete Hebbian learning to pass from internal activations a to questionings p calculated during the previous stage. Unlike Kosko who used a limit cycle and a learning rate decreasing towards zero over time to ensure convergence (see [4] page 186), we only make them learn the passing from internal activations a to reconsideration p in order to modify the links without creating cycles while maintaining a constant learning rate r(t) = R. A cycle would teach not only the passing from a to p, but also from p to a, which is inappropriate. Our learning rate is constant so that the agent will conserve its adaptive nature. Theoretically, there is nothing to prevent the learning rate from being modified over time. This can be achieved by making it follow a series decreasing towards zero, with an infinitely decreasing associated series, as, for example, r(t > t0 ) = R . t−t0 This would ensure that the weights of the FCM’s edges would converge, but adaptability would decrease over time. Formally, noting A ⊂ J1, nK2 the edge set of the FCM, β ∈]0; 1 + δ[ a sensitivity level and s : R → {−1, 0, 1} the discrete function s(x) = −1, 0 or 1 if x ≤ −β, −β < x < β or x ≥ β respectively, the learning algorithm corresponds to the following equations: 23 k,{··· ,i,··· } ∀(i, j) ∈ A, if ∃k ∈ J0, nK, pj ∈ Pj , with this k :   } k,{··· ,i,··· }  ∆i = s(pj,{··· − ai ), ∆j = s(pj − aj )  i        ¯ ¯   ¯L (t) + R(∆ ∆ − L (t)) , if ∆ 6= 0   ¯ ij i j ij i   Lij (t+1)=¯¯     ¯L (t)  , if ∆i = 0 ¯ ij (4) otherwise Aij 6∈ {path to effectors} : Lij (t+1) = Lij (t) It must be noted that we preserve coherence in the modification of links as specified in the initial prototype provided by the expert. Link emergence, link suppression, or modification of a link’s sign are therefore forbidden. Furthermore, some links can be maintained in max terminals Bij = [Lmin ij , Lij ] so that the modified behavior might remain consistent with and if Lij (t+1) > Lmax then Lij (t+1) = Lmin the expert’s initial description: if Lij (t+1) < Lmin ij ij ij then Lij (t+1) = Lmax ij . Complexity The complexity of this algorithm is a polynomial function of the number n of FCM concepts, and even O(n). For an expert, a concept’s causes are always very limited in number (seldom more than seven). Therefore, the number of edges arriving on each concept is increased by M (M ≈ 7). CardJ ≤ M . 3CardJ is thus increased in practice, whatever the number of concepts involved in the FCM. The same applies to the calculation of FCM dynamics with a 24 complexity of O(n) whereas they could seem to be O(n2 ), due to the great number of zeros in the link matrix. The number of not null links in a column is no more than M , whatever n might be. This algorithm can thus be implemented for use in real-time. Application Based on the sheepdog environment described above, we implemented three applications. First, the dog learns one way of rounding up sheep by imitating a human operator or another dog. In these cases, the prototypic FCM used is the dog’s own FCM. Second, the dog’s prey prototype adapts to a given sheep in real time. This application is described in this section. Third, fearful sheep learn how to be surrounded by other sheep. The sheep remain frightened but no longer flee when they come upon a dog. Immobilizing fearful links means that the sheep’s behavior can be adapted whilst at the same time preserving a fearful ”personality”. To simulate sheep behavior, the dog uses prototypic FCMs of prey from its behavioral library. The dog actually represents each sheep’s behavior through prototypic ”prey” FCMs in its imaginary world, with each sheep being associated with its own prototype. The dog can therefore simulate the sheep’s behavior in order to make predictions and to test different strategies. The prototypes will adapt to each sheep through imitation. One FCM controls the prototype’s speed and another controls its direction. Comparisons between the result of the imaginary and the virtual worlds are used to adapt prototypic FCMs in real-time through 25 learning. Figure 8 illustrates the modification through imitation of a prototype’s speed and the representation of one sheep’s speed using the imaginary world. We chose this set of learning periods to ensure that the process would converge. In order to imitate, the dog first observes the sheep. It adapts the prototypic prey behavior allowing it to simulate the sheep’s behavior in its imaginary world. By observing the sheep, it estimates the information necessary for the fuzzification of the prototype (Phase 1 : observation). Estimated sensor values are fuzzified in activating the concepts ”Enemy close” and ”Enemy far”. The prototype dynamic occurs and, by defuzzifying the activation of the motor effector ”Escape envy”, we obtain the image effector (Phase 2 : prediction). This corresponds to the dog’s representation of the prey’s speed. The image effector from the prototype is compared to an estimation of the sheep’s effectors and this comparison is used to calculate a set of pseudo-activations associating the desired modifications to the FCM’s links (Phase 3 : reconsideration). In Figure 9, we compared the simulation of the sheep’s behavior obtained from the prototype in the imaginary world (”Prey image”), with the sheep’s behavior in the virtual world (”Sheep Model”), both before and after learning, while the dog performs the same trajectory (”Dog”). The modelled sheep is controlled by the map in Figure 3b (with λ = 0.5 and γ = 0.6). The human operator decides on the training period from start to finish. The dog’s acquired imitation experience illustrated in Figure 9 represents around 100 cycles, during which the dog approaches and moves away from the sheep twice. If learning were to be26 come permanent, parameter λ in the ”prey” prototype would be reduced to as little as 0.3 when the sheep remains at a distance from the dog (over 1,000 cycles) but which quickly (below 10 cycles) goes back to a value of around 0.7 as soon as the dog begins to move towards the sheep. This proves the need for a constant learning rate: adaptability remains extremely responsive no matter what the duration of the learning. We then went on to generate a herd of 30 sheep with different ”personalities” (differences λ and γ in the sheep’s FCM). We assigned the dog a ”prey” prototype for each sheep and requested the same number of parallel learning processes as there were sheep in the herd. We obtained significant predictions, with each prey prototype adapting quickly to each sheep (relative stability of the coefficient after 1,500 cycles; the time required for the dog to approach each sheep at least once). However, a simultaneous learning technique would not be possible for larger herds, as when there are more than 300 sheep, the dog no longer has the time to learn in real-time 3 . Conclusion In order to be believable, autonomous behaviors must be driven, not only according to external stimuli, but also according to internal emotions such as fear, satisfaction, love or hate. We described these behaviors using fuzzy cognitive maps (FCMs) where these internal states 3 Our models were implemented using oRis [49] language, and were made on a basic linux PC with a 400MHz CPU and 128MB of RAM. 27 are explicitly represented. Instead of the conventional use [4], we embedded FCMs into each agent. This defines the decision-making period of their lifecycle. The agents implemented with FCMs are not only sensitive, but perceptive; their behavior depends on their internals states retroacting on the sensors. We described the use of FCMs as a tool for modeling the behavior of virtual actors improvising in free interaction. We highlight specific modeling features which can prove particularly advantageous, such as the flexibility concerning system design and control, the comprehensive structure and operation, adaptability to problem-specific prerequisites, and the capability for abstractive representation and fuzzy reasoning. Our agents possess a behavioral library made up of prototypic FCMs. While acting in the virtual world, the prototypic FCMs allow the agent to simulate the behavior of other actors in its imaginary world. These FCMs simulate different strategies, allowing the agent to make predictions. We use FCMs to predict behavior, not by formal reasoning as it was realized for the reasoning on beliefs, the distributed decision and the organization of agents in interaction from the global standpoint [50], conceptual graphs for human experts, but by behavioral simulation. We presented a learning algorithm allowing the prototypic FCM to adapt through observation. Our algorithm changes the weights of FCM connections. It does not, however, modify the structure, the fuzzification of the sensors, or the defuzzification of the motor concepts. The applications depict a sheepdog using a prototypic FCM to predict the behavior of a herd of sheep. 28 The following points are the major limits of our proposal. Currently, the prototype choice (for internal simulations and for learning) is provided by the designer of the simulation. Moreover the learning periods are not chosen by the agent, they are imposed by the designer. Transferring these competencies to the level of agents will increase their autonomy, while automating the entire process. Consequently, future research will aim to implement a process that selects a prototype in the library through observation of the model behavior to be imitated. Furthermore, the learning period will be selected automatically. We are also working on adapting the fuzzy transformations associated with fuzzification and defuzzification. References [1] E.C. Tolman. Cognitive Maps in Rats and Men. Psychological Review, 55(4):189–208, 1948. [2] R. Axelrod. Structure of decision. Princeton University Press, USA, 1976. [3] B. Kosko. Fuzzy Cognitive Maps. International Journal of Man-Machine Studies, 24:65–75, 1986. [4] J.A. Dickerson and B. Kosko. Virtual Worlds as Fuzzy Cognitive Maps. Presence, 3(2):173–189, 1994. 29 [5] D.E. Koulouriotis, I.E. Diakoulakis, D. M. Emiris, E.N. Antonidakis, and I.A. Kaliakatsos. Efficiently Modeling and Controlling Complex Dynamic Systems using Evolutionary Fuzzy Cognitive Maps. International Journal of Computational Cognition, 1(2):41–65, 2003. [6] C. Stylios, V. Georgopoulos, G. Malandraki, and S. Chouliara. Fuzzy Cognitive Map Architectures for Medical Decision Support Systems. Applied Soft Computing, 8(3):1243–1251, 2008. [7] L. Rodriguez-Repiso, R. Setchi, and J. Salmeron. Modelling IT Projects Success with Fuzzy Cognitive Maps. Expert Systems with Applications, 32(2):543–559, 2007. [8] V. Hafner. Learning places in newly explored environments. In in Meyer, Berthoz, Floreano, Roitblat and Wilson (Eds.), SAB2000 Proceedings Supplement Book, Publication of the International Society for Adaptive Behavior, 2000. [9] C. Stylios, V. Georgopoulos, and P. Groumpos. The use of fuzzy cognitive maps in modeling systems. In 5th IEEE Mediterranean Conference on Control and Systems, Paphos, 1997. paper 67 (CD-ROM). [10] M. Hagiwara. Extended fuzzy cognitive maps. In IEEE International Conference on Fuzzy Systems, 1992. [11] C. Stylios and P. Groumpos. Fuzzy cognitive map in modeling supervisory control systems. Journal of Intelligent & Fuzzy Systems, 8:83–98, 2000. 30 [12] Jose L. Salmeron. Augmented fuzzy cognitive maps for modelling lms critical success factors. Knowledge-Based Systems, 22(4):275–278, May 2009. [13] R. Taber. Knowledge processing with fuzzy cognitive maps. Expert Systems with Applications, 2:83–87, 1991. [14] A. Siraj, S. Bridgesk, and R. Vaughn. Fuzzy cognitive maps for decision support in an intelligent intrusion detection system. In International Fuzzy Systems Association/ North American Fuzzy Information Processing Society (IFSA/NAFIPS) Conference on Soft Computing, 2001. [15] G. Calais. Fuzzy cognitive maps theory: Implications for interdisciplinary reading: National implications. FOCUS On Colleges, Universities, and Schools, 2, 2008. [16] M. Parenthoën, J. Tisseau, and T. Morineau. Believable decision for virtual actors. In IEEE International Conference on Systems, Man and Cybernetics (IEEE-SMC), volume 3, page MP2R3, Tunisia, 2002. [17] J.J. Gibson. The ecological approach to visual perception. Lawrence Erlbaum Associates, London, 1979. [18] D.A. Norman and T. Shallice. Attention to action: willed and automatic control of behavior. Consciousness and Self-regulation, 4:1–18, 1986. [19] K. Lewin. Principles of topological psychology. McGraw-Hill, New York. 31 [20] E.S. Reed. The intention of the use a specific affordance: A conceptual framework for psychology. In R.H. Wozniak and K.W. Fisher, editors, Development in context, acting and thinking in specific environment, pages 45–76. Lawrence Erlbaum Associates, New York. [21] S. Lahlou. Human activity modeling for systems design: A trans-disciplinary and empirical approach. pages 512–521, 2007. [22] T.A. Stoffregen, Y-Y. Gorday, K.M.and Sheng, and S.B. Flynn. Perceiving affordances for another person’s actions. Journal of Experimental Psychology: Human Perception and Performance, (25):120–136, 1999. [23] M. Mateas. An Oz-centric review of interactive drama and believable agents. Technical Report CMU-CS-97-156, Carnegie Mellon University, Pittsburgh, PA, USA, 6 1997. [24] A. Nédélec, D. Follut., C. Septseault, and G. Rozec. Emotions, Personality and Social Interactions Modelling in a Multiagent Environment. Proceedings of CASA 2005, Hong Kong (China), pages 103–108, october 2005. [25] F. Wenstop. Deductive Verbal Models of Organizations. International Journal of Man-Machine Studies, 8:293–311, 1976. [26] B. Kosko. Neural networks and fuzzy systems: a dynamical systems approach to machine intelligence. Engelwood Cliffs, Prentice-Hall edition, 1992. 32 [27] M. Sugeno and M. Nishida. Fuzzy Control of a Model Car. Fuzzy Sets and Systems, 16:103–113, 1985. [28] A.C. Schultz, J.J. Grefenstette, and W.L. Adams. Roboshepherd: Learning a Complex Behavior. Technical Report AIC-96-017, Naval Research Laboratory, Washington D.C., USA, 1996. [29] R. Vaughan, N. Sumpter, J. Henderson, A. Frost, and S. Cameron. Experiments in Automatic Flock Control. Journal of Robotics and Autonomous Systems, 31:109–117, 2000. [30] M. Klesen, J. Szatkowski, and N. Lehmann. The Black Sheep: Interactive Improvisation in a 3D Virtual World. In I3, pages 77–80, 2000. [31] L. Van Valen. A new evolutionary law. Evolutionary Theory, 1(1):1–30, 1973. [32] A.N. Meltzoff. Understanding the intentions of others: re-enactement of intended acts by 18-month-old children. Developmental Psychology, 31:838–850, 1995. [33] V. Gallese. The inner sense of action: agency and motor representations. Journal of Consciousness Studies, 7(10):23–40, 2000. [34] D.A. Isla. The virtual hippocampus : spatial common sense for synthetic crea- tures. PhD thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 2001. 33 [35] Q. Yu and D. Terzopoulos. A decision network framework for the behavioral animation of virtual humans. In SCA ’07: Proceedings of the 2007 ACM SIG- GRAPH/Eurographics symposium on Computer animation, pages 119–128. Eurographics Association, 2007. [36] A. Del Bimbo and E. Vicario. Specification by example of virtual agents behavior. IEEE Transactions on Visualization and Computer Graphics, 1(4):350–360, 1995. [37] P. Gaussier, S. Moga, M. Quoy, and J.P. Banquet. From perception-action loops to imitation process: a bottom-up approach of learning by imitation. Applied Artificial Intelligence, 12:701–727, 1998. [38] R. Le Hy, A. Arrigoni, P. Bessiere, and O. Lebeltel. Teaching Bayesian Behaviours to Video Game Characters. Robotics and Autonomous Systems, 47:177–185, 2004. [39] R. Voyles, J. Morrow, and Khosla P. Towards gesture-based programming: Shape from motion primordial learning of sensorimotor primitives. Robotics and Autonomous Systems, 22(3-4):361–375, 1997. [40] M.J. Mataric. Visuo-motor primitives as a basis for learning by imitation: linking perception to action and biology to robotics. In K. Dautenhahn and C. Nehaniv, editors, Imitation in Animals and Artifacts, pages 392–422. MIT Press, 2002. 34 [41] E. I. Papageorgiou, C. Stylios, and P. P. Groumpos. Active hebbian learning algorithm to train fuzzy cognitive maps. International Journal of Approximate Reasoning, 37(3):219–249, 2004. [42] E. I. Papageorgiou and P. P. Groumpos. A new hybrid method using evolutionary algorithms to train fuzzy cognitive maps. Applied Soft Computing, 5(4):409–431, 2005. [43] E. I. Papageorgiou, C. Stylios, and P. P. Groumpos. Unsupervised learning techniques for fine-tuning fuzzy cognitive map causal links. International Journal of HumanComputer Studies, 64(8):727–743, 2006. [44] B. Kosko. Hidden Patterns in Combined and Adaptative Knowledge Networks. International Journal of Approximate Reasoning, 2:337–393, 1988. [45] D.O. Hebb. The Organization of Behaviour. John Wiley and Sons, New York, USA, 1949. [46] D. Rumelhart and J. Mcclelland. Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Foundations (Parallel Distributed Processing). MIT Press, August 1986. [47] M. Mozer. A Focused Backpropagation Algorithm for Temporal Pattern Recognition. In Y. Chauvin and D. Rumelhart, editors, Backpropagation: Theory, Architechtures and Applications, pages 137–169. Lawrence Erlbaum Associates, 1995. 35 [48] R. William and D. Zipser. Gradient-based Learning Algorithms for Recurrent Networks and their Computational Complexity. In Y. Chauvin and D. Rumelhart, editors, Backpropagation: Theory, Architechtures and Applications, pages 433–486. Lawrence Erlbaum Associates, 1995. [49] P. Chevaillier, F. Harrouet, P. Reignier, and J. Tisseau. Virtual Reality and Multi-agent Systems for Manufacturing System Interactive Prototyping. International Journal of Design and Innovation Research, January 2000. [50] B. Chaib-Draa. Causal maps: theory, implementation, and practical applications in multiagent environments. IEEE Transactions on Knowledge and Data Engineering, 14(6):1201–1217, 2002. 36 The FCM seen opposite is made up of 4 concepts and has 7 edges. At a moment t, each concept +1 C 3 −1 Ci has a degree of activation ai (t). The activation a(t) ∈ V4 and the links L ∈ M4 (R) are: −1 −1 C 1 0 1 B 0 B a1 (t) C B C B B C B B 0 B a2 (t) C B C B a(t) = B C L=B B C B B −1 B a3 (t) C B C B @ A @ 0 a4 (t) 0 C 4 +1 −1 +2 C 2 +2 −1 −1 0 0 0 0 +1 1 −1 C C C +1 C C C. C 0 C C A 0 One zero in the links matrix Lij = 0 indicates the absence of edges from concept Ci to concept Cj , and a diagonal element Lii 6= 0 corresponds to an edge from concept Ci to itself. Figure 1: An FCM as an influence graph. Fuzzy mode 1 ρ (1+δ)/2 Bimodal +1 Trimodal 0 (1−δ)/2 σ σ σ (0,0.5,5) (δ,α0,ρ) −δ a0 a (1,0,5) 0 b -1 Figure 2: Cognitive maps’ standardizing functions 37 c far my ne 0 Enemy far Sensitive escape FCM +λ 0 a 20 80 100 distance to ennemy c Perceptive escape FCM Ennemy close speed Fear −1 1 +1 +1 Ennemy far en concept’s activation −1 e los Escape yc Fear 1 em +1 +1 en Enemy close Escape +γ 0 0 −λ 1 escape’s activation b d The sensitive concepts surrounded by dashes are activated by the fuzzification of sensors. The motor concepts in thick black lines activate the effector by defuzzyfication. In (a), the concept C1 =“enemy close” excites C3 =“fear” whereas C2 =“enemy far” inhibits it and “fear” excites C4 =”escape”. A purely sensitive FCM is hereby defined. In (b), the FCM is perceptive: “fear” can be self-maintained (memory) and even influence feelings (perception). In (c), the fuzzification of the distance to an enemy gives two sensitive concepts: “enemy close” and “enemy far”. In (d), the defuzzification of “escape” governs the speed of escape in a linear manner. Figure 3: FCM for an agent’s escape behavior. 38 pure stimuli perception 1.0 1.0 speed speed 0.5 0.5 0.0 0 0.0 0 di 50 dis sta 50 nc tan e ce 100 0.0 0.5 1.0 fear 100 a 0.0 0.5 1.0 fear b The perception of the distance to an enemy can be influenced by fear: depending on both the proximity of an enemy and fear, the dynamics of the FCM decide upon a speed obtained here with its 3rd cycle. In (a), λ = γ = 0, the agent is purely sensitive and its perception of the distance to the enemy does not depend on fear. In (b), λ = γ = 0.6, the agent is perceptive: its perception of the distance to an enemy is modified by fear. Figure 4: Escape speed decided by Figure 3b FCM 39 Stop 1 Shepherd 4 4 1 1 Guard point 4 7 Sheep a −1 Sheep b 1 7 7 4 7 Run Dog Stop FCM Sheep c a b In (a), the shepherd is motionless. A circular zone represents the area where the sheepdog must gather the herd. A guard point is located in the zone, diagonally opposite the shepherd. The sheepdog must return to this point when all the sheep are inside the designated area. Initially, the behavior of a dog without FCM is to run after the sheep which then quickly disperse outside of the herding area. In (b), this elementary FCM carries out the actions of a dog obeying the shepherd’s order to “stay”, by inhibiting the desire to run. Figure 5: Sheepdog and sheep playing roles given by their FCMs. 40 1 Shepherd Sheep a 13 13 9 5 Sheep b 1 13 1 5 9 5 Sheep c 9 5 1 13 Guard point Dog 9 c Sheep far −3 +1 −1 Sheep close +1 +1 Stop −1 Sheep close to zone Bring back −1 +1 Sheep far from zone No move +1 −1 Follow slowly +1 +2 +1 −1 Guard +1 −2 Turn left +1 −1 −1 −1 −1 +1 Sheep outside herd +1 −1 −2 +2 Turn right +1 −1 Sheep inside herd Large zone Sheep right +1 −1 +1 Sheep left −3 Run +1 −1 Close guard point Far guard point Zone left Small zone d Sheepdog’s main FCM Zone right Bring back angle’s FCM e For this simulation, the sheep’s desire to gather is inhibited. In (c), the film of the sheepdog bringing back three sheep was paused. The FCMs for the sheep and for the dog are represented in (d) and (e) respectively. In (d), the dog’s main FCM represents the role of bringing a sheep back to the area and maintaining a position in relation to the shepherd when all the sheep are in the desired zone. In (e), this FCM decides the angle of incidence towards the sheep to bring it back into the zone: to go towards the sheep, but to approach it from the opposite direction. Figure 6: Sheepdog and sheep carrying out the roles given by their respective FCMs. 41 actor−image +0.1 Imaginary Hebbian differential Prey prototype +1 +1 −1 −0.1 Reconsideration Defuzzyfication Virtual Comparaison − Effectors prediction +0.1 Fuzzyfication Sensors estimation Effectors estimation Observations actor−model agent The ”dog” agent and the ”sheep” actor-model evolve within the virtual world. The dog possesses is own imaginary world in which it can simulate prototypic behavior from a library of behaviors, containing the ”prey”. In its imaginary world, the dog tries to imitate the sheep by an actor-image using the prey prototype. The imitation is conducted in real-time according to the events occurring in the virtual world, while comparing the observed effectors of the sheep in the virtual world with the predicted effectors by the prey prototype in the imaginary simulated world (as estimated by the sheep’s sensors). If necessary, discrete reconsideration takes place at the level of the prey’s internal activations in order to reduce this difference. The prey prototype is then updated via differential Hebbian learning. Figure 7: The agent, the actor-model and the actor-image 42 +0.1 +0.42 Enemy close Enemy close +1 +1 Fear Escape envy Fear Escape envy −0.65 −1 Enemy far +0.95 +1.1 Enemy far +0.1 −0.1 +0.72 −0.25 FCM prey before learning FCM prey after learning Figure 8: An FCM of perceptive prey from the library of prototypic FCMs which adapt themselves by learning. Prey image 6 7 5 6 2 3 4 5 7 5 Dog 1 Sheep Model 2 2 3 6 Dog 1 4 7 1 1 7 3 7 4 6 7 5 5 4 4 3 3 4 6 Sheep Model 2 2 6 5 Prey image Pasture Pasture Prey before learning Prey after learning Figure 9: More pertinent predictions can be obtained from the imaginary world by using imitation learning. 43