Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Can imitation explain dialect origins?

Imitation is one of the central processes underlying learning. Although the mechanisms of imitation at the individual level have received considerable attention, the population effects of imitative behavior have scarcely been investigated. In this paper I address the problem of self-organization at the population level emerging from imitative behavior between individuals. The model considered is a modification of that developed by Durrett and Levin [Durrett, R., Levin, S.A., 2005. Can stable social groups be maintained by homophilous imitation alone? J. Econ. Behav. Organ. 57, 267–286] in investigation of the coexistence of social groups. I modified the previous model in order to approach it in describing not only human societies but also animal populations with simpler cultures. In contrast with the other studies, I do not assume any payoffs related to imitation behavior and the existence of social rank. Individuals are assumed to be of equal rank and to accept opinions of others in proportion to their similarity (homophilous imitation). The symmetrical structure of interactions induces random drift and development of stable self-organized social groups in both homogeneous and spatially distributed societies. This type of self-organization may be widely distributed in natural systems, where imitative behavior takes place. In particular, it can be involved in origins of dialects and ring species.

Ecological Modelling 220 (2009) 2624–2639 Contents lists available at ScienceDirect Ecological Modelling journal homepage: www.elsevier.com/locate/ecolmodel Can imitation explain dialect origins? Nikolay Strigul ∗ Department of Mathematical Sciences, Stevens Institute of Technology, Castle Point on Hudson, Hoboken, NJ 07030, USA a r t i c l e i n f o Article history: Received 15 January 2009 Received in revised form 3 July 2009 Accepted 7 July 2009 Available online 6 August 2009 Keywords: Dialect origin Homophilous imitations Imitation behavior Individual-based model Population-level model a b s t r a c t Imitation is one of the central processes underlying learning. Although the mechanisms of imitation at the individual level have received considerable attention, the population effects of imitative behavior have scarcely been investigated. In this paper I address the problem of self-organization at the population level emerging from imitative behavior between individuals. The model considered is a modification of that developed by Durrett and Levin [Durrett, R., Levin, S.A., 2005. Can stable social groups be maintained by homophilous imitation alone? J. Econ. Behav. Organ. 57, 267–286] in investigation of the coexistence of social groups. I modified the previous model in order to approach it in describing not only human societies but also animal populations with simpler cultures. In contrast with the other studies, I do not assume any payoffs related to imitation behavior and the existence of social rank. Individuals are assumed to be of equal rank and to accept opinions of others in proportion to their similarity (homophilous imitation). The symmetrical structure of interactions induces random drift and development of stable self-organized social groups in both homogeneous and spatially distributed societies. This type of self-organization may be widely distributed in natural systems, where imitative behavior takes place. In particular, it can be involved in origins of dialects and ring species. © 2009 Elsevier B.V. All rights reserved. 1. Introduction Imitation is one of the most basic types of behavior in vertebrates (including humans). In general, imitation can be defined as reproduction of an act after perception of a similar act by another individual. Historically, imitation has been considered a driving force for social evolution (Jahoda, 2002). According to recent studies, imitation plays a crucial role in learning and cultural evolution (Miklosi, 1999; Heyes, 2001; Castro and Toro, 2004). Imitative behavior has been intensely investigated for several decades, mostly on the individual level and from the perspective of the cognitive sciences (Adolphs, 2003; Heyes, 2003; Jarvis, 2004). The consequences of individual imitation at the population (society) level are not well understood. Imitation is often associated with payoffs that the imitator expects to gain after copying someone’s behavior (Smith, 1982; Axelrod, 1997; Laland, 2004; Galef and Laland, 2005). In this case, a game theoretical framework is the most suitable tool for investigating the outcome of individual imitation at the societal level (Conlisk et al., 2000; Sigmund et al., 2001; Offerman et al., 2002). However, imitative behavior need not be associated with payoffs in human societies (Heyes, 2001) and animal populations (Whitehead et al., 2004; Jarvis, 2004; Nowicki and Searcy, 2004). An imitator often neither understands the objective ∗ Tel.: +1 201 952 4260; fax: +1 201 216 8321. E-mail address: nstrigul@stevens.edu. 0304-3800/$ – see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.ecolmodel.2009.07.005 of accepted behavior (Heyes, 2001, 2003), nor expects any advantages from copying behavior. At the same time, behavior is usually motivated explicitly or implicitly and an imitator can obtain a payoff even without expecting it. Despite the fact that imitative behavior without tradeoffs is probably common in nature, population models of imitation without rewards and punishments have received limited attention (see Durrett and Levin, 2005). This study addresses the consequences of individual imitation at the population level. In particular, the problem of dialect origin is considered. Dialect can be defined as a variety of signals such as birdsongs (Thorpe, 1961), the dance language of bees (von Frisch, 1962) or human languages (Francis, 1983) used by a group of individuals that is smaller than the whole population. Dialects can be considered as population-level patterns emerging as the result of interactions between individuals. Questions addressed include: What are dialect origins? What kind of self-organization pattern, at the population level, can emerge from individual imitations? Can self-organization emerging from imitations explain the origin of dialects? I begin from the simple individual-based model operating in discrete time and assuming no spatial structure (homogeneously mixed population) and no group structure (Section 2.1). I only assume that individuals more easily imitate the behavior of the individuals who are similar to them (homophilous imitation), but there are no payoffs. In Section 2.2, I introduce into the model two distinct social groups (parties): individuals adopt opinions of others with a probability equal to their similarity if they belong to the same N. Strigul / Ecological Modelling 220 (2009) 2624–2639 party, or with the same probability, but reduced by a factor, if individuals are from opposite parties. This model, operating in discrete time and without spatial structure, already demonstrates some self-organization patterns which can be associated with the dialect origin. For these individual-based models I derive continuous-time mean-field approximations, which are analytically tractable systems of ordinary differential equations (Appendix C). Finally I study a spatially distributed individual-based model (Section 3.2), in which individuals can interact only with their immediate neighbors. The discussion considers possible roles of self-organization patterns emerging from imitation behavior in origins of birdsong and human language dialects. 2625 see also discussion in Section 4.2) is similar to one of the assumptions of the model of the dissemination of culture (Axelrod, 1997, p. 155). I also keep the original terms of the Durrett–Levin’s model. I consider two focal individuals with human names (Fred and Ethel), and two distinct social groups present in the population I will call parties. Opinions represent different variations of certain signals. In particular, “opinion” is similar to “isogloss” with respect to the human language dialects. While these words are not the best terms for description of the dialects, they seem to be general enough to avoid confusions. 2.1. One party model 1.1. List of symbols N- population size. k- number of opinions. P- probability of opinion change. ˛- a real number from [0,1] indicating in how many times probability of opinion change of individuals from the different parties is smaller than this probability when individuals are from the same party. • a- an indicator whether the focal individuals are from the same of different parties. • i- number of similar opinions between two focal individuals. • p- number of opinions of the focal individual that are similar to its party line. • • • • 2. Model development The model presented below is based on the two-party model developed by Durrett and Levin (2005) in addressing the problem of understanding cooperation. Conditions leading to the coexistence of two social groups were established by considering a family of homogeneous and spatially distributed individualbased models. A stable two-party society structure arose even from relatively simple interaction rules, but only with significant polarization of social groups. Only the addition of introspective changes in the model provided the development of a stable twoparty structure without significant polarization (Durrett and Levin, 2005). The assumptions in this paper are significantly modified from this initial model (Durrett and Levin, 2005) in order to address the problem of dialect origin and to describe not only human societies, but also animal populations. In particular, in the initial model individuals can evaluate their opinions about different issues as “right” or “wrong” depending on their party affiliation. The probability of accepting a new opinion “to” the party line is always higher than the probability of accepting the opinion “from” the party line. Such an assumption mainly characterizes human societies with explicit social structures (McPherson et al., 2001). This assumption is omitted in the model presented. Instead, I assume that probabilities to accept opinions are independent of the party line. In the initial model (Durrett and Levin, 2005), individuals of the same party are not distinct in terms of interactions. Any individual accepts a new opinion from any other individual of his party with the same probability and there is only one probability of accepting an opinion of the individuals of the opposite party. In the model presented, I assume that any individual accepts another opinion with a probability equal to his similarity with the selected individual. Therefore, in this model, individual variations of opinions within each party are important. It should be noted that some of the assumptions introduced are inspired by the research of Axelrod (1997). In particular, the rule of opinion change in the model with variable ˛ (Section 2.2.2; Following Durrett and Levin (2005), I consider an individual as a string of k binary bits and a homogeneously mixed population with N individuals in discrete time. At each time step, individuals make consecutive actions to reconsider one of their binary bits. Unlike the initial model (Durrett and Levin, 2005), I introduce another probability for the opinion change (Appendix A). To describe the algorithm, let us take a look at a random focal individual (call him Fred). When Fred decides to update his jth opinion, he randomly chooses an individual from the population-call her Ethel. If Ethel’s jth bit agrees with Fred’s, nothing happens. If Ethel’s jth bit is different from Fred’s and i of her k bits agree with Fred’s, then Fred adopts her opinion with probability P = (i + 1)/k designated as the probabilistic criterion of opinion change for the one party model. This probability is determined by how many of Fred’s opinions will be the same as Ethel’s opinions if Fred will adopt her jth opinion. In other words, when Fred decides to update his jth opinion, he considers not his current similarity with Ethel but his future similarity with her (Appendix A). 2.2. Bipartisan model 2.2.1. Description of the model As in the previous model (Durrett and Levin, 2005), I represent individuals as strings with k + 1 binary bits, where 0 th bits identify party affiliation and the other k bits are independent opinions. Party affiliations are denoted as 1 and 0. However, interaction rules are significantly different compared to the initial model. Again, let us take one random individual, Fred, and describe rules of that individual’s behavior. At each time step, Fred makes three consecutive actions: 1. Select an opinion to reconsider. Fred decides to update his beliefs and picks one of the opinion bits at random; let it be the jth bit. He never chooses to update the 0 th bit representing the party affiliation. 2. Reconsider the opinion in interaction with another individual. Fred randomly chooses an individual from the population (Ethel). If their jth opinions are the same, no change occurs. If their jth opinions are different, then the probability of Fred changing his opinion is (see also Appendix A): P = a(1 − ˛) i+1 i+1 +˛ , k k (1) where ˛ is a number from the interval [0,1], i is the number of similar opinions between Fred and Ethel, and a =  1, if Fred and Ethel are from the same party; 0, if Fred and Ethel are from different parties. 3. Reconsider his party affiliation. This is an optional operation. In particular, if “party” denotes different sexes or species, then change of party affiliation is impossible. In other cases, Fred reconsiders his party affiliation after each attempt to reconsider 2626 N. Strigul / Ecological Modelling 220 (2009) 2624–2639 an opinion, regardless of the result. Probabilistic and deterministic rules for the party change are based on the conformity of the individual to his party, p/k (where p is the number of Fred’s opinions that are similar to the party line), describing how much Fred agrees with his party. The probabilistic rule is considered as a basic rule since it reflects natural stochastisity of behavioral actions. Simulations with the deterministic rule are also discussed as it gives some useful insights into the model outcomes. Probabilistic rule for a party affiliation change. Fred changes his party affiliation with a probability equal to his disagreement with the party, 1 − p/k, after any attempt to update his opinion. Deterministic rule for a party affiliation change. Fred changes party affiliation if he disagrees with his party on more than some fixed number of opinions. A threshold level for party change b is a number from the [0,1] interval. The rule for change of the party affiliation is: if (1 − p/k) ≥ b then Fred changes his party affiliation and if (1 − p/k) < b then Fred does not change his party affiliation. In particular, if ˛ = 1 − p/k, then the probability to change opinion, if Fred and Ethel are from different parties, becomes: P =  How much Fred disagrees with his party ×   How much Fred’ s and Ethel’ s opinions will be similar if Fred changes hisjth opinion  (2) This probability has some intuitive properties. The probability of adopting an opinion from an opposite party individual is proportional to the degree of Fred’s disagreement with his party. The probability to change Fred’s jth opinion is always higher if Fred and Ethel belong to the same party. If Fred completely agrees with his party, then 1 − p/k = 0, hence P = 0 and, therefore, nobody from the opposite party can convince Fred to change his opinion. Alternatively, if an individual from the opposite party agrees with Fred in a significant fraction of opinions, it will increase the probability that Fred adopts the selected opinion. 3. Results 2.2.2. Modification of the model with variable ˛ This model modification is to consider the assumption that ˛ is a variable depending on the current status of the focal individual. For the non-spatial models the results of simulation are presented in Section 3.1, their continuous-time mean-field approx- Fig. 1. Simulation of the bipartisan model with the probabilistic rule for party change (20,000 individuals, 200000 time steps, ˛ = 0.5). Dynamics of different groups a) {0, x, x, x} and b) {1, x, x, x}. c) Dynamics of the diversity criterion. N. Strigul / Ecological Modelling 220 (2009) 2624–2639 2627 Fig. 1. (Continued ). imations are considered in Appendix C. The spatially distributed model is considered in Section 3.2. In the simulations, the number of opinions is fixed, k = 3. Therefore, each party consists of eight different social groups. The common notation for an individual is {x, x, x, x}, where x can be 0 or 1 and the first element denotes the party affiliation. No polarization of society is observed in the simulations of the one-party model. This is consistent with the analytical results (see Appendix C). Also, it is similar to the one-party model with different probability of opinion change studied by Durrett and Levin (2005). Simulations of homogeneously mixed populations were conducted using an original program in C++. Spatially distributed 2628 N. Strigul / Ecological Modelling 220 (2009) 2624–2639 models were simulated using Mathematica software (Wolfram Research Inc.). 3.1. Bipartisan model: homogeneously mixed populations A bipartisan model can be considered as a Markov stochastic process, where the transition function defined at the individual level includes two independent components, opinion and party affiliation changes. Simulations of the models with different rules have been similarly organized. At the initial stage, a population consisting of 20,000 individuals is homogeneously distributed among eight existing groups. This homogeneous distribution of individuals is considered as having the highest level of disorder. In non-spatial simulations the “diversity” criterion (see Appendix B) is employed to characterize the emergence of the self-organized patterns, where individuals accumulate in several groups. 3.1.1. Models with constant ˛ 3.1.1.1. Probabilistic rule of party change. In this model, the initial homogeneous distribution of individuals is unstable (Figs. 1 and 2). An immediate observation is that two groups {1, 0, 0, 0} and {0, 1, 1, 1}, are minute because an individual with {1, 0, 0, 0} or {0, 1, 1, 1} changes his party affiliation with probability 1 (Fig. 1a and b). Through iterations individuals accumulate in several groups and then other groups are diminished (Fig. 1a and b). One possible stable group structure develops and remains following numerous iterations, though the number of individuals in the surviving groups fluctuates. A necessary condition for a stable structure is that at least one opinion (and, typically, two opinions) should be the same for all individuals. For example, in Fig. 1 two opinions are similar for all individuals: the first opinion is 1 and the third opinion is 0. There are many possible stationary group structures, but they are similar due to symmetry: Table 1 Average distribution of individuals after 20,000 time steps in the homogeneous model with probabilistic rule for party change (10,000 runs, 20,000 individuals, ˛ = 0.5). Group Average number of individuals SD Min number of individuals OOOO IOOO OIOO IIOO OOIO IOIO OIIO IIIO OOOI IOOI OIOI IIOI OOII IOII OIII IIII O ... I ... 2723.97 0.01 1611.72 805.66 1610.09 804.94 800.79 1602.74 1626.55 812.88 795.17 1589.06 819.59 1639.62 0.01 2757.19 9987.90 10012.10 2750.50 0.10 1695.97 847.43 1698.45 849.70 835.87 1672.72 1695.84 846.95 817.38 1632.64 845.89 1692.35 0.11 2767.92 3119.26 3119.26 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Max number of individuals 18,617 1 12,408 6,053 11,612 6,052 6,251 12,317 11,673 5,912 6,442 13,029 6,304 12,591 2 19,871 19,542 19,952 1. There is only one surviving group {1, 1, 1, 1} or {0, 0, 0, 0} (in fact, this structure would be similar to the second type, if the groups {1, 0, 0, 0} and {0, 1, 1, 1} were not exceptional). 2. Individuals constantly move between two groups. For example {0, 0, 1, 1} and {1, 0, 1, 1} is a stable structure, where individuals can change only their party affiliation. 3. Stable structures consisting of four or more groups. For example, the population presented in Fig. 1 consists of only four groups of individuals: {0, 1, 0, 0}, {1, 1, 0, 0}, {0, 1, 1, 0}, and {1, 1, 1, 0} after 200,000 time steps. The second component of the random process, the change of party affiliation, connects groups of individuals (from both parties) with similar opinion combinations. The probabilistic rule for party change in the model with 2 parties and 3 opinions reveals 4 possible cases for party change with the probabilities 0, 1/3, 2/3, and 1, when an individual has 0, 1, 2, and 3 different opinions form his party’s line, respectively. Social groups can be apportioned by pairs demonstrating similar dynamics (Fig. 1): {0, 1, 0, 0} and {1, 1, 0, 0}, {0, 0, 1, 0} and {1, 0, 1, 0}, {0, 1, 1, 0} and {1, 1, 1, 0}, {0, 0, 0, 1} and {1, 0, 0, 1}, {0, 1, 0, 1} and {1, 1, 0, 1}, and {0, 0, 1, 1} and {1, 0, 1, 1}. The correlation coefficient between groups in each pair is close to 1 in any particular realization of the random process. When numerous process simulations are averaged, similar correlations are also observed, and individuals in the coupled pairs are distributed proportionately to the probability of party affiliation change (Table 1). For example, individuals from {1, 0, 1, 1} and {0, 0, 1, 1} groups can change their party affiliations with probabilities 1/3 and 2/3, respectively, and the average numbers of individuals in these groups are 1639.62 and 819.59 (Table 1), close to the predicted 2:1 ratio. Cluster analysis (Fig. 3) of the mul- Fig. 2. Dynamics of the average diversity depending on ˛ and its approximation by the first order exponential decay function y = y0 + ae−bt (average over 100 runs of the model, simulation parameters as in Fig. 1, ˛ and corresponding b values are 0.0001 (b = 82 × 10−7 ), 0.5 (b = 13 × 10−6 ) and 1 (b = 18 × 10−6 ), for all fits r 2 > 0.98 and y0 = 0.2). Fig. 3. Cluster analysis of different groups of individuals. A distant measure is 1 − r, where r is the Pearson correlation coefficient and analysis of variance is used to evaluate the linkage distances between clusters. The data consists of 10,000 runs of the individual-based model with parameters as in Fig. 1. N. Strigul / Ecological Modelling 220 (2009) 2624–2639 tiple run data show that statistical linkages exist only between the groups of individuals connected by party affiliation changes. While, the homogeneous distribution of individuals is unstable in a single run of the model and aggregation of individuals occurs relatively quickly (Fig. 1), on average different groups are equally presented (Table 1). Therefore, in every particular realization of the random process individuals accumulate in some survival groups, which are taken at random; and the process can equally likely converge to any of the possible stationary states. The diversity criterion (Figs. 1c and 2), which characterizes the dynamics of aggregating individuals, decays exponentially on average (Fig. 2). Parameter ˛, which determines outcomes of interactions of individuals from opposite parties, affects the average dynamics (Fig. 2). When ˛ is approximately larger than 1/2, the process of aggregation is faster than in cases where ˛ is smaller than 1/2 (Fig. 2). When the number of iterations is large enough, the random process is close to an asymptotical state. Computer simulations showed that 200,000 iterations of the random process provide a good approximation of the stationary state including the value of the diversity criterion. The average diversity at 200,000 time steps is approximately equal to 0.3 when ˛ is relatively small (the following values were tested: 1/3, 1/5, 0.1, 0.01, 0.001) and is about 0.2 if ˛ ≥ 1/2 (1/2, 2/3, 1, 20, 1000). 2629 3.1.1.2. Deterministic rules of party change. In this model modification individuals follow the deterministic rules in updating their party affiliation, where the threshold level for party change determines the change of party affiliation by an individual. When three distinct opinions are considered, k = 3, there are three threshold levels (1/3, 2/3, 1); and an individual changes his party affiliation when he has 1, 2, and 3 opinions different from his party line, respectively. Deterministic rules restrict possible diversity of opinion combinations in society. In simulations of the model with the threshold level equal to 1, individuals change party if and only if all 3 of their opinions are different from their parties. This rule automatically excluded two groups of individuals {1, 0, 0, 0} and {0, 1, 1, 1}. Similarly, these two groups have been excluded when the probabilistic rule is employed (Fig. 1a and b). Individuals with 2 and 3 opinions different from their parties have not been presented in the society in simulations with the threshold level 2/3. There exist only 4 groups for each party ({1, 1, 1, 1}, {1, 0, 1, 1}, {1, 1, 0, 1}, {1, 1, 1, 0} and {0, 0, 0, 0}, {0, 1, 0, 0}, {0, 0, 1, 0}, {0, 0, 0, 1}) and a random drift occurs among these groups. The very intensive exchange of individuals between parties occurs when the threshold level is equal to 1/3. An individual moves into another party if he disagrees with his party on any point, but then he has 2 different opinions from the new party line Fig. 4. Simulation of the bipartisan model with the probabilistic rule for party change and variable ˛ (20,000 individuals, 1000 time steps). Dynamics of different groups of (a) {0, x, x, x} and (b) {1, x, x, x}. 2630 N. Strigul / Ecological Modelling 220 (2009) 2624–2639 Fig. 4. (Continued ). and he returns back to his initial party on the next time step. Only individuals from {0, 0, 0, 0} and {1, 1, 1, 1} groups do not change their party affiliation. Despite the differences between the models with the deterministic and probabilistic rules, the emerging self-organization patterns and stationary states are similar. 3.1.1.3. Model with no party change. In many real-life situations individuals cannot change their party affiliation, and the only changes that occur are the opinion changes. For example, in bird populations, parties can be males or females, or males of two sympatric species (see Section 4). This model modification where the two parties exist, but individuals do not change their party affiliations, reveals substantially the same self-organization patterns as the model with the probabilistic rule for party change. In particular, similar stable group structures emerge in simulations. However, some minor differences occur. For instance, there were stable structures consisting of coupled groups {1, 1, 1, 1}, {0, 1, 1, 1} and {1, 0, 0, 0}, {0, 0, 0, 0}. Therefore, the mere existence of the second party leads to the polarization of society. 3.1.2. Model with variable ˛ Individual-based models with variable ˛ predict significantly different society-level patterns. In the model with variable ˛ there exist two groups {0, 0, 0, 0} and {1, 1, 1, 1}, where individuals cannot be convinced to change their opinions by any individual from the opposite party. This causes asymmetric structure of transition probabilities for opinion change for individuals from different groups. While individuals from all other groups have 8 opportunities to change their given opinion, the individuals from {0, 0, 0, 0} and {1, 1, 1, 1} groups have only 4 opportunities. As a result, individuals are accumulated in {0, 0, 0, 0} and {1, 1, 1, 1} groups. The random process converges faster to a stationary state where individuals belong only to {0, 0, 0, 0} or {1, 1, 1, 1} (Fig. 4). Therefore, the assumptions that (1) ˛ is a variable and (2) the probability to change opinion in interactions with individuals from the opposite party is proportional to the similarity of an individual to his own party, lead to the development of a stable society consisting of two distinct groups. Individuals from these two groups have no interactions with individuals from another group. Similarly, the spatially distributed version of the model with variable ˛ has a stable stationary state with only two types of spatial clusters. The first cluster consists of {1, 1, 1, 1} individuals and the second of {0, 0, 0, 0} individuals. These patterns of spatial organization are different from spatial structures emerging in simulations of the model with constant ˛ described below. 3.2. Spatially distributed model In this section, interactions of individuals on a rectangular lattice on a torus are examined for a model with the probabilistic rule of N. Strigul / Ecological Modelling 220 (2009) 2624–2639 2631 Fig. 5. Simulation of the spatially distributed bipartisan model with the probabilistic rule for party change (2304 individuals, 8000 time steps, the Moore neighborhood, ˛ = 0.5). (a) Changes in society during the simulations (1, 2000, 3000, 5000, 7000, 8000 time steps). 16 groups of individuals have different colors. (b) Dynamics of different groups of individuals. (c) Cluster structure of the society (8000 time steps). (d) Dynamics of the diversity criterion. 2632 N. Strigul / Ecological Modelling 220 (2009) 2624–2639 Fig. 5. (Continued ). party change and constant ˛. Individuals are located on the lattice sites. Each individual has 8 neighbors (The Moore neighborhood) and at each time step an individual selects at random one of his 8 neighbors to reconsider his opinion. In simulations, the initial distribution of social groups on the lattice is taken at random. This random spatial distribution provides practically the same number of individuals in all 16 groups with the diversity criterion close to 1. At the initial stage there are no organized spatial structures of individuals, such as clusters (Fig. 5a). Individuals with similar opinion combinations tend to aggregate in spatial clusters in the course of this random process (Fig. 5a). The behavior of social groups with the same opinion combinations, but with different party affiliations, is correlated and these groups demonstrate similar dynamics (Fig. 5b). This behavior is analogous to the homogeneous mixed society (Fig. 1). These coupled groups (see Section 3.1.1.1) are also located closely on the lattice (Fig. 5c). In general, spatial clusters consist of individuals with similar opinion combinations and from both parties (Fig. 5c). One of the coupled groups is dominant in each cluster and the second group is less numerous. On average, the numbers of the coupled social groups are determined by the ratio 2/3:1/3 of probabilities for party affiliation changes, similar to the model with the homogeneously mixed population. The opinion change process concentrates on the cluster borders with the cluster system development. Only individuals located on the cluster borders can change their opinions. Inside of each cluster individuals can change only their party affiliations that are governed by the probabilistic rule. In the simulations, both parties fluctuate, but coexist (Fig. 5b), and the diversity criterion slightly decreases (Fig. 5d). Although individuals are aggregated in spatial clusters, they are not concentrated in one or several groups of opinion combinations (Fig. 5), in contrast with the homogeneously mixed population. Select clusters are presented by stable group combinations similar to stable group structures of the first and second types that are described in Section (3.1.1.1). Therefore, spatial structures are locally similar to the homogeneously mixed model. The lattice size should play a significant role in the society dynamics. If the population size is too small then all individuals can eventually belong to one cluster and the diversity criterion should be small. This conclusion is confirmed in simulations (Fig. 6), and the diversity criterion, on average, depends on the population size as a logarithm. Average dynamics of the random process are approximated by the exponential functions (Fig. 7). On average, cluster size grows exponentially (Fig. 7a) and at the same time diversity decreases Fig. 6. Average diversity depending on the population size in the spatially distributed model of two parties (100 runs of the model for each population size, 10,000 time steps, ˛ = 0.5). exponentially (Fig. 7b), similar to the average dynamics observed in the homogeneous model (Fig. 2). The average cluster size linearly increases (Fig. 8a) and the diversity decreases (Fig. 8b) as a function of ˛. Therefore, individuals from one party change their opinions more easily when in contact with individuals from the opposite party and when ˛ is large. Therefore, the model predicts rapid development of cluster structures in a spatially distributed society. Each local spatial cluster is similar, in a certain sense, to the mean-field model. Cluster boundaries continuously fluctuate because of inter-group interactions. However, in general all possible social groups coexist in the society on condition that the lattice site is large enough, so random fluctuations of the cluster sizes do not cause cluster extinctions. 4. Discussion This model, as well as the Durret-Levin model, shows that a society or a population, governed by the homophilous imitation rules, demonstrates self-organization patterns. Self-organization emerges when there are two different social groups. Individuals from one group are more likely to imitate behavior of individuals from the same group than individuals from the opposite group. It is easy to find examples in animal populations, in particular, resident and non-resident individuals, males, females, etc. In human societies any distinct social, national and family groups can be considered as “parties”. The model predicts that there are sev- N. Strigul / Ecological Modelling 220 (2009) 2624–2639 Fig. 7. Average dynamics of the spatially distributed model (100 runs, 1024 individuals, ˛ = 0.5). (a) Average cluster size and (b) average diversity. 2633 eral equally possible social structures emerging in the random process initially starting from disorder. A spontaneous order of selforganized systems is considered to be one of the most important additions to natural selection in the development of evolutionary patterns (Kauffman, 1993, 1995). Also related ideas of random drifts are also widely employed in neutral theories (Millstein, 2002; Leigh, 2007). In general, self-organization interacts with selection in a complex way (Kauffman, 1995). This model has been constricted to be general and simple, but it can be, quite obviously, modified to take into account demographic processes inside of the population, natural selection and different social norms. The following examples demonstrate that certain social structures may evolve similarly to what the model predicts. The discussion focuses on two better investigated examples of dialects, birdsong and human language dialects. It can be pointed out that degrees of proximity between the individuals may also influence other types of population ecology models operating with averaged population-level parameters, such as intrinsic growth rate or environment carrying capacity. Some of the mathematical population evolution models can be adapted to better take into account possible effects of individual proximity. For instance, using the famous Beverton–Holt model of population dynamics (McCarthy, 1997; Hui, 2006; De la Sen, 2007), it has been recently shown see (De la Sen, 2007) that the intrinsic growth rate of a species, associated with its reproduction capability, cannot be independent of the “environment carrying capacity” (associated with how in favor the habitat is related to the species). The assumption that those parameters are mutually independent leads to a bad model description of real process in some circumstances, for instance, if there are very few individuals inside a habitat (De la Sen, 2007). 4.1. Birdsong dialects Fig. 8. Average characteristics of the spatially distributed model depending on ˛ (1024 individuals, 10,000 time steps, 100 replications). (a) Average cluster size depending on ˛ value. (b) Average diversity depending on ˛ value. Birdsongs have been studied for several centuries. Development of song at the individual level is one of the major topics of interest and it is quite amazing that many critical facts in this area have been discovered more than 200 years ago (Barrington, 1773; Wickler, 1982). The studying of birdsong variations at the population level and their ecological and evolutionary significance is another research direction initiated in the 17th century (Thielcke, 1988). Nowadays we better understand how individual birds develop their songs from the perspective of cognitive sciences (Baker and Cunningham, 1985; Jarvis, 2004; Nowicki and Searcy, 2004; Podos and Warren, 2007), as well as what birdsong patterns exist at the population level (Catchpole and Slater, 2008; Slater, 2003). There exists an immense number of experimental and empirical studies on these topics. However, the connections between the individual and the population levels are not yet well established. In particular, the origins of microgeographic song variations, such as local song dialects, are not well understood, unlike macrogeographic song variations emerging on the macroscopic spatial scale (Baker and Cunningham, 1985). The males and, more rarely, females of most passerine birds actively perform songs. Songs maintain different functions, for example to attract or stimulate a female or to establish and defend individual territory (Malchevsky, 1959). Another crucial song function is to prevent interspecific hybridization (Grant and Grant, 1996, 1997a). Possible song variations are restricted by syrinx characteristics. However, males of many related songbird species can reproduce similar sounds (Thorpe, 1961). At the same time, morphologically and genetically many bird species can hybridize (Grant and Grant, 1996). Therefore, every female identifies a song type of its own species and the male clearly performs such a song to prevent interspecific hybridization. These tasks are critical for the mainte- 2634 N. Strigul / Ecological Modelling 220 (2009) 2624–2639 nance of species identity and are executed by cultural inheritance through imitative vocal learning (Marler, 1997; Nelson et al., 2001; Jarvis, 2004). Usually nestlings imprint the species-specific song patterns in a short critical period while they stay in the nest (Grant and Grant, 1996). However, in many cases this initial song imprinting does not lead to a fully developed song. For example, in numerous experiments where young birds were isolated after the nestling period, they could perform only a subsong that was not a fully developed species song (Barrington, 1773; Marler, 1997). Junior birds can perform a full song only on the next spring after the dispersion (Thielcke and Krome, 1989). The crucial process in the final stage of song development is the imitation of a mature bird of the same species living in a neighborhood (Nelson et al., 2001; Ellers and Slabbekoorn, 2003). Adult birds are often conservative and they breed in the same local areas for many years (Zimin, 1988; Artemyev, 2008). Therefore some individual song variations of the adult birdsong can be imitated by junior birds and are transmitted by imitation from one generation to the next generation in a local area (Lemon, 1975). This imitation is a homophilous imitation because junior birds tend to copy mature birds of their own species. Local dialects are very well known in most of the songbird species from all geographical regions and can be preserved for many years in local areas (Krebs and Kroodsma, 1980; Catchpole and Slater, 2008). Their areas are not separated from the bird population and dialects often gradually change each other (Mundinger, 1982; Briskie, 1999; Slabbekoorn and Smith, 2002; Baker and Cunningham, 1985). In the earlier studies the variations of the birdsong were recorded by using alphabets or music notes (Barrington, 1773; Lucanus, 1907; Thielcke, 1988; Malchevsky, 1958, 1959) and after introducing the sound spectrograph it became possible to rigorously analyze birdsong patterns (Thorpe, 1954; Thielcke, 1960). Typically song variations can be described by means of discrete variables, which can be considered as “opinions” in the model presented. For instance, for song variation characterization of Darwin’s finches, five discrete variables were selected (Grant and Grant, 1996): the duration and the maximum frequency of the first and the second units and the interval between them. Several Palaearctic, North American and tropical birds were found to be convenient models for the song variation studies as they have, respectively simple song and occupy large areas. For example, vocalizations of the Chaffinch (Fringila coelebs), Cardinas (Cardinales cardinales) and the two treecreepers (Certhia brachydactyla and Certhia familiaris) have been intensively studied (Thorpe, 1954; Lemon, 1975; Thielcke and Wüstenberg, 1985; Baker and Jenkins, 1987; Martens and Geduldig, 1988; Thielcke and Krome, 1989). The Common Treecreper is remarkable by splitting parental responsibilities so it can produce two broods per summer in northern areas (Strigul, 2001), and by its sympatric zone with the Short-toed Treecreeper (Thielcke, 1960, 1986). To summarize, some important patterns of songbird dialects in relation to the model presented are: 1. Geographical areas that maintain local song dialect should not be isolated from the main population area, 2. Dialects are transmitted from one generation to the next generation by learning mechanisms close to homophilous imitations, 3. Birds that perform dialect song variations are not close relatives; they could be born in different parts of the species area. In field studies it was observed that birds performing a particular song dialect usually occupy some area that can be considered as a spatial cluster (Malchevsky and Pukinsky, 1983; Catchpole and Slater, 2008). Apparently, there is a significant similarity between the predictions of the spatially distributed model and spatial distributions of bird dialects. In both cases spatial clusters have no fixed boundaries and change randomly in the course of time. In the model the population is assumed to have a constant size that can be attributed to the number of individual territories suitable for reproduction. Similar assumptions are also often employed in population genetics. Birdsong dialects may exist much longer than an average life span of small passerine birds (Thielcke, 1992) and, to implicitly include demographic processes, an “individual” in the model can be thought of as a sequence of individual birds which replace each other in a simple replacement process. The theory presented does not involve selection or geographical isolation. However, an initial disorder is assumed. This could be the case, for instance, when a territory inside of the species area is being recolonized after a major disturbance. Major disturbances such as forest cuts or hurricanes are traditional starting point for forest succession models (Strigul et al., 2008). During forest development such an area is colonized by particular bird species only on certain succession stages (Malchevsky and Pukinsky, 1983; Zimin, 1988). Settlers come from various places and carry numerous song variations providing an initial disorder similar to the one required in the model. Some local song dialects likely emerge by the fixation of song copying mistakes (which can be considered as new opinions in the model) in a local neighborhood (Thielcke, 1992), however most local song dialects are new combinations of already existing song variations (Lemon, 1975). On the contrary, macrogeographic song variations, where geographical isolation is involved, sometimes are also called dialects, but they can be developed in substantially different ways (Mayr, 1963; Baker and Cunningham, 1985; Podos and Warren, 2007). For example, a founder effect may play an important role in the development of the dialects in geographically isolated areas (Mayr, 1963; Thielcke and Wüstenberg, 1985; Baker and Jenkins, 1987). In conclusion, local song dialects can emerge from self-organization caused by individual imitations only. Song dialect origins may be connected to the mechanisms of ring species emergence. Ring species emerge as a result of gradual differentiation in a circular population distribution. When the limit borders overlap, two coexisting and reproductively isolated species emerge (Mayr, 1963; Irwin, 2000; Irwin et al., 2001a,b; Slabbekoorn and Smith, 2002). It is accepted that mating signals play the crucial role in reproductive isolation and speciation (Grant and Grant, 1996, 1997a,b). In a recently investigated Siberian ring species, the Greenish Warbler (Phylloscopus trochiloides), reproductive isolation is closely associated with song variations (Irwin, 2000; Irwin et al., 2001a,b). The greenish warbler song variations were characterized by five parameters: song length, maximum frequency, minimum frequency, number of song units, and number of song types. Some ecological parameters (such as forest density and population density) as well as song patterns demonstrated latitude changes. Irwin (2000, p. 999) suggested four alternative hypotheses of what can cause signal divergence leading to reproductive isolation: (1) ecological differences, (2) sexual selection without ecological differences, (3) ecological differences which affect the balance between sexual and natural selection, and (4) selection for species recognition. Statistical analysis of the data supported hypotheses 1 and 3, and the last hypothesis was selected as the more logical proposal (Irwin, 2000). These two hypotheses assume that ecological factors can affect song variation directly (through the acoustic environment, the first hypothesis) or indirectly (through sexual selection, the third hypothesis). However, mechanisms for such effects are not really understood, and no direct evidence was found (Irwin, 2000). Selection may probably affect development of the local dialects through the restriction of some combinations of song parameters that are not efficient in the given ecological conditions. If this is the case, then the reproductive isolation in the ring species may be the result of interplay between gradual changes of N. Strigul / Ecological Modelling 220 (2009) 2624–2639 adaptive landscape and self-organization. Therefore, hypothesis 3 can be modified in order to take into account the self-organization mechanism of dialect development predicted by the model. In this case, natural selection, according to the ecological latitude gradient, modifies the structure of song variables (or equally opinions) and self-organization patterns at the ends of the population ring are qualitatively different. Therefore, individuals in the overlap zone from different ends of the ring are not likely to imitate each other, which leads to reproductive isolation. 4.2. Human language dialects Dialect patterns observed in human languages are very similar to those in birdsongs. Copying and imitative learning are the main mechanisms for individual language development. At the same time, at the population level, language is not a fixed and homogeneous structure but rather a dynamic system changing in both time and space. Different kinds of language variations are widely distributed in human societies. Language variations may involve all parts of the language: the definitions of words, phonology, grammar, and also semantics (Hock and Joseph, 1996; Bichakjian, 2002; Cangelosi and Parisi, 2002). Such complex differentiation makes it very difficult to describe either the actual dialects or the differences between them, their borders, and dynamical patterns. Further difficulties emerge when scholars attempt to rank dialects according to their similarity and to establish rigorous distance between them. Even when the differences among dialects are well described, as for example in isoglosses, the results of dialect distribution analysis are not similar because there usually are no distinct borders and different isogloss changes are not correlated (Kretzschmar, 1992; Livingstone, 2002). This also makes it difficult to explicitly apply the developed model to any specific case study. Therefore, I will consider only qualitative similarities between the simulated patterns and observed dialect patterns in different societies. Language variations are observed on different levels (Francis, 1983, p. 42): (1) within the performance of an individual speaker (stylistic variations), (2) between individuals (idiolectal variations), and (3) between groups of individuals (dialectal variations). This classification can be used to model the origin of dialectal language variations as a result of self-organization of idiodialectal variations based on stylistic variations. The prevailing view on dialects is summarized by W.N. Francis: “Any language spoken by more than a handful of people exhibits this tendency to split into dialects, which may differ from one another along many dimensions of language content, structure, and function: vocabulary, pronunciation, grammar, usage, social function, artistic and literary expression.” (Francis, 1983, p. 1). This observation is in agreement with my study: Dialects should emerge in any society that is large enough. Dialectal variations might be separated into several general cases (Francis, 1983): (1) geographic, (2) social, (3) ethnic, (4) sexual, and (5) age variations. Geographic variations are very common among language patterns. Geographical variations may be observed in practically any language. There are well investigated examples of geographical dialects in Britain (Petyt, 1980, ch. 3), the United States (Kurath, 1998), France, Belgium, Germany (Francis, 1983), and other countries. Detailed case analysis shows that dialects usually change gradually and do not have strict borders; however, in some cases explicit borders arise as a result of geographical or political separations (Francis, 1983). The geographical type of language variation is well described by the spatially distributed model I have considered. The model predicts the existence of spatial clusters in which all individuals have similar opinions having variable boundaries along which interactions and changes occur. This scheme is highly suitable for observed spatial dialect patterns, which can involve (as opinions) any structural elements of the language. 2635 Social, ethnic, sexual, and age variations account for typical language variations among people that live in the same area and, therefore, a mean-field model should be applied. Where language variations involve at least two different social groups, in which people are more likely to be influenced by individuals from their own group than by individuals from the opposite group, the resulting patterns are examples of the outcomes of homophilous imitations. Social dialects present language differences among different social classes. Even in a society that officially prohibits social inequalities and restricts the development of classes, such structures usually emerge as a result of natural associations as individuals gather to share similar duties, responsibilities, privileges, and constraints (Francis, 1983; McPherson et al., 2001). Racial or ethnic dialects result from the gathering together of individuals according to religious or national similarities. Similarly, sex- and age-related dialects emerge as people aggregate according to those types of classifications. I can suggest that in many cases, such dialects emerged in a homogeneous local community as a result of selforganization according to the model with variable ˛ (Section 3.1.2. In such cases, individuals aggregate in strictly distinct groups and no individual can be convinced by someone from the opposite party to accept his opinion. Only according to such a strong rule can completely different dialects be presented in a homogeneous society in which interactions are random and unlimited. However, sometimes language variations may also be related to local geographical isolation. For example, it is possible to consider ethnic enclaves in the United States: many local communities are heavily populated by different ethnicites (African-Americans, Italian, Chinese, Russian, etc.). Nevertheless, ethnic language variations are usually considered to occur in a homogeneous community, when a local geographical aggregation is actually being observed. Therefore, such cases should be simulated with the spatially distributed models using a constant or variable ˛ depending on the level of social separation. Another spatially distributed model was employed by Robert Axelrod in order to explain the dissemination of culture and, in particular, the origins of dialect (Axelrod, 1997. pp. 149–177). In this model individuals change their opinions (features) with a probability equal to their cultural similarity, much as in the model with variable ˛. However, unlike that model, individual interactions in the Axelrod model also involve associated traits. Although the model structure, interaction rules, and population spatial distribution of the Axelrod model and of the spatially distributed model with variable ˛ (Section 3.1.2 are different, general predictions are similar in both models. Both models predict the complete polarization of society into groups of individuals that have nothing in common and therefore cannot interact. Such social groups can occupy a region and stay there forever because individuals cannot be influenced to change their opinions by the completely different individuals in their neighborhood. Obviously, these models cannot comprehensively describe the development of geographical dialects because they do not predict gradual changes or the stable spatial coexistence of different dialects. Still, these spatial models can be useful in some cases when a strong disruptive selection is involved. For example, when extreme social or ethnic dialects have a spatial distribution. By contrast, the model with the constant ˛ predicts spatial patterns that are similar to those of geographical dialects. One of the most significant differences between dialects in human societies and those in bird populations is the existence of official policies. In most human societies, one dialect is designated as a standard dialect or standard language. Usually, the people using this dialect will tend to belong to the higher social classes and live in the country’s capital (Petyt, 1980). This kind of organization immediately leads to social inequality, as individuals of lower social rank or individuals who live in peripheral areas of the country can be immediately recognized by their dialect; moreover, 2636 N. Strigul / Ecological Modelling 220 (2009) 2624–2639 this recognition is often associated with a social disadvantage for them. It is sometimes difficult to separate different dialects from different languages (Petyt, 1980). This is a common problem with both birdsong and human dialects and it is closely related to the problem of both the ring speciation and speciation in general. The general criterion is mutual intelligibility. However, many practical difficulties may arise because geographical dialects often adhere to no explicit borders (Francis, 1983; Irwin et al., 2001a,b; Livingstone, 2002). Therefore, self-organization can play a significant role in dialect development. It appears that the spatially distributed model with constant ˛ predicts patterns very similar to the geographical birdsong and language dialect patterns. I suggest that such selforganization may be a possible mechanism for dialect origin and that this process interplays in a complex way with selection and official policies. In particular, in some cases when the disruptive or divergence selection is involved, such as a case of some extreme ethnic or social dialects, the model with variable ˛ is justified and two distinct social groups are developed with very limited interactions between individuals. 5. Conclusions Imitation is one of the basic types of behavior and it is widely employed in learning and cultural transmission. In this paper I considered how social patterns at the population level can emerge from individual-level patterns via imitative behavior. The respectively simple model demonstrates that self-organization patterns similar to the dialects can emerge from individual imitations via drifts. The minimal condition is the existence of two distinct social groups (parties). Also the process should start from a high level of disorder, which means that there are numerous variations of selected patterns at the individual level. These conditions apparently are not restrictive, and therefore this self-organization mechanism could be widely distributed in both human societies and animal populations. The model does not involve any payoffs related to the imitative behavior and does not include selection. Therefore, population-level patterns may result only from individual imitations, with no selection involved. However, more complicated models can be considered to further investigate how selection and social norms can interplay with this mechanism of the dialect origin. The examples considered demonstrate that this mechanism probably evolves in the development of birdsong and language dialects. Also, it appears that the self-organization involved in speciation mechanisms is related to that which occurs in dialect development, which can lead to the reproductive isolation of bird species and, likewise, to the origin of new human languages. Acknowledgements I am grateful to Simon Levin for his advice and critical comments, and to Catherine Galdun David Vaccari and Timothy Ryan for useful suggestions. I also would like to acknowledge anonymous reviewers who made very useful comments. Appendix A. Probabilities of opinion change opinions are already similar given that jth opinion is different. A difference between P and P ′ is: (P − P ′ ) = (k − 1 − i)/(k(k − 1)) = 1/k − i/(k(k − 1)) The following simple observations explain why P is chosen in the model introduced in (Section (2.1)): Observation A.1. (1) if i = k − 1 then (P − P ′ ) = 0 and P = P ′ = 1. This means that when all of Ethel’s other k − 1 opinions are similar with Fred’s opinion, Fred will change his opinion with probability 1 in both cases. In this case both criteria are equal. (2) (P − P ′ ) are from the interval [0, 1/k]. Let us consider (P − P ′ ) as a function of i (i changes from 0 to k − 1) when k is a constant. It is a linear function, which is decreased when i increases. This function (P − P ′ )(i) has a maximum of 1/k when i = 0 and a minimum of 0 when i = k − 1. This means that the difference between two criteria is greatest when Fred and Ethel have a small number of similar opinions. In the case where they do not have any similar opinions P1 = 0 and Fred has no chance to be convinced, but he still has a chance to adopt Ethel’s opinion if P = 1/k. This is the main difference between criteria and the reason to use P. (3) If i is a constant then limk→∞ (P − P ′ ) = 0. This means that the difference between the two criteria is decreasing (with the rate 1/k) when the number of opinions is increasing. When k is large enough, Fred’s decisions will be very similar if he will use any of those criteria. His decisions can be qualitatively different only in a case where i = 0. The following proposition outlines some obvious properties of P: Proposition A.1. 1. P is a rational number from the interval (0,1]. 2. P is equal to 0 only in a limit case limk→∞ P = 0 for a constant i. This means that Fred always has a chance to adopt Ethel’s opinion, even if Ethel completely disagrees with him on all opinions (in this case P = 1/k). 3. P is equal to 1 if and only if Fred and Ethel have only one j th opinion different, and all other k − 1 opinions are the same. A.2. Bipartisan model Observation A.2. Probabilistic criterion to change opinion in the two-party model, P = a(1 − ˛)(i + 1/k) + ˛(i + 1/k), introduced in Section 2.2.1 depends on the party affiliation of the individuals. There are two cases: 1. If both individuals (Fred and Ethel) belong to the same party, then a = 1 and P= i+1 . k P=˛ ability of adopting Ethel’s opinion. P = (i + 1)/k is determined by how many of Fred’s and Ethel’s opinions will be similar if Fred will change his jth opinion (this probability is adopted in this paper). P ′ = i/(k − 1) means that Fred will adopt Ethel’s opinion with probability, which is determined by how many of Fred’s and Ethel’s (A.1) This probability is the same as the probability that is used in the one-party model. 2. If Fred and Ethel are from different parties, then a = 0 and the probability that Fred changes his j th opinion is A.1. One party model Fred can use two simple criteria, P and P ′ , to determine the prob- P = (i + 1)/k has the following properties: i+1 . k (A.2) The probability to change opinion if individuals are from different parties is ˛ times less than the probability to change opinion if individuals are from the same party. 3. The values of P always belong to the interval [0,1] for realistic values of i, i.e. when i ≤ k − 1. 2637 N. Strigul / Ecological Modelling 220 (2009) 2624–2639 Then the normalized Shannon entropy is Appendix B. The “Diversity” criterion The presented two-party model describes how individuals are distributed in several groups that represent different combinations of opinions. At the initial state, all possible groups have practically the same number of individuals (homogeneous distribution). This homogeneous distribution represents the maximally disordered state of the system, where all other states are equally possible. Therefore entropy at this state is at a maximum (Landsberg, 2002; Davison and Shiner, 2003). One of the most typical outcomes of considered processes in the mean-field model is the concentration of individuals in some groups and the disappearance of the other groups. This process leads to one of the possible stationary states of the system. Therefore, disorder and entropy decrease and self-organization emerges (Shiner, 1996). If all individuals are concentrated in one group, then entropy has its minimum and the system is completely ordered. This state can be considered as the opposite extreme state to the homogeneous distribution in a sense of order of the system. It is convenient to characterize this process with one parameter which decreases monotonically from 1 to 0 when distribution changes from the homogeneous to an opposite extreme state. The considered population is of finite size, and to construct such a parameter it is possible to normalize a central moment of the distribution, or to apply some non-linear transformation such as the Shannon entropy. These criteria are introduced as follows. 1. Normalized square root difference between homogeneous distribution and the given distribution, i.e., the second moment of the given distribution. The square root difference between the distributions is determined by the formula:    xi − q N q 2 , where N is the number of individuals in population, q is the number of groups of individuals and {x1 , . . . , xq } is the given distribution of individual between groups. If the given distribution is the same as the homogeneous distribution, this parameter is equal to 0. In the extreme case, when all individuals are concentrated in one group, this parameter equals: d1 =  xi ln xi q (qN − 1/q) ln(qN − 1/q) + (q − 1)(N/q) ln(N/q) . Criteria d and d1 are closely correlated (r = 0.93, p = 0.04) in course of the random process and, therefore, either one of them can be used to characterize the disorder of the given random process. However, the criterion d is computationally more simple. d is called the “diversity criterion” or “diversity” and is used throughout the paper to characterize the considered random process. The diversity criterion can be used as a measure of entropy and disorder only for the mean-field model (Section 3.1. Selforganization in the spatially distributed model is a local spatial process that involves some number of individuals in a neighborhood. Therefore, a distribution of individuals among different groups and the relateds diversity criteria are not comprehensive characteristics of disorder and self-organization for the spatialdistributed model (see Section 3.2). Appendix C. Continuous approximations of the individual-based models The individual-based model developed in Section 2 is a discrete stochastic process, namely a discrete Markov chain. Under the following assumptions this model can be approximated by an analytically tractable system of ordinary differential equations: 1. Number of individuals is sufficiently large (can be considered, infinite). 2. Individuals interact in continuous time. 3. Interactions among individuals lead to immediate results. Such an approximation can be used to investigate structure of stationary states in terms of densities of social groups and their stability. It is common in studying non-linear dynamical systems to use linearizations and to obtain qualitative results of local meaning (for example, local stability of stationary states). In this case, however, the global behavior of the model was completely investigated when time tends to go to infinity. C.1. One-party model  qN − 1 2 q  N 2 + (q − 1) q The normalized square root difference is subtracted from 1 to obtain the criterion d, having the maximum if a given distribution is the homogeneous distribution: d=1−  (xi − (N/q)) x = {0, 0}, y = {0, 1}, z = {1, 0}, v = {1, 1}. 2 q 2 (qN − 1/q) + (q − 1)(N/q) In the model of one party proposed in Section 2.1 individuals are represented as strings of two elements, 0 or 1. For simplicity the case of two distinct independent opinions is considered. In this case the society consists of 4 different social groups, which are represented by their densities: 2 (B.1) 2. Normalized Shannon entropy The Shannon entropy was suggested as a measure of entropy and the normalized Shannon entropy was used as a direct measure of disorder in similar problems (Landsberg, 2002; Davison and Shiner, 2003). The Shannon entropy is determined by the formula: Individuals from each group have the same number of opportunities to change their opinions and therefore move into another group with probability P. The model describing the dynamics of social groups is represented by the following system of ordinary differential equations: x′ (t) = y(t)z(t) − x(t)v(t), y′ (t) = −y(t)z(t) + x(t)v(t), z ′ (t) = −y(t)z(t) + x(t)v(t), (C.1) v′ (t) = y(t)z(t) − x(t)v(t).  q xi ln xi . Proposition C.1. All stationary states of system (C.1) located on a line in the {x, y, z, v} space. 2638 N. Strigul / Ecological Modelling 220 (2009) 2624–2639 To address the global behavior of system (C.1), consider a new variable U(t) = y(t)z(t) − x(t)v(t). The differential equation for U(t) is x(t) + v(t) + y(t) + z(t) + x1 (t) + y1 (t) + z1 (t) + v1 (t) ≤ C, ′ U (t) = −SU(t), where S is a constant equal to the total number of individuals x(t) + y(t) + z(t) + v(t) > 0. It is clear that U(t) decays exponentially to 0 as time increases. Also, the solution must be constrained to a line, which is an intersection of the three hyperplanes x(t) + z(t) = C1 , x(t) + y(t) = C2 , x(t) − v(t) = C3 , (C.2) where the constants, C1 , C2 and C3 are determined by the initial conditions. Therefore U(t) must converge to the intersection of these hyperplanes and U = 0. These analytical results are in agreement with computer simulations of the individual-based model having finite number of individuals. All possible groups of individuals are presented in the long-term simulations of the one-party society, in which no significant polarization occurs. Only stochastic oscillations around stationary states determined by the relations (C.2) are observed. where C is a constant determined by the initial conditions. Assume C = 1, then x(t), v(t), y(t), z(t), x1 (t), y1 (t), z1 (t), v1 (t) ≤ 1 Now subtract (C.6) from (C.3): ′ x (t) − v′ (t) = x1 (t) + v(t) (C.13) The right hand side of (C.13) is non-negative and bounded. Now I will show that limt→∞ x1 (t) + v(t) = 0. First assume the contrary limt→∞ (x1 (t) + v(t)) > = 0 then t (x(t) − v(t)) = lim t→∞ (x1 (t) + v(t))dt → ∞ 0 that is contradictory with the fact that x(t) is bounded. Therefore limt→∞ x1 (t) = 0, limt→∞ v(t) = 0 (C.14) Now it follows from (C.6), (C.7), and (C.14) that C.2. Bipartisan model In this case each individual is represented by a string of three elements, each of them is 1 or 0. The first element determines the party affiliation and two other determine opinions. Therefore, there exist 8 different groups of individuals: x = {0, 0, 0}, y = {0, 0, 1}, z = {0, 1, 0}, v = {0, 1, 1}, limt→∞ y(t)z(t) = 0, limt→∞ y1 (t)z1 (t) = 0 (C.15) Let us consider (C.4), the first two members of the right hand side are vanishing when time tends to go to infinity. The other two members should tend to be equal as limt→∞ y′ (t) = 0 and, also, because y(t) is bounded: limt→∞ y1 (t) = limt→∞ y(t) x1 = {1, 0, 0}, y1 = {1, 0, 1}, z1 = {1, 1, 0}, v1 = {1, 1, 1}. Interactions between individuals are determined by the model described in the Section 2.2.1. In the limit case the model can be presented as a system of ordinary equations. The general system has a quite complex structure and cannot be solved algebraically. A more simple special case is considered below, where individuals can change their party affiliation but cannot change their opinions in contacts with individuals from the opposite party. The system of ordinary differential equations for this model is: x′ (t) = y(t)z(t) − x(t)v(t) + x1 (t), (C.3) y′ (t) = −y(t)z(t) + x(t)v(t) + y1 (t)/2 − y(t)/2, (C.4) z ′ (t) = −y(t)z(t) + x(t)v(t) + z1 (t)/2 − z(t)/2, (C.5) ′ x(t) ≥ 0, v(t) ≥ 0, y(t) ≥ 0, z(t) ≥ 0, x1 (t) ≥ 0, y1 (t) ≥ 0, z1 (t) ≥ 0, v1 (t) ≥ 0, v (t) = y(t)z(t) − x(t)v(t) − v(t), (C.6) x1′ (t) = y1 (t)z1 (t) − x1 (t)v1 (t) − x1 (t), (C.7) y1′ (t) = −y1 (t)z1 (t) + x1 (t)v1 (t) + y(t)/2 − y1 (t)/2, (C.8) z1′ (t) = −y1 (t)z1 (t) + x1 (t)v1 (t) − z1 (t)/2 + z(t)/2, (C.9) v′1 (t) = y1 (t)z1 (t) − x1 (t)v1 (t) + v(t). (C.10) Proposition C.2. The system (C.3)–(C.10)approaches one of two possible stationary states as time tends to go to infinity: y(t) = y1 (t) ≥ 0, v1 (t) ≥ 0, x(t) ≥ 0, v(t) = z(t) = z1 (t) = x1 (t) = 0 (C.11) z(t) = z1 (t) ≥ 0, v1 (t) ≥ 0, x(t) ≥ 0, v(t) = y(t) = y1 (t) = x1 (t) = 0 (C.12) All variables in this system (C.3)–(C.10) are non-negative and bounded: Similarly from (C.5) it follows that limt→∞ z1 (t) = limt→∞ z(t) (C.16) These two states (C.11) and (C.12) determine all possible stationary points of the system (C.3)–(C.10), and the systems globally converge to these stationary states. The structure of these stationary states is similar to the structure of the stationary states that were observed in computer simulations (see Section 3.1.1). References Adolphs, R., 2003. Cognitive neuroscience of human social behaviour. Nat. Rev. Neurosci. 4, 165–178. Artemyev, A.V., 2008. Population Ecology of the Pied Flycatcher in the Nothern Part of the Range. Nauka, Moskow. Axelrod, R.M., 1997. The Complexity of Cooperation: Agent-based Models of Competition and Collaboration. Princeton University Press, Princeton. Baker, M.C., Cunningham, M.A., 1985. The biology of bird song dialects. Behav. Brain Sci. 8, 85–133. Baker, A.J., Jenkins, P.F., 1987. Founder effect and cultural evolution of songs in an isolated population of chaffinches, Fringilla coelebs, in the Chatham Islands. Anim. Behav. 35, 1793–1803. Barrington, D., 1773. Experiments and observations on the singing of birds (1683–1775). Philos. Trans. 63, 249–291. Bichakjian, B.H., 2002. Language in a Darwinian Perspective. Peter Lang Publishing, Frankfurt. Briskie, J.V., 1999. Song variation and the structure of local song dialects in the polygynandrous Smith’s Longspur. Can. J. Zool. 77, 1587–1594. Cangelosi, A., Parisi, D., 2002. Computer Simulation: A New Scientific Approach to the Study of Language Evolution. In: Cangelosi, A., Parisi, D. (Eds.), Simulating the Evolution of Language. Springer Verlag, London, pp. 3–28. Castro, L., Toro, M.A., 2004. The evolution of culture: from primate social learning to human culture. Proc. Natl. Acad. Sci. U.S.A. 101, 10235–10240. Catchpole, C.K., Slater, P.J.B., 2008. Bird Song: Biological Themes and Variations. Cambridge University Press, New York. Conlisk, J., Gong, J.C., Tong, C.H., 2000. Imitation and the dynamics of norms. Math. Soc. Sci. 40, 197–213. Davison, M., Shiner, J.S., 2003. Many entropies, many disorders. Open Syst. Inf. Dyn. 10, 281–296. De la Sen, M., 2007. The environment carrying capacity is not independent of the intrinsic growth rate for subcritical spawning stock biomass in the Beverton–Holt equation. Ecol. Modell. 204, 271–273. N. Strigul / Ecological Modelling 220 (2009) 2624–2639 Durrett, R., Levin, S.A., 2005. Can stable social groups be maintained by homophilous imitation alone? J. Econ. Behav. Organ. 57, 267–286. Ellers, J., Slabbekoorn, H., 2003. Song divergence and male dispersal among bird populations: a spatially explicit model testing the role of vocal learning. Anim. Behav. 65, 671–681. Francis, W.N., 1983. Dialectology. Longman, London. Galef, B.G., Laland Jr., K.N., 2005. Social learning in animals: empirical studies and theoretical models. Bioscience 55, 489–499. Grant, B.R., Grant, P.R., 1996. Cultural inheritance of song and its role in the evolution of Darwin’s finches. Evolution 50, 2471–2487. Grant, P.R., Grant, B.R., 1997a. Hybridization, sexual imprinting, and mate choice. Am. Nat. 149, 1–28. Grant, P.R., Grant, B.R., 1997b. Mating patterns of Darwin’s finch hybrids determined by song and morphology. Biol. J. Linn. Soc. Lond. 60, 317–343. Heyes, C., 2001. Causes and consequences of imitation. Trends Cogn. Sci. 5, 253–261. Heyes, C., 2003. Four routes of cognitive evolution. Psychol. Rev. 110, 713–727. Hock, H.H., Joseph, B.D., 1996. Language History, Language Change, and Language Relationship: An Introduction to Historical and Comparative Linguistics. Mouton de Gruyter, Berlin. Hui, C., 2006. Carrying capacity, population equilibrium, and environments maximal load. Ecol. Modell. 192, 317–320. Irwin, D.E., 2000. Song variation in an avian ring species. Evolution 54, 998–1010. Irwin, D.E., Bensch, S., Price, T.D., 2001a. Speciation in a ring. Nature 409, 333–337. Irwin, D.E., Irwin, J.H., Price, T.D., 2001b. Ring species as bridges between microevolution and speciation. Genetica 112, 223–243. Jahoda, G., 2002. The ghosts in the meme machine. Hist. Hum. Sci. 15, 55–68. Jarvis, E.D., 2004. Learned birdsong and the neurobiology of human language. Ann. N. Y. Acad. Sci. 1016, 749–777. Kauffman, S.A., 1993. The Origins of Order: Self-Organization and Selection in Evolution. Oxford University Press, New York. Kauffman, S.A., 1995. At Home in the Universe: The Search for the Laws of SelfOrganization and Complexity. Oxford University Press, New York. Krebs, J.R., Kroodsma, D.E., 1980. Repertoires and geographical variation in bird song. Adv. Study Behav. 11, 134–177. Kretzschmar Jr., W.A., 1992. Isoglosses and predictive modeling. Am. Speech 67, 227–249. Kurath, H., 1998. The sociocultural background of dialect areas in American English. In: Linn, M.D. (Ed.), Handbook of Dialects and Language Variation, 2nd ed. Academic Press, San Diego, pp. 105–122. Laland, K.N., 2004. Social learning strategies. Learn. Behav. 32, 4–14. Landsberg, P.T., 2002. Fragmentations, mergings and order: aspects of entropy. Physica A 305, 32–40. Leigh Jr., E.G., 2007. Neutral theory: a historical perspective. Evol. Biol. 20, 2075–2091. Lemon, R.E., 1975. How birds develop song dialects. Condor 77, 385–406. Livingstone, D., 2002. The Evolution of Dialect Diversity. In: Cangelosi, A., Parisi, D. (Eds.), Simulating the Evolution of Language. Springer Verlag, London, pp. 99–118. Lucanus, H.V., 1907. Lokale Gesangserscheinungen und Vogeldialekte; ihre Ursachen und Entstehungen. Orn. Monatsber. 15, 109–122. Malchevsky, A.S., 1958. Local songs and geographical variability of song in birds. Vestn. Lenin. U. Biol. 9, 100–119. Malchevsky, A.S., 1959. The Nesting Life of Songbirds. The Breeding and Postembryonal Development of the Forest Passerines of the European part of the USSR. Nauka, Leningrad. Malchevsky, A.S., Pukinsky, J.B., 1983. Birds of Leningrad District and Neighbouring Territories. v 2 Passerines. Leningrad University Press, Leningrad. Marler, P., 1997. Three models of song learning: evidence from behavior. J. Neurobiol. 33, 501–516. Martens, J., Geduldig, G., 1988. Akustische Barrieren beim Waldbaumläufer (Certhia familiaris)? J. Ornithol. 129, 417–432. Mayr, E., 1963. Animal Species and Evolution. Harvard University Press, Cambridge. 2639 McCarthy, M.A., 1997. The Allee effect, finding mates and theoretical models. Ecol. Modell. 103, 99–102. McPherson, M., Smith-Lovin, L., Cook, J.M., 2001. Birds of a feather: homophily in social networks. Annu. Rev. Sociol. 27, 415–444. Miklosi, A., 1999. The ethological analysis of imitation. Biol. Rev. 74, 347–374. Millstein, R.L., 2002. Are random drift and natural selection conceptually distinct? Biol. Philos. 17, 33–53. Mundinger, P.C., 1982. Microgeographic and macrogeographic variation in the acquired vocalizations of birds. In: Kroodsma, D.E., Miller, E.H. (Eds.), Acoustic Communication in Birds. v. 2. Academic Press, New York, pp. 147–208. Nelson, D.A., Khanna, H., Marler, P., 2001. Learning by instruction or selection: Implications for patterns of geographic variation in bird song. Behaviour 138, 1137–1160. Nowicki, S., Searcy, W.A., 2004. Song function and the evolution of female preferences: why birds sing, why brains matter. Ann. N. Y. Acad. Sci. 1016, 704–723. Offerman, T., Potters, J., Sonnemans, J., 2002. Imitation and belief learning in an oligopoly experiment. Rev. Econ. Stud. 69, 973–997. Petyt, K.M., 1980. The Study of Dialect: An Introduction to Dialectology. Westview Press, Boulder, Colorado. Podos, J., Warren, P.S., 2007. The evolution of geographic variation in birdsong. Adv. Study Behav. 37, 403–458. Shiner, J.S., 1996. Self-organization, entropy, and order in growing systems. In: Schweitzer, F. (Ed.), Self-organization of Complex Structures: From Individual to Collective Dynamics. Gordon and Breach, London, pp. 21–35. Sigmund, K., Hauert, C., Nowak, M.A., 2001. Reward and punishment. Proc. Natl. Acad. Sci. U.S.A. 98, 10757–10762. Slabbekoorn, H., Smith, T.B., 2002. Bird song, ecology and speciation, Philos. Trans. R. Soc. Lond., B: Biol. Sci. 357, 493–503. Slater, P.J.B., 2003. Fifty years of bird song research: a case study in animal behaviour. Anim. Behav. 65, 633–639. Smith, M.J., 1982. Evolution and the Theory of Games. Cambridge University Press, Cambridge. Strigul, N.S., 2001. Observations of the Common Treecreeper (Certhia familiaris) pairs, successfully raising two broods per summer. Russ. J. Ornithol. 157, 742–752. Strigul, N.S., Pristinski, D., Purves, D., Dushoff, J., Pacala, S.W., 2008. Scaling from trees to forests: tractable macroscopic equations for forest dynamics. Ecol. Monogr. 78, 523–545. Thielcke, G., 1960. Mischgesang der Baumläufer Certhia brachydactyla und C. familiaris. J. Ornithol. 101, 286–290. Thielcke, G., Wüstenberg, K., 1985. Experiments on the origin of dialects in the shorttoed tree creeper (Certhia brachydactyla). Behav. Ecol. Sociobiol. 16, 195–201. Thielcke, G., 1986. Waldbaumläufer (Certhia familiaris) singen bei Sympatrie mit dem Gartenbaumläufer (C. brachydactyla) nicht kontrastreicher. J. Ornithol. 127, 43–49. Thielcke, G., 1988. Neue Befunde bestätigen Baron Pernau’s (1660–1731) Angaben über LautäuAYerungen des Buchfinken (Fringilla coelebs). J. Ornithol. 129, 55–70. Thielcke, G., Krome, M., 1989. Experimente über sensible Phasen und Gesangsvariabilität beim Buchfinken (Fringilla coelebs). J. Ornithol. 130, 435–453. Thielcke, G., 1992. Stabilität und Änderungen von Dialekten und Dialektgrenzen beim Gartenbaumläufer (Certhia brachydactyla.). J. Ornithol. 133, 43–59. Thorpe, W.H., 1954. The process of song learning in the chaffinch as studied by means of the sound spectrograph. Nature 173, 465–469. Thorpe, W.H., 1961. Bird-Song: The Biology of Vocal Communication and Expression in Birds. Cambridge University Press, Cambridge. von Frisch, K., 1962. Dialects in the language of the bees. Sci. Am. 207, 79–87. Whitehead, H., Rendell, L., Osborne, R.W., Wursig, B., 2004. Culture and conservation of non-humans with reference to whales and dolphins: review and new directions. Biol. Conserv. 120, 427–437. Wickler, W., 1982. Immanuel Kant and the song of the house sparrow. Auk 99, 590–591. Zimin, V.B., 1988. Ecology of passerines in the Northwest of the USSR. Nauka, Leningrad.