ACI
ACI
ACI
Americas Headquarters
Cisco Systems, Inc.
170 West Tasman Drive
San Jose, CA 95134-1706
USA
http://www.cisco.com
Tel: 408 526-4000
800 553-NETS (6387)
Fax: 408 527-0883
THE SPECIFICATIONS AND INFORMATION REGARDING THE PRODUCTS IN THIS MANUAL ARE SUBJECT TO CHANGE WITHOUT NOTICE. ALL STATEMENTS,
INFORMATION, AND RECOMMENDATIONS IN THIS MANUAL ARE BELIEVED TO BE ACCURATE BUT ARE PRESENTED WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED. USERS MUST TAKE FULL RESPONSIBILITY FOR THEIR APPLICATION OF ANY PRODUCTS.
THE SOFTWARE LICENSE AND LIMITED WARRANTY FOR THE ACCOMPANYING PRODUCT ARE SET FORTH IN THE INFORMATION PACKET THAT SHIPPED WITH
THE PRODUCT AND ARE INCORPORATED HEREIN BY THIS REFERENCE. IF YOU ARE UNABLE TO LOCATE THE SOFTWARE LICENSE OR LIMITED WARRANTY,
CONTACT YOUR CISCO REPRESENTATIVE FOR A COPY.
The Cisco implementation of TCP header compression is an adaptation of a program developed by the University of California, Berkeley (UCB) as part of UCB's public domain version
of the UNIX operating system. All rights reserved. Copyright © 1981, Regents of the University of California.
NOTWITHSTANDING ANY OTHER WARRANTY HEREIN, ALL DOCUMENT FILES AND SOFTWARE OF THESE SUPPLIERS ARE PROVIDED “AS IS" WITH ALL FAULTS.
CISCO AND THE ABOVE-NAMED SUPPLIERS DISCLAIM ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING, WITHOUT LIMITATION, THOSE OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OR ARISING FROM A COURSE OF DEALING, USAGE, OR TRADE PRACTICE.
IN NO EVENT SHALL CISCO OR ITS SUPPLIERS BE LIABLE FOR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, OR INCIDENTAL DAMAGES, INCLUDING, WITHOUT
LIMITATION, LOST PROFITS OR LOSS OR DAMAGE TO DATA ARISING OUT OF THE USE OR INABILITY TO USE THIS MANUAL, EVEN IF CISCO OR ITS SUPPLIERS
HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
Any Internet Protocol (IP) addresses and phone numbers used in this document are not intended to be actual addresses and phone numbers. Any examples, command display output, network
topology diagrams, and other figures included in the document are shown for illustrative purposes only. Any use of actual IP addresses or phone numbers in illustrative content is unintentional
and coincidental.
Cisco and the Cisco logo are trademarks or registered trademarks of Cisco and/or its affiliates in the U.S. and other countries. To view a list of Cisco trademarks, go to this URL: http://
www.cisco.com/go/trademarks. Third-party trademarks mentioned are the property of their respective owners. The use of the word partner does not imply a partnership
relationship between Cisco and any other company. (1110R)
CHAPTER 8 ACI Transit Routing, Route Peering, and EIGRP Support 173
ACI Transit Routing 173
Transit Routing Use Cases 174
ACI Fabric Route Peering 178
Route Redistribution 179
Route Peering by Protocol 180
Transit Route Control 183
Default Policy Behavior 185
EIGRP Protocol Support 185
EIGRP L3extOut Configuration 187
EIGRP Interface Profile 188
Audience
This guide is intended primarily for data center administrators with responsibilities and expertise in one or
more of the following:
• Virtual machine installation and administration
• Server administration
• Switch and network administration
Document Conventions
Command descriptions use the following conventions:
Convention Description
bold Bold text indicates the commands and keywords that you enter literally
as shown.
Italic Italic text indicates arguments for which the user supplies the values.
Convention Description
[x | y] Square brackets enclosing keywords or arguments separated by a vertical
bar indicate an optional choice.
variable Indicates a variable for which you supply values, in context where italics
cannot be used.
string A nonquoted set of characters. Do not use quotation marks around the
string or the string will include the quotation marks.
Convention Description
screen font Terminal sessions and information the switch displays are in screen font.
boldface screen font Information you must enter is in boldface screen font.
italic screen font Arguments for which you supply values are in italic screen font.
Note Means reader take note. Notes contain helpful suggestions or references to material not covered in the
manual.
Caution Means reader be careful. In this situation, you might do something that could result in equipment damage
or loss of data.
Related Documentation
Application Policy Infrastructure Controller (APIC) Documentation
The following companion guides provide documentation for APIC:
• Cisco APIC Getting Started Guide
• Cisco APIC Basic Configuration Guide
• Cisco ACI Fundamentals
• Cisco APIC Layer 2 Networking Configuration Guide
• Cisco APIC Layer 3 Networking Configuration Guide
• Cisco APIC NX-OS Style Command-Line Interface Configuration Guide
• Cisco APIC REST API Configuration Guide
• Cisco APIC Layer 4 to Layer 7 Services Deployment Guide
• Cisco ACI Virtualization Guide
• Cisco Application Centric Infrastructure Best Practices Guide
Documentation Feedback
To provide technical feedback on this document, or to report an error or omission, please send your comments
to apic-docfeedback@cisco.com. We appreciate your feedback.
Maximum MTU Increased To enable setting the MTU used in Networking and Management
communicating with the external Connectivity
network to 9216, the maximum
MTU has been increased from
9000 to 9216 bytes.
Remote Leaf Switches With an ACI fabric deployed, you Remote Leaf Switches in Network
can extend ACI services and APIC and Management Connectivity
management to remote datacenters
with Cisco ACI leaf switches that
have no local spine switch or APIC
attached.
Graceful Insertion and Removal The Graceful Insertion and Fabric Provisioning
(GIR) Mode Removal (GIR) mode or
maintenance mode allows you to
isolate a switch from the network
with minimum service disruption.
Local User Authentication using OTP is a one-time password that is User Access, Authentication, and
OTP valid for only one session. Once Accounting
OTP is enabled, APIC generates a
random human readable 16 binary
octets that are base32 OTP Key.
802.1Q Tunnel Enhancements Now you can configure ports on Fabric Provisioning
core-switches for use in Dot1q
Tunnels for multiple customers.
You can also define access VLANs
to distinguish between customers
consuming the corePorts. You can
also disable MAC learning on
Dot1q Tunnels.
Table 5: New Features and Changed Behavior in Cisco ACI, Release 2.2(2e)
Table 6: New Features and Changed Behavior in Cisco APIC 2.2(1n) Release
802.1 Q Tunnels You can now configure 802.1Q 802.1Q Tunnels in Network and
tunnels to enable Management Connectivity
point-to-multi-point tunneling of
Ethernet frames in the fabric, with
Quality of Service (QoS) priority
settings.
APIC Cluster Cold Standby Support is added to operate the APIC Cluster Management in
APICs in a cluster in an Fabric Provisioning
Active/Standby mode. In an APIC
cluster, the designated active
APICs share the load and the
designated standby APICs can act
as an replacement for any of the
APICs in an active cluster.
Contract Preferred Groups Support is added for contract Contracts in ACI Policy Model
preferred groups that enable greater
control of communication between
EPGs in a VRF. If most of the
EPGs in the VRF should have open
communication, but a few should
only have limited communication
with the other EPGs, you can
configure a combination of a
contract preferred group and
contracts with filters to control
communication precisely.
FCoE Supported over FEX You can now configure FCoE over Supporting Fibre Channel over
FEX ports. Ethernet Traffic on the ACI Fabric
in Fabric Provisioning
CDP supported in policies on In this release, support is added for Fabric Provisioning
interfaces to FEX devices CDP on interfaces to FEX devices.
Table 7: New Features and Changed Bahavior in Cisco APIC 2.1(1h) Release
Table 8: New Features and Changed Behavior in Cisco APIC 2.0(2f) release
Install Tetration Analytics Cisco Tetration Analytics agent About Cisco Tetration Analytics
installation is added. Agent Installation, on page 265
Route Target Filtering Route Target Filtering is added, to Route Target filtering, on page 160
optimize BGP routing tables by
filtering the routes that are stored
on them.
Multipod QoS Support for Preserving CoS and Preserving QoS Priority Settings
DSCP settings is added for in a Multipod Fabric, on page 96
Multipod topologies.
Table 9: New Features and Changed Behavior in Cisco APIC 2.0(1m) release
-- Copy Services -- Unlike SPAN that duplicates all of --About Copy Services, on
the traffic, the Cisco Application page 46
Centric Infrastructure (ACI) copy
services feature enables selectively
copying portions of the traffic between
endpoint groups, according to the
specifications of the contract.
-- Layer 3 EVPN Services Over The Layer 3 EVPN services over fabric -- Cisco ACI GOLF , on page
Fabric WAN WAN feature enables much more 158
efficient and scalable ACI fabric WAN
connectivity. It uses EVPN over OSPF
for WAN routers that are connected to
spine switches.
-- EPG Deployment through AEP Attached entity profiles can be Attachable Entity Profile, on
associated directly with application page 36
EPGs, which deploys the associated
application EPGs to all those ports
associated with the attached entity
profile.
-- Fibre Channel over Ethernet Fibre Channel over Ethernet (FCoE) --Supporting Fibre Channel
(FCoE) ssupport. over Ethernet Traffic on the
ACI Fabric , on page 70
-- Configuration Zone Supported Updated list of policies are supported --Configuration Zone
Policies for configuration zones. Supported Policies, on page
297
Table 10: New Features and Changed Behavior in Cisco APIC 1.3(x) and switch 11.3(x) release
-- Bug fixes Updates to tagged EPG topic -- Native 802.1p and Tagged
EPGs on Interfaces, on page
33
Table 11: New Features and Changed Behavior in Cisco APIC Release 1.2(2x)
-- Data plane policing Use data plane policing (DPP) to --Data Plane Policing, on page
manage bandwidth consumption on 88
ACI fabric access interfaces.
--DSCP marking Previously, DSCP marking could only --Preserving 802.1P Class of
be set on a L3Out but now can be set Service Settings, on page 95
on the following: Contract; Subject; In
Term; Out Term.
--IPv6 support for management Unrestricted IPv6 support for all ACI --IPv6 Support, on page 133
interfaces fabric and APIC interfaces; IPv4, or
IPv6, or dual stack configurations are
supported. The requirement to allow
only IPv4 addresses on management
interfaces no longer applies.
--BGP dynamic neighbors, route Expanded support for BGP and OSPF --Route Peering by Protocol,
dampening, weight attribute, options. on page 180
remove-private-as
--OSPF name lookup, prefix
suppression, and type 7 translation
Table 12: New Features and Changed Behavior in Cisco APIC Release 1.2(1x)
-- Support for Public Subnets under An EPG that provides a shared service --Bridge Domains and
EPG must have its subnet configured under Subnets, on page 38
that EPG (not under a bridge domain),
and its scope must be set to advertised
externally, and shared between VRFs.
--Shared Layer 3 Out A shared Layer 3 Out configuration -- Shared Layer 3 Out, on page
provides routed connectivity to external 150
networks as a shared service. An
l3extInstP EPG provides routed
connectivity to external networks. It
can be can be provisioned as a shared
service in any tenant (user, common,
infra, or mgmt.).
--Bug fix Improved explanations of the subnet --Route Import and Export,
route export and route import Route Summarization, and
configuration options. Route Community Match , on
page 146
-- Stats on Layer 3 routes interfaces The APIC can be configured to collect -- Routed Connectivity to
for Billing byte count and packet count billing External Networks as a Shared
statistics from a port configured for Service Billing and Statistics,
routed connectivity to external on page 192
networks (an l3extInstP EPG) as a
shared service.
--Static route with weights Static route preference within the ACI --Static Route Preference, on
fabric is carried in MP-BGP using cost page 145
extended community.
--Common pervasive gateway for Multiple ACI fabrics can be configured --Common Pervasive
IPv4 and secondary IP address for with an IPv4 common gateway on a per Gateway, on page 138
IPv4 bridge domain basis.
--Fabric secure mode Fabric secure mode prevents parties --Fabric Secure Mode
with physical access to the fabric
equipment from adding a switch or
APIC controller to the fabric without
manual authorization by an
administrator.
--CoS (802.1p) The ACI fabric enables preserving --Preserving 802.1P Class of
802.1p class of service (CoS) within Service Settings, on page 95
the fabric. Enable the fabric global QoS
policy dot1p-preserve option to
guarantee that the 802.1p value in
packets which enter and transit the ACI
fabric is preserved.
Table 13: New Features and Changed Behavior in Cisco APIC Release 1.1(2x)
Table 14: New Features and Changed Behavior in Cisco APIC Release 1.1(1x)
--Transit routing The ACI fabric supports transit routing, --ACI Transit Routing, on
including the necessary EIGRP, eBGP, page 173
and OSPF protocol support, which
enables border routers to perform
bidirectional redistribution with other
routing domains.
--Host vPC FEX The ACI fabric supports Cisco Fabric --FEX Virtual Port Channels,
Extender (FEX) server-side virtual port on page 68
channels (VPC), also known as FEX
straight-through VPC.
--Per bridge domain An administrator can control the --Bridge Domains and
multicast/broadcast packet control behavior of these packets per bridge Subnets, on page 38
domain.
--Per port VLAN. Allows configuration of the same -- Per Port VLAN, on page
VLAN ID across different EPGs (on 34
different bridge domains) on different
ports on the same leaf switch. An
administrator can now configure the
same VLAN ID on every port on the
same switch.
--Loop detection. The ACI fabric can now detect loops --Loop Detection, on page 125
in Layer 2 network segments that are
connected to leaf switch access ports.
--Various updates and bug fixes Added vzAny introduction. --What vzAny Is, on page 47
Accounting. --Accounting, on page 191
Default policies. --Default Policies, on page 49
Contract scope. --Contracts, on page 42
Networking domains. -- Networking Domains, on
page 140
VMM domain concepts updated and
procedures moved to new expanded --Cisco ACI VM Networking
ACI Virtualization Guide. Support for Virtual Machine
Managers, on page 203
Table 15: New Features and Changed Behavior in Cisco APIC Release 1.0(3x)
--Update to the Endpoint Retention Clarifies behavior of Bridge Domain --Endpoint Retention, on page
topic flooding that updated the location of 115
endpoints within an EPG subnet that
spans multiple leaf switches within the
BD.
--Storm Control Implements Layer 2 storm control. --About Traffic Storm Control,
on page 112
--AAA VMM Domain tags VMM domains can be tagged as --User Access: Roles,
security domains so that they become Privileges, and Security
visible to the users contained in the Domains, on page 190
security domain.
--Atomic counters endpoint to IP Enables selecting either the target MAC --Atomic Counters, on page
address option address or IP address. 275
--Delete VMM domain guidelines Identifies recommended workflow --See the Guidelines for
sequence. Deleting VMM Domains topic
in the Virtual Machine
Manager Domains Chapter.
--Custom RBAC Rules Identifies use case scenarios and Custom RBAC Rules
guidelines for developing custom
See Sample RBAC Rules in
RBAC rules.
Configuring Security in Cisco
APIC REST API Configuration
Guide
--Health Score calculations Identifies how system, pod, tenant, and -- Health Scores, on page 277
MO level health scores are calculated.
--Multinode SPAN ERSPAN Identifies ERSPAN header types and --Multinode SPAN, on page
guidelines and header types guidelines for using ERSPAN. 282
--EPG untagged and tagged VLAN Provides guidelines and limitations for --Endpoint Groups, on page
headers using ungtagged EPG VLANS. 29
--Bridge Domain legacy mode Provides guidelines for configuring --Bridge Domains and
legacy mode bridge domains. Subnets, on page 38
--Updates to AAA LDAP and Adds AAA LDAP and TCACS+ --LDAP/Active Directory
TCACS+ configurations with configuration examples. Authentication, on page 200
examples --TACACS+ Authentication,
on page 199
--Update to the DHCP Relay topic Provides guidelines regarding the --DHCP Relay, on page 127
requirement to configure a single
bridge domain with a single sublet
when establishing a relation with a
DHCP relay.
--Various text edits to improve Readability improvements and See the Fundamentals,
readability and a correction to a additional details in several topics. Provisioning, and Networking
misspelled word in an image chapters.
The APIC manages the scalable ACI multitenant fabric. The APIC provides a unified point of automation
and management, policy programming, application deployment, and health monitoring for the fabric. The
APIC, which is implemented as a replicated synchronized clustered controller, optimizes performance, supports
any application anywhere, and provides unified operation of the physical and virtual infrastructure. The APIC
enables network administrators to easily define the optimal network for applications. Data center operators
can clearly see how applications consume network resources, easily isolate and troubleshoot application and
infrastructure problems, and monitor and profile resource usage patterns.
APIC fabric management functions do not operate in the data path of the fabric. The following figure shows
an overview of the leaf/spin ACI fabric.
The ACI fabric provides consistent low-latency forwarding across high-bandwidth links (40 Gbps, with a
100-Gbps future capability). Traffic with the source and destination on the same leaf switch is handled locally,
and all other traffic travels from the ingress leaf to the egress leaf through a spine switch. Although this
architecture appears as two hops from a physical perspective, it is actually a single Layer 3 hop because the
fabric operates as a single Layer 3 switch.
The ACI fabric object-oriented operating system (OS) runs on each Cisco Nexus 9000 Series node. It enables
programming of objects for each configurable element of the system.
The ACI fabric OS renders policies from the APIC into a concrete model that runs in the physical infrastructure.
The concrete model is analogous to compiled software; it is the form of the model that the switch operating
system can execute. The figure below shows the relationship of the logical model to the concrete model and
the switch OS.
All the switch nodes contain a complete copy of the concrete model. When an administrator creates a policy
in the APIC that represents a configuration, the APIC updates the logical model. The APIC then performs the
intermediate step of creating a fully elaborated policy that it pushes into all the switch nodes where the concrete
model is updated.
Note The Cisco Nexus 9000 Series switches can only execute the concrete model. Each switch has a copy of
the concrete model. If the APIC goes offline, the fabric keeps functioning but modifications to the fabric
policies are not possible.
The APIC is responsible for fabric activation, switch firmware management, network policy configuration,
and instantiation. While the APIC acts as the centralized policy and network management engine for the
fabric, it is completely removed from the data path, including the forwarding topology. Therefore, the fabric
can still forward traffic even when communication with the APIC is lost.
The Cisco Nexus 9000 Series switches offer modular and fixed 1-, 10-, and 40-Gigabit Ethernet switch
configurations that operate in either Cisco NX-OS stand-alone mode for compatibility and consistency with
the current Cisco Nexus switches or in ACI mode to take full advantage of the APIC's application policy-driven
services and infrastructure automation features.
The Representational State Transfer (REST) architecture is a key development method that supports cloud
computing. The ACI API is REST-based. The World Wide Web represents the largest implementation of a
system that conforms to the REST architectural style.
Cloud computing differs from conventional computing in scale and approach. Conventional environments
include software and maintenance requirements with their associated skill sets that consume substantial
operating expenses. Cloud applications use system designs that are supported by a very large scale infrastructure
that is deployed along a rapidly declining cost curve. In this infrastructure type, the system administrator,
development teams, and network professionals collaborate to provide a much higher valued contribution.
In conventional settings, network access for compute resources and endpoints is managed through virtual
LANs (VLANs) or rigid overlays, such as Multiprotocol Label Switching (MPLS), that force traffic through
rigidly defined network services, such as load balancers and firewalls. The APIC is designed for
programmability and centralized management. By abstracting the network, the ACI fabric enables operators
to dynamically provision resources in the network instead of in a static fashion. The result is that the time to
deployment (time to market) can be reduced from months or weeks to minutes. Changes to the configuration
of virtual or physical switches, adapters, policies, and other hardware and software components can be made
in minutes with API calls.
The transformation from conventional practices to cloud computing methods increases the demand for flexible
and scalable services from data centers. These changes call for a large pool of highly skilled personnel to
enable this transformation. The APIC is designed for programmability and centralized management. A key
feature of the APIC is the web API called REST. The APIC REST API accepts and returns HTTP or HTTPS
messages that contain JavaScript Object Notation (JSON) or Extensible Markup Language (XML) documents.
Today, many web developers use RESTful methods. Adopting web APIs across the network enables enterprises
to easily open up and combine services with other internal or external providers. This process transforms the
network from a complex mixture of static resources to a dynamic exchange of services on offer.
• Tags, page 52
• About APIC Quota Management Configuration, page 52
Managed object manipulation in the model relieves engineers from the task of administering isolated, individual
component configurations. These characteristics enable automation and flexible workload provisioning that
can locate any workload anywhere in the infrastructure. Network-attached services can be easily deployed,
and the APIC provides an automation framework to manage the life cycle of those network-attached services.
Logical Constructs
The policy model manages the entire fabric, including the infrastructure, authentication, security, services,
applications, and diagnostics. Logical constructs in the policy model define how the fabric meets the needs
of any of the functions of the fabric. The following figure provides an overview of the ACI policy model
logical constructs.
Fabric-wide or tenant administrators create predefined policies that contain application or shared resource
requirements. These policies automate the provisioning of applications, network-attached services, security
policies, and tenant subnets, which puts administrators in the position of approaching the resource pool in
terms of applications rather than infrastructure building blocks. The application needs to drive the networking
behavior, not the other way around.
Each node in the tree represents a managed object (MO) or group of objects. MOs are abstractions of fabric
resources. An MO can represent a concrete object, such as a switch, adapter, or a logical object, such as an
application profile, endpoint group, or fault. The following figure provides an overview of the MIT.
The hierarchical structure starts with the policy universe at the top (Root) and contains parent and child nodes.
Each node in the tree is an MO and each object in the fabric has a unique distinguished name (DN) that
describes the object and locates its place in the tree.
The following managed objects contain the policies that govern the operation of the system:
• APIC controllers comprise a replicated synchronized clustered controller that provides management,
policy programming, application deployment, and health monitoring for the multitenant fabric.
• A tenant is a container for policies that enable an administrator to exercise domain-based access control.
The system provides the following four kinds of tenants:
◦User tenants are defined by the administrator according to the needs of users. They contain policies
that govern the operation of resources such as applications, databases, web servers, network-attached
storage, virtual machines, and so on.
◦The common tenant is provided by the system but can be configured by the fabric administrator.
It contains policies that govern the operation of resources accessible to all tenants, such as firewalls,
load balancers, Layer 4 to Layer 7 services, intrusion detection appliances, and so on.
◦The infrastructure tenant is provided by the system but can be configured by the fabric administrator.
It contains policies that govern the operation of infrastructure resources such as the fabric VXLAN
overlay. It also enables a fabric provider to selectively deploy resources to one or more user tenants.
Infrastructure tenant polices are configurable by the fabric administrator.
◦The management tenant is provided by the system but can be configured by the fabric administrator.
It contains policies that govern the operation of fabric management functions used for in-band and
out-of-band configuration of fabric nodes. The management tenant contains a private out-of-bound
address space for the APIC/fabric internal communications that is outside the fabric data path that
provides access through the management port of the switches. The management tenant enables
discovery and automation of communications with virtual machine controllers.
• Access policies govern the operation of switch access ports that provide connectivity to resources such
as storage, compute, Layer 2 and Layer 3 (bridged and routed) connectivity, virtual machine hypervisors,
Layer 4 to Layer 7 devices, and so on. If a tenant requires interface configurations other than those
provided in the default link, Cisco Discovery Protocol (CDP), Link Layer Discovery Protocol (LLDP),
Link Aggregation Control Protocol (LACP), or Spanning Tree, an administrator must configure access
policies to enable such configurations on the access ports of the leaf switches.
• Fabric policies govern the operation of the switch fabric ports, including such functions as Network
Time Protocol (NTP) server synchronization, Intermediate System-to-Intermediate System Protocol
(IS-IS), Border Gateway Protocol (BGP) route reflectors, Domain Name System (DNS) and so on. The
fabric MO contains objects such as power supplies, fans, chassis, and so on.
• Virtual Machine (VM) domains group VM controllers with similar networking policy requirements.
VM controllers can share VLAN or Virtual Extensible Local Area Network (VXLAN) space and
application endpoint groups (EPGs). The APIC communicates with the VM controller to publish network
configurations such as port groups that are then applied to the virtual workloads.
• Layer 4 to Layer 7 service integration life cycle automation framework enables the system to dynamically
respond when a service comes online or goes offline. Policies provide service device package and
inventory management functions.
• Access, authentication, and accounting (AAA) policies govern user privileges, roles, and security domains
of the Cisco ACI fabric.
The hierarchical policy model fits well with the REST API interface. When invoked, the API reads from or
writes to objects in the MIT. URLs map directly into distinguished names that identify objects in the MIT.
Any data in the MIT can be described as a self-contained structured tree text document encoded in XML or
JSON.
Tenants
A tenant (fvTenant) is a logical container for application policies that enable an administrator to exercise
domain-based access control. A tenant represents a unit of isolation from a policy perspective, but it does not
represent a private network. Tenants can represent a customer in a service provider setting, an organization
or domain in an enterprise setting, or just a convenient grouping of policies. The following figure provides
an overview of the tenant portion of the management information tree (MIT).
Figure 6: Tenants
Tenants can be isolated from one another or can share resources. The primary elements that the tenant contains
are filters, contracts, outside networks, bridge domains, Virtual Routing and Forwarding (VRF) instances,
and application profiles that contain endpoint groups (EPGs). Entities in the tenant inherit its policies. VRFs
are also known as contexts; each VRF can be associated with multiple bridge domains.
Note In the APIC GUI under the tenant navigation path, a VRF (context) is called a private network.
Tenants are logical containers for application policies. The fabric can contain multiple tenants. You must
configure a tenant before you can deploy any Layer 4 to Layer 7 services. The ACI fabric supports IPv4, IPv6,
and dual-stack configurations for tenant networking.
VRFs
A Virtual Routing and Forwarding (VRF) object (fvCtx) or context is a tenant network (called a private
network in the APIC GUI). A tenant can have multiple VRFs. A VRF is a unique Layer 3 forwarding and
application policy domain. The following figure shows the location of VRFs in the management information
tree (MIT) and their relation to other objects in the tenant.
Figure 7: VRFs
A VRF defines a Layer 3 address domain. One or more bridge domains are associated with a VRF. All of the
endpoints within the Layer 3 domain must have unique IP addresses because it is possible to forward packets
directly between these devices if the policy allows it. A tenant can contain multiple VRFs. After an administrator
creates a logical device, the administrator can create a VRF for the logical device, which provides a selection
criteria policy for a device cluster. A logical device can be selected based on a contract name, a graph name,
or the function node name inside the graph.
Note In the APIC GUI, a VRF (fvCtx) is also called a "Context" or "Private Network."
Application Profiles
An application profile (fvAp) defines the policies, services and relationships between endpoint groups (EPGs).
The following figure shows the location of application profiles in the management information tree (MIT)
and their relation to other objects in the tenant.
Application profiles contain one or more EPGs. Modern applications contain multiple components. For
example, an e-commerce application could require a web server, a database server, data located in a storage
area network, and access to outside resources that enable financial transactions. The application profile contains
as many (or as few) EPGs as necessary that are logically related to providing the capabilities of an application.
EPGs can be organized according to one of the following:
• The application they provide, such as a DNS server or SAP application (see Tenant Policy Example in
Cisco APIC REST API Configuration Guide).
• The function they provide (such as infrastructure)
• Where they are in the structure of the data center (such as DMZ)
• Whatever organizing principle that a fabric or tenant administrator chooses to use
Endpoint Groups
The endpoint group (EPG) is the most important object in the policy model. The following figure shows where
application EPGs are located in the management information tree (MIT) and their relation to other objects in
the tenant.
An EPG is a managed object that is a named logical entity that contains a collection of endpoints. Endpoints
are devices that are connected to the network directly or indirectly. They have an address (identity), a location,
attributes (such as version or patch level), and can be physical or virtual. Knowing the address of an endpoint
also enables access to all its other identity details. EPGs are fully decoupled from the physical and logical
topology. Endpoint examples include servers, virtual machines, network-attached storage, or clients on the
Internet. Endpoint membership in an EPG can be dynamic or static.
The ACI fabric can contain the following types of EPGs:
• Application endpoint group (fvAEPg)
• Layer 2 external outside network instance endpoint group (l2extInstP)
• Layer 3 external outside network instance endpoint group (l3extInstP)
• Management endpoint groups for out-of-band (mgmtOoB) or in-band ( mgmtInB) access.
EPGs contain endpoints that have common policy requirements such as security, virtual machine mobility
(VMM), QoS, or Layer 4 to Layer 7 services. Rather than configure and manage endpoints individually, they
are placed in an EPG and are managed as a group.
Policies apply to EPGs, never to individual endpoints. An EPG can be statically configured by an administrator
in the APIC, or dynamically configured by an automated system such as vCenter or OpenStack.
Note When an EPG uses a static binding path, the encapsulation VLAN associated with this EPG must be part
of a static VLAN pool. For IPv4/IPv6 dual-stack configurations, the IP address property is contained in
the fvStIp child property of the fvStCEp MO. Multiple fvStIp objects supporting IPv4 and IPv6 addresses
can be added under one fvStCEp object. When upgrading ACI from IPv4-only firmware to versions of
firmware that support IPv6, the existing IP property is copied to an fvStIp MO.
Regardless of how an EPG is configured, EPG policies are applied to the endpoints they contain.
WAN router connectivity to the fabric is an example of a configuration that uses a static EPG. To configure
WAN router connectivity to the fabric, an administrator configures an l3extInstP EPG that includes any
endpoints within an associated WAN subnet. The fabric learns of the EPG endpoints through a discovery
process as the endpoints progress through their connectivity life cycle. Upon learning of the endpoint, the
fabric applies the l3extInstP EPG policies accordingly. For example, when a WAN connected client initiates
a TCP session with a server within an application (fvAEPg) EPG, the l3extInstP EPG applies its policies to
that client endpoint before the communication with the fvAEPg EPG web server begins. When the client server
TCP session ends and communication between the client and server terminate, that endpoint no longer exists
in the fabric.
Note If a leaf switch is configured for static binding (leaf switches) under an EPG, the following restrictions
apply:
• The static binding cannot be overridden with a static path.
• Interfaces in that switch cannot be used for routed external network (L3out) configurations.
• Interfaces in that switch cannot be assigned IP addresses.
Virtual machine management connectivity to VMware vCenter is an example of a configuration that uses a
dynamic EPG. Once the virtual machine management domain is configured in the fabric, vCenter triggers the
dynamic configuration of EPGs that enable virtual machine endpoints to start up, move, and shut down as
needed.
IP-Based EPGs
Although encapsulation-based EPGs are commonly used, IP-based EPGs are suitable in networks where there
is a need for large numbers of EPGs that cannot be supported by Longest Prefix Match (LPM) classification.
IP-based EPGs do not require allocating a network/mask range for each EPG, unlike LPM classification. Also,
a unique bridge domain is not required for each IP-based EPG. The configuration steps for an IP-based EPG
are like those for configuring a virtual IP-based EPG that is used in the Cisco AVS vCenter configuration.
Observe the following guidelines and limitations of IP-based EPGs:
• IP-based EPGs are supported starting with the APIC 1.1(2x) and ACI switch 11.1(2x) releases on the
following Cisco Nexus N9K switches:
◦Switches with "E" on the end of the switch name, for example, N9K-C9372PX-E.
◦Switches with "EX" on the end of the switch name, for example, N9K-93108TC-EX.
The APIC raises a fault when you attempt to deploy IP-based EPGs on older switches that do not support
them.
• IP-based EPGs can be configured for specific IP addresses or subnets, but not IP address ranges.
• IP-based EPGs are not supported in the following scenarios:
◦In combination with static EP configurations.
◦External, infrastructure tenant (infra) configurations will not be blocked, but they do not take effect,
because there is no Layer 3 learning in this case.
◦In Layer 2-only bridge domains, IP-based EPG does not take effect, because there is no routed
traffic in this case. If proxy ARP is enabled on Layer 3 bridge domains, the traffic is routed even
if endpoints are in the same subnet. So IP-based EPG works in this case.
◦Configurations with a prefix that is used both for shared services and an IP-based EPG.
(FEXs). Access policies enable an administrator to configure port channels and virtual port channels, protocols
such as LLDP, CDP, or LACP, and features such as monitoring or diagnostics.
In the policy model, EPGs are tightly coupled with VLANs. For traffic to flow, an EPG must be deployed on
a leaf port with a VLAN in a physical, VMM, L2out, L3out, or Fiber Channel domain. For more information,
see Networking Domains, on page 140.
In the policy model, the domain profile associated to the EPG contains the VLAN instance profile. The domain
profile contains both the VLAN instance profile (VLAN pool) and the attacheable Access Entity Profile
(AEP), which are associated directly with application EPGs. The AEP deploys the associated application
EPGs to all the ports to which it is attached, and automates the task of assigning VLANs. While a large data
center could easily have thousands of active virtual machines provisioned on hundreds of VLANs, the ACI
fabric can automatically assign VLAN IDs from VLAN pools. This saves a tremendous amount of time,
compared with trunking down VLANs in a traditional data center.
VLAN Guidelines
Use the following guidelines to configure the VLANs where EPG traffic will flow.
• Multiple domains can share a VLAN pool, but a single domain can only use one VLAN pool.
• To deploy multiple EPGs with same VLAN encapsulation on a single leaf switch, see Per Port VLAN,
on page 34.
There are some differences in traffic handling, depending on the switch, when a leaf switch port is associated
with a single EPG that is configured as Access (802.1p) or Access (Untagged) modes.
Generation 1 Switches
• If the port is configured in Access (802.1p) mode:
◦On egress, if the access VLAN is the only VLAN deployed on the port, then traffic will be untagged.
◦On egress, if the port has other (tagged) VLANs deployed along with an untagged EPG, then traffic
from that EPG is zero tagged.
◦On egress, for all FEX ports, traffic is untagged, irrespective of one or more VLAN tags configured
on the port.
◦The port accepts ingress traffic that is untagged, tagged, or in 802.1p mode.
Generation 2 Switches
Generation 2 switches, or later, do not distinguish between the Access (Untagged) and Access (802.1p)
modes. When EPGs are deployed on Generation 2 ports configured with either Untagged or 802.1p mode:
• On egress, traffic is always untagged on a node where this is deployed.
• The port accepts ingress traffic that is untagged, tagged, or in 802.1p mode.
EPG 1 on Port 1, with VLAN mode: EPG 1 on different ports, the following VLAN modes
are allowed:
Trunk Trunk or 802.1p
Untagged Untagged
EPG 1 on port 1 with VLAN mode: EPG 1 on port 2, the following EPG 2 on port 1, the following
modes are allowed: modes are allowed:
Untagged Untagged Trunk
Note Certain older network interface cards (NICs) that send traffic on the native VLAN untagged, drop return
traffic that is tagged as VLAN 0. This is normally only a problem on interfaces configured as trunk ports.
However, if an Attachable Entity Profile (AEP) for an access port is configured to carry the infra VLAN,
then it is treated as a trunk port, even though it is configured as an access port. In these circumstances,
packets sent on the native VLAN from the switch with Network Flow Engine (NFE) cards will be tagged
as VLAN 0, and older switch NICs may drop them. Options to address this issue include:
• Removing the infra VLAN from the AEP.
• Configuring "port local scope" on the port. This enables per-port VLAN definition and allows the
switch equipped with NFE to send packets on the native VLAN, untagged.
To enable deploying multiple EPGs using the same encapsulation number, on a single leaf switch, use the
following guidelines:
• EPGs must be associated with different bridge domains.
• EPGs must be deployed on different ports.
• Both the port and EPG must be associated with the same domain that is associated with a VLAN pool
that contains the VLAN number.
• Ports must be configured with portLocal VLAN scope.
For example, with Per Port VLAN for the EPGs deployed on ports 3 and 9 in the diagram above, both using
VLAN-5, port 3 and EPG1 are associated with Dom1 (pool 1) and port 9 and EPG2 are associated with Dom2
(pool 2).
Traffic coming from port 3 is associated with EPG1, and traffic coming from port 9 is associated with EPG2.
This does not apply to ports configured for Layer 3 external outside connectivity.
Note Avoid adding more than one domain to the AEP that is used to deploy the EPG on the ports, to avoid the
risk of traffic forwarding issues.
Only ports that have the vlanScope set to portlocal allow allocation of separate (Port, VLAN) translation
entries in both ingress and egress directions. For a given port with the vlanScope set to portGlobal (the
default), each VLAN used by an EPG must be unique on a given leaf switch.
Note Per Port VLAN is not supported on interfaces configured with Multiple Spanning Tree (MST), which
requires VLAN IDs to be unique on a single leaf switch, and the VLAN scope to be global.
Reusing VLAN Numbers Previously Used for EPGs on the Same Leaf Switch
If you have previously configured VLANs for EPGs that are deployed on a leaf switch port, and you want to
reuse the same VLAN numbers for different EPGs on different ports on the same leaf switch, use a process,
such as the following example, to set them up without disruption:
In this example, EPGs were previously deployed on a port associated with a domain including a VLAN pool
with a range of 9-100. You want to configure EPGs using VLAN encapsulations from 9-20.
1 Configure a new VLAN pool on a different port (with a range of, for example, 9-20).
2 Configure a new physical domain that includes leaf ports that are connected to firewalls.
3 Associate the physical domain to the VLAN pool you configured in step 1.
4 Configure the VLAN Scope as portLocal for the leaf port.
5 Associate the new EPGs (used by the firewall in this example) to the physical domain you created in step
2.
6 Deploy the EPGs on the leaf ports.
When an EPG is deployed on a VPC, it must be associated with the same domain (with the same VLAN pool)
that is assigned to the leaf switch ports on the two legs of the VPC.
In this diagram, EPG A is deployed on a VPC that is deployed on ports on Leaf switch 1 and Leaf switch 2.
The two leaf switch ports and the EPG are all associated with the same domain, containing the same VLAN
pool.
Note When creating a VPC domain between two leaf switches, both switches must be in the same switch
generation, one of the following:
• Generation 1 - Cisco Nexus N9K switches without “EX” or "FX" on the end of the switch name; for
example, N9K-9312TX
• Generation 2 – Cisco Nexus N9K switches with “EX” or "FX" on the end of the switch model name;
for example, N9K-93108TC-EX
Switches such as these two are not compatible VPC peers. Instead, use switches of the same generation.
An Attachable Entity Profile (AEP) represents a group of external entities with similar infrastructure policy
requirements. The infrastructure policies consist of physical interface policies that configure various protocol
options, such as Cisco Discovery Protocol (CDP), Link Layer Discovery Protocol (LLDP), or Link Aggregation
Control Protocol (LACP).
An AEP is required to deploy VLAN pools on leaf switches. Encapsulation blocks (and associated VLANs)
are reusable across leaf switches. An AEP implicitly provides the scope of the VLAN pool to the physical
infrastructure.
The following AEP requirements and dependencies must be accounted for in various configuration scenarios,
including network connectivity, VMM domains, and multipod configuration:
• The AEP defines the range of allowed VLANS but it does not provision them. No traffic flows unless
an EPG is deployed on the port. Without defining a VLAN pool in an AEP, a VLAN is not enabled on
the leaf port even if an EPG is provisioned.
• A particular VLAN is provisioned or enabled on the leaf port that is based on EPG events either statically
binding on a leaf port or based on VM events from external controllers such as VMware vCenter or
Microsoft Azure Service Center Virtual Machine Manager (SCVMM).
• Attached entity profiles can be associated directly with application EPGs, which deploy the associated
application EPGs to all those ports associated with the attached entity profile. The AEP has a configurable
generic function (infraGeneric), which contains a relation to an EPG (infraRsFuncToEpg) that is deployed
on all interfaces that are part of the selectors that are associated with the attachable entity profile.
A virtual machine manager (VMM) domain automatically derives physical interface policies from the interface
policy groups of an AEP.
An override policy at the AEP can be used to specify a different physical interface policy for a VMM domain.
This policy is useful in scenarios where a VM controller is connected to the leaf switch through an intermediate
Layer 2 node, and a different policy is desired at the leaf switch and VM controller physical ports. For example,
you can configure LACP between a leaf switch and a Layer 2 node. At the same time, you can disable LACP
between the VM controller and the Layer 2 switch by disabling LACP under the AEP override policy.
A BD must be linked to a VRF (also known as a context or private network). It must have at least one subnet
(fvSubnet) associated with it. The BD defines the unique Layer 2 MAC address space and a Layer 2 flood
domain if such flooding is enabled. While a VRF defines a unique IP address space, that address space can
consist of multiple subnets. Those subnets are defined in one or more BDs that reference the corresponding
VRF.
The options for a subnet under a BD or under an EPG are as follows:
• Public—the subnet can be exported to a routed connection.
• Private—the subnet applies only within its tenant.
• Shared—the subnet can be shared with and exported to multiple VRFs in the same tenant or across
tenants as part of a shared service. An example of a shared service is a routed connection to an EPG
present in another VRF in a different tenant. This enables traffic to pass in both directions across VRFs.
An EPG that provides a shared service must have its subnet configured under that EPG (not under a
BD), and its scope must be set to advertised externally, and shared between VRFs.
Note Shared subnets must be unique across the VRF involved in the communication. When
a subnet under an EPG provides a Layer 3 external network shared service, such a subnet
must be globally unique within the entire ACI fabric.
Note Because the following protocols are always flooded in a bridge domain, bridge domain flood settings do
not apply to: OSPF/OSPFv3, BGP, EIGRP, CDP, LACP, LLDP, ISIS, IGMP, PIM, ST-BPDU, ARP/GARP,
RARP, ND.
Bridge domains can span multiple switches. A bridge domain can contain multiple subnets, but a subnet is
contained within a single bridge domain. If the bridge domain (fvBD) limitIPLearnToSubnets property is
set to yes, endpoint learning will occur in the bridge domain only if the IP address is within any of the
configured subnets for the bridge domain or within an EPG subnet when the EPG is a shared service provider.
Subnets can span multiple EPGs; one or more EPGs can be associated with one bridge domain or subnet. In
hardware proxy mode, ARP traffic is forwarded to an endpoint in a different bridge domain when that endpoint
has been learned as part of the Layer 3 lookup operation.
Note Bridge domain legacy mode allows only one VLAN per bridge domain. When bridge domain legacy mode
is specified, bridge domain encapsulation is used for all EPGs that reference the bridge domain; EPG
encapsulation, if defined, is ignored. Unicast routing does not apply for bridge domain legacy mode. A
leaf switch can be configured with multiple bridge domains that operate in a mixture of legacy or normal
modes. However, once a bridge domain is configured, its mode cannot be switched.
Caution Changing from unknown unicast flooding mode to hw-proxy mode is disruptive to the traffic in the bridge
domain.
If IP routing is enabled in the bridge domain, the mapping database learns the IP address of the endpoints in
addition to the MAC address.
The Layer 3 Configurations tab of the bridge domain panel allows the administrator to configure the following
parameters:
• Unicast Routing: If this setting is enabled and a subnet address is configured, the fabric provides the
default gateway function and routes the traffic. Enabling unicast routing also instructs the mapping
database to learn the endpoint IP-to-VTEP mapping for this bridge domain. The IP learning is not
dependent upon having a subnet configured under the bridge domain.
• Subnet Address: This option configures the SVI IP addresses (default gateway) for the bridge domain.
• Limit IP Learning to Subnet: This option is similar to a unicast reverse-forwarding-path check. If this
option is selected, the fabric will not learn IP addresses from a subnet other than the one configured on
the bridge domain.
Caution Enabling Limit IP Learning to Subnet is disruptive to the traffic in the bridge domain.
When IP learning is disabled, you have to enable the Global Subnet Prefix check option in System > System
Settings > Fabric Wide Setting > Enforce Subnet Check in the Online Help.
Contracts
In addition to EPGs, contracts (vzBrCP) are key objects in the policy model. EPGs can only communicate
with other EPGs according to contract rules. The following figure shows the location of contracts in the
management information tree (MIT) and their relation to other objects in the tenant.
An administrator uses a contract to select the type(s) of traffic that can pass between EPGs, including the
protocols and ports allowed. If there is no contract, inter-EPG communication is disabled by default. There
is no contract required for intra-EPG communication; intra-EPG communication is always implicitly allowed.
You can also configure contract preferred groups that enable greater control of communication between EPGs
in a VRF. If most of the EPGs in the VRF should have open communication, but a few should only have
limited communication with the other EPGs, you can configure a combination of a contract preferred group
and contracts with filters to control communication precisely.
Contracts govern the following types of endpoint group communications:
• Between ACI fabric application EPGs (fvAEPg), both intra-tenant and inter-tenant
Note In the case of a shared service mode, a contract is required for inter-tenant
communication. A contract is used to specify static routes across VRFs, even though
the tenant VRF does not enforce a policy.
• Between ACI fabric application EPGs and Layer 2 external outside network instance EPGs (l2extInstP)
• Between ACI fabric application EPGs and Layer 3 external outside network instance EPGs (l3extInstP)
• Between ACI fabric out-of-band (mgmtOoB) or in-band (mgmtInB) management EPGs
Contracts govern the communication between EPGs that are labeled providers, consumers, or both. EPG
providers expose contracts with which a would-be consumer EPG must comply. The relationship between an
EPG and a contract can be either a provider or consumer. When an EPG provides a contract, communication
with that EPG can be initiated from other EPGs as long as the communication complies with the provided
contract. When an EPG consumes a contract, the endpoints in the consuming EPG may initiate communication
with any endpoint in an EPG that is providing that contract.
Note An EPG can both provide and consume the same contract. An EPG can also provide and consume multiple
contracts simultaneously.
In the diagram above, EPG A is configured to inherit Provided-Contract 1 and 2 and Consumed-Contract 3
from EPG B (contract master for EPG A).
Use the following guidelines when configuring contract inheritance:
• Contract inheritance can be configured for application, microsegmented (uSeg), external L2Out EPGs,
and external L3Out EPGs. The relationships must be between EPGs of the same type.
• Both provided and consumed contracts are inherited from the contract master when the relationship is
established.
• Contract masters and the EPGs inheriting contracts must be within the same tenant.
• Changes to the masters’ contracts are propagated to all the inheritors. If a new contract is added to the
master, it is also added to the inheritors.
• An EPG can inherit contracts from multiple contract masters.
• Contract inheritance is only supported to a single level (cannot be chained) and a contract master cannot
inherit contracts.
• Contract subject label and EPG label inheritance is supported. When EPG A inherits a contract from
EPG B, if different subject labels are configured under EPG A and EPG B, APIC only uses the subject
label configured under EPG B and not a collection of labels from both EPGs.
• Whether an EPG is directly associated to a contract or inherits a contract, it consumes entries in TCAM.
So contract scale guidelines still apply. For more information, see the Verified Scalability Guide for
your release.
• vzAny security contracts and taboo contracts are not supported.
For information about configuring Contract Inheritance and viewing inherited and standalone contracts, see
Cisco APIC Basic Configuration Guide.
Contracts can contain multiple communication rules and multiple EPGs can both consume and provide multiple
contracts. Labels control which rules apply when communicating between a specific pair of EPGs. A policy
designer can compactly represent complex communication policies and re-use these policies across multiple
instances of an application. For example, the sample policy in the Cisco Application Centric Infrastructure
Fundamentals "Contract Scope Examples" chapter shows how the same contract uses labels, subjects, and
filters to differentiate how communications occur among different EPGs that require HTTP or HTTPS.
Labels, subjects, aliases and filters define EPG communications according to the following options:
• Labels are managed objects with only one property: a name. Labels enable classifying which objects
can and cannot communicate with one another. Label matching is done first. If the labels do not match,
no other contract or filter information is processed. The label match attribute can be one of these values:
at least one (the default), all, none, or exactly one. The Cisco Application Centric Infrastructure
Fundamentals "Label Matching" chapter shows simple examples of all the label match types and their
results.
Note Labels can be applied to a variety of provider and consumer managed objects, including
EPGs, contracts, bridge domains, DHCP relay policies, and DNS policies. Labels do
not apply across object types; a label on an application EPG has no relevance to a label
on a bridge domain.
Labels determine which EPG consumers and EPG providers can communicate with one another. Label
matching determines which subjects of a contract are used with a given EPG provider or EPG consumer
of that contract.
The two types of labels are as follows:
◦Subject labels that are applied to EPGs. Subject label matching enables EPGs to choose a subset
of the subjects in a contract.
◦Provider/consumer labels that are applied to EPGs. Provider/consumer label matching enables
consumer EPGs to choose their provider EPGs and vice versa.
• Aliases are alternative names you can apply to objects, which can be changed, unlike the name.
• Filters are Layer 2 to Layer 4 fields, TCP/IP header fields such as Layer 3 protocol type, Layer 4 ports,
and so forth. According to its related contract, an EPG provider dictates the protocols and ports in both
the in and out directions. Contract subjects contain associations to the filters (and their directions) that
are applied between EPGs that produce and consume the contract.
Note When a contract filter match type is All, best practice is to use the VRF unenforced
mode. Under certain circumstances, failure to follow these guidelines results in the
contract not allowing traffic among EPGs in the VRF.
• Subjects are contained in contracts. One or more subjects within a contract use filters to specify the type
of traffic that can be communicated and how it occurs. For example, for HTTPS messages, the subject
specifies the direction and the filters that specify the IP address type (for example, IPv4), the HTTP
protocol, and the ports allowed. Subjects determine if filters are unidirectional or bidirectional. A
unidirectional filter is used in one direction. Unidirectional filters define in or out communications but
not the same for both. Bidirectional filters are the same for both; they define both in and out
communications.
Microsegmentation
Microsegmentation associates endpoints from multiple EPGs into a microsegmented EPG according to virtual
machine attributes, IP address, or MAC address. Virtual machine attributes include: VNic domain name, VM
identifier, VM name, hypervisor identifier, VMM domain, datacenter, operating system, or custom attribute.
Some advantages of microsegmentation include the following:
• Stateless white list network access security with line rate enforcement.
• Per-microsegment granularity of security automation through dynamic Layer 4 - Layer 7 service insertion
and chaining.
• Hypervisor agnostic microsegmentation in a broad range of virtual switch environments.
• ACI policies that easily move problematic VMs into a quarantine security zone.
• When combined with intra-EPG isolation for bare metal and VM endpoints, microsegmentation can
provide policy driven automated complete endpoint isolation within application tiers.
For any EPG, the ACI fabric ingress leaf switch classifies packets into an EPG according to the policies
associated with the ingress port. Microsegmented EPGs apply policies to individual virtual or physical endpoints
that are derived based on the VM attribute, MAC address, or IP address specified in the microsegmented EPG
policy.
Note If an EPG is configured with intra-EPG endpoint isolation enforced, these restrictions apply:
• All Layer 2 endpoint communication across an isolation enforced EPG is dropped within a bridge
domain.
• All Layer 3 endpoint communication across an isolation enforced EPG is dropped within the same
subnet.
• Preserving QoS CoS priority settings is not supported when traffic is flowing from an EPG with
isolation enforced to an EPG without isolation enforced.
What vzAny Is
The vzAny managed object provides a convenient way of associating all endpoint groups (EPGs) in a Virtual
Routing and Forwarding (VRF) instance to one or more contracts (vzBrCP), instead of creating a separate
contract relation for each EPG.
In the Cisco ACI fabric, EPGs can only communicate with other EPGs according to contract rules. A
relationship between an EPG and a contract specifies whether the EPG provides the communications defined
by the contract rules, consumes them, or both. By dynamically applying contract rules to all EPGs in a VRF,
vzAny automates the process of configuring EPG contract relationships. Whenever a new EPG is added to a
VRF, vzAny contract rules automatically apply. The vzAny one-to-all EPG relationship is the most efficient
way of applying contract rules to all EPGs in a VRF.
Note In the APIC GUI under tenants, a VRF is also known as a private network (a network within a tenant) or
a context.
Outside Networks
Outside network policies control connectivity to the outside. A tenant can contain multiple outside network
objects. The following figure shows the location of outside networks in the management information tree
(MIT) and their relation to other objects in the tenant.
Outside network policies specify the relevant Layer 2 (l2extOut) or Layer 3 (l3extOut) properties that control
communications between an outside public or private network and the ACI fabric. External devices, such as
routers that connect to the WAN and enterprise core, or existing Layer 2 switches, connect to the front panel
interface of a leaf switch. The leaf switch that provides such connectivity is known as a border leaf. The border
leaf switch interface that connects to an external device can be configured as either a bridged or routed interface.
In the case of a routed interface, static or dynamic routing can be used. The border leaf switch can also perform
all the functions of a normal leaf switch.
The dotted lines in the following figure shows several common MO relations.
For example, the dotted line between the EPG and the bridge domain defines the relation between those two
MOs. In this figure, the EPG (fvAEPg) contains a relationship MO (fvRsBD) that is named with the name of
the target bridge domain MO (fvDB). For example, if production is the bridge domain name
(tnFvBDName=production), then the relation name would be production (fvRsBdName=production).
In the case of policy resolution based on named relations, if a target MO with a matching name is not found
in the current tenant, the ACI fabric tries to resolve in the common tenant. For example, if the user tenant
EPG contained a relationship MO targeted to a bridge domain that did not exist in the tenant, the system tries
to resolve the relationship in the common tenant. If a named relation cannot be resolved in either the current
tenant or the common tenant, the ACI fabric attempts to resolve to a default policy. If a default policy exists
in the current tenant, it is used. If it does not exist, the ACI fabric looks for a default policy in the common
tenant. Bridge domain, VRF, and contract (security policy) named relations do not resolve to a default.
Default Policies
The ACI fabric includes default policies for many of its core functions. Examples of default policies include
the following:
• Bridge domain (in the common tenant)
• Layer 2 and Layer 3 protocols
• Fabric initialization, device discovery, and cabling detection
• Storm control and flooding
Note This does not apply to a bridge domain or a VRF (private network) in a tenant.
• A configuration does not refer to any policy name: if a default policy exists in the current tenant, it is
used. Otherwise, the default policy in tenant common is used.
Note For bridge domains and VRFs, this only applies if the connectivity instrumentation
policy (fvConnInstrPol) in the common tenant has the appropriate bridge domain or
VRF flag set. This prevents unintended EPGs from being deployed in tenant common
subnets.
Default policies can be modified or deleted. Deleting a default policy can result in a policy resolution process
to complete abnormally.
Note To avoid confusion when implementing configurations that use default policies, document changes made
to default policies. Be sure there are no current or future configurations that rely on a default policy before
deleting a default policy. For example, deleting a default firmware update policy could result in a
problematic future firmware update.
The policy model specifies that an object is using another policy by having a relation managed object (MO)
under that object and that relation MO refers to the target policy by name. If this relation does not explicitly
refer to a policy by name, then the system will try to resolve a policy called default. Bridge domains (BD)
and VRFs (Ctx) are exceptions to this rule.
An endpoint group (EPG) has a relation to a BD (fvRsBd) that has a property called tnFvBDName. If this is not
set (tnVfBDName=""), the connectivity instrumentation policy (fvConnInstrPol) derives the behavior for this
case. This policy applies for all EPG cases (VMM, baremetal, l2ext, l3ext). The instrumentation policy uses
the bdctrl property to control whether the default BD policy will be used and the ctxCtrl property to control
whether the default VRF (Ctx) policy will be used. The following options are the same for both:
• do not instrument: the leaf switch will not use the default policy.
• Contracts are needed for inter-bridge domain traffic when a private network is unenforced.
• Prefix-based EPGs are not supported. Shared Services are not supported for a Layer 3 external outside
network. Contracts provided or consumed by a Layer 3 external outside network need to be consumed
or provided by EPGs that share the same Layer 3 VRF.
• A shared service is supported only with non-overlapping and non-duplicate subnets. When configuring
subnets for shared services, follow these guidelines:
◦Configure the subnet for a shared service provider under the EPG, not under the bridge domain.
◦Subnets configured under an EPG that share the same VRF must be disjointed and must not overlap.
◦Subnets leaked from one VRF to another must be disjointed and must not overlap.
◦Subnets advertised from multiple consumer networks into a VRF or vice versa must be disjointed
and must not overlap.
Note If two consumers are mistakenly configured with the same subnet, recover from this
condition by removing the subnet configuration for both, then reconfigure the subnets
correctly.
• Do not configure a shared service with AnyToProv in the provider VRF. The APIC rejects this
configuration and raises a fault.
• The private network of a provider cannot be in unenforced mode while providing a shared service.
Tags
Object tags simplify API operations. In an API operation, an object or group of objects can be referenced by
the tag name instead of by the distinguished name (DN). Tags are child objects of the item they tag; besides
the name, they have no other properties.
Use a tag to assign a descriptive name to a group of objects. The same tag name can be assigned to multiple
objects. Multiple tag names can be assigned to an object. For example, to enable easy searchable access to all
web server EPGs, assign a web server tag to all such EPGs. Web server EPGs throughout the fabric can be
located by referencing the web server tag.
Fabric Provisioning
Cisco Application Centric Infrastructure (ACI) automation and self-provisioning offers these operation
advantages over the traditional switching infrastructure:
• A clustered logically centralized but physically distributed APIC provides policy, bootstrap, and image
management for the entire fabric.
• The APIC startup topology auto discovery, automated configuration, and infrastructure addressing uses
these industry-standard protocols: Intermediate System-to-Intermediate System (IS-IS), Link Layer
Discovery Protocol (LLDP), and Dynamic Host Configuration Protocol (DHCP).
• The APIC provides a simple and automated policy-based provisioning and upgrade process, and automated
image management.
• APIC provides scalable configuration management. Because ACI data centers can be very large,
configuring switches or interfaces individually does not scale well, even using scripts. APIC pod,
controller, switch, module and interface selectors (all, range, specific instances) enable symmetric
configurations across the fabric. To apply a symmetric configuration, an administrator defines switch
profiles that associate interface configurations in a single policy group. The configuration is then rapidly
deployed to all interfaces in that profile without the need to configure them individually.
The Cisco Nexus ACI fabric software is bundled as an ISO image, which can be installed on the Cisco APIC
server through the KVM interface on the Cisco Integrated Management Controller (CIMC). The Cisco Nexus
ACI Software ISO contains the Cisco APIC image, the firmware image for the leaf node, the firmware image
for the spine node, default fabric infrastructure policies, and the protocols required for operation.
The ACI fabric bootstrap sequence begins when the fabric is booted with factory-installed images on all the
switches. The Cisco Nexus 9000 Series switches that run the ACI firmware and APICs use a reserved overlay
for the boot process. This infrastructure space is hard-coded on the switches. The APIC can connect to a leaf
through the default overlay, or it can use a locally significant identifier.
The ACI fabric uses an infrastructure space, which is securely isolated in the fabric and is where all the
topology discovery, fabric management, and infrastructure addressing is performed. ACI fabric management
communication within the fabric takes place in the infrastructure space through internal private IP addresses.
This addressing scheme allows the APIC to communicate with fabric nodes and other Cisco APIC controllers
in the cluster. The APIC discovers the IP address and node information of other Cisco APIC controllers in
the cluster using the Link Layer Discovery Protocol (LLDP)-based discovery process.
The following describes the APIC cluster discovery process:
• Each APIC in the Cisco ACI uses an internal private IP address to communicate with the ACI nodes
and other APICs in the cluster. The APIC discovers the IP address of other APIC controllers in the
cluster through the LLDP-based discovery process.
• APICs maintain an appliance vector (AV), which provides a mapping from an APIC ID to an APIC IP
address and a universally unique identifier (UUID) of the APIC. Initially, each APIC starts with an AV
filled with its local IP address, and all other APIC slots are marked as unknown.
• When a switch reboots, the policy element (PE) on the leaf gets its AV from the APIC. The switch then
advertises this AV to all of its neighbors and reports any discrepancies between its local AV and neighbors'
AVs to all the APICs in its local AV.
Using this process, the APIC learns about the other APIC controllers in the ACI through switches. After
validating these newly discovered APIC controllers in the cluster, the APIC controllers update their local AV
and program the switches with the new AV. Switches then start advertising this new AV. This process continues
until all the switches have the identical AV and all APIC controllers know the IP address of all the other APIC
controllers.
Note Prior to initiating a change to the cluster, always verify its health. When performing planned changes to
the cluster, all controllers in the cluster should be healthy. If one or more of the APIC controllers in the
cluster is not healthy, remedy that situation before proceeding with making changes to the cluster. Also,
assure that cluster controllers added to the APIC are running the same version of firmware as the other
controllers in the APIC cluster. See the KB: Cisco ACI APIC Cluster Management article for guidelines
that must be followed to assure that making changes the APIC cluster complete normally.
The ACI fabric is brought up in a cascading manner, starting with the leaf nodes that are directly attached to
the APIC. LLDP and control-plane IS-IS convergence occurs in parallel to this boot process. The ACI fabric
uses LLDP- and DHCP-based fabric discovery to automatically discover the fabric switch nodes, assign the
infrastructure VXLAN tunnel endpoint (VTEP) addresses, and install the firmware on the switches. Prior to
this automated process, a minimal bootstrap configuration must be performed on the Cisco APIC controller.
After the APIC controllers are connected and their IP addresses assigned, the APIC GUI can be accessed by
entering the address of any APIC controller into a web browser. The APIC GUI runs HTML5 and eliminates
the need for Java to be installed locally.
Fabric Inventory
The policy model contains a complete real-time inventory of the fabric, including all nodes and interfaces.
This inventory capability enables automation of provisioning, troubleshooting, auditing, and monitoring.
For Cisco ACI fabric switches, the fabric membership node inventory contains policies that identify the node
ID, serial number, and name. Third-party nodes are recorded as unmanaged fabric nodes. Cisco ACI switches
can be automatically discovered, or their policy information can be imported. The policy model also maintains
fabric member node state information.
Disabled interfaces can be ones blacklisted by an administrator or ones taken down because the APIC detects
anomalies. Examples of link state anomalies include the following:
• A wiring mismatch, such as a spine connected to a spine, a leaf connected to a leaf, a spine connected
to a leaf access port, a spine connected to a non-ACI node, or a leaf fabric port connected to a non-ACI
device.
• A fabric name mismatch. The fabric name is stored in each ACI node. If a node is moved to another
fabric without resetting it to a back to factory default state, it will retain the fabric name.
• A UUID mismatch causes the APIC to disable the node.
Note If an administrator uses the APIC to disable all the leaf nodes on a spine, a spine reboot is required to
recover access to the spine.
Provisioning
The APIC provisioning method automatically brings up the ACI fabric with the appropriate connections. The
following figure shows fabric provisioning.
After Link Layer Discovery Protocol (LLDP) discovery learns all neighboring connections dynamically, these
connections are validated against a loose specification rule such as "LEAF can connect to only SPINE-L1-*"
or "SPINE-L1-* can connect to SPINE-L2-* or LEAF." If a rule mismatch occurs, a fault occurs and the
connection is blocked because a leaf is not allowed to be connected to another leaf, or a spine connected to a
spine. In addition, an alarm is created to indicate that the connection needs attention. The Cisco ACI fabric
administrator can import the names and serial numbers of all the fabric nodes from a text file into the APIC
or allow the fabric to discover the serial numbers automatically and then assign names to the nodes using the
APIC GUI, command-line interface (CLI), or API. The APIC is discoverable via SNMP. It has the following
asysobjectId: ciscoACIController OBJECT IDENTIFIER ::= { ciscoProducts 2238 }
Note Prior to initiating a change to the cluster, always verify its health. When performing planned changes to
the cluster, all controllers in the cluster should be healthy. If one or more of the APIC controllers' health
status in the cluster is not "fully fit", remedy that situation before proceeding. Also, assure that cluster
controllers added to the APIC are running the same version of firmware as the other controllers in the
APIC cluster.
• When an APIC cluster is split into two or more groups, the ID of a node is changed and the changes are
not synchronized across all APICs. This can cause inconsistency in the node IDs between APICs and
also the affected leaf nodes may not appear in the inventory in the APIC GUI. When you split an APIC
cluster, decommission the affected leaf nodes from APIC and register them again, so that the inconsistency
in the node IDs is resolved and the health status of the APICs in a cluster are in a fully fit state.
• Before configuring the APIC cluster, ensure that all the APICs are running the same firmware version.
Initial clustering of APICs running differing versions is an unsupported operation and may cause problems
within the cluster.
• After switching over a standby APIC to active, if it was the only standby, you must configure a new
standby.
• The following limitations are observed for retaining out of band address for standby APIC after a fail
over.
◦Standby (new active) APIC may not retain its out of band address if more than 1 active APICs are
down or unavailable.
◦Standby (new active) APIC may not retain its out of band address if it is in a different subnet than
active APIC. This limitation is only applicable for APIC release 2.x.
◦Standby (new active) APIC may not retain its IPv6 out of band address. This limitation is not
applicable starting from APIC release 3.1x.
◦Standby (new active) APIC may not retain its out of band address if you have configured non Static
OOB Management IP address policy for replacement (old active) APIC.
Note In case you observe any of the limitations, in order to retain standby APICs out of band
address, you must manually change the OOB policy for replaced APIC after the replace
operation is completed successfully.
• We recommend keeping standby APICs in same POD as the active APICs it may replace.
• There must be three active APICs in order to add a standby APIC.
• The standby APIC does not participate in policy configuration or management.
• No information is replicated to standby controllers, including admin credentials.
Important Notes
• Upgrading or downgrading a switch in maintenance mode is not supported.
• While the switch is in maintenance mode, the Ethernet Port Module stops propagating the interface
related notifications. As a result, if the remote switch is rebooted or the fabric link is flapped during this
time, the fabric link will not come up afterwards unless the switch is manually rebooted (using the
acidiag touch clean command), decommissioned, and recommissioned.
• For multi pod, ISIS metric for redistributed routes should be set to less than 63. To set the ISIS metric
for redistributed routes, choose Fabric > Fabric Policies > Pod Policies > ISIS Policy.
• Existing GIR supports all Layer-3 traffic diversion. With LACP, all the Layer-2 traffic is also diverted
to the redundant node and hits nearly zero-traffic loss. Once a node goes into maintenance mode, LACP
running on the node immediately informs neighbors that it can no longer be aggregated as part of
port-channel. All traffic is then diverted to the vPC peer node.
The stretched fabric is a single ACI fabric. The sites are one administration domain and one availability zone.
Administrators are able to manage the sites as one entity; configuration changes made on any APIC controller
node are applied to devices across the sites. The stretched ACI fabric preserves live VM migration capability
across the sites. Currently, stretched fabric designs have been validated with three sites.
Default Policies
The initial values of the APIC default policies values are taken from the concrete model that is loaded in the
switch. A fabric administrator can modify default policies. A default policy serves multiple purposes:
1 Allows a fabric administrator to override the default values in the model.
2 If an administrator does not provide an explicit policy, the APIC applies the default policy. An administrator
can create a default policy and the APIC uses that unless the administrator provides any explicit policy.
For example, according to actions the administrator does or does not take, the APIC will do the following:
• Because the administrator does not specify the LLDP policy for the selected ports, the APIC applies the
default LLDP interface policy for the ports specified in the port selector.
• If the administrator removes a port from a port selector, the APIC applies the default policies to that
port. In this example, if the administrator removes port 1/15 from the port selector, the port is no longer
part of the port channel and the APIC applies all the default policies to that port.
When the ACI fabric is upgraded, the existing policy default values persist, even if the default value changes
in the newer release. When the node connects to the APIC for the first time, the node registers itself with
APIC which pushes all the default policies to the node. Any change in the default policy is pushed to the node.
administrators to select the pods, switches, and interfaces to which they will apply fabric policies. The following
figure provides an overview of the fabric policy model.
once; configuring one port at a time is not scalable. The following figure shows how the process works for
configuring the ACI fabric.
The following figure shows the result of applying Switch Profile 1 and Switch Profile 2 to the ACI fabric.
This combination of infrastructure and scope enables administrators to manage fabric configuration in a
scalable fashion. These configurations can be implemented using the REST API, the CLI, or the GUI. The
Quick Start Fabric Interface Configuration wizard in the GUI automatically creates the necessary underlying
objects to implement such policies.
• Monitoring and troubleshooting policies specify what to monitor, thresholds, how to handle faults and
logs, and how to perform diagnostics.
Sample XML policies for switch interfaces, port channels, virtual port channels, and change interface speeds
are provided in Cisco APIC Rest API Configuration Guide.
Note While tenant network policies are configured separately from fabric access policies, tenant policies are
not activated unless the underlying access policies they depend on are in place.
To apply a configuration across a potentially large number of switches, an administrator defines switch profiles
that associate interface configurations in a single policy group. In this way, large numbers of interfaces across
the fabric can be configured at once. Switch profiles can contain symmetric configurations for multiple switches
or unique special purpose configurations. The following figure shows the process for configuring access to
the ACI fabric.
The following figure shows the result of applying Switch Profile 1 and Switch Profile 2 to the ACI fabric.
This combination of infrastructure and scope enables administrators to manage fabric configuration in a
scalable fashion. These configurations can be implemented using the REST API, the CLI, or the GUI. The
Quick Start Interface, PC, VPC Configuration wizard in the GUI automatically creates the necessary underlying
objects to implement such policies.
Note When creating a VPC domain between two leaf switches, both switches must be in the same switch
generation, one of the following:
• Generation 1 - Cisco Nexus N9K switches without “EX” or "FX" on the end of the switch name; for
example, N9K-9312TX
• Generation 2 – Cisco Nexus N9K switches with “EX” or "FX" on the end of the switch model name;
for example, N9K-93108TC-EX
Switches such as these two are not compatible VPC peers. Instead, use switches of the same generation.
Note When using GARP as the protocol to notify of IP to MAC binding changes to different interfaces on the
same FEX you must set the bridge domain mode to ARP Flooding and enable EP Move Detection Mode:
GARP-based Detection, on the L3 Configuration page of the bridge domain wizard. This workaround
is only required with Generation 1 switches. With Generation 2 switches or later, this is not an issue.
Note As of release version 2.0(1), FCoE support is limited to N9K-C93180YC-EX and N9K-C93108TC-EX
hardware. With release version 2.2(1), the N9K-C93180LC-EX 40 Gigabit Ethernet (GE) ports can be
used as F or NP ports. However, If they are enabled for FCoE, they cannot be enabled for 40GE port
breakout. FCoE is not supported on breakout ports.
From release version 2.2(2), N9K-C93180YC-FX and N9K-C93108TC-FX hardware supports FCoE.
With release 2.3(1), FCoE support on Fex ports for the hardware N9K-C93180YC-FX and
N9K-C93108TC-FX is available.
As of release version 2.2(x), FCoE is also supported on the following FEX Nexus devices:
• N2K-C2348UPQ-10GE
• N2K-C2348TQ-10GE
• N2K-C2232PP-10GE
• N2K-B22DELL-P
• N2K-B22HP-P
• N2K-B22IBM-P
• N2K-B22DELL-P-FI
Note The vlan used for FCoE should have vlanScope set to Global. vlanScope set to portLocal is not supported
for FCoE. The value is set via the L2 Interface Policy l2IfPol.
• One or more ACI leaf switches configured through FC SAN policies to function as an NPV backbone
• Selected interfaces on the NPV-configured leaf switches configured to function as F ports.
F ports accommodate FCoE traffic to and from hosts running SAN management or SAN-consuming
applications.
• Selected interfaces on the NPV-configured leaf switches to function as NP ports.
NP ports accommodate FCoE traffic to and from an FCF bridge.
The FCF bridge receives FC traffic from fibre channel links typically connecting SAN storage devices and
encapsulates the FC packets into FCoE frames for transmission over the ACI fabric to the SAN management
or SAN Data-consuming hosts. It receives FCoE traffic and repackages it back to FC for transmission over
the fibre channel network.
Note In the above ACI topology, FCoE traffic support requires direct connections between the hosts and F ports
and direct connections between the FCF device and NP port.
APIC servers enable an operator to configure and monitor the FCoE traffic through the APIC Basic GUI, the
APIC Advanced GUI, the APIC NX-OS style CLI, or through application calls to the APIC REST API.
802.1Q Tunnels
About ACI 802.1Q Tunnels
Figure 30: ACI 802.1Q Tunnels
With Cisco ACI and Cisco APIC Release 2.2(1x) and higher, you can configure 802.1Q tunnels on edge
(tunnel) ports to enable point-to-multi-point tunneling of Ethernet frames in the fabric, with Quality of Service
(QoS) priority settings. A Dot1q Tunnel transports untagged, 802.1Q tagged, and 802.1ad double-tagged
frames as-is across the fabric. Each tunnel carries the traffic from a single customer and is associated with a
single bridge domain. ACI front panel ports can be part of a Dot1q Tunnel. Layer 2 switching is done based
on Destination MAC (DMAC) and regular MAC learning is done in the tunnel. Edge-port Dot1q Tunnels
are supported on second-generation (and later) Cisco Nexus 9000 series switches with "EX" on the end of the
switch model name.
With Cisco ACI and Cisco APIC Release 2.3(x) and higher, you can also configure multiple 802.1Q tunnels
on the same core port to carry double-tagged traffic from multiple customers, each distinguished with an
access encapsulation configured for each 802.1Q tunnel. You can also disable MAC Address Learning on
802.1Q tunnels. Both edge ports and core ports can belong to an 802.1Q tunnel with access encapsulation and
disabled MAC Address Learning. Both edge ports and core ports in Dot1q Tunnels are supported on
third-generation Cisco Nexus 9000 series switches with "FX" on the end of the switch model name.
Terms used in this document may be different in the Cisco Nexus 9000 Series documents.
• If a PC or VPC is the only interface in a Dot1q Tunnel and it is deleted and reconfigured, remove the
association of the PC/VPC to the Dot1q Tunnel and reconfigure it.
• With Cisco APIC Release 2.2(x) the Ethertypes for double-tagged frames must be 0x9100 followed by
0x8100.
However, with Cisco APIC Release 2.3(x) and higher, this limitation no longer applies for edge ports,
on third-generation Cisco Nexus switches with "FX" on the end of the switch model name.
• For core ports, the Ethertypes for double-tagged frames must be 0x8100 followed by 0x8100.
• You can include multiple edge ports and core ports (even across leaf switches) in a Dot1q Tunnel.
• An edge port may only be part of one tunnel, but a core port can belong to multiple Dot1q tunnels.
• With Cisco APIC Release 2.3(x) and higher, regular EPGs can be deployed on core ports that are used
in 802.1Q tunnels.
• L3Outs are not supported on interfaces enabled for Dot1q Tunnels.
• FEX interfaces are not supported as members of a Dot1q Tunnel.
• Interfaces configured as breakout ports do not support 802.1Q tunnels.
• Interface-level statistics are supported for interfaces in Dot1q Tunnels, but statistics at the tunnel level
are not supported.
The 40GE to 10GE dynamic breakout feature is supported on the access facing ports of the following switches:
• N9K-C9332PQ
• N9K-C93180LC-EX
• N9K-C9336C-FX
The 100GE to 25GE breakout feature is supported on the access facing ports of the following switches:
• N9K-C93180LC-EX
• N9K-C9336C-FX2
are processed individually in the fabric and restored to double-tags when egressing the ACI switch. Ingressing
single-tagged and untagged traffic is dropped.
This feature is only supported on Nexus 9300-FX platform switches.
Both the outer and inner tag must be of EtherType 0x8100.
MAC learning and routing are based on the EPG port, sclass, and VRF, not on the access encapsulations.
QoS priority settings are supported, derived from the outer tag on ingress, and rewritten to both tags on egress.
EPGs can simultaneously be associated with other interfaces on a leaf switch, that are configured for
single-tagged VLANs.
Service graphs are supported for provider and consumer EPGs that are mapped to Q-in-Q encapsulated
interfaces. You can insert service graphs, as long as the ingress and egress traffic on the service nodes is in
single-tagged encapsulated frames.
The following features and options are not supported with this feature:
• Per-Port VLAN feature
• FEX connections
• Mixed Mode is not supported. For example, an interface in Q-in-Q encapsulation mode can have a static
path binding to an EPG with double-tagged encapsulation only, not with regular VLAN encapsulation.
• STP and the “Flood in Encapsulation” option
• Untagged and 802.1p mode
• Multi-pod and Multi-Site
• Legacy bridge domain
• L2Out and L3Out connections
• VMM integration
• Changing a port mode from routed to Q-in-Q encapsulation mode is not supported
• Per-vlan MCP is not supported between ports in Q-in-Q encapsulation mode and ports in regular trunk
mode.
• When VPC ports are enabled for Q-in-Q encapsulation mode, VLAN consistency checks are not
performed.
Layer 2 Multicast
About Cisco APIC and IGMP Snooping
IGMP snooping is the process of listening to Internet Group Management Protocol (IGMP) network traffic.
The feature allows a network switch to listen in on the IGMP conversation between hosts and routers and
filter multicasts links that do not need them, thus controlling which ports receive specific multicast traffic.
Cisco APIC provides support for the full IGMP snooping feature included on a traditional switch such as the
N9000 standalone.
• Policy-based IGMP snooping configuration per bridge domain
APIC enables you to configure a policy in which you enable, disable, or customize the properties of
IGMP Snooping on a per bridge-domain basis. You can then apply that policy to one or multiple bridge
domains.
• Static port group implementation
IGMP static port grouping enables you to pre-provision ports, already statically-assigned to an application
EPG, as the switch ports to receive and process IGMP multicast traffic. This pre-provisioning prevents
the join latency which normally occurs when the IGMP snooping stack learns ports dynamically.
Static group membership can be pre-provisioned only on static ports (also called, static-binding ports)
assigned to an application EPG.
• Access group configuration for application EPGs
An “access-group” is used to control what streams can be joined behind a given port.
An access-group configuration can be applied on interfaces that are statically assigned to an application
EPG in order to ensure that the configuration can be applied on ports that will actually belong to the that
EPG.
Only Route-map-based access groups are allowed.
Note You can use vzAny to enable protocols such as IGMP Snooping for all the EPGs in a VRF. For more
information about vzAny, see Use vzAny to Automatically Apply Communication Rules to all EPGs in
a VRF.
To use vzAny, navigate to Tenants > tenant-name > Networking > VRFs > vrf-name > EPG Collection
for VRF.
Note We recommend that you do not disable IGMP snooping on bridge domains. If you disable IGMP snooping,
you may see reduced multicast performance because of excessive false flooding within the bridge domain.
IGMP snooping software examines IP multicast traffic within a bridge domain to discover the ports where
interested receivers reside. Using the port information, IGMP snooping can reduce bandwidth consumption
in a multi-access bridge domain environment to avoid flooding the entire bridge domain. By default, IGMP
snooping is enabled on the bridge domain.
This figure shows the IGMP routing functions and IGMP snooping functions both contained on an ACI leaf
switch with connectivity to a host. The IGMP snooping feature snoops the IGMP membership reports and
Leave messages and forwards them only when necessary to the IGMP router function.
IGMP snooping operates upon IGMPv1, IGMPv2, and IGMPv3 control plane packets where Layer 3 control
plane packets are intercepted and influence the Layer 2 forwarding behavior.
IGMP snooping has the following proprietary features:
• Source filtering that allows forwarding of multicast packets based on destination and source IP addresses
• Multicast forwarding based on IP addresses rather than the MAC address
• Multicast forwarding alternately based on the MAC address
Note For more information about IGMP snooping, see RFC 4541.
Virtualization Support
You can define multiple virtual routing and forwarding (VRF) instances for IGMP snooping.
On leaf switches, you can use the show commands with a VRF argument to provide a context for the information
displayed. The default VRF is used if no VRF argument is supplied.
The APIC IGMP Snooping Function, IGMPv1, IGMPv2, and the Fast Leave
Feature
Both IGMPv1 and IGMPv2 support membership report suppression, which means that if two hosts on the
same subnet want to receive multicast data for the same group, the host that receives a member report from
the other host suppresses sending its report. Membership report suppression occurs for hosts that share a port.
If no more than one host is attached to each switch port, you can configure the fast leave feature in IGMPv2.
The fast leave feature does not send last member query messages to hosts. As soon as APIC receives an IGMP
leave message, the software stops forwarding multicast data to that port.
IGMPv1 does not provide an explicit IGMP leave message, so the APIC IGMP snooping function must rely
on the membership message timeout to indicate that no hosts remain that want to receive multicast data for a
particular group.
Note The IGMP snooping function ignores the configuration of the last member query interval when you enable
the fast leave feature because it does not check for remaining hosts.
Note The IP address for the querier should not be a broadcast IP address, multicast IP address, or 0 (0.0.0.0).
When an IGMP snooping querier is enabled, it sends out periodic IGMP queries that trigger IGMP report
messages from hosts that want to receive IP multicast traffic. IGMP snooping listens to these IGMP reports
to establish appropriate forwarding.
The IGMP snooping querier performs querier election as described in RFC 2236. Querier election occurs in
the following configurations:
• When there are multiple switch queriers configured with the same subnet on the same VLAN on different
switches.
• When the configured switch querier is in the same subnet as with other Layer 3 SVI queriers.
Does not enforce serial number based authorization Enforces serial number authorization.
.
You can verify the configuration and the conversion of the ports using the show interface brief CLI command.
Note Port profile can be deployed only on the top ports of a Cisco N9K-C93180LC-EX switch, for example,
1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, and 23. When the top port is converted using the port profile, the bottom
ports are hardware disabled. For example, if Eth 1/1 is converted using the port profile, Eth 1/2 is hardware
disabled.
The following example displays the output for converting an uplink port to downlink port. Before converting
an uplink port to downlink port, the output is displayed in the example. The keyword routed denotes the port
as uplink port.
After configuring the port profile and reloading the switch, the output is displayed in the example. The keyword
trunk denotes the port as downlink port.
Restrictions
Port profiles and breakout ports are not supported on the same ports.
Configuring Fast Link Failover policy and port profiles do not work together. If port profile is enabled, Fast
Link Failover cannot be enabled or vice versa.
Guidelines
Note In the GUI, uplink or downlink port configuration is located under Fabric > Inventory > Topology >
Interface > Add Switches. The mode selected is Configuration.
Note If a decommissioned node has the Port Profile feature deployed on it, the port conversions are not removed
even after decommissioning the node. It is necessary to manually delete the configurations after
decommission, for the ports to return to the default state. To do this, log onto the switch, run the
setup-clean-config.sh script, and wait for it to run. Then, enter the reload command.
Note When you enable or disable Federal Information Processing Standards (FIPS) on a Cisco ACI fabric, you
must reload each of the switches in the fabric for the change to take effect. The configured scale profile
setting is lost when you issue the first reload after changing the FIPS configuration. The switch remains
operational, but it uses the default scale profile. This issue does not happen on subsequent reloads if the
FIPS configuration has not changed.
FIPS is supported on Cisco NX-OS release 13.1(1) or later.
If you must downgrade the firmware from a release that supports FIPS to a release that does not support
FIPS, you must first disable FIPS on the Cisco ACI fabric and reload all the switches in the fabric for the
FIPS configuration change.
Note When the maximum uplink port limit is reached and ports 25 and 27 are converted from uplink to downlink
and back to uplink on Cisco 93180LC-EX switches:
On Cisco 93180LC-EX Switches, ports 25 and 27 are the native uplink ports. Using the port profile, if
you convert port 25 and 27 to downlink ports, ports 29, 30, 31, and 32 are still available as four native
uplink ports. Because of the threshold on the number of ports (which is maximum of 12 ports) that can
be converted, you can convert 8 more downlink ports to uplink ports. For example, ports 1, 3, 5, 7, 9, 13,
15, 17 are converted to uplink ports and ports 29, 30, 31 and 32 are the 4 native uplink ports which is the
maximum uplink port limit on Cisco 93180LC-EX switches.
When the switch is in this state and if the port profile configuration is deleted on ports 25 and 27, ports
25 and 27 are converted back to uplink ports, but there are already 12 uplink ports on the switch (as
mentioned earlier). To accommodate ports 25 and 27 as uplink ports, 2 random ports from the port range
1, 3, 5, 7, 9, 13, 15, 17 are denied the uplink conversion and this situation cannot be controlled by the
user.
Therefore, it is mandatory to clear all the faults before reloading the leaf node to avoid any unexpected
behavior regarding the port type. It should be noted that if a node is reloaded without clearing the port
profile faults, especially when there is a fault related to limit-exceed, the port might not be in an expected
operational state.
The downlink ports are capable of doing 50-Gigabits, but that is not currently supported.
Odd Numbered Port (1 to 23) Even Numbered Port (2 to 24) below the Odd
Numbered Port
40-Gigabit QSFP+ downlink port (default) 40-Gigabit QSFP+ downlink port (default)
Note The breakout feature is available only for downlink ports 1 to 23, unless those ports are
profiled as uplink ports.
• Uplink ports (25, 27, and 29 to 32) individually used, configured, or profiled as follows:
◦40/100-Gigabit QSFP+/QSFP28 uplink port (default)
◦Profiled as 40/100-Gigabit downlink port
Note Port tracking is located under Fabric > Access Policies > Global Policies > Port Tracking.
Each leaf switch can have up to 6 uplink connections to each spine switch. The port tracking policy specifies
the number of uplink connections that trigger the policy, and a delay timer for bringing the leaf switch access
ports back up after the number of specified uplinks is exceeded.
The following example illustrates how a port tracking policy behaves:
• The leaf switches each have 6 active uplink connections to the spine switches.
• The port tracking policy specifies that the threshold of active uplink connections each leaf switch that
triggers the policy is 2.
• The port tracking policy triggers when the number of active uplink connections from the leaf switch to
the spine switches drops to 2.
• Each leaf switch monitors its uplink connections and triggers the port tracking policy according to the
threshold specified in the policy.
• When the uplink connections come back up, the leaf switch waits for the delay timer to expire before
bringing its access ports back up. This gives the fabric time to reconverge before allowing traffic to
resume on leaf switch access ports. Large fabrics may need the delay timer to be set for a longer time.
Note Use caution when configuring this policy. If the port tracking setting for the number of active spine links
that triggers port tracking is too high, all leaf switch access ports will be brought down.
Note Fast Link Failover is located under Fabric > Access Policies > Switch Policies >> Policies > Fast Link
Failover.
If the limit is not reached, the endpoint is learned and a verification is made to see if the limit is reached
because of this new endpoint. If the limit is reached, and the learn disable action is configured, learning will
be disabled in the hardware on that interface (on the physical interface or on a port channel or vPC). If the
limit is reached and the learn disable action is not configured, the endpoint will be installed in hardware with
a drop action. Such endpoints are aged normally like any other endpoints.
When the limit is reached for the first time, the operational state of the port security policy object is updated
to reflect it. A static rule is defined to raise a fault so that the user is alerted. A syslog is also raised when the
limit is reached.
In case of vPC, when the MAC limit is reached, the peer leaf switch is also notified so learning can be disabled
on the peer. As the vPC peer can be rebooted any time or vPC legs can become unoperational or restart, this
state will be reconciled with the peer so vPC peers do not go out of sync with this state. If they get out of sync,
there can be a situation where learning is enabled on one leg and disabled on the other leg.
By default, once the limit is reached and learning is disabled, it will be automatically re-enabled after the
default timeout value of 60 seconds.
Protect Mode
The protect mode prevents further port security violations from occurring. Once the MAC limit exceeds the
maximum configured value on a port, all traffic from excess MAC addresses will be dropped and further
learning is disabled.
FHS features are enabled on a per tenant bridge domain (BD) basis. As the bridge domain, may be deployed
on a single or across multiple leaf switches, the FHS threat control and mitigation mechanisms cater to a single
switch and multiple switch scenarios.
About MACsec
MACsec is an IEEE 802.1AE standards based Layer 2 hop-by-hop encryption that provides data confidentiality
and integrity for media access independent protocols.
MACsec, provides MAC-layer encryption over wired networks by using out-of-band methods for encryption
keying. The MACsec Key Agreement (MKA) Protocol provides the required session keys and manages the
required encryption keys.
The 802.1AE encryption with MKA is supported on all types of links, that is, host facing links (links between
network access devices and endpoint devices such as a PC or IP phone), or links connected to other switches
or routers.
MACsec encrypts the entire data except for the Source and Destination MAC addresses of an Ethernet packet.
The user also has the option to skip encryption up to 50 bytes after the source and destination MAC address.
To provide MACsec services over the WAN or Metro Ethernet, service providers offer Layer 2 transparent
services such as E-Line or E-LAN using various transport layer protocols such as Ethernet over Multiprotocol
Label Switching (EoMPLS) and L2TPv3.
The packet body in an EAP-over-LAN (EAPOL) Protocol Data Unit (PDU) is referred to as a MACsec Key
Agreement PDU (MKPDU). When no MKPDU is received from a participants after 3 hearbeats (each hearbeat
is of 2 seconds), peers are deleted from the live peer list. For example, if a client disconnects, the participant
on the switch continues to operate MKA until 3 heartbeats have elapsed after the last MKPDU is received
from the client.
A node can have multiple policies deployed for more than one fabric link. When this happens, the per fabric
interface keychain and policy are given preference on the affected interface. The auto generated keychain and
associated MACsec policy are then given the least preference.
APIC MACsec supports two security modes. The MACsec must secure only allows encrypted traffic on the
link while the should secure allows both clear and encrypted traffic on the link. Before deploying MACsec
in must secure mode, the keychain must be deployed on the affected links or the links will go down. For
example, a port can turn on MACsec in must secure mode before its peer has received its keychain resulting
in the link going down. To address this issue the recommendation is to deploy MACsec in should secure
mode and once all the links are up then change the security mode to must secure.
Note Any MACsec interface configuration change will result in packet drops.
MACsec policy definition consists of configuration specific to keychain definition and configuration related
to feature functionality. The keychain definition and feature functionality definitions are placed in separate
policies. Enabling MACsec per Pod or per interface involves deploying a combination of a keychain policy
and MACsec functionality policy.
Note Using internal generated keychains do not require the user to specify a keychain.
Note Egress data plane policers are not supported on switched virtual interfaces (SVI).
DPP policies can be single-rate, dual-rate, and color-aware. Single-rate policies monitor the committed
information rate (CIR) of traffic. Dual-rate policers monitor both CIR and peak information rate (PIR) of
traffic. In addition, the system monitors associated burst sizes. Three colors, or conditions, are determined by
the policer for each packet depending on the data rate parameters supplied: conform (green), exceed (yellow),
or violate (red).
Typically, DPP policies are applied to physical or virtual layer 2 connections for virtual or physical devices
such as servers or hypervisors, and on layer 3 connections for routers. DPP policies applied to leaf switch
access ports are configured in the fabric access (infraInfra) portion of the ACI fabric, and must be configured
by a fabric administrator. DPP policies applied to interfaces on border leaf switch access ports (l3extOut or
l2extOut) are configured in the tenant (fvTenant) portion of the ACI fabric, and can be configured by a tenant
administrator.
Only one action can be configured for each condition. For example, a DPP policy can to conform to the data
rate of 256000 bits per second, with up to 200 millisecond bursts. The system applies the conform action to
traffic that falls within this rate, and it would apply the violate action to traffic that exceeds this rate. Color-aware
policies assume that traffic has been previously marked with a color. This information is then used in the
actions taken by this type of policer.
Scheduler
A schedule allows operations, such as configuration import/export or tech support collection, to occur during
one or more specified windows of time.
A schedule contains a set of time windows (occurrences). These windows can be one time only or can recur
at a specified time and day each week. The options defined in the window, such as the duration or the maximum
number of tasks to be run, determine when a scheduled task executes. For example, if a change cannot be
deployed during a given maintenance window because the maximum duration or number of tasks has been
reached, that deployment is carried over to the next maintenance window.
Each schedule checks periodically to see whether the APIC has entered one or more maintenance windows.
If it has, the schedule executes the deployments that are eligible according to the constraints specified in the
maintenance policy.
A schedule contains one or more occurrences, which determine the maintenance windows associated with
that schedule. An occurrence can be one of the following:
• One-time Window—Defines a schedule that occurs only once. This window continues until the maximum
duration of the window or the maximum number of tasks that can be run in the window has been reached.
• Recurring Window—Defines a repeating schedule. This window continues until the maximum number
of tasks or the end of the day specified in the window has been reached.
After a schedule is configured, it can then be selected and applied to the following export and firmware policies
during their configuration:
• Tech Support Export Policy
• Configuration Export Policy -- Daily AutoBackup
• Firmware Download
Firmware Upgrade
Policies on the APIC manage the following aspects of the firmware upgrade processes:
• What version of firmware to use.
• Downloading firmware images from Cisco to the APIC repository.
• Compatibility enforcement.
• What to upgrade:
◦Switches
◦The APIC
◦The compatibility catalog
Each firmware image includes a compatibility catalog that identifies supported types and switch models. The
APIC maintains a catalog of the firmware images, switch types, and models that are allowed to use that
firmware image. The default setting is to reject a firmware update when it does not conform to the compatibility
catalog.
The APIC, which performs image management, has an image repository for compatibility catalogs, APIC
controller firmware images, and switch images. The administrator can download new firmware images to the
APIC image repository from an external HTTP server or SCP server by creating an image source policy.
Firmware Group policies on the APIC define what firmware version is needed.
Maintenance Group policies define when to upgrade firmware, which nodes to upgrade, and how to handle
failures. In addition, maintenance Group policies define groups of nodes that can be upgraded together and
assign those maintenance groups to schedules. Node group options include all leaf nodes, all spine nodes, or
sets of nodes that are a portion of the fabric.
The APIC controller firmware upgrade policy always applies to all nodes in the cluster, but the upgrade is
always done one node at a time. The APIC GUI provides real-time status information about firmware upgrades.
Note If a recurring or one-time upgrade schedule is set with a date and time in the past, the scheduler triggers
the upgrade immediately.
The following figure shows the APIC cluster nodes firmware upgrade process.
• The upgrade proceeds one node at a time until all nodes in the cluster are upgraded.
Note Because the APIC is a replicated cluster of nodes, disruption should be minimal. An
administrator should be aware of the system load when considering scheduling APIC
upgrades, and should plan for an upgrade during a maintenance window.
• The ACI fabric, including the APIC, continues to run while the upgrade proceeds.
Note The controllers upgrade in random order. Each APIC controller takes about 10 minutes
to upgrade. Once a controller image is upgraded, it drops from the cluster, and it reboots
with the newer version while the other APIC controllers in the cluster remain operational.
Once the controller reboots, it joins the cluster again. Then the cluster converges, and
the next controller image starts to upgrade. If the cluster does not immediately converge
and is not fully fit, the upgrade will wait until the cluster converges and is fully fit.
During this period, a Waiting for Cluster Convergence message is displayed.
• If a controller node upgrade fails, the upgrade pauses and waits for manual intervention.
The following figure shows how this process works for upgrading all the ACI fabric switch nodes firmware.
• Because the administrator configured the controller update policy with a start time of midnight Saturday,
the APIC begins the upgrade at midnight on Saturday.
• The system checks for compatibility of the existing firmware to upgrade to the new version according
to the compatibility catalog provided with the new firmware image.
• The upgrade proceeds five nodes at a time until all the specified nodes are upgraded.
Note A firmware upgrade causes a switch reboot; the reboot can disrupt the operation of the
switch for several minutes. Schedule firmware upgrades during a maintenance window.
• If a switch node fails to upgrade, the upgrade pauses and waits for manual intervention.
Refer to the Cisco APIC Management, Installation, Upgrade, and Downgrade Guide for detailed step-by-step
instructions for performing firmware upgrades.
Configuration Zones
Configuration zones divide the ACI fabric into different zones that can be updated with configuration changes
at different times. This limits the risk of deploying a faulty fabric-wide configuration that might disrupt traffic
or even bring the fabric down. An administrator can deploy a configuration to a non-critical zone, and then
deploy it to critical zones when satisfied that it is suitable.
The following policies specify configuration zone actions:
• infrazone:ZoneP is automatically created upon system upgrade. It cannot be deleted or modified.
• infrazone:Zone contains one or more pod groups (PodGrp) or one or more node groups (NodeGrp).
Note You can only choose PodGrp or NodeGrp; both cannot be chosen.
A node can be part of only one zone (infrazone:Zone). NodeGrp has two properties: name, and
deployment mode. The deployment mode property can be:
◦enabled - Pending updates are sent immediately.
◦disabled - New updates are postponed.
◦triggered - pending updates are sent immediately, and the deployment mode is automatically
reset to the value it had before the change to triggered.
When a policy on a given set of nodes is created, modified, or deleted, updates are sent to each node where
the policy is deployed. Based on policy class and infrazone configuration the following happens:.
• For policies that do not follow infrazone configuration, the APIC sends updates immediately to all the
fabric nodes.
• For policies that follow infrazone configuration, the update proceeds according to the infrazone
configuration:
◦If a node is part of an infrazone:Zone, the update is sent immediately if the deployment mode of
the zone is set to enabled; otherwise the update is postponed.
◦If a node is not part of aninfrazone:Zone, the update is done immediately, which is the ACI fabric
default behavior.
Geolocation
Administrators use geolocation policies to map the physical location of ACI fabric nodes in data center
facilities. The following figure shows an example of the geolocation mapping feature.
For example, for fabric deployment in a single room, an administrator would use the default room object, and
then create one or more racks to match the physical location of the switches. For a larger deployment, an
administrator can create one or more site objects. Each site can contain one or more buildings. Each building
has one or more floors. Each floor has one or more rooms, and each room has one or more racks. Finally each
rack can be associated with one or more switches.
For configuration details, see Routed Connectivity to External Networks in Cisco APIC Layer 3 Configuration
Guide.
multipod traffic is transitting an IPN, use a DSCP policy. For more information, see Preserving QoS Priority
Settings in a Multipod Fabric, on page 96.
Observe the following 801.1P CoS preservation guidelines and limitations:
• The current release can only preserve the 802.1P value within a VLAN header. The DEI bit is not
preserved.
• For VXLAN encapsulated packets, the current release will not preserve the 802.1P CoS value contained
in the outer header.
• 802.1P is not preserved when the following configuration options are enabled:
◦Multipod QoS (using a DSCP policy) is enabled.
◦Contracts are configured that include QoS.
◦Dynamic packet prioritization is enabled.
◦The outgoing interface is on a FEX.
◦Preserving QoS CoS priority settings is not supported when traffic is flowing from an EPG with
isolation enforced to an EPG without isolation enforced.
◦A DSCP QoS policy is configured on a VLAN EPG and the packet has an IP header. DSCP marking
can be set at the filter level on the following with the precedence order from the innermost to the
outermost:
◦Contract
◦Subject
◦In Term
◦Out Term
Note When specifying vzAny for a contract, external EPG DSCP values are not honored
because vzAny is a collection of all EPGs in a VRF, and EPG specific configuration
cannot be applied. If EPG specific target DSCP values are required, then the external
EPG should not use vzAny.
Note You can alternatively use CoS Preservation where you want to preserve the QoS priority settings of 802.1P
traffic entering POD 1 and egressing out of POD 2, but you are not concerned with preserving the
CoS/DSCP settings in interpod network (IPN) traffic between the pods. For more information, see
Preserving 802.1P Class of Service Settings, on page 95.
As illustrated in this figure, traffic between pods in a multipod topology passes through an IPN, which may
not be under APIC management. When an 802.1P frame is sent from a spine or leaf switch in POD 1, the
devices in the IPN may not preserve the CoS setting in 802.1P frames. In this situation, when the frame reaches
a POD 2 spine or leaf switch, it has the CoS level assigned by the IPN device, instead of the level assigned
at the source in POD 1. Use a DSCP policy to ensure that the QoS priority levels are preserved in this case.
Configure a DSCP policy to preserve the QoS priority settings in a multipod topology, where there is a need
to do deterministic mapping from CoS to DSCP levels for different traffic types, and you want to prevent the
devices in the IPN from changing the configured levels. With a DSCP policy enabled, APIC converts the CoS
level to a DSCP level, according to the mapping you configure. When a frame is sent from POD 1 (with the
PCP level mapped to a DSCP level), when it reaches POD 2, the mapped DSCP level is then mapped back
to the original PCP CoS level.
802.1P CoS translation is not supported when the following configuration options are enabled:
• Contracts are configured that include QoS.
• The outgoing interface is on a FEX.
• Multipod QoS using a DSCP policy is enabled.
• Dynamic packet prioritization is enabled.
• If an EPG is configured with intra-EPG endpoint isolation enforced.
• If an EPG is configured with allow-microsegmentation enabled.
supports standard bridging and routing semantics without standard location constraints (any IP address
anywhere), and removes flooding requirements for the IP control plane Address Resolution Protocol (ARP)
/ Gratuitous Address Resolution Protocol (GARP). All traffic within the fabric is encapsulated within VXLAN.
As traffic enters the fabric, ACI encapsulates and applies policy to it, forwards it as needed across the fabric
through a spine switch (maximum two-hops), and de-encapsulates it upon exiting the fabric. Within the fabric,
ACI uses Intermediate System-to-Intermediate System Protocol (IS-IS) and Council of Oracle Protocol (COOP)
for all forwarding of endpoint to endpoint communications. This enables all ACI links to be active, equal cost
multipath (ECMP) forwarding in the fabric, and fast-reconverging. For propagating routing information
between software defined networks within the fabric and routers external to the fabric, ACI uses the
Multiprotocol Border Gateway Protocol (MP-BGP).
VXLAN in ACI
VXLAN is an industry-standard protocol that extends Layer 2 segments over Layer 3 infrastructure to build
Layer 2 overlay logical networks. The ACI infrastructure Layer 2 domains reside in the overlay, with isolated
broadcast and failure bridge domains. This approach allows the data center network to grow without the risk
of creating too large a failure domain.
All traffic in the ACI fabric is normalized as VXLAN packets. At ingress, ACI encapsulates external VLAN,
VXLAN, and NVGRE packets in a VXLAN packet. The following figure shows ACI encapsulation
normalization.
Forwarding in the ACI fabric is not limited to or constrained by the encapsulation type or encapsulation
overlay network. An ACI bridge domain forwarding policy can be defined to provide standard VLAN behavior
where required.
Because every packet in the fabric carries ACI policy attributes, ACI can consistently enforce policy in a fully
distributed manner. ACI decouples application policy EPG identity from forwarding. The following illustration
shows how the ACI VXLAN header identifies application policy within the fabric.
The ACI VXLAN packet contains both Layer 2 MAC address and Layer 3 IP address source and destination
fields, which enables efficient and scalable forwarding within the fabric. The ACI VXLAN packet header
source group field identifies the application policy endpoint group (EPG) to which the packet belongs. The
VXLAN Instance ID (VNID) enables forwarding of the packet through tenant virtual routing and forwarding
(VRF) domains within the fabric. The 24-bit VNID field in the VXLAN header provides an expanded address
space for up to 16 million unique Layer 2 segments in the same network. This expanded address space gives
IT departments and cloud providers greater flexibility as they build large multitenant data centers.
VXLAN enables ACI to deploy Layer 2 virtual networks at scale across the fabric underlay Layer 3
infrastructure. Application endpoint hosts can be flexibly placed in the data center network without concern
for the Layer 3 boundary of the underlay infrastructure, while maintaining Layer 2 adjacency in a VXLAN
overlay network.
The ACI fabric decouples the tenant endpoint address, its identifier, from the location of the endpoint that is
defined by its locator or VXLAN tunnel endpoint (VTEP) address. Forwarding within the fabric is between
VTEPs. The following figure shows decoupled identity and location in ACI.
VXLAN uses VTEP devices to map tenant end devices to VXLAN segments and to perform VXLAN
encapsulation and de-encapsulation. Each VTEP function has two interfaces:
• A switch interface on the local LAN segment to support local endpoint communication through bridging
• An IP interface to the transport IP network
The IP interface has a unique IP address that identifies the VTEP device on the transport IP network known
as the infrastructure VLAN. The VTEP device uses this IP address to encapsulate Ethernet frames and transmit
the encapsulated packets to the transport network through the IP interface. A VTEP device also discovers the
remote VTEPs for its VXLAN segments and learns remote MAC Address-to-VTEP mappings through its IP
interface.
The VTEP in ACI maps the internal tenant MAC or IP address to a location using a distributed mapping
database. After the VTEP completes a lookup, the VTEP sends the original data packet encapsulated in
VXLAN with the destination address of the VTEP on the destination leaf switch. The destination leaf switch
de-encapsulates the packet and sends it to the receiving host. With this model, ACI uses a full mesh, single
hop, loop-free topology without the need to use the spanning-tree protocol to prevent loops.
The VXLAN segments are independent of the underlying network topology; conversely, the underlying IP
network between VTEPs is independent of the VXLAN overlay. It routes the encapsulated packets based on
the outer IP address header, which has the initiating VTEP as the source IP address and the terminating VTEP
as the destination IP address.
The following figure shows how routing within the tenant is done.
For each tenant VRF in the fabric, ACI assigns a single L3 VNID. ACI transports traffic across the fabric
according to the L3 VNID. At the egress leaf switch, ACI routes the packet from the L3 VNID to the VNID
of the egress subnet.
Traffic arriving at the fabric ingress that is sent to the ACI fabric default gateway is routed into the Layer 3
VNID. This provides very efficient forwarding in the fabric for traffic routed within the tenant. For example,
with this model, traffic between 2 VMs belonging to the same tenant, on the same physical host, but on
different subnets, only needs to travel to the ingress switch interface before being routed (using the minimal
path cost) to the correct destination.
To distribute external routes within the fabric, ACI route reflectors use multiprotocol BGP (MP-BGP). The
fabric administrator provides the autonomous system (AS) number and specifies the spine switches that
become route reflectors.
Fabric and access policies govern the operation of internal fabric and external access interfaces. The system
automatically creates default fabric and access policies. Fabric administrators (who have access rights to the
entire fabric) can modify the default policies or create new policies according to their requirements. Fabric
and access policies can enable various functions or protocols. Selectors in the APIC enable fabric administrators
to choose the nodes and interfaces to which they will apply policies.
EPG communications require a contract; EPG to EPG communication is not allowed without a contract. The
APIC renders the entire policy model, including contracts and their associated EPGs, into the concrete model
in each switch. Upon ingress, every packet entering the fabric is marked with the required policy details.
Because contracts are required to select what types of traffic can pass between EPGs, contracts enforce security
policies. While contracts satisfy the security requirements handled by access control lists (ACLs) in conventional
network settings, they are a more flexible, manageable, and comprehensive security policy solution.
groups of endpoints are allowed to communicate. These managed objects are not tied to the topology of the
network because they are not applied to a specific interface. They are simply rules that the network must
enforce irrespective of where these groups of endpoints are connected. This topology independence means
that these managed objects can easily be deployed and reused throughout the data center not just as specific
demarcation points.
The ACI fabric security model uses the endpoint grouping construct directly so the idea of allowing groups
of servers to communicate with one another is simple. A single rule can allow an arbitrary number of sources
to communicate with an equally arbitrary number of destinations. This simplification dramatically improves
their scale and maintainability which also means they are easier to use throughout the data center.
The contract is constructed in a hierarchical manner. It consists of one or more subjects, each subject contains
one or more filters, and each filter can define one or more protocols.
For example, you may define a filter called HTTP that specifies TCP port 80 and port 8080 and another filter
called HTTPS that specifies TCP port 443. You might then create a contract called webCtrct that has two sets
of subjects. openProv and openCons are the subjects that contain the HTTP filter. secureProv and secureCons
are the subjects that contain the HTTPS filter. This webCtrct contract can be used to allow both secure and
non-secure web traffic between EPGs that provide the web service and EPGs that contain endpoints that want
to consume that service.
These same constructs also apply for policies that govern virtual machine hypervisors. When an EPG is placed
in a virtual machine manager (VMM) domain, the APIC downloads all of the policies that are associated with
the EPG to the leaf switches with interfaces connecting to the VMM domain. For a full explanation of VMM
domains, see the Virtual Machine Manager Domains chapter of Application Centric Infrastructure
Fundamentals. When this policy is created, the APIC pushes it (pre-populates it) to a VMM domain that
specifies which switches allow connectivity for the endpoints in the EPGs. The VMM domain defines the set
of switches and ports that allow endpoints in an EPG to connect to. When an endpoint comes on-line, it is
associated with the appropriate EPGs. When it sends a packet, the source EPG and destination EPG are derived
from the packet and the policy defined by the corresponding contract is checked to see if the packet is allowed.
If yes, the packet is forwarded. If no, the packet is dropped.
Contracts consist of 1 or more subjects. Each subject contains 1 or more filters. Each filter contains 1 or more
entries. Each entry is equivalent to a line in an Access Control List (ACL) that is applied on the Leaf switch
to which the endpoint within the endpoint group is attached.
In detail, contracts are comprised of the following items:
• Name—All contracts that are consumed by a tenant must have different names (including contracts
created under the common tenant or the tenant itself).
• Subjects—A group of filters for a specific application or service.
• Filters—Used to classify traffic based upon layer 2 to layer 4 attributes (such as Ethernet type, protocol
type, TCP flags and ports).
• Actions—Action to be taken on the filtered traffic. The following actions are supported:
◦Permit the traffic (regular contracts, only)
◦Mark the traffic (DSCP/CoS) (regular contracts, only)
◦Redirect the traffic (regular contracts, only, through a service graph)
◦Copy the traffic (regular contracts, only, through a service graph or SPAN)
◦Block the traffic (taboo contracts, only)
◦Log the traffic (taboo contracts and regular contracts)
• Aliases—(Optional) A changeable name for an object. Although the name of an object, once created,
cannot be changed, the Alias is a property that can be changed.
Thus, the contract allows more complex actions than just allow or deny. The contract can specify that traffic
that matches a given subject can be re-directed to a service, can be copied, or can have its QoS level modified.
With pre-population of the access policy in the concrete model, endpoints can move, new ones can come
on-line, and communication can occur even if the APIC is off-line or otherwise inaccessible. The APIC is
removed from being a single point of failure for the network. Upon packet ingress to the ACI fabric, security
policies are enforced by the concrete model running in the switch.
1 A unicast (/32) hit provides the EPG of the destination endpoint and either the local interface or the remote
leaf switch VTEP IP address where the destination endpoint is present.
2 A unicast hit of a subnet prefix (not /32) provides the EPG of the destination subnet prefix and either the
local interface or the remote leaf switch VTEP IP address where the destination subnet prefix is present.
3 A multicast hit provides the local interfaces of local receivers and the outer destination IP address to use
in the VXLAN encapsulation across the fabric and the EPG of the multicast group.
Note Multicast and external router subnets always result in a hit on the ingress leaf switch. Security policy
enforcement occurs as soon as the destination EPG is known by the ingress leaf switch.
A miss result in the forwarding table causes the packet to be sent to the forwarding proxy in the spine switch.
The forwarding proxy then performs a forwarding table lookup. If it is a miss, the packet is dropped. If it is
a hit, the packet is sent to the egress leaf switch that contains the destination endpoint. Because the egress leaf
switch knows the EPG of the destination, it performs the security policy enforcement. The egress leaf switch
must also know the EPG of the packet source. The fabric header enables this process because it carries the
EPG from the ingress leaf switch to the egress leaf switch. The spine switch preserves the original EPG in
the packet when it performs the forwarding proxy function.
On the egress leaf switch, the source IP address, source VTEP, and source EPG information are stored in the
local forwarding table through learning. Because most flows are bidirectional, a return packet populates the
forwarding table on both sides of the flow, which enables the traffic to be ingress filtered in both directions.
The leaf switch always views the group that corresponds to the multicast stream as the destination EPG and
never the source EPG. In the access control matrix shown previously, the row contents are invalid where the
multicast EPG is the source. The traffic is sent to the multicast stream from either the source of the multicast
stream or the destination that wants to join the multicast stream. Because the multicast stream must be in the
forwarding table and there is no hierarchical addressing within the stream, multicast traffic is access controlled
at the ingress fabric edge. As a result, IPv4 multicast is always enforced as ingress filtering.
The receiver of the multicast stream must first join the multicast stream before it receives traffic. When sending
the IGMP Join request, the multicast receiver is actually the source of the IGMP packet. The destination is
defined as the multicast group and the destination EPG is retrieved from the forwarding table. At the ingress
point where the router receives the IGMP Join request, access control is applied. If the Join request is denied,
the receiver does not receive any traffic from that particular multicast stream.
The policy enforcement for multicast EPGs occurs on the ingress by the leaf switch according to contract
rules as described earlier. Also, the multicast group to EPG binding is pushed by the APIC to all leaf switches
that contain the particular tenant (VRF).
Taboos
While the normal processes for ensuring security still apply, the ACI policy model aids in assuring the integrity
of whatever security practices are employed. In the ACI policy model approach, all communications must
conform to these conditions:
• Communication is allowed only based on contracts, which are managed objects in the model. If there
is no contract, inter-EPG communication is disabled by default.
• No direct access to the hardware; all interaction is managed through the policy model.
Taboos are special contract managed objects in the model that the network administrator can use to deny
specific classes of traffic. Taboos can be used to drop traffic matching a pattern (any EPG, a specific EPG,
matching a filter, and so forth). Taboo rules are applied in the hardware before the rules of regular contracts
are applied.
The following figure shows a topology with four FTAGs. Every leaf switch in the fabric is connected to each
FTAG either directly or through transit nodes. One FTAG is rooted on each of the spine nodes.
If a leaf switch has direct connectivity to the spine, it uses the direct path to connect to the FTAG tree. If there
is no direct link, the leaf switch uses transit nodes that are connected to the FTAG tree, as shown in the figure
above. Although the figure shows each spine as the root of one FTAG tree, multiple FTAG tree roots could
be on one spine.
As part of the ACI Fabric bring-up discovery process, the FTAG roots are placed on the spine switches. The
APIC configures each of the spine switches with the FTAGs that the spine anchors. The identity of the roots
and the number of FTAGs is derived from the configuration. The APIC specifies the number of FTAG trees
to be used and the roots for each of those trees. FTAG trees are recalculated every time there is a topology
change in the fabric.
Root placement is configuration driven and is not re-rooted dynamically on run-time events such as a spine
switch failure. Typically, FTAG configurations are static. An FTAG can be re-anchored from one spine to
another when a spine switch is added or removed because the administrator might decide to redistribute the
FTAG across the remaining or expanded set of spine switches.
second allowed on the given port, is compared with the traffic storm control level that you configured. When
the ingress traffic reaches the traffic storm control level that is configured on the port, traffic storm control
drops the traffic until the interval ends. An administrator can configure a monitoring policy to raise a fault
when a storm control threshold is exceeded.
• For port channels and virtual port channels, the storm control values (packets per second or percentage)
apply to all individual members of the port channel. Do not configure storm control on interfaces that
are members of a port channel.
Note On switch hardware starting with the APIC 1.3(x) and switch 11.3(x) release, for port
channel configurations, the traffic suppression on the aggregated port may be up to two
times the configured value. The new hardware ports are internally subdivided into these
two groups: slice-0 and slice-1. To check the slicing map, use the vsh_lc command
show platform internal hal l2 port gpd and look for slice 0 or slice 1 under
the Sl column. If port-channel members fall on both slice-0 and slice-1, allowed storm
control traffic may become twice the configured value because the formula is calculated
based on each slice.
• When configuring by percentage of available bandwidth, a value of 100 means no traffic storm control
and a value of 0.01 suppresses all traffic.
• Due to hardware limitations and the method by which packets of different sizes are counted, the level
percentage is an approximation. Depending on the sizes of the frames that make up the incoming traffic,
the actual enforced level might differ from the configured level by several percentage points.
Packets-per-second (PPS) values are converted to percentage based on 256 bytes.
• Maximum burst is the maximum accumulation of rate that is allowed when no traffic passes. When
traffic starts, all the traffic up to the accumulated rate is allowed in the first interval. In subsequent
intervals, traffic is allowed only up to the configured rate. The maximum supported is 65535 KB. If the
configured rate exceeds this value, it is capped at this value for both PPS and percentage.
• The maximum burst that can be accumulated is 512 MB.
• On an egress leaf switch in optimized multicast flooding (OMF) mode, traffic storm control will not be
applied.
• On an egress leaf switch in non-OMF mode, traffic storm control will be applied.
• On a leaf switch for FEX, traffic storm control is not available on host-facing interfaces.
End Host (PC and VPC) traffic For Layer 2 frames: Source MAC address and
Destination MAC address
For IP traffic:
• Source IP address
• Destination IP address
• Source MAC address
• Destination MAC address
• Layer 4 Source Port
• Layer 4 Destination Port
• Protocol
• System generated random number
Endpoint Retention
Retaining cached endpoint MAC and IP addresses in the switch improves performance. The switch learns
about endpoints as they become active. Local endpoints are on the local switch. Remote endpoints are on
other switches but are cached locally. The leaf switches store location and policy information about endpoints
that are attached directly to them (or through a directly attached Layer 2 switch or Fabric Extender), local
endpoints, and endpoints that are attached to other leaf switches on the fabric (remote endpoints in the
hardware). The switch uses a 32-Kb entry cache for local endpoints and a 64-Kb entry cache for remote
endpoints.
Software that runs on the leaf switch actively manages these tables. For the locally attached endpoints, the
software ages out entries after a retention timer for each entry has expired. Endpoint entries are pruned from
the switch cache as the endpoint activity ceases, the endpoint location moves to another switch, or the life
cycle state changes to offline. The default value for the local retention timer is 15 minutes. Before removing
an inactive entry, the leaf switch sends three ARP requests to the endpoint to see if it really has gone away.
If the switch receives no ARP response, the entry is pruned. For remotely attached endpoints, the switch ages
out the entries after three minutes of inactivity. The remote endpoint is immediately reentered in the table if
it becomes active again.
Note Version 1.3(1g) adds silent host tracking that will be triggered for any virtual and local hosts.
There is no performance penalty for not having the remote endpoint in the table other than policies are
enforced at the remote leaf switch until the endpoint is cached again.
When subnets of a bridge domain are configured to be enforced, the endpoint retention policy operates in the
following way:
• New endpoints with IP addresses not contained in the subnets of the bridge domain are not learned.
• Already learned endpoints age out of the endpoint retention cache if the device does not respond for
tracking.
This enforcement process operates in the same way regardless of whether the subnet is defined under a bridge
domain or if the subnet is defined under and EPG.
The endpoint retention timer policy can be modified. Configuring a static endpoint MAC and IP address
enables permanently storing it in the switch cache by setting its retention timer to zero. Setting the retention
timer to zero for an entry means that it will not be removed automatically. Care must be taken when doing
so. If the endpoint moves or its policy changes, the entry must be refreshed manually with the updated
information through the APIC. When the retention timer is nonzero, this information is checked and updated
instantly on each packet without APIC intervention.
A bridge domain (BD) is a flood domain that can span leaf switches, for example leaf switch 1, leaf switch
2, leaf switch 3, leaf switch 4. A BD contains one or more EPGs that can be thought of as VLANs. EPGs can
only be deployed in the leaf switches of the BD, for example leaf switch 1 and leaf switch 2. When packets
arrive at the ingress leaf switch, the ingress leaf switch does not know which leaf switch will be the egress
leaf switch. So, packets are flooded to all leaf switches in the BD (leaf switch 1, leaf switch 2, leaf switch 3,
leaf switch 4 in this example). This flooding occurs when endpoints within the EPG need to be learned. The
leaf switch endpoint retention tables update accordingly. Since the EPG in this example is not related to leaf
switch 3 or leaf switch 4 (the VLAN is not relevant on these switches), those leaf switches drop those packets.
Note Hosts that are not in the bridge domain do not receive these packets.
The endpoint retention policy determines how pruning is done. Use the default policy algorithm for most
operations. Changing the endpoint retention policy can affect system performance. In the case of a switch that
communicates with thousands of endpoints, lowering the aging interval increases the number of cache windows
available to support large numbers of active endpoints. When the endpoint count exceeds 10,000, we recommend
distributing endpoints across multiple switches.
Observe the following guidelines regarding changing the default endpoint retention policy:
• Remote Bounce Interval = (Remote Age * 2) + 30 seconds
◦Recommended default values:
◦Local Age = 900 seconds
◦Remote Age = 300 seconds
◦Bounce Age = 630 seconds
• Upgrade considerations: When upgrading to any ACI version older than release 1.0(1k), assure that the
default values of endpoint retention policy (epRetPol) under tenant common are as follows: Bounce
Age = 660 seconds.
The third case occurs if a server has multiple IP addresses on the same MAC address, such as primary and
secondary IP addresses. It could also occur if the ACI fabric learns a server's MAC and IP addresses on the
fabric, but the server's IP address is subsequently changed. When this occurs, ACI stores and links the MAC
address with both the old and new IP addresses. The old IP address is not removed until the ACI fabric flushes
the endpoint with the base MAC address.
There are two primary types of local endpoint moves in ACI:
• Where the MAC address moves to a different interface
• Where the IP address moves to a different MAC address
When the MAC address moves to a different interface, all IP addresses linked to the MAC address in the
bridge domain move with it. The ACI fabric also tracks moves, when only the IP address moves (and receives
a new MAC address). This might occur, for example, if a virtual server's MAC address is changed and it is
moved to a new ESXI server (port).
If an IP address is seen to exist across multiple MAC addresses within a VRF, this indicates that an IP flap
has occurred (which can be detrimental to fabric forwarding decisions). This is similar to MAC flapping on
two separate interfaces in a legacy network or MAC flaps on a bridge domain.
One scenario that can produce IP flaps is when a server Network Information Card (NIC) pair is set to
active/active, but the two are not connected in a single logical link (such as a Port-Channel or Virtual
Port-Channel). This type of setup can cause a single IP address, for example a virtual machine’s IP address,
to constantly move between two MAC addresses in the fabric.
To address this type of behavior, we recommend configuring the NIC pair as the two legs of a VPC to achieve
an Active/Active setup. If the server hardware does not support the Active/Active configuration (for example
a blade chassis), then an active/standby type of NIC pair configuration will also prevent the IP flapping from
occurring.
Proxy ARP within the Cisco ACI fabric is different from the traditional proxy ARP. As an example of the
communication process, when proxy ARP is enabled on an EPG, if an endpoint A sends an ARP request for
endpoint B and if endpoint B is learned within the fabric, then endpoint A will receive a proxy ARP response
from the bridge domain (BD) MAC. If endpoint A sends an ARP request for endpoint B, and if endpoint B
is not learned within the ACI fabric already, then the fabric will send a proxy ARP request within the BD.
Endpoint B will respond to this proxy ARP request back to the fabric. At this point, the fabric does not send
a proxy ARP response to endpoint A, but endpoint B is learned within the fabric. If endpoint A sends another
ARP request to endpoint B, then the fabric will send a proxy ARP response from the BD MAC.
The following example describes the proxy ARP resolution steps for communication between clients VM1
and VM2:
Device State
VM1 IP = * MAC = *
VM2 IP = * MAC = *
Figure 48: VM1 sends an ARP Request with a Broadcast MAC address to VM2
Device State
VM1 IP = VM2 IP; MAC = ?
VM2 IP = * MAC = *
3 The ACI fabric floods the proxy ARP request within the bridge domain (BD).
Figure 49: ACI Fabric Floods the Proxy ARP Request within the BD
Device State
VM1 IP = VM2 IP; MAC = ?
Device State
VM1 IP = VM2 IP; MAC = ?
5 VM2 is learned.
Device State
VM1 IP = VM2 IP; MAC = ?
Figure 52: VM1 Sends an ARP Request with a Broadcast MAC Address to VM2
Device State
VM1 IP = VM2 IP MAC = ?
Device State
VM1 IP = VM2 IP; MAC = BD MAC
Loop Detection
The ACI fabric provides global default loop detection policies that can detect loops in Layer 2 network
segments which are connected to ACI access ports. These global policies are disabled by default but the port
level policies are enabled by default. Enabling the global policies means they are enabled on all access ports,
virtual ports, and virtual port channels unless they are disabled at the individual port level.
The ACI fabric does not participate in the Spanning Tree Protocol (STP). Instead, it implements the mis-cabling
protocol (MCP) to detect loops. MCP works in a complementary manner with STP that is running on external
Layer 2 networks, and handles bridge protocol data unit (BPDU) packets that access ports receive.
Note Interfaces from an external switch running spanning tree and connected to ACI fabric with a VPC can go
to loop_inc status. Flapping the port-channel from the external switch resolves the problem. Enabling
BDPU filter or disabling loopguard on the external switch will prevent the issue.
A fabric administrator provides a key that MCP uses to identify which MCP packets are initiated by the ACI
fabric. The administrator can choose how the MCP policies identify loops and how to act upon the loops:
syslog only, or disable the port.
While endpoint moves such as VM moves are normal, they can be symptomatic of loops if the frequency is
high, and the interval between moves is brief. A separate global default endpoint move loop detection policy
is available but is disabled by default. An administrator can choose how to act upon move detection loops.
Also, an error disabled recovery policy can enable ports that loop detection and BPDU policies disabled after
an interval that the administrator can configure.
The MCP runs in native VLAN mode where the MCP BPDUs sent are not VLAN tagged, by default. MCP
can detect loops due to mis-cabling if the packets sent in native VLAN are received by the fabric, but if there
is a loop in non-native VLANs in EPG VLANs then it is not detected. Starting with release 2.1(1), APIC
supports sending MCP BPDUs in all VLANs in the EPGs configured therefore any loops in those VLANs
are detected. A new MCP configuration mode allows you to configure MCP to operate in a mode where MCP
PDUs are sent in all EPG VLANs that a physical port belongs to by adding 802.1Q header with each of the
EPG VLAN id to the PDUs transmitted.
DHCP Relay
Although ACI fabric-wide flooding is disabled by default, flooding within a bridge domain is enabled by
default. Because flooding within a bridge domain is enabled by default, clients can connect to DHCP servers
within the same EPG. However, when the DHCP server is in a different EPG or Virtual Routing and Forwarding
(VRF) instance than the clients, DHCP Relay is required. Also, when Layer 2 flooding is disabled, DHCP
Relay is required.
Note When the ACI fabric acts as a DHCP relay, it inserts the DHCP Option 82 (the DHCP Relay Agent
Information Option) in DHCP requests that it proxies on behalf of clients. If a response (DHCP offer)
comes back from a DHCP server without Option 82, it is silently dropped by the fabric. When ACI acts
as a DHCP relay, DHCP servers providing IP addresses to compute nodes attached to the ACI fabric must
support Option 82. Windows 2003 and 2008 do not support option 82 but Windows 2012 does.
The figure below shows the managed objects in the management information tree (MIT) that can contain
DHCP relays: user tenants, the common tenant, the infra tenant, the mgmt tenant, and fabric access.
The DHCP Relay profile contains one or more providers. An EPG contains one or more DHCP servers, and
the relation between the EPG and DHCP Relay specifies the DHCP server IP address. The consumer bridge
domain contains a DHCP label that associates the provider DHCP server with the bridge domain. Label
matching enables the bridge domain to consume the DHCP Relay.
Note The bridge domain DHCP label must match the DHCP Relay name.
The DHCP label object also specifies the owner. The owner can be a tenant or the access infrastructure. If the
owner is a tenant, the ACI fabric first looks within the tenant for a matching DHCP Relay. If there is no match
within a user tenant, the ACI fabric then looks in the common tenant.
DHCP Relay operates in the Visable mode as follows: Visible—the provider's IP and subnet are leaked into
the consumer's VRF. When the DHCP Relay is visible, it is exclusive to the consumer's VRF.
While the tenant and access DHCP Relays are configured in a similar way, the following use cases vary
accordingly:
• Common tenant DHCP Relays can be used by any tenant.
• Infra tenant DHCP Relays are exposed selectively by the ACI fabric service provider to other tenants.
• Fabric Access (infraInfra) DHCP Relays can be used by any tenant and allow more granular
configuration of the DHCP servers. In this case, it is possible to provision separate DHCP servers within
the same bridge domain for each leaf switch in the node profile.
DNS
The ACI fabric DNS service is contained in the fabric managed object. The fabric global default DNS profile
can be accessed throughout the fabric. The figure below shows the logical relationships of the DNS-managed
objects within the fabric.
A VRF (context) must contain a dnsLBL object in order to use the global default DNS service. Label matching
enables tenant VRFs to consume the global DNS provider. Because the name of the global DNS profile is
“default,” the VRF label name is "default" (dnsLBL name = default).
The management profile includes the in-band EPG MO that provides access to management functions via the
in-band contract (vzBrCP). The vzBrCP enables fvAEPg, l2extInstP, andl3extInstP EPGs to consume the
in-band EPG. This exposes the fabric management to locally connected devices, as well as devices connected
over Layer 2 bridged external networks, and Layer 3 routed external networks. If the consumer and provider
EPGs are in different tenants, they can use a bridge domain and VRF from the common tenant. Authentication,
access, and audit logging apply to these connections; any user attempting to access management functions
through the in-band EPG must have the appropriate access privileges.
The management profile includes the out-of-band EPG MO that provides access to management functions
via the out-of-band contract (vzOOBBrCP). The vzOOBBrCP enables the external management instance profile
(mgmtExtInstP) EPG to consume the out-of-band EPG. This exposes the fabric node supervisor ports to
locally or remotely connected devices, according to the preference of the service provider. While the bandwidth
of the supervisor ports will be lower than the in-band ports, the supervisor ports can provide direct access to
the fabric nodes when access through the in-band ports is unavailable. Authentication, access, and audit
logging apply to these connections; any user attempting to access management functions through the out-of-band
EPG must have the appropriate access privileges. When an administrator configures an external management
instance profile, it specifies a subnet range for devices that are allowed out-of-band access. Any device not
in this range will not have out-of-band access.
The figure below shows how out-of-band management access can be consolidated through a dedicated switch.
While some service providers choose to restrict out-of-band connectivity to local connections, others can
choose to enable routed or bridged connections from external networks. Also, a service provider can choose
to configure a set of policies that include both in-band and out-of-band management access for local devices
only, or both local and remote devices.
Note Starting with APIC release 1.2(2), when a contract is provided on an out-of-band node management EPG,
the default APIC out-of-band contract source address is the local subnet that is configured on the out-of-band
node management address. Previously, any address was allowed to be the default APIC out-of-band
contract source address.
IPv6 Support
The ACI fabric supports the following IPv6 features for in-band and out-of-band interfaces, tenant addressing,
contracts, shared services, routing, Layer 4 - Layer 7 services, and troubleshooting:
• IPv6 address management, pervasive software virtual interface (SVI) bridge domain subnets, outside
network external interface addresses, and routes for shared services such as load balancers or intrusion
detection.
• Neighbor Discovery using ICMPv6 messages known as router advertisements (RA) and router solicitations
(RS), and Duplicate Address Detection (DAD),
• Stateless Address Auto configuration (SLAAC) and DHCPv6.
• Bridge domain forwarding.
• Troubleshooting (see the atomic counters, SPAN, iping6, and traceroute topics in the Troubleshooting
Chapter).
• IPv4 only, IPv6 only, or dual stack configuration of in-band and out-of-band interfaces.
Limitations of the current ACI fabric IPv6 implementation include the following:
• Multicast Listener Discovery (MLD) snooping is not supported.
• For IPv6 management, only static addresses are permitted; dynamic IPv6 pools are not supported for
IPv6 management.
• IPv6 tunnel interfaces (Intra-Site Automatic Tunnel Addressing Protocol, 6to4 and so forth) are not
supported within the fabric; IPv6 tunnel traffic run over the fabric is transparent to the fabric.
ACI fabric interfaces can be configured with link local, global unicast, and multicast IPv6 addresses.
Note While many of the examples provided in this manual use IPv4 addresses, IPv6 addresses could also be
used.
A global unicast address can be routed across the public Internet; it is globally unique within a routing domain.
A Link Local Address (LLA) has link-local scope and is unique on the link (subnet). The LLA cannot be
routed across subnets. These are used by control protocols such as neighbor discovery or OSPF. Multicast
addresses are used by IPv6 control protocols such as Neighbor Discovery to deliver packets to more than one
endpoint. These are not configurable; they are automatically generated by the protocol components.
Note The EUI-64 format can only be used for pervasive bridge domain and Layer 3 interface addresses. It
cannot be used for other IP fields in the fabric such as an external server address or for DHCP relay.
Bridge domain subnets and Layer 3 external interface IP addresses can be IPv6 global addresses with a mask
ranging from /1 to /127. A bridge domain can contain multiple IPv4 and IPv6 subnets. To support IPv4 and
IPv6 address on the same L3 external interface, the administrator creates multiple interface profiles. When
an EPG or external EpP gets deployed on the switch, the presence of a manually configured link-local address
for the equivalent bridge domain/L3 Interface or an IPv6 address for the subnet/address field results in the
creation of ipv6If interface in the switch.
Link-Local Addresses
One Link-Local Address (LLA) can be assigned to an interface. The LLA can be autogenerated or configured
by an administrator. By default, an ACI LLA is autogenerated by the switch in EUI-64 format. An administrator
must configure at least one global address on the interface for an autogenerated LLA to be generated on the
switch. The autogenerated address is saved in the operllAddr field of the ipv6If MO. For pervasive SVIs
the MAC address used is the same as the configured interface MAC address. For other kinds of interfaces the
switch MAC address is used. An administrator has the option to manually specify a complete 128-bit IPv6
link-local address on an interface in compressed or uncompressed format.
Note The switch hardware tables are limited to one LLA per Virtual Routing and Forwarding (VRF) instance.
Each pervasive bridge domain can have a single IPv6 LLA. This LLA can be set by an administrator, or can
be automatically configured by the switch when one isn't provided. When automatically configured, the switch
forms the LLA in the modified EUI-64 format where the MAC address is encoded in the IPv6 address to form
a unique address. A pervasive bridge domain uses one LLA on all the leaf nodes.
Follow these guidelines for setting LLAs:
• For external SVI and VPC members, the LLA is unique for every leaf node.
• LLAs can be changed to manual (non-zero manually specified link-local addresses) or auto (by manually
setting the specified link-local address to zero) anytime in the lifecycle of the interface.
• LLAs specified by an administrator must conform to the IPv6 link-local format (FE80:/10).
• The IPv6 interface MO (ipv6If) is created on the switch upon the creation of the first global address
on the interface, or when an administrator manually configures an LLA, whichever happens first.
• An administrator-specified LLA is represented in the llAddr property in the bridge domain and Layer
3 interface objects in the logical model.
• The LLA used by the switch (either from llAddr or autogenerated when llAddr is zero is represented
in the operLlAddr property in the corresponding ipv6If object.
• Operational LLA-related errors like duplicate LLAs are detected by the switch during Duplicate Address
Detection process and recorded in operStQual field in the ipv6If object or raise faults as appropriate.
• Apart from the llAddr fields, an LLA (FE80:/10) cannot be a valid address in any other IP address field
in the APIC (such as external server addresses or bridge domain subnets) as these addresses cannot be
routed.
Static Routes
ACI IPv6 static routes are similar to what is supported in the IPv4, except for the address and prefix format
differences in the configurations. The following types of static routes are typically handled by IPv6 static
route module:
• Local Routes: Any /128 address configured on an interface leads to a local route that points to the CPU.
• Direct routes: For any configured address on a pervasive BD, the policy element pushes a subnet route
pointing to an IPv4 proxy tunnel destination on the spine. For any configured address on a non-pervasive
Layer 3 external interface, the IPv6 manager module automatically pushes a subnet route pointing to
the CPU.
• Static routes pushed from PE: Used for external connectivity. The next hop IPv6 address for such routes
can be on a directly connected subnet on the external router or a recursive next hop that can be resolved
to a real next hop on a directly connected subnet. Note that the interface model does not allow an interface
as a next hop (though it is supported in the switch). Used to enable shared services across tenants, the
next hop for shared-services static routes is located in the shared services Virtual Routing and Forwarding
(VRF) instance, which is different from the tenant VRF, where the route is installed on the ingress leaf
switches.
Neighbor Discovery
The IPv6 Neighbor Discovery (ND) protocol is responsible for the address auto configuration of nodes,
discovery of other nodes on the link, determining the link-layer addresses of other nodes, duplicate address
detection, finding available routers and DNS servers, address prefix discovery, and maintaining reachability
information about the paths to other active neighbor nodes.
ND-specific Neighbor Solicitation or Neighbor Advertisement (NS or NA) and Router Solicitation or Router
Advertisement (RS or RA) packet types are supported on all ACI fabric Layer 3 interfaces, including physical,
Layer 3 sub interface, and SVI (external and pervasive). Up to APIC release 3.1(1x), RS/RA packets are used
for auto configuration for all Layer 3 interfaces but are only configurable for pervasive SVIs. Starting with
APIC release 3.1(2x), RS/RA packets are used for auto configuration and are configurable on Layer 3 interfaces
including routed interface, Layer 3 sub interface, and SVI (external and pervasive).
ACI bridge domain ND always operates in flood mode; unicast mode is not supported.
The ACI fabric ND support includes the following:
• Interface policies (nd:IfPol) control ND timers and behavior for NS/NA messages.
• ND prefix policies (nd:PfxPol) control RA messages.
• Configuration of IPv6 subnets for ND (fv:Subnet).
• ND interface policies for external networks.
• Configurable ND subnets for external networks, and arbitrary subnet configurations for pervasive bridge
domains are not supported.
• Per Interface
◦Control of ND packets (NS/NA)
◦Neighbor Solicitation Interval
◦Neighbor Solicitation Retry count
◦Control of RA packets
◦Suppress RA
◦Suppress RA MTU
◦RA Interval, RA Interval minimum, Retransmit time
Any configured address is usable for sending and receiving IPv6 traffic only if its DAD state is VALID.
IPv6 addresses are supported for DHCP relay. DHCPv6 relay applies across Virtual Routing and Forwarding
(VRF) instances. DHCP relay over VLAN and VXLAN are also supported. DHCPv4 works in conjunction
with DHCPv6.
fabrics can be a local link, or can be across a routed WAN link. The following figure illustrates the basic
common pervasive gateway topology.
The per-bridge domain common pervasive gateway configuration requirements are as follows:
• The bridge domain MAC (mac) values for each fabric must be unique.
Note The default bridge domain MAC (mac) address values are the same for all ACI fabrics.
The common pervasive gateway requires an administrator to configure the bridge domain
MAC (mac) values to be unique for each ACI fabric.
• The bridge domain virtual MAC (vmac) address and the subnet virtual IP address must be the same
across all ACI fabrics for that bridge domain. Multiple bridge domains can be configured to communicate
across connected ACI fabrics. The virtual MAC address and the virtual IP address can be shared across
bridge domains.
The routes that are learned through peering are sent to the spine switches. The spine switches act as route
reflectors and distribute the external routes to all of the leaf switches that have interfaces that belong to the
same tenant. These routes are longest prefix match (LPM) summarized addresses and are placed in the leaf
switch's forwarding table with the VTEP IP address of the remote leaf switch where the external router is
connected. WAN routes have no forwarding proxy. If the WAN routes do not fit in the leaf switch's forwarding
table, the traffic is dropped. Because the external router is not the default gateway, packets from the tenant
endpoints (EPs) are sent to the default gateway in the ACI fabric.
Networking Domains
A fabric administrator creates domain policies that configure ports, protocols, VLAN pools, and encapsulation.
These policies can be used exclusively by a single tenant, or shared. Once a fabric administrator configures
domains in the ACI fabric, tenant administrators can associate tenant endpoint groups (EPGs) to domains.
The following networking domain profiles can be configured:
• VMM domain profiles (vmmDomP) are required for virtual machine hypervisor integration.
• Physical domain profiles (physDomP) are typically used for bare metal server attachment and management
access.
• Bridged outside network domain profiles (l2extDomP) are typically used to connect a bridged external
network trunk switch to a leaf switch in the ACI fabric.
• Routed outside network domain profiles (l3extDomP) are used to connect a router to a leaf switch in the
ACI fabric.
• Fibre Channel domain profiles (fcDomP) are used to connect Fibre Channel VLANs and VSANs.
A domain is configured to be associated with a VLAN pool. EPGs are then configured to use the VLANs
associated with a domain.
Note EPG port and VLAN configurations must match those specified in the domain infrastructure configuration
with which the EPG associates. If not, the APIC will raise a fault. When such a fault occurs, verify that
the domain infrastructure configuration matches the EPG port and VLAN configurations.
The l2extOut includes the switch-specific configuration and interface-specific configuration. The l2extInstP
EPG exposes the external network to tenant EPGs through a contract. For example, a tenant EPG that contains
a group of network-attached storage devices could communicate through a contract with the l2extInstP EPG
according to the network configuration contained in the Layer 2 external outside network. Only one outside
network can be configured per leaf switch. However, the outside network configuration can easily be reused
for multiple nodes by associating multiple nodes with the L2 external node profile. Multiple nodes that use
the same profile can be configured for fail-over or load balancing.
The ACI fabric is unaware of the presence of the external router and the APIC statically assigns the leaf switch
interface to its EPG.
A Layer 3 external outside network (l3extOut object) includes the routing protocol options (BGP, OSPF, or
both) and the switch-specific configuration and interface-specific configuration. While the l3extOut contains
the routing protocol (for example, OSPF with its related Virtual Routing and Forwarding (VRF) and area ID),
the Layer 3 external interface profile contains the necessary OSPF interface configuration details. Both are
needed to enable OSPF.
Note A bridge domain in a tenant can contain a public subnet that is advertised through an l3extOut provisioned
in the common tenant.
The l3extInstP EPG exposes the external network to tenant EPGs through a contract. For example, a tenant
EPG that contains a group of web servers could communicate through a contract with the l3extInstP EPG
according to the network configuration contained in the l3extOut. The outside network configuration can
easily be reused for multiple nodes by associating the nodes with the L3 external node profile. Multiple nodes
that use the same profile can be configured for fail-over or load balancing. Also, a node can be added to
multiple l3extOuts resulting in VRFs that are associated with the l3extOuts getting deployed on that node.
For scalability information, refer to the current Verified Scalability Guide for Cisco ACI.
Starting with Cisco APIC release 1.1(1j), transit traffic ingressing and egressing the same l3extOut will be
dropped according to the policy, when configured with 0.0.0.0/0 security import subnet. This behavior is true
for dynamic or static routing. To prevent this behavior, you must define more specific subnets.
Starting with Cisco APIC release 1.2(1), ingress-based policy enforcement enables defining policy enforcement
for Layer 3 Outside traffic with regard to egress and ingress directions. The default is ingress. During an
upgrade to release 1.2(1) or higher, existing Layer 3 Outside configurations are set to egress so that the behavior
is consistent with the existing configuration; no special upgrade sequence needs to be planned. After the
upgrade, you change the global property value to ingress. When it has been changed, the system reprograms
the rules and prefix entries. Rules are removed from the egress leaf and installed on the ingress leaf, if not
already present. If not already configured, an Actrl prefix entry is installed on the ingress leaf. Direct server
return (DSR), and attribute EPGs require ingress based policy enforcement. vzAny and taboo contracts ignore
ingress based policy enforcement. Transit rules are applied at ingress.
Note Starting with Cisco APIC release 1.2(1x), tenant networking protocol policies for BGP l3extOut
connections can be configured with a maximum prefix limit enabling monitoring and restricting the number
of route prefixes received from a peer. Once the maximum prefix limit has been exceeded, a log entry is
recorded, and further prefixes are rejected. The connection can be restarted if the count drops below the
threshold in a fixed interval, or the connection is shut down. Only one option can be used at a time. The
default setting is a limit of 20,000 prefixes, after which new prefixes are rejected. When the reject option
is deployed, BGP accepts one more prefix beyond the configured limit, before the APIC raises a fault.
Note When you configure Layer 3 Outside (L3Out) connections to external routers, or multipod connections
through an Inter-Pod Network (IPN), it is critical that the MTU be set appropriately on both sides. On
some platforms, such as ACI, Cisco NX-OS, and Cisco IOS, the configurable MTU value takes into
account packet headers (resulting in a max packet size to be set as 9216 bytes for ACI and 9000 for NX-OS
and IOS), whereas other platforms such as IOS-XR configure the MTU value exclusive of packet headers
(resulting in a max packet size of 8986 bytes).
For the appropriate MTU values for each platform, see the relevant configuration guides.
Cisco highly recommends you test the MTU using CLI-based commands. For example, on the Cisco
NX-OS CLI, use a command such as ping 1.1.1.1 df-bit packet-size 9000 source-interface
ethernet 1/1.
The following figure illustrates how the ACI fabric keeps static route preferences intact across leaf switches
so that route selection happens based on this preference.
This figure shows a MP-BGP route coming to leaf switch 3 (L3) from leaf switch 4 (L4) that wins over a local
static route. A static route is installed in the Unicast Routing Information Base (URIB) with the preference
configured by an administrator. On an ACI non-border leaf switch, a static route is installed with leaf switch
4 (L4) as its nexthop. When nexthop on L4 is not available, the L3 static route becomes the best route in
fabric.
Note If a static route in a leaf switch is defined with next hop Null 0, MP-BGP does not advertise that route
to other leaf switches in fabric.
Route Import and Export, Route Summarization, and Route Community Match
Subnet route export or import configuration options can be specified according to the scope and aggregation
options described below.
For routed subnets, the following scope options are available:
• Export Route Control Subnet—Controls the export route direction.
• Import Route Control Subnet—Controls the import route direction.
Note Import route control is supported for BGP and OSPF, but not EIGRP.
• External Subnets for the External EPG (Security Import Subnet)—Specifies which external subnets have
contracts applied as part of a specific External Network Instance Profile (l3extInstP). For a subnet
under the l3extInstP to be classified as an External EPG, the scope on the subnet should be set to
"import-security". Subnets of this scope determine which IP addresses are associated with the l3extInstP.
Once this is determined, contracts determine with which other EPGs that external subnet is allowed to
communicate. For example, when traffic enters the ACI switch on the Layer 3 External Outside Network
(L3extOut), a lookup occurs to determine which source IP addresses are associated with the l3extInstP.
This action is performed based on Longest Prefix Match (LPM) so that more specific subnets take
precedence over more general subnets.
• Shared Route Control Subnet— In a shared service configuration, only subnets that have this property
enabled will be imported into the consumer EPG Virtual Routing and Forwarding (VRF). It controls the
route direction for shared services between VRFs.
• Shared Security Import Subnet—Applies shared contracts to imported subnets. The default specification
is External Subnets for the External EPG.
Routed subnets can be aggregated. When aggregation is not set, the subnets are matched exactly. For example,
if 11.1.0.0/16 is the subnet, then the policy will not apply to a 11.1.1.0/24 route, but it will apply only if the
route is 11.1.0.0/16. However, to avoid a tedious and error prone task of defining all the subnets one by one,
a set of subnets can be aggregated into one export, import or shared routes policy. At this time, only 0/0
subnets can be aggregated. When 0/0 is specified with aggregation, all the routes are imported, exported, or
shared with a different VRF, based on the selection option below:
• Aggregate Export—Exports all transit routes of a VRF (0/0 subnets).
• Aggregate Import—Imports all incoming routes of given L3 peers (0/0 subnets).
Note Aggregate import route control is supported for BGP and OSPF, but not for EIGRP.
• Aggregate Shared Routes—If a route is learned in one VRF but needs to be advertised to another VRF,
the routes can be shared by matching the subnet exactly, or can be shared in an aggregate way according
to a subnet mask. For aggregate shared routes, multiple subnet masks can be used to determine which
specific route groups are shared between VRFs. For example, 10.1.0.0/16 and 12.1.0.0/16 can be specified
to aggregate these subnets. Or, 0/0 can be used to share all subnet routes across multiple VRFs.
Note Routes shared between VRFs function correctly on Generation 2 switches (Cisco Nexus N9K switches
with "EX" or "FX" on the end of the switch model name, or later; for example, N9K-93108TC-EX). On
Generation 1 switches, however, there may be dropped packets with this configuration, because the physical
ternary content-addressable memory (TCAM) tables that store routes do not have enough capacity to fully
support route parsing.
Route summarization simplifies route tables by replacing many specific addresses with an single address. For
example, 10.1.1.0/24, 10.1.2.0/24, and 10.1.3.0/24 are replaced with 10.1.0.0/16. Route summarization policies
enable routes to be shared efficiently among border leaf switches and their neighbor leaf switches. BGP,
OSPF, or EIGRP route summarization policies are applied to a bridge domain or transit subnet. For OSPF,
inter-area and external route summarization are supported. Summary routes are exported; they are not advertised
within the fabric. In the example above, when a route summarization policy is applied, and an EPG uses the
10.1.0.0/16 subnet, the entire range of 10.1.0.0/16 is shared with all the neighboring leaf switches.
Note When two L3extOut policies are configured with OSPF on the same leaf switch, one regular and another
for the backbone, a route summarization policy configured on one L3extOut is applied to both L3extOut
policies because summarization applies to all areas in the VRF.
As illustrated in the figure below, route control profiles derive route maps according to prefix-based and
community-based matching.
The route control profile (rtctrtlProfile) specifies what is allowed. The Route Control Context specifies
what to match, and the scope specifies what to set. The subject profile contains the community match
specifications, which can be used by multiple l3extOut instances. The subject profile (SubjP) can contain
multiple community terms each of which contains one or more community factors (communities). This
arrangement enables specifying the following boolean operations:
• Logical or among multiple community terms
• Logical and among multiple community factors
For example, a community term called northeast could have multiple communities that each include many
routes. Another community term called southeast could also include many different routes. The administrator
could choose to match one, or the other, or both. A community factor type can be regular or extended. Care
should be taken when using extended type community factors, to ensure there are no overlaps among the
specifications.
The scope portion of the route control profile references the attribute profile (rtctrlAttrP) to specify what
set-action to apply, such as preference, next hop, community, and so forth. When routes are learned from an
l3extOut, route attributes can be modified.
The figure above illustrates the case where an l3extOut contains a rtctrtlProfile. A rtctrtlProfile can
also exist under the tenant. In this case, the l3extOut has an interleak relation policy (L3extRsInterleakPol)
that associates it with the rtctrtlProfile under the tenant. This configuration enables reusing the
rtctrtlProfile for multiple l3extOut connections. It also enables keeping track of the routes the fabric
learns from OSPF to which it gives BGP attributes (BGP is used within the fabric). A rtctrtlProfile defined
under an L3extOut has a higher priority than one defined under the tenant.
The rtctrtlProfile has two modes: combinable, and global. The default combinable mode combines
pervasive subnets (fvSubnet) and external subnets (l3extSubnet) with the match/set mechanism to render
the route map. The global mode applies to all subnets within the tenant, and overrides other policy attribute
settings. A global rtctrtlProfile provides permit-all behavior without defining explicit (0/0) subnets. A
global rtctrtlProfile is used with non-prefix based match rules where matching is done using different
subnet attributes such as community, next hop, and so on. Multiple rtctrtlProfile policies can be configured
under a tenant.
rtctrtlProfile policies enable enhanced default import and default export route control. Layer 3 Outside
networks with aggregated import or export routes can have import/export policies that specify supported
default-export and default–import, and supported 0/0 aggregation policies. To apply a rtctrtlProfile policy
on all routes (inbound or outbound), define a global default rtctrtlProfile that has no match rules.
Note While multiple l3extOut connections can be configured on one switch, all Layer 3 outside networks
configured on a switch must use the same rtctrtlProfile because a switch can have only one route map.
The protocol interleak and redistribute policy controls externally learned route sharing with ACI fabric BGP
routes. Set attributes are supported. Such policies are supported per L3extOut, per node, or per VRF. An
interleak policy applies to routes learned by the routing protocol in the L3extOut. Currently, interleak and
redistribute policies are supported for OSPF v2 and v3. A route control policy rtctrtlProfile has to be
defined as global when it is consumed by an interleak policy.
• A shared service is supported only with non-overlapping and non-duplicate subnets. When configuring
subnets for shared services, follow these guidelines:
◦Configure the subnet for a shared service provider under the EPG, not under the bridge domain.
◦Subnets configured under an EPG that share the same VRF must be disjointed and must not overlap.
◦Subnets leaked from one VRF to another must be disjointed and must not overlap.
◦Subnets leaked from multiple consumer networks into a VRF or vice versa must be disjointed and
must not overlap.
Note If two consumers are mistakenly configured with the same subnet, recover from this
condition by removing the subnet configuration for both then reconfigure the subnets
correctly.
• Do not configure a shared service with AnyToProv in the provider VRF. The APIC rejects this
configuration and raises a fault.
• When a contract is configured between in-band and out-of-band EPGs, the following restrictions apply:
◦Both EPGs should be in the same VRF.
◦Ffilters apply only in the incoming direction.
◦Layer 2 filters are not supported.
◦QoS does not apply to in-band Layer 4 to Layer 7 services.
◦Management statistics are not available.
◦Shared services for CPU-bound traffic are not supported.
Note All switches that will use l3extInstP EPG shared service contracts require the hardware and software
support available starting with the APIC 1.2(1x) and switch 11.2(1x) releases. Refer to the Cisco APIC
Management, Installation, Upgrade, and Downgrade Guide and Release Notes documentation for more
details.
The figure below illustrates the major policy model objects that are configured for a shared l3extInstP EPG.
Take note of the following guidelines and limitations for shared Layer 3 outside network configurations:
• No tenant limitations: Tenants A and B can be any kind of tenant (user, common, infra, mgmt.). The
shared l3extInstP EPG does not have to be in the common tenant.
• Flexible placement of EPGs: EPG A and EPG B in the illustration above are in different tenants. EPG
A and EPG B could use the same bridge domain and VRF, but they are not required to do so. EPG A
and EPG B are in different bridge domains and different VRFs but still share the same l3extInstP EPG.
• A subnet can be private, public, or shared. A subnet that is to be advertised into a consumer or provider
EPG of an L3extOut must be set to shared. A subnet that is to be exported to an L3extOut must be set
to public.
• The shared service contract is exported from the tenant that contains the l3extInstP EPG that provides
shared Layer 3 outside network service. The shared service contract is imported into the tenants that
contain the EPGs that consume the shared service.
• Do not use taboo contracts with a shared L3 out; this configuration is not supported.
• The l3extInstP as a shared service provider is supported, but only with non l3extInstP consumers
(where the L3extOut EPG is the same as the l3extInstP).
• Traffic Disruption (Flap): When an l3instP EPG is configured with an external subnet of 0.0.0.0/0 with
the scope property of the l3instP subnet set to shared route control (shared-rctrl), or shared security
(shared-security), the VRF is redeployed with a global pcTag. This will disrupt all the external traffic
in that VRF (because the VRF is redeployed with a global pcTag).
• Prefixes for a shared L3extOut must to be unique. Multiple shared L3extOut configurations with the
same prefix in the same VRF will not work. Be sure that the external subnets (external prefixes) that are
advertised into a VRF are unique (the same external subnet cannot belong to multiple l3instPs). An
L3extOut configuration (for example, named L3Out1) with prefix1 and a second Layer 3 outside
configuration (for example, named L3Out2) also with prefix1 belonging to the same VRF will not work
(because only 1 pcTag is deployed). Different behaviors of L3extOut are possible when configured on
the same leaf switch under the same VRF. The two possible scenarios are as follows:
◦Scenario 1 has an L3extOut with an SVI interface and two subnets (10.10.10.0/24 and 0.0.0.0/0)
defined. If ingress traffic on the Layer 3 outside network has the matching prefix 10.10.10.0/24,
then the ingress traffic uses the External EPG pcTag. If ingress traffic on the Layer 3 Outside
network has the matching default prefix 0.0.0.0/0, then the ingress traffic uses the External Bridge
pcTag.
◦Scenario 2 has an L3extOut using a routed or routed-sub-interface with two subnets (10.10.10.0/24
and 0.0.0.0/0) defined. If ingress traffic on the Layer 3 outside network has the matching prefix
10.10.10.0/24, then the ingress traffic uses the External EPG pcTag. If ingress traffic on the Layer
3 outside network has the matching default prefix 0.0.0.0/0, then the ingress traffic uses the VRF
pcTag.
◦As a result of these described behaviors, the following use cases are possible if the same VRF and
same leaf switch are configured with L3extOut-A and L3extOut-B using an SVI interface:
Case 1 is for L3extOut-A: This External Network EPG has two subnets defined: 10.10.10.0/24 &
0.0.0.0/1. If ingress traffic on L3extOut-A has the matching prefix 10.10.10.0/24, it uses the external
EPG pcTag & contract-A which is associated with L3extOut-A. When egress traffic on L3extOut-A
has no specific match found, but there is a maximum prefix match with 0.0.0.0/1, it uses the External
Bridge Domain (BD) pcTag & contract-A.
Case 2 is for L3extOut-B: This External Network EPG has one subnet defined: 0.0.0.0/0. When
ingress traffic on L3extOut-B has the matching prefix10.10.10.0/24 (which is defined
underL3extOut-A), it uses the External EPG pcTag of L3extOut-A and the contract-A which is
tied with L3extOut-A. It does not use contract-B which is tied with L3extOut-B.
• Traffic not permitted: Traffic is not permitted when an invalid configuration sets the scope of the external
subnet to shared route control (shared-rtctrl) as a subset of a subnet that is set to shared security
(shared-security). For example, the following configuration is invalid:
◦shared rtctrl: 10.1.1.0/24, 10.1.2.0/24
◦shared security: 10.1.0.0/16
In this case, ingress traffic on a non-border leaf with a destination IP of 10.1.1.1 is dropped, since prefixes
10.1.1.0/24 and 10.1.2.0/24 are installed with a drop rule. Traffic is not permitted. Such traffic can be
enabled by revising the configuration to use the shared-rtctrl prefixes as shared-security prefixes as
well.
• Inadvertent traffic flow: Prevent inadvertent traffic flow by avoiding the following configuration scenarios:
◦Case 1 configuration details:
◦A Layer 3 outside network configuration (for example, named L3extOut-1) with VRF1 is
called provider1.
◦A second Layer 3 outside network configuration (for example, named L3extOut-2) with
VRF2 is called provider2.
◦L3extOut-1 VRF1 advertises a default route to the Internet, 0.0.0.0/0 which enables both
shared-rtctrl and shared-security.
◦L3extOut-2 VRF2 advertises specific subnets to DNS and NTP, 192.0.0.0/8 which enables
shared-rtctrl.
◦L3extOut-2 VRF2 has specific subnet 192.1.0.0/16, which enables shared-security.
◦Variation A: EPG Traffic Goes to Multiple VRFs.
◦Communications between EPG1 and L3extOut-1 is regulated by an allow_all contract.
◦Communications between EPG1 and L3extOut-2 is regulated by an allow_all contract.
Result: Traffic from EPG1 to L3extOut-2 also goes to 192.2.x.x.
◦Variation B: An EPG conforms to the allow_all contract of a second shared Layer 3 outside
network.
◦Communications between EPG1 and L3extOut-1 is regulated by an allow_all contract.
◦Communications between EPG1 and L3extOut-2 is regulated by an allow_icmp contract.
Result: Traffic from EPG1 to L3extOut-2 to 192.2.x.x conforms to the allow_all
contract.
◦192.1.0.0/16 = shared-security
Result: Traffic going from 192.2.x.x also goes through to the EPG.
◦The shared L3extOut VRF has an EPG with pcTag = prov vrf and a contract
set to allow_all
◦The EPG <subnet> = shared.
Result: The traffic coming in on the Layer 3 out can go through the EPG.
Note When you configure Layer 3 Outside (L3Out) connections to external routers, or multipod connections
through an Inter-Pod Network (IPN), it is critical that the MTU be set appropriately on both sides. On
some platforms, such as ACI, Cisco NX-OS, and Cisco IOS, the configurable MTU value takes into
account packet headers (resulting in a max packet size to be set as 9216 bytes for ACI and 9000 for NX-OS
and IOS), whereas other platforms such as IOS-XR configure the MTU value exclusive of packet headers
(resulting in a max packet size of 8986 bytes).
For the appropriate MTU values for each platform, see the relevant configuration guides.
Cisco highly recommends you test the MTU using CLI-based commands. For example, on the Cisco
NX-OS CLI, use a command such as ping 1.1.1.1 df-bit packet-size 9000 source-interface
ethernet 1/1.
Layer 3 Multicast
In the ACI fabric, most unicast and multicast routing operate together on the same border leaf switches, with
the multicast protocol operating over the unicast routing protocols.
In this architecture, only the border leaf switches run the full Protocol Independent Multicast (PIM) protocol.
Non-border leaf switches run PIM in a passive mode on the interfaces. They do not peer with any other PIM
routers. The border leaf switches peer with other PIM routers connected to them over L3 Outs and also with
each other.
The following figure shows the border leaf (BL) switches (BL1 and BL2) connecting to routers (R1 and R2)
in the multicast cloud. Each virtual routing and forwarding (VRF) in the fabric that requires multicast routing
will peer separately with external multicast routers.
is no equivalent for the interface in hardware. The operational state of the fabric interface should follow the
aggFabState published by the intermediate system-to-intermediate system (IS-IS).
Note The user must configure a unique loopback address on each border leaf on each VRF that is enables
multicast routing.
Any loopback configured for unicast routing can be reused. This loopback address must be routed from the
external network and will be injected into the fabric MPBGP (Multiprotocol Border Gateway Protocol) routes
for the VRF. The fabric interface source IP will be set to this loopback as the loopback interface. The following
figure shows the fabric for multicast routing.
Note Layer 3 Out ports and sub-interfaces are supported while external SVIs are not supported. Since external
SVIs are not supported, PIM cannot be enabled in L3-VPC.
• The Layer 3 multicast configuration is done at the VRF level so protocols function within the VRF and
multicast is enabled in a VRF, and each multicast VRF can be turned on or off independently.
• Once a VRF is enabled for multicast, the individual bridge domains (BDs) and L3 Outs under the enabled
VRF can be enabled for multicast configuration. By default, multicast is disabled in all BDs and Layer
3 Outs.
• Layer 3 multicast is not currently supported on VRFs that are configured with a shared L3 Out.
• Any Source Multicast (ASM) and Source-Specific Multicast (SSM) are supported.
• Bidirectional PIM, Rendezvous Point (RP) within the ACI fabric, and PIM IPv6 are currently not
supported.
• IGMP snooping cannot be disabled on pervasive bridge domains with multicast routing enabled.
• Multicast routers are not supported in pervasive bridge domains.
• The Layer 3 multicast feature is supported on the following -EX model leaf switches:
◦N9K-93180YC-EX
◦N9K-93108TC-EX
◦N9K-93180LC-EX
• Layer 3 Out ports and sub-interfaces are supported while external SVIs are not supported. Since external
SVIs are not supported, PIM cannot be enabled in L3-VPC.
• For Layer 3 multicast support for multipod, when the ingress leaf switch receives a packet from a source
attached on a bridge domain that is enabled for multicast routing, the ingress leaf switch sends only a
routed VRF copy to the fabric (routed implies that the TTL is decremented by 1, and the source-mac is
rewritten with a pervasive subnet MAC). The egress leaf switch also routes the packet into receivers in
all the relevant bridge domains. Therefore, if a receiver is on the same bridge domain as the source, but
on a different leaf switch than the source, that receiver continues to get a routed copy, even though it is
in the same bridge domain.
For more information, see details about layer 3 multicast support for multipod that leverages existing
Layer 2 design, at the following link Adding Pods.
• Starting with Release 3.1(1x), Layer 3 multicast is supported with FEX. Multicast sources or receivers
connected to FEX ports are supported. For further details about how to add FEX in your testbed, see
Configure a Fabric Extender with Application Centric Infrastructure at this URL: https://www.cisco.com/
c/en/us/support/docs/cloud-systems-management/application-policy-infrastructure-controller-apic/
200529-Configure-a-Fabric-Extender-with-Applica.html. For releases preceeding Release 3.1(1x), Layer
3 multicast is not supported with FEX. Multicast sources or receivers connected to FEX ports are not
supported.
Note When you configure Layer 3 Outside (L3Out) connections to external routers, or multipod connections
through an Inter-Pod Network (IPN), it is critical that the MTU be set appropriately on both sides. On
some platforms, such as ACI, Cisco NX-OS, and Cisco IOS, the configurable MTU value takes into
account packet headers (resulting in a max packet size to be set as 9216 bytes for ACI and 9000 for NX-OS
and IOS), whereas other platforms such as IOS-XR configure the MTU value exclusive of packet headers
(resulting in a max packet size of 8986 bytes).
For the appropriate MTU values for each platform, see the relevant configuration guides.
Cisco highly recommends you test the MTU using CLI-based commands. For example, on the Cisco
NX-OS CLI, use a command such as ping 1.1.1.1 df-bit packet-size 9000 source-interface
ethernet 1/1.
All tenant WAN connections use a single session on the spine switches where the WAN routers are connected.
This aggregation of tenant BGP sessions towards the Data Center Interconnect Gateway (DCIG) improves
control plane scale by reducing the number of tenant BGP sessions and the amount of configuration required
for all of them. The network is extended out using Layer 3 subinterfaces configured on spine fabric ports.
Transit routing with shared services using GOLF is not supported.
A Layer 3 external outside network (L3extOut) for GOLF physical connectivity for a spine switch is specified
under the infra tenant, and includes the following:
• LNodeP (l3extInstP is not required within the L3Out in the infra tenant. )
• A provider label for the L3extOut for GOLF in the infra tenant.
• OSPF protocol policies
• BGP protocol policies
All regular tenants use the above-defined physical connectivity. The L3extOut defined in regular tenants
requires the following:
• An l3extInstP (EPG) with subnets and contracts. The scope of the subnet is used to control import/export
route control and security policies. The bridge domain subnet must be set to advertise externally and it
must be in the same VRF as the application EPG and the GOLF L3Out EPG.
• Communication between the application EPG and the GOLF L3Out EPG is governed by explicit contracts
(not Contract Preferred Groups).
• An l3extConsLbl consumer label that must be matched with the same provider label of an L3Out for
GOLF in the infra tenant. Label matching enables application EPGs in other tenants to consume the
LNodeP external L3Out EPG.
• The BGP EVPN session in the matching provider L3extOut in the infra tenant advertises the tenant
routes defined in this L3Out.
• When deploying three GOLF Outs, if only 1 has a provider/consumer label for GOLF, and 0/0 export
aggregation, APIC will export all routes. This is the same as existing L3extOut on leaf switches for
tenants.
• If there is direct peering between a spine switch and a data center interconnect (DCI) router, the transit
routes from leaf switches to the ASR have the next hop as the PTEP of the leaf switch. In this case,
define a static route on the ASR for the TEP range of that ACI pod. Also, if the DCI is dual-homed to
the same pod, then the precedence (administrative distance) of the static route should be the same as the
route received through the other link.
• The default bgpPeerPfxPol policy restricts routes to 20, 000. For ACI WAN Interconnect peers, increase
this as needed.
• In a deployment scenario where there are two L3extOuts on one spine switch, and one of them has the
provider label prov1 and peers with the DCI 1, the second L3extOut peers with DCI 2 with provider
label prov2. If the tenant VRF has a consumer label pointing to any 1 of the provider labels (either prov1
or prov2), the tenant route will be sent out both DCI 1 and DCI 2.
• When aggregating GOLF OpFlex VRFs, the leaking of routes cannot occur in the ACI fabric or on the
GOLF device between the GOLF OpFlex VRF and any other VRF in the system. An external device
(not the GOLF router) must be used for the VRF leaking.
Note When you configure Layer 3 Outside (L3Out) connections to external routers, or multipod connections
through an Inter-Pod Network (IPN), it is critical that the MTU be set appropriately on both sides. On
some platforms, such as ACI, Cisco NX-OS, and Cisco IOS, the configurable MTU value takes into
account packet headers (resulting in a max packet size to be set as 9216 bytes for ACI and 9000 for NX-OS
and IOS), whereas other platforms such as IOS-XR configure the MTU value exclusive of packet headers
(resulting in a max packet size of 8986 bytes).
For the appropriate MTU values for each platform, see the relevant configuration guides.
Cisco highly recommends you test the MTU using CLI-based commands. For example, on the Cisco
NX-OS CLI, use a command such as ping 1.1.1.1 df-bit packet-size 9000 source-interface
ethernet 1/1.
Multipod
Multipod enables provisioning a more fault tolerant fabric comprised of multiple pods with isolated control
plane protocols. Also, multipod provides more flexibility with regard to the full mesh cabling between leaf
and spine switches. For example, if leaf switches are spread across different floors or different buildings,
multipod enables provisioning multiple pods per floor or building and providing connectivity between pods
through spine switches.
Multipod uses MP-BGP EVPN as the control-plane communication protocol between the ACI spines in
different Pods. WAN routers can be provisioned in the IPN, directly connected to spine switches, or connected
to border leaf switches. Multipod uses a single APIC cluster for all the pods; all the pods act as a single fabric.
Individual APIC controllers are placed across the pods but they are all part of a single APIC cluster.
For control plane isolation, IS-IS and COOP are not extended across pods. Endpoints synchronize across pods
using BGP EVPN over the IPN between the pods. Two spines in each pod are configured to have BGP EVPN
sessions with spines of other pods. The spines connected to the IPN get the endpoints and multicast groups
from COOP within a pod, but they advertise them over the IPN EVPN sessions between the pods. On the
receiving side, BGP gives them back to COOP and COOP synchs them across all the spines in the pod. WAN
routes are exchanged between the pods using BGP VPNv4/VPNv6 address families; they are not exchanged
using the EVPN address family.
There are two modes of setting up the spine switches for communicating across pods as peers and route
reflectors:
• Automatic
◦Automatic mode is a route reflector based mode that does not support a full mesh where all spines
peer with each other. The administrator must post an existing BGP route reflector policy and select
IPN aware (EVPN) route reflectors. All the peer/client settings are automated by the APIC.
◦The administrator does not have an option to choose route reflectors that don’t belong to the fabric
(for example, in the IPN).
• Manual
◦The administrator has the option to configure full mesh where all spines peer with each other
without route reflectors.
◦In manual mode, the administrator must post the already existing BGP peer policy.
• OSPF is deployed on ACI spine switches and IPN switches to provide reachability between PODs. Layer
3 subinterfaces are created on spines to connect to IPN switches. OSPF is enabled on these Layer 3
subinterfaces and per POD TEP prefixes are advertised over OSPF. There is one subinterface created
on each external spine link. Provision many external links on each spine if the expectation is that the
amount of east-west traffic between PODs will be large. Currently, ACI spine switches support up to
64 external links on each spine, and each subinterface can be configured for OSPF. Spine proxy TEP
addresses are advertised in OSPF over all the subinterfaces leading to a maximum of 64 way ECMP on
the IPN switch for proxy TEP addresses. Similarly, spines would receive proxy TEP addresses of other
PODs from IPN switches over OSPF and the spine can have up to 64 way ECMP for remote pod proxy
TEP addresses. In this way, traffic between PODs spread over all these external links provides the desired
bandwidth.
• When the all fabric links of a spine switch are down, OSPF advertises the TEP routes with the maximum
metric. This will force the IPN switch to remove the spine switch from ECMP which will prevent the
IPN from forwarding traffic to the down spine switch. Traffic is then received by other spines that have
up fabric links.
• Up to APIC release 2.0(2), multipod is not supported with GOLF. In release 2.0 (2) the two features are
supported in the same fabric only over Cisco Nexus N9000K switches without “EX” on the end of the
switch name; for example, N9K-9312TX. Since the 2.1(1) release, the two features can be deployed
together over all the switches used in the multipod and EVPN topologies.
• In a multipod fabric, if a spine in POD1 uses the infra tenant L3extOut-1, the TORs for the other pods
( POD2, POD3) cannot use the same infra L3extOut (L3extOut-1) for Layer 3 EVPN control plane
connectivity. Each POD must use their own spine switch and infra L3extOut, because it is not supported
to use a pod as a transit for WAN connectivity of other pods.
• No filtering is done for limiting the routes exchanged across pods. All end-point and WAN routes present
in each pod are exported to other pods.
• Inband management across pods is automatically configured by a self tunnel on every spine.
• The maximum latency supported between pods is 10 msec RTT, which roughly translates to a geographical
distance of up to 500 miles.
Multipod Provisioning
The IPN is not managed by the APIC. It must be preconfigured with the following information:
• Configure the interfaces connected to the spines of all PODs. Use the VLAN-4 or VLAN-5 and MTU
of 9150 and the associated correct IP addresses. Use VLAN-5 for the multipod interfaces/sub-interfaces,
if any of the pods have connections to remote leaf switches.
• Enable OSPF on sub-interfaces with the correct area ID.
Note You can alternatively use CoS Preservation where you want to preserve the QoS priority settings of 802.1P
traffic entering POD 1 and egressing out of POD 2, but you are not concerned with preserving the
CoS/DSCP settings in interpod network (IPN) traffic between the pods. For more information, see
Preserving 802.1P Class of Service Settings, on page 95.
As illustrated in this figure, traffic between pods in a multipod topology passes through an IPN, which may
not be under APIC management. When an 802.1P frame is sent from a spine or leaf switch in POD 1, the
devices in the IPN may not preserve the CoS setting in 802.1P frames. In this situation, when the frame reaches
a POD 2 spine or leaf switch, it has the CoS level assigned by the IPN device, instead of the level assigned
at the source in POD 1. Use a DSCP policy to ensure that the QoS priority levels are preserved in this case.
Configure a DSCP policy to preserve the QoS priority settings in a multipod topology, where there is a need
to do deterministic mapping from CoS to DSCP levels for different traffic types, and you want to prevent the
devices in the IPN from changing the configured levels. With a DSCP policy enabled, APIC converts the CoS
level to a DSCP level, according to the mapping you configure. When a frame is sent from POD 1 (with the
PCP level mapped to a DSCP level), when it reaches POD 2, the mapped DSCP level is then mapped back
to the original PCP CoS level.
The remote leaf switches are added to an existing pod in the fabric. All policies deployed in the main datacenter
are deployed in the remote switches, which behave like local leaf switches belonging to the pod. In this
topology, all unicast traffic is through VXLAN over Layer 3. Layer 2 Broadcast, Unknown Unicast, and
Multicast (BUM) messages are sent using Head End Replication (HER) tunnels without the use of Multicast.
All local traffic on the remote site is switched directly between endpoints, whether physical or virtual. Any
traffic that requires use of the spine switch proxy is forwarded to the main datacenter.
The APIC system discovers the remote leaf switches when they come up. From that time, they can be managed
through APIC, as part of the fabric.
You can configure Remote Leaf in the APIC GUI, either with and without a wizard, or use the REST API or
the NX-OS style CLI.
Full fabric and tenant policies are supported on remote leaf switches, in this release, except for the following
features:
• ACI Multi-Site
• Local traffic switching in a remote location will work only within a vPC pair of switches
• Local traffic switching is only between the vPC pair of remote leaf switches
• Layer 2 Outside Connections (except Static EPGs)
• With service node integration, local traffic forwarding within a remote location will be enabled if we
have consumer, provider, and services nodes connected to Remote Leaf switches in vPC mode
• 802.1Q Tunnels
• Local traffic forwarding within a remote location will happen if we have consumer, provider, and services
nodes connected to Remote Leaf switches in vPC mode
• Q-in-Q Encapsulation Mapping for EPGs
• Copy services with vzAny contract is not supported
• FEX devices connected to remote leaf switches
• FCoE connections on remote leaf switches
• Flood in encapsulation for bridge domains or EPGs
• Fast Link Failover policies
• Managed Service Graph-attached devices at remote locations
• NetFlow
• Traffic Storm Control
• Cloud Sec, MacSec Encryption, and First Hop Security
• PTP
• Layer 3 Multicast routing on remote leaf switches
• PBR Tracking on remote leaf switches
• Openstack and Kubernetes VMM Domains
• Cisco AVS with VLAN and Cisco AVS with VXLAN
• Cisco ACI Virtual Edge with VLAN and ACI Virtual Edge with VXLAN
• Maintenance mode
• Troubleshooting Wizard
• TEP to TEP Atomic Counters between remote leaf switches or remote leaf switches and local leaf
switches
• Transit L3Out across remote locations is not supported.
• In transit L3Outs, traffic forwarding between two remote leaf switches in the same remote datacenter
in the same pod and across pods is not supported. All local traffic on the remote location is switched
directly between endpoints, whether physical or virtual.
• All inter-VRF traffic goes to the spine switch before being forwarded.
• Before decommissioning a remote leaf, you must first delete the vPC.
HSRP
About HSRP
HSRP is a first-hop redundancy protocol (FHRP) that allows a transparent failover of the first-hop IP router.
HSRP provides first-hop routing redundancy for IP hosts on Ethernet networks configured with a default
router IP address. You use HSRP in a group of routers for selecting an active router and a standby router. In
a group of routers, the active router is the router that routes packets, and the standby router is the router that
takes over when the active router fails or when preset conditions are met.
Many host implementations do not support any dynamic router discovery mechanisms but can be configured
with a default router. Running a dynamic router discovery mechanism on every host is not practical for many
reasons, including administrative overhead, processing overhead, and security issues. HSRP provides failover
services to such hosts.
When you use HSRP, you configure the HSRP virtual IP address as the default router of the host (instead of
the IP address of the actual router). The virtual IP address is an IPv4 or IPv6 address that is shared among a
group of routers that run HSRP.
When you configure HSRP on a network segment, you provide a virtual MAC address and a virtual IP address
for the HSRP group. You configure the same virtual address on each HSRP-enabled interface in the group.
You also configure a unique IP address and MAC address on each interface that acts as the real address. HSRP
selects one of these interfaces to be the active router. The active router receives and routes packets destined
for the virtual MAC address of the group.
HSRP detects when the designated active router fails. At that point, a selected standby router assumes control
of the virtual MAC and IP addresses of the HSRP group. HSRP also selects a new standby router at that time.
HSRP uses a priority designator to determine which HSRP-configured interface becomes the default active
router. To configure an interface as the active router, you assign it with a priority that is higher than the priority
of all the other HSRP-configured interfaces in the group. The default priority is 100, so if you configure just
one interface with a higher priority, that interface becomes the default active router.
Interfaces that run HSRP send and receive multicast User Datagram Protocol (UDP)-based hello messages
to detect a failure and to designate active and standby routers. When the active router fails to send a hello
message within a configurable period of time, the standby router with the highest priority becomes the active
router. The transition of packet forwarding functions between the active and standby router is completely
transparent to all hosts on the network.
You can configure multiple HSRP groups on an interface. The virtual router does not physically exist but
represents the common default router for interfaces that are configured to provide backup to each other. You
do not need to configure the hosts on the LAN with the IP address of the active router. Instead, you configure
them with the IP address of the virtual router (virtual IP address) as their default router. If the active router
fails to send a hello message within the configurable period of time, the standby router takes over, responds
to the virtual addresses, and becomes the active router, assuming the active router duties. From the host
perspective, the virtual router remains the same.
Note Packets received on a routed port destined for the HSRP virtual IP address terminate on the local router,
regardless of whether that router is the active HSRP router or the standby HSRP router. This process
includes ping and Telnet traffic. Packets received on a Layer 2 (VLAN) interface destined for the HSRP
virtual IP address terminate on the active router.
is because Cisco APIC assigns the same MAC address (00:22:BD:F8:19:FF) to every logical interface
under the interface logical profiles.
HSRP Versions
Cisco APIC supports HSRP version 1 by default. You can configure an interface to use HSRP version 2.
HSRP version 2 has the following enhancements to HSRP version 1:
• Expands the group number range. HSRP version 1 supports group numbers from 0 to 255. HSRP version
2 supports group numbers from 0 to 4095.
• For IPv4, uses the IPv4 multicast address 224.0.0.102 or the IPv6 multicast address FF02::66 to send
hello packets instead of the multicast address of 224.0.0.2, which is used by HSRP version 1.
• Uses the MAC address range from 0000.0C9F.F000 to 0000.0C9F.FFFF for IPv4 and 0005.73A0.0000
through 0005.73A0.0FFF for IPv6 addresses. HSRP version 1 uses the MAC address range
0000.0C07.AC00 to 0000.0C07.ACFF.
Mainframes that require the ACI fabric to be a transit domain for external connectivity through a WAN router
and for east-west traffic within the fabric push host routes to the fabric that are redistributed within the fabric
and towards external interfaces.
The VIP is the external facing IP address for a particular site or service. A VIP is tied to one or more servers
or nodes behind a service node.
In such scenarios, the policies are administered at the demarcation points and ACI policies need not be imposed.
L4-L7 route peering is a special use case of the fabric as a transit where the ACI fabric serves as a transit
OSPF/BGP domain for multiple pods. Route Peering is used to configure OSPF/BGP peering on the L4-L7
service device so that it can exchange routes with the ACI leaf node to which it is connected. A common use
case for route peering is Route Health Injection where the SLB VIP is advertised over OSPF/iBGP to clients
outside the ACI fabric. See L4-L7 Route Peering with Transit Fabric - Configuration Walkthrough for a
configuration walk-through of this scenario.
Figure 81: GOLF L3Outs and a Border Leaf L3Out in a Transit-Routed Configuration
Route Redistribution
Inbound routes from external peers are redistributed into the ACI fabric using MP-BGP, subject to inbound
filtering rules. These can be transit routes or external routes in the case of WAN connectivity. MP-BGP
distributes the routes to all the leaves (including other border leaves) where the tenant is deployed.
Inbound route filtering rules select a subset of routes advertised by the external peers to the fabric on the
l3extOut interfaces. The import filter route-map is generated by using the prefixes in the prefix based EPG.
The import filter list is associated only with MP-BGP to restrict the prefixes distributed into the fabric. Set
actions can also be associated with import route-maps.
In the outbound direction, an administrator has the option to advertise default routes or transit routes and
bridge domain public subnets. If default route advertisement is not enabled, outbound route filtering selectively
advertises routes as configured by the administrator.
Currently, route-maps are created with a prefix-list on a per-tenant basis to indicate the bridge domain public
subnets to be advertised to external routers. In addition, a prefix-list has to be created to allow all transit routes
to be advertised to an external router. The prefix-list for transit routes are configured by an administrator. The
default behavior is to deny all transit route advertisement to an external router.
The following options are available for the route-maps associated with transit routes:
• Permit-all: Allow all transit routes to be redistributed and advertised outside.
• Match prefix-list: Only a subset of transit routes are redistributed and advertised outside.
• Match prefix-list and set action: A set action can be associated with a subset of transit routes to tag routes
with a particular attribute.
The bridge domain public subnets and transit route prefixes can be different prefix-lists but combined into a
single route-map with different sequence numbers. Transit routes and bridge domain public subnets are not
expected to have the same prefixes, so prefix-list matches are mutually exclusive.
OSPF BGP
Various host types require OSPF to enable External PODs and service nodes can use BGP
connectivity and provide redundancy. These include peering with the fabric. BGP peers are associated with
mainframes, external pods, and service nodes that use an l3extOut and multiple BGP peers can be
ACI as a layer-3 transit within the fabric and to the configured per l3extOut. BGP peer reachability can
WAN. Such external devices peer with the fabric be through OSPF, EIGRP, connected interfaces, static
through a non-border leaf running OSPF. Ideally, the routes, or loopback. iBGP or eBGP is used for peering
OSPF areas are configured as a Not-So-Stubby Area with external routers. BGP route attributes from the
(NSSA) or totally stub area to enable them to receive external router are preserved since MP-BGP is used
a default route and not have them participate in full for distributing the external routes in the fabric.
area routing. For existing deployments where the
A configuration that contains a match for both
administrator prefers not to change routing
transitive and non-transitive BGP extended
configurations, a stub area configuration is not
communities with the same value is not supported;
mandated.
the APIC rejects this configuration.
Two fabric leaf switches do not establish OSPF
adjacency with each other unless they share the same
external SVI interface.
OSPF BGP
OSPF Route filtering BGP Route filtering
OSPF can be configured to limit the number of Inbound route filtering in BGP is applied using a
Link-State Advertisements (LSAs) that can be route-map on a per-peer basis. A route-map is
accepted from the external peer to avoid over configured at the peer-af level in the in-direction, to
consumption of the route table due to a rogue external filter the transit routes to be allowed in the fabric.
router.
In the outbound direction, static routes are
Inbound route filtering is supported for Layer 3 redistributed into BGP at the dom-af level. Transit
external outside tenant networks using OSPF. It is routes from MP-BGP are available to the external
applied using a route-map in the in-direction, to filter BGP peering sessions. A route-map is configured at
transit routes allowed in the fabric. the peer-af level in the out-direction to allow only
In the outbound direction, redistribute static and public subnets and selected transit routes outside.
redistribute BGP are configured at the OSPF domain Optionally, a set action to advertise a community
level. A route-map is configured to filter the bridge value for selected prefixes is configured on the
domain public subnets and transit routes. Optionally, route-map.
some prefixes in the route-map can also be configured The bridge domain public subnets and transit route
with a set action to add route tags. Inter-area prefixes prefixes can be different prefix-lists but combined
are also filtered using the outbound filter list and into a single route-map at the peer-af level with
associating it with an OSPF area. different sequence numbers.
OSPF Name Lookup, Prefix Suppression, and BGP Dynamic Neighbor Support and Private AS
Type 7 Translation Control
OSPF can be configured to enable name lookup for Rather than providing a specific neighbor address, a
router IDs and perform prefix suppression. dynamic neighbor range of addresses can be provided.
The OSPF Forwarding Address Suppression in Private Autonomous numbers (AS) are from
Translated Type-5 LSAs feature causes an NSSA 64512-65535; they cannot be advertised to a global
ABR to translate Type-7 LSAs to Type-5 LSAs, but BGP table. Private AS numbers can be removed from
use the 0.0.0.0 as the forwarding address instead of the AS path on a per peer basis and can only be used
that specified in the Type-7 LSA. This feature causes for eBGP peers according to the following variations:
routers that are configured not to advertise forwarding
addresses into the backbone to direct forwarded traffic • Remove Private AS – Remove if AS path has
to the translating NSSA ASBRs only private AS numbers
• Remove All – Removes if AS path has both
private and public AS numbers
• Replace AS – Replaces private AS with local
AS number
Note Remove all and replace AS can only
be set if remove private as is set.
BGP dampening minimizes propagation into the fabric of flapping e-BGP routes received from external routers
connected to border leaf switches (BLs). Frequently flapping routes from external routers are suppressed on
BLs based on configured criteria and prohibited from redistribution to iBGP peers (ACI spine switches).
Suppressed routes are reused after a configured time criteria. Each flap penalizes the e-BGP route with a
penalty of 1000. When the flap penalty reaches a defined suppress-limit threshold (default 2000) the e-BGP
route is marked as dampened. Dampened routes are not advertised to other BGP peers. The penalty is
decremented to half after every half-life interval (the default 15 minutes). A dampened route is reused if the
penalty falls below a specified reuse-limit (the default is 750). A dampened route is suppressed at most for a
specified maximum suppress time (maximum of 45 minutes).
Use the BGP weight attribute to select a best path. The weight is assigned locally to the router. The value is
only significant to the specific router. The value is not propagated or carried through any of the route updates.
A weight can be from 0 to 65,535. By default, paths that the router originates have a weight of 32,768, and
other paths have a weight of 0. Routes with a higher weight value have preference when there are multiple
routes to the same destination. A weight can be set under the neighbor or under the route map.
BGP peering is typically configured to the neighbor’s loopback address. In such cases, loopback reachability
is statically configured or advertised through OSPF, the latter being the more common case. The loopback
interface is configured as a passive interface and added into the OSPF area. There are no redistribution policies
attached to OSPF. The route redistribution implementation is through BGP. Route filtering can be configured
in Layer 3 outside tenant networks that use either BGP or OSPF.
External routes can also be programmed as static routes on the border leaf in the respective tenants. A peering
protocol is not required if the external routes are programmed as static routes on the border leaf. External
static routes are redistributed to other leafs in the fabric through MP-BGP, subject to import filtering. Starting
with release 1.2(1x), static route preference in within the ACI fabric is carried in MP-BGP using cost extended
community. On a Layer 3 outside connection, an MP-BGP route coming from Layer 4 wins over a local static
route. A route is installed in the Unicast Routing Information Base (URIB) with the preference configured
by an administrator. On an ACI non-border leaf switch, a route is installed with Layer 4 as nexthop. When
nexthop on Layer 4 is not available, the Layer 3 static route becomes the best route in fabric.
For l3extOut connections, external endpoints are mapped to an external EPG based on IP prefixes. For each
l3extOut connection, an administrator has the option to create one or more external EPGs based on whether
they require different policy treatments for different groups of external endpoints.
Each external EPG is associated with a class-id. Each prefix in the external EPG is programmed in the hardware
to derive the corresponding class-id. The prefixes are only qualified by the Virtual Routing and Forwarding
(VRF) instance (also known as the context or private network) and not by the l3extOut interface on which
they are deployed.
The union of prefixes from all the l3extOut policies in the same VRF is programmed on all the leaves where
the l3extOut policies are deployed. The source and destination class-ids corresponding to the source and
destination IP address in the packet are derived at the ingress leaf and the policy is applied on the ingress leaf
itself based on the configured contract. If a contract allows traffic between two prefixes on two different
l3extOut interfaces, then packets with any combination of the source and destination IP address (belonging
to the configured prefixes) is allowed between the l3extOut interfaces. If there is no contract between the
EPGs, the traffic is dropped at the ingress leaf.
Since prefixes are programmed on every leaf where l3extOut policies are deployed, the total number of
prefixes supported for prefix-based EPGs is limited to 1K for the fabric.
Overlapping or equal subnets cannot be configured on different l3extOut interfaces in the same VRF. If
overlapping or equal subnets are required, then a single l3extOut is used for transit with appropriate export
prefixes.
• no-advertise
• no-export
• no-peer
BGP
The ACI fabric supports BGP peering with external routers. BGP peers are associated with an l3extOut policy
and multiple BGP peers can be configured per l3extOut. BGP can be enabled at the l3extOut level by defining
the bgpExtP MO under an l3extOut.
Note While the l3extOut policy contains the routing protocol (for example, BGP with its related VRF), the
Layer 3 external interface profile contains the necessary BGP interface configuration details. Both are
needed to enable BGP.
BGP peer reachability can be through OSPF, EIGRP, a connected interface, static routes, or loopback.
iBGP/eBGP can be used for peering with external routers. The BGP route attributes from the external router
will be preserved since MP-BGP is used for distributing the external routes in the fabric.
This is enough to enable IPv4 and/or IPv6 address families for the VRF associated with an l3extOut. The
address family to enable on a switch is determined by the IP address type defined in bgpPeerP policies for
the l3extOut. The policy is optional; if not defined, the default will be used. Policies can be defined for a
tenant and used by a VRF that is referenced by name.
At least one peer policy has to be defined to enable the protocol on the border leaf switch. A peer policy can
be defined in two places:
• Under l3extRsPathL3OutAtt—a physical interface is used as the source interface.
• Under l3extLNodeP—a loop-back interface is used as the source interface.
OSPF
Various host types require OSPF to enable connectivity and provide redundancy. These include mainframe
devices, external pods and service nodes that use the ACI fabric as a Layer 3 transit within the fabric and to
the WAN. Such external devices peer with the fabric through a non-border leaf running OSPF. Configure the
OSPF area as an NSSA (stub) area to enable it to receive a default route and not participate in full-area routing.
Typically, existing routing deployments avoid configuration changes, so a stub area configuration is not
mandated.
OSPF is enabled by configuring an ospfExtP managed object under an l3extOut. OSPF IP address family
version(s) configured on the border leaf are determined by the address family configured for the OSPF interface
IP address.
Note While the l3extOut policy contains the routing protocol (for example, OSPF with its related VRF and
area ID), the Layer 3 external interface profile contains the necessary OSPF interface configuration details.
Both are needed to enable OSPG.
All OSPF VRF level policies are configured by using the fvRsCtxToOspfCtxPol relation. The relation can
be configured per address family. If not configured, default parameters are used.
The OSPF area is configured in the ospfExtP managed object which also exposes IPv6 area properties as
needed.
Supported Features
The following features are supported:
• IPv4 and IPv6 routing
• Virtual routing and forwarding (VRF) and interface controls for each address family
• Redistribution with OSPF across nodes
• Default route leak policy per VRF
• Passive interface and split horizon support
• Route map control for setting tag for exported routes
• Bandwidth and delay configuration options in an EIGRP interface policy
Unsupported Features
The following features are not supported:
• Stub routing
• EIGRP used for BGP connectivity
• Multiple EIGRP L3extOuts on the same node
• Authentication support
• Summary prefix
• Per interface distribute lists for import and export
◦split horizon
◦passive
◦next hop self
• EIGRP Address Family Context Policy (eigrpCtxAfPol)—contains the configuration for a given
address family in a given VRF. An eigrpCtxAfPol is configured under tenant protocol policies and can
be applied to one or more VRFs under the tenant. An eigrpCtxAfPol can be enabled on a VRF through
a relation in the VRF-per-address family. If there is no relation to a given address family, or the specified
eigrpCtxAfPol in the relation does not exist, then the default VRF policy created under the common
tenant is used for that address family.
The following configurations are allowed in the eigrpCtxAfPol:
Note The autonomous system number that is a tag used for EIGRP and is not the same as the fabric ASN used
by BGP.
EIGRP cannot be enabled with BGP and OSPF on the same L3extOut.
The following EIGRP transit scenarios are supported:
• EIGRP running in an L3extOut on one node and OSPF running in another L3extOut on a different node.
Note Multiple EIGRP L3extOuts are not supported on the same node in the same Virtual
Routing and Forwarding (VRF).
Note At this time, VRF-level route maps are supported, but interface route maps are not supported.
The default route leak policy on the L3extOut is protocol agnostic in terms of the configuration. Properties
enabled in the default route leak policy are a superset of the individual protocols. Supported configurations
in the default route leak are as follows:
• Scope: VRF is the only scope supported for EIGRP.
• Always: The switch advertises the default route only if present in the routing table or advertises it
regardless.
• Criteria: only or in-addition. With the only option, only the default route is advertised by EIGRP. With
in-addition, the public subnets and transit routes are advertised along with the default route.
The default route leak policy is enabled in the domain per VRF per address family.
By default, the protocol redistribution interleak policies with appropriate route maps are set up for all valid
configurations. The administrator enables transit routing purely by virtue of creating l3extInstP subnets with
scope=export-route control to allow certain routes to be transmitted between two L3extOuts in the same VRF.
Apart from the scope of l3extInstP subnets, there are no special protocol specific configurations for covering
transit cases. Apart from the scope, which is protocol-specific, other parameters of the default route leak policy
are common across all the protocols.
The OSPF on another L3extOut on a different node transit scenario is supported with EIGRP.
Observe the following EIGRP guidelines and limitations:
• At this time, multiple EIGRP L3Outs are not supported on the same leaf switch.
• All routes are imported on an L3extOut that uses EIGRP. Import subnet scope is disabled in the GUI if
EIGRP is the protocol on the L3extOut.
Unless the administrator assigns permissions to do so, tenants are restricted from reading fabric configuration,
policies, statistics, faults, or events.
The ACI fabric manages access privileges at the managed object (MO) level. A privilege is an MO that enables
or restricts access to a particular function within the system. For example, fabric-equipment is a privilege bit.
This bit is set by the Application Policy Infrastructure Controller (APIC) on all objects that correspond to
equipment in the physical fabric.
A role is a collection of privilege bits. For example, because an “admin” role is configured with privilege bits
for “fabric-equipment” and “tenant-security,” the “admin” role has access to all objects that correspond to
equipment of the fabric and tenant security.
A security domain is a tag associated with a certain subtree in the ACI MIT object hierarchy. For example,
the default tenant “common” has a domain tag common. Similarly, the special domain tag all includes the
entire MIT object tree. An administrator can assign custom domain tags to the MIT object hierarchy. For
example, an administrator could assign the “solar” domain tag to the tenant named solar. Within the MIT, only
certain objects can be tagged as security domains. For example, a tenant can be tagged as a security domain
but objects within a tenant cannot.
Note Security Domain password strength parameters can be configured by creating Custom Conditions or by
selecting Any Three Conditions that are provided.
Creating a user and assigning a role to that user does not enable access rights. It is necessary to also assign
the user to one or more security domains. By default, the ACI fabric includes two special pre-created domains:
• All—allows access to the entire MIT
• Infra— allows access to fabric infrastructure objects/subtrees, such as fabric access policies
Note For read operations to the managed objects that a user's credentials do not allow, a "DN/Class Not Found"
error is returned, not "DN/Class Unauthorized to read." For write operations to a managed object that a
user's credentials do not allow, an HTTP 401 Unauthorized error is returned. In the GUI, actions that a
user's credentials do not allow, either they are not presented, or they are grayed out.
A set of predefined managed object classes can be associated with domains. These classes should not have
overlapping containment. Examples of classes that support domain association are as follows:
• Layer 2 and Layer 3 network managed objects
When an object that can be associated with a domain is created, the user must assign domain(s) to the object
within the limits of the user's access rights. Domain assignment can be modified at any time.
If a virtual machine management (VMM) domain is tagged as a security domain, the users contained in the
security domain can access the correspondingly tagged VMM domain. For example, if a tenant named solar
is tagged with the security domain called sun and a VMM domain is also tagged with the security domain
called sun, then users in the solar tenant can access the VMM domain according to their access rights.
Accounting
ACI fabric accounting is handled by these two managed objects (MO) that are processed by the same mechanism
as faults and events:
• The aaaSessionLR MO tracks user account login and logout sessions on the APIC and switches, and
token refresh. The ACI fabric session alert feature stores information such as the following:
◦Username
◦IP address initiating the session
◦Type (telnet, https, REST etc.)
◦Session time and length
◦Token refresh – a user account login event generates a valid active token which is required in order
for the user account to exercise its rights in the ACI fabric.
Note Token expiration is independent of login; a user could log out but the token expires
according to the duration of the timer value it contains.
• The aaaModLR MO tracks the changes users make to objects and when the changes occurred.
• If the AAA server is not pingable, it is marked unavailable and a fault is seen.
Both the aaaSessionLR and aaaModLR event logs are stored in APIC shards. Once the data exceeds the pre-set
storage allocation size, it overwrites records on a first-in first-out basis.
Note In the event of a destructive event such as a disk crash or a fire that destroys an APIC cluster node, the
event logs are lost; event logs are not replicated across the cluster.
The aaaModLR and aaaSessionLR MOs can be queried by class or by distinguished name (DN). A class query
provides all the log records for the whole fabric. All aaaModLR records for the whole fabric are available from
the GUI at the Fabric > Inventory > POD > History > Audit Log section, The APIC GUI History > Audit
Log options enable viewing event logs for a specific object identified in the GUI.
The standard syslog, callhome, REST query, and CLI export mechanisms are fully supported for aaaModLR
and aaaSessionLR MO query data. There is no default policy to export this data.
There are no pre-configured queries in the APIC that report on aggregations of data across a set of objects or
for the entire system. A fabric administrator can configure export policies that periodically export aaaModLR
and aaaSessionLR query data to a syslog server. Exported data can be archived periodically and used to
generate custom reports from portions of the system or across the entire set of system logs.
Note Modifying the "all" domain to give a user access to resources outside of that user's security domain is bad
practice. Such a user has access to resources that are provisioned for other users.
Note While an RBAC rule exposes an object to a user in a different part of the management information tree,
it is not possible to use the CLI to navigate to such an object by traversing the structure of the tree. However,
as long as the user knows the DN of the object included in the RBAC rule, the user can use the CLI to
locate it via an MO find command.
The APIC also enables administrators to grant access to users configured on externally managed authentication
Lightweight Directory Access Protocol (LDAP), RADIUS, TACACS+, or SAML servers. Users can belong
to different authentication systems and can log in simultaneously to the APIC.
In addition, OTP can be enabled for a Local User which is a one-time password that changes every 30 seconds.
Once OTP is enabled, APIC generates a random human readable 16 binary octet that are base32 OTP Key.
This OTP Key is used to generate OTP for the user.
The following figure shows how the process works for configuring an admin user in the local APIC
authentication database who has full access to the entire ACI fabric.
Note The security domain “all” represents the entire Managed Information Tree (MIT). This domain includes
all policies in the system and all nodes managed by the APIC. Tenant domains contain all the users and
managed objects of a tenant. Tenant administrators should not be granted access to the “all” domain.
The following figure shows the access that the admin user Joe Stratus has to the system.
The user Joe Stratus with read-write “admin” privileges is assigned to the domain “all” which gives him full
access to the entire system.
The following figure shows the access the admin user Jane Cirrus has to the system.
In this example, the Solar tenant administrator has full access to all the objects contained in the Solar tenant
as well as read-only access to the tenant Common. Tenant admin Jane Cirrus has full access to the tenant
Solar, including the ability to create new users in tenant Solar. Tenant users are able to modify configuration
parameters of the ACI fabric that they own and control. They also are able to read statistics and monitor faults
and events for the entities (managed objects) that apply to them such as endpoints, endpoint groups (EPGs)
and application profiles.
In the example above, the user Jane Cirrus was configured on an external RADIUS authentication server. To
configure an AV Pair on an external authentication server, add a Cisco AV Pair to the existing user record.
The Cisco AV Pair specifies the Role-Based Access Control (RBAC) roles and privileges for the user on the
APIC. The RADIUS server then propagates the user privileges to the APIC controller.
In the example above, the configuration for an open radius server (/etc/raddb/users) is as follows:
janecirrus Cleartext-Password := "<password>"
Cisco-avpair = "shell:domains = solar/admin/,common//read-all(16001)"
This example includes the following elements:
• janecirrus is the tenant administrator
• solar is the tenant
• shell:domains= - Required so that ACI reads the string correctly. This must always prepend the shell
string.
• ACI_Security_Domain_1//admin - Grants admin read only access to the tenants in this security domain.
• ACI_Security_Domain_2/admin - Grants admin write access to the tenants in this security domain.
• ACI_Security_Domain_3/read-all - Grants read-all write access to the tenants in this security domain.
Note /'s separate the security domain, write, read sections of the string. |’s separate multiple write or read roles
within the same security domain.
The first av-pair format has no UNIX user ID, while the second one does. Both are correct if all remote users
have the same role and mutual file access is acceptable. If the UNIX user ID is not explicitly specified in the
response from the remote authentication server, then some APIC software releases assign a default ID of
23999 to all users. If the response from the remote authentication server fails to specify a UNIX ID, all users
will share the same ID of 23999 and this can result in users being granted higher or lower privileges than
configured through the RBAC policies on the APIC.
The APIC supports the following regexes:
shell:domains\\s*[=:]\\s*((\\S+?/\\S*?/\\S*?)(,\\S+?/\\S*?/\\S*?){0,31})(\\(\\d+\\))$
shell:domains\\s*[=:]\\s*((\\S+?/\\S*?/\\S*?)(,\\S+?/\\S*?/\\S*?){0,31})$
Examples:
• Example 1: A Cisco AV Pair that contains a single Login domain with only writeRoles:
shell:domains=ACI_Security_Domain_1/Write_Role_1|Write_Role_2/
• Example 2: A Cisco AV Pair that contains a single Login domain with only readRoles:
shell:domains=Security_Domain_1//Read_Role_1|Read_Role_2
Note The "/" character is a separator between writeRoles and readRoles per Login domain and is required even
if only one type of role is to be used.
The Cisco AVpair string is case sensitive. Although a fault may not be seen, using mismatching cases for
the domain name or roles could lead to unexpected privileges being given.
RADIUS
To configure users on RADIUS servers, the APIC administrator must configure the required attributes
(shell:domains) using the cisco-av-pair attribute. The default user role is network-operator.
The SNMPv3 authentication protocol options are SHA and MD5. The privacy protocol options are AES-128
and DES. If these options are not specified in the cisco-av-pair attribute, MD5 and DES are the default
authentication protocols.
For example, SNMPv3 authentication and privacy protocol attributes can be specified as follows:
snmpv3:auth=SHA priv=AES-128
Similarly, the list of domains would be as follows:
shell:domains="domainA domainB …"
TACACS+ Authentication
Terminal Access Controller Access Control device Plus (TACACS+) is another remote AAA protocol that
is supported by Cisco devices. TACACS+ has the following advantages over RADIUS authentication:
• Provides independent AAA facilities. For example, the APIC can authorize access without authenticating.
• Uses TCP to send data between the AAA client and server, enabling reliable transfers with a
connection-oriented protocol.
• Encrypts the entire protocol payload between the switch and the AAA server to ensure higher data
confidentiality. RADIUS encrypts passwords only.
• Uses the av-pairs that are syntactically and configurationally different than RADIUS but the APIC
supports shell:domains.
Note The TACACS server and TACACs ports must be reachable by ping.
The XML example below configures the ACI fabric to work with a TACACS+ provider at IP address
10.193.208.9.
Note While the examples provided here use IPv4 addresses, IPv6 addresses could also be used.
<aaaTacacsPlusProvider name="10.193.208.9"
key="test123"
authProtocol=”pap”/>
Note While the examples provided here use IPv4 addresses, IPv6 addresses could also be used.
<aaaLdapProvider name="10.30.12.128"
rootdn="CN=Manager,DC=ifc,DC=com"
basedn="DC=ifc,DC=com"
SSLValidationLevel="strict"
attribute="CiscoAVPair"
enableSSL="yes"
filter="cn=$userid"
port="636" />
Note For LDAP configurations, use CiscoAVPair as the attribute string. This avoids problems related to the
limitation LDAP servers not allowing overlapping object identifiers (OID); that is, the CiscoAVPair OID
is already in use.
Instead of configuring the Cisco AVPair, you have the option to create LDAP group maps in the APIC.
admin@ifav17-ifc1:~> id
uid=15374(admin) gid=15374(admin) groups=15374(admin)
Login Domains
A login domain defines the authentication domain for a user. Login domains can be set to the Local, LDAP,
RADIUS, or TACACS+ authentication mechanisms. When accessing the system from REST, the CLI, or the
GUI, the APIC enables the user to select the correct authentication domain.
For example, in the REST scenario, the username is prefixed with a string so that the full login username
looks as follows:
apic:<domain>\<username>
If accessing the system from the GUI, the APIC offers a drop-down list of domains for the user to select. If
no apic: domain is specified, the default authentication domain servers are used to look up the username.
Starting in ACI version 1.0(2x), the login domain fallback of the APIC defaults local. If the default
authentication is set to a non-local method and the console authentication method is also set to a non-local
method and both non-local methods do not automatically fall back to local authentication, the APIC can still
be accessed via local authentication.
To access the APIC fallback local authentication, use the following strings:
• From the GUI, use apic:fallback\\username.
Note Do not change the fallback login domain. Doing so could result in being locked out of the system.
About SAML
SAML is an XML-based open standard data format that enables administrators to access a defined set of Cisco
collaboration applications seamlessly after signing into one of those applications. SAML describes the exchange
of security related information between trusted business partners. It is an authentication protocol used by
service providers to authenticate a user. SAML enables exchange of security authentication information
between an Identity Provider (IdP) and a service provider.
SAML SSO uses the SAML 2.0 protocol to offer cross-domain and cross-product single sign-on for Cisco
collaboration solutions. SAML 2.0 enables SSO across Cisco applications and enables federation between
Cisco applications and an IdP. SAML 2.0 allows Cisco administrative users to access secure web domains to
exchange user authentication and authorization data, between an IdP and a Service Provider while maintaining
high security levels. The feature provides secure mechanisms to use common credentials and relevant
information across various applications.
The authorization for SAML SSO Admin access is based on Role-Based Access Control (RBAC) configured
locally on Cisco collaboration applications.
SAML SSO establishes a Circle of Trust (CoT) by exchanging metadata and certificates as part of the
provisioning process between the IdP and the Service Provider. The Service Provider trusts the IdP's user
information to provide access to the various services or applications.
Note Service providers are no longer involved in authentication. SAML 2.0 delegates authentication away from
the service providers and to the IdPs.
The client authenticates against the IdP, and the IdP grants an Assertion to the client. The client presents the
Assertion to the Service Provider. Since there is a CoT established, the Service Provider trusts the Assertion
and grants access to the client.
Enabling SAML SSO results in several advantages:
• It reduces password fatigue by removing the need for entering different user name and password
combinations.
• It transfers the authentication from your system that hosts the applications to a third party system.
UsingSAML SSO, you can create a circle of trust between an IdP and a service provider. The service
provider trusts and relies on the IdP to authenticate the users.
• It protects and secures authentication information. It provides encryption functions to protect authentication
information passed between the IdP, service provider, and user. SAML SSO can also hide authentication
messages passed between the IdP and the service provider from any external user.
• It improves productivity because you spend less time re-entering credentials for the same identity.
• It reduces costs as fewer help desk calls are made for password reset, thereby leading to more savings.
• Cisco ACI VM Networking Support for Virtual Machine Managers, page 203
• VMM Domain Policy Model, page 205
• Virtual Machine Manager Domain Main Components , page 205
• Virtual Machine Manager Domains, page 206
• VMM Domain VLAN Pool Association, page 207
• VMM Domain EPG Association, page 208
• Trunk Port Group, page 210
• EPG Policy Resolution and Deployment Immediacy, page 211
• Guidelines for Deleting VMM Domains, page 212
or eliminates manual configuration and manual errors. This enables virtualized data centers to support large
numbers of VMs reliably and cost effectively.
Supported Vendors
Cisco ACI supports virtual machine managers (VMMs) from the following products and vendors:
• Cisco Application Centric Infrastructure Virtual Edge
For information, see the Cisco ACI Virtual Edge documentation on Cisco.com.
• Cisco Application Virtual Switch (AVS)
For information, see the chapter "Cisco ACI with Cisco AVS" in the Cisco ACI Virtualization Guide
and Cisco AVS documentation on Cisco.com.
• Cloud Foundry
Cloud Foundry integration with Cisco ACI is supported beginning with Cisco APIC Release 3.1(2). For
information, see the knowledge base article, Cisco ACI and Cloud Foundry Integration on Cisco.com.
• Kubernetes
For information, see the knowledge base article, Cisco ACI and Kubernetes Integration on Cisco.com.
• Microsoft System Center Virtual Machine Manager (SCVMM)
For information, see the chapters "Cisco ACI with Microsoft SCVMM" and "Cisco ACI with Microsoft
Windows Azure Pack in the Cisco ACI Virtualization Guide.
• OpenShift
For information, see the OpenShift documentation on Cisco.com.
• OpenStack
For information, see the OpenStack documentation on Cisco.com.
• Red Hat Virtualization (RHV)
For information, see the knowledge base article, Cisco ACI and Red Hat Integration on Cisco.com.
• VMware Virtual Distributed Switch (VDS)
For information, see the chapter "Cisco "ACI with VMware VDS Integration" in the Cisco ACI
Virtualization Guide.
See the Cisco ACI Virtualization Compatibility Matrix for the most current list of verified interoperable
products.
Note A single VMM domain can contain multiple instances of VM controllers, but they must
be from the same vendor (for example, from VMware or from Microsoft.
• EPG Association—Endpoint groups regulate connectivity and visibility among the endpoints within
the scope of the VMM domain policy. VMM domain EPGs behave as follows:
◦The APIC pushes these EPGs as port groups into the VM controller.
◦An EPG can span multiple VMM domains, and a VMM domain can contain multiple EPGs.
• Attachable Entity Profile Association—Associates a VMM domain with the physical network
infrastructure. An attachable entity profile (AEP) is a network interface template that enables deploying
VM controller policies on a large set of leaf switch ports. An AEP specifies which switches and ports
are available, and how they are configured.
• VLAN Pool Association—A VLAN pool specifies the VLAN IDs or ranges used for VLAN encapsulation
that the VMM domain consumes.
VMM domains contain VM controllers such as VMware vCenter or Microsoft SCVMM Manager and the
credential(s) required for the ACI API to interact with the VM controller. A VMM domain enables VM
mobility within the domain but not across domains. A single VMM domain can contain multiple instances of
VM controllers but they must be the same kind. For example, a VMM domain can contain many VMware
vCenters managing multiple controllers each running multiple VMs but it may not also contain SCVMM
Managers. A VMM domain inventories controller elements (such as pNICs, vNICs, VM names, and so forth)
and pushes policies into the controller(s), creating port groups, and other necessary elements. The ACI VMM
domain listens for controller events such as VM mobility and responds accordingly.
In the illustration above, end points (EP) of the same color are part of the same end point group. For example,
all the green EPs are in the same EPG even though they are in two different VMM domains.
Refer to the latest Verified Scalability Guide for Cisco ACI document for virtual network and VMM domain
EPG capacity information.
Note Multiple VMM domains can connect to the same leaf switch if they do not have overlapping VLAN pools
on the same port. Similarly, the same VLAN pools can be used across different domains if they do not
use the same port of a leaf switch.
Note By default, the APIC dynamically manages allocating a VLAN for an EPG. VMware DVS administrators
have the option to configure a specific VLAN for an EPG. In that case, the VLAN is chosen from a static
allocation block within the pool associated with the VMM domain.
Figure 92: Multiple VMM Domains and Scaling of EPGs in the Fabric
While live migration of VMs within a VMM domain is supported, live migration of VMs across VMM domains
is not supported.
Note When you configure Layer 3 Outside (L3Out) connections to external routers, or multipod connections
through an Inter-Pod Network (IPN), it is critical that the MTU be set appropriately on both sides. On
some platforms, such as ACI, Cisco NX-OS, and Cisco IOS, the configurable MTU value takes into
account packet headers (resulting in a max packet size to be set as 9216 bytes for ACI and 9000 for NX-OS
and IOS), whereas other platforms such as IOS-XR configure the MTU value exclusive of packet headers
(resulting in a max packet size of 8986 bytes).
For the appropriate MTU values for each platform, see the relevant configuration guides.
Cisco highly recommends you test the MTU using CLI-based commands. For example, on the Cisco
NX-OS CLI, use a command such as ping 1.1.1.1 df-bit packet-size 9000 source-interface
ethernet 1/1.
Caution If you install 1 Gigabit Ethernet (GE) or 10GE links between the leaf and spine switches in the fabric,
there is risk of packets being dropped instead of forwarded, because of inadequate bandwidth. To avoid
the risk, use 40GE or 100GE links between the leaf and spine switches.
Note Multiple Spanning Tree (MST) is not supported on interfaces configured with the Per Port VLAN feature
(configuring multiple EPGs on a leaf switch using the same VLAN ID with localPort scope).
Note If you are using Cisco ACI Multi-Site with this Cisco APIC cluster/fabric, look for a cloud icon on the
object names in the navigation bar. This indicates that the information is derived from Multi-Site. It is
recommended to only make changes from the Multi-Site GUI. Please review the Multi-Site documentation
before making changes here.
Note For a Cisco APIC REST API query of event records, the APIC system limits the response to a maximum
of 500,000 event records. If the response is more than 500,000 events, it returns an error. Use filters to
refine your queries. For more information, see Composing Query Filter Expressions.
See the Cisco APIC Layer 4 to Layer 7 Services Deployment Guide or the Cisco ACI Virtualization Guide
for more information.
Resolution Immediacy
• Pre-provision—Specifies that a policy (for example, VLAN, VXLAN binding, contracts, or filters) is
downloaded to a leaf switch even before a VM controller is attached to the virtual switch (for example,
VMware VDS). This pre-provisions the configuration on the switch.
This helps the situation where management traffic for hypervisors/VM controllers are also using the
virtual switch associated to APIC VMM domain (VMM switch).
Deploying a VMM policy such as VLAN on ACI leaf switch requires APIC to collect CDP/LLDP
information from both hypervisors via VM controller and ACI leaf switch. However if VM Controller
is supposed to use the same VMM policy (VMM switch) to communicate with its hypervisors or even
APIC, the CDP/LLDP information for hypervisors can never be collected because the policy required
for VM controller/hypervisor management traffic is not deployed yet.
When using pre-provision immediacy, policy is downloaded to ACI leaf switch regardless of CDP/LLDP
neighborship. Even without a hypervisor host connected to the VMM switch.
• Immediate—Specifies that EPG policies (including contracts and filters) are downloaded to the associated
leaf switch software upon ESXi host attachment to a DVS. LLDP or OpFlex permissions are used to
resolve the VM controller to leaf node attachments.
The policy will be downloaded to leaf when you add host to the VMM switch. CDP/LLDP neighborship
from host to leaf is required.
• On Demand—Specifies that a policy (for example, VLAN, VXLAN bindings, contracts, or filters) is
pushed to the leaf node only when an ESXi host is attached to a DVS and a VM is placed in the port
group (EPG).
The policy will be downloaded to leaf when host is added to VMM switch and virtual machine needs
to be placed into port group (EPG). CDP/LLDP neighborship from host to leaf is required.
With both immediate and on demand, if host and leaf lose LLDP/CDP neighborship the policies are
removed.
Deployment Immediacy
Once the policies are downloaded to the leaf software, deployment immediacy can specify when the policy
is pushed into the hardware policy content-addressable memory (CAM).
• Immediate—Specifies that the policy is programmed in the hardware policy CAM as soon as the policy
is downloaded in the leaf software.
• On demand—Specifies that the policy is programmed in the hardware policy CAM only when the first
packet is received through the data path. This process helps to optimize the hardware space.
Note When you use on demand deployment immediacy with MAC-pinned VPCs, the EPG contracts are not
pushed to the leaf ternary content-addressble memory (TCAM) until the first endpoint is learned in the
EPG on each leaf. This can cause uneven TCAM utilization across VPC peers. (Normally, the contract
would be pushed to both peers.)
1 The VM administrator must detach all the VMs from the port groups (in the case of VMware vCenter) or
VM networks (in the case of SCVMM), created by the APIC.
In the case of Cisco AVS, the VM admin also needs to delete vmk interfaces associated with the Cisco
AVS.
2 The ACI administrator deletes the VMM domain in the APIC. The APIC triggers deletion of VMware
VDS or Cisco AVS or SCVMM logical switch and associated objects.
Note The VM administrator should not delete the virtual switch or associated objects (such as port groups or
VM networks); allow the APIC to trigger the virtual switch deletion upon completion of step 2 above.
EPGs could be orphaned in the APIC if the VM administrator deletes the virtual switch from the VM
controller before the VMM domain is deleted in the APIC.
If this sequence is not followed, the VM controller does delete the virtual switch associated with the APIC
VMM domain. In this scenario, the VM administrator must manually remove the VM and vtep associations
from the VM controller, then delete the virtual switch(es) previously associated with the APIC VMM domain.
so that traffic flows through the services. Also, the APIC can automatically configure the service according
to the application's requirements. This approach allows organizations to automate service insertion and
eliminate the challenge of managing all of the complex traffic-steering techniques of traditional service
insertion.
After the graph is configured in the APIC, the APIC automatically configures the services according to the
service function requirements that are specified in the service graph. The APIC also automatically configures
the network according to the needs of the service function that is specified in the service graph, which does
not require any change in the service device.
A service graph is represented as two or more tiers of an application with the appropriate service function
inserted between.
A service appliance (device) performs a service function within the graph. One or more service appliances
might be required to render the services required by a graph. One or more service functions can be performed
by a single-service device.
Service graphs and service functions have the following characteristics:
• Traffic sent or received by an endpoint group can be filtered based on a policy, and a subset of the traffic
can be redirected to different edges in the graph.
• Service graph edges are directional.
• Taps (hardware-based packet copy service) can be attached to different points in the service graph.
• Logical functions can be rendered on the appropriate (physical or virtual) device, based on the policy.
• The service graph supports splits and joins of edges, and it does not restrict the administrator to linear
service chains.
• Traffic can be reclassified again in the network after a service appliance emits it.
• Logical service functions can be scaled up or down or can be deployed in a cluster mode or 1:1
active-standby high-availability mode, depending on the requirements.
By using a service graph, you can install a service, such as an ASA firewall, once and deploy it multiple times
in different logical topologies. Each time the graph is deployed, ACI takes care of changing the configuration
on the firewall to enable the forwarding in the new logical topology.
Deploying a service graph requires bridge domains and VRFs, as shown in the following figure:
Note If you have some of the legs of a service graph that are attached to endpoint groups in other tenants, when
you use the Remove Related Objects of Graph Template function in the GUI, the APIC does not remove
contracts that were imported from tenants other than where the service graph is located. The APIC also
does not clean endpoint group contracts that are located in a different tenant than the service graph. You
must manually remove these objects that are in different tenants.
In this use case, you must create two subjects. The first subject permits HTTP traffic, which then gets redirected
to the firewall. After the traffic passes through the firewall, it goes to the Web endpoint. The second subject
permits all traffic, which captures traffic that is not redirected by the first subject. This traffic goes directly
to the Web endpoint.
While these examples illustrate simple deployments, ACI PBR enables scaling up mixtures of both physical
and virtual service appliances for multiple services, such as firewalls and server load balancers.
Observe the following guidelines and limitations when planning PBR service nodes:
• For a Cold Standby active/standby deployment, configure the service nodes with the MAC address of
the active deployment. In a Cold Standby active/standby deployment, when the active node goes down,
the standby node takes over the MAC address of active node.
• The next-hop service node IP address and virtual MAC address must be provided.
• The PBR bridge domain must have the endpoint dataplane learning disabled.
• Provision service appliances in a separate bridge domain. Starting with Cisco APIC release 3.1(x), it is
not mandatory to provision service appliances in a separate bridge domain.
• When downgrading from the Cisco APIC release 3.1(x) software, an internal code checks whether the
PBR bridge domain uses the same BD as a consumer or a provider. If it does, then the fault is disabled
during the downgrade as such a configuration is not supported in earlier Cisco APIC versions.
• The service appliance, source, and bridge domain can be in the same VRF.
• For N9K-93128TX, N9K-9396PX, N9K-9396TX, N9K-9372PX, and N9K-9372TX switches, the service
appliance must not be in the same leaf switch as either the source or destination endpoint group. For
N9K-C93180YC-EX and N9K-93108TC-EX switches, the service appliance can be in the same leaf
switch as either the source or destination endpoint group.
• The service appliance can only be in a regular bridge domain.
• The contract offered by the service appliance provider endpoint group can be configured to allow-all,
but traffic should be routed by the ACI fabric.
• Starting with Cisco APIC release 3.1(1), if you use the Cisco Nexus 9300-EX and 9300-FX platform
leaf switches, it is not necessary for you to have the endpoint dataplane learning disabled on PBR bridge
domains. During service graph deployment, the endpoint dataplane learning will be automatically disabled
only for PBR node EPG. If you use non-EX/non-FX platform leaf switches, you must have the endpoint
dataplane learning disabled on PBR bridge domains.
• Supported PBR configurations in the same VRF instance include the following:
<vnsSvcRedirectPol name=“LoadBalancer_pool”>
<vnsRedirectDest name=“lb1” ip=“1.1.1.1” mac=“00:00:11:22:33:44”/>
<vnsRedirectDest name=“lb2” ip=“2.2.2.2” mac=“00:de:ad:be:ef:01”/>
<vnsRedirectDest name=“lb3” ip=“3.3.3.3” mac=“00:de:ad:be:ef:02”/>
</vnsSvcRedirectPol>
<vnsLIfCtx name=“external”>
<vnsRsSvcRedirectPol tnVnsSvcRedirectPolName=“LoadBalancer_pool”/>
<vnsRsLIfCtxToBD tDn=“uni/tn-solar/bd-fwBD”>
</vnsLIfCtx>
The following commands set the redirect policy under the device selection policy connector:
apic1(config-service)# connector external
apic1(config-connector)# svcredir-pol tenant solar name fw-external
Device script A Python script that interacts with the device from the APIC. APIC events
are mapped to function calls that are defined in the device script. A device
package can contain multiple device scripts. A device script can interface
with the device by using REST, SSH, or any similar mechanism.
Function profile Function parameters with default values that are specified by the vendor. You
can configure a function to use these default values.
Device-level configuration A configuration file that specifies parameters that are required by a device.
parameters This configuration can be shared by one or more graphs using a device.
You can create a device package or it can be provided by a device vendor or Cisco.
The following figure illustrates the interaction of a device package and the APIC:
The functions in a device script are classified into the following categories:
• Device/Infrastructure—For device level configuration and monitoring
• Service Events—For configuring functions, such as a server load balancer or Secure Sockets Layer, on
the device
• Endpoint/Network Events—For handling endpoint and network attach/detach events
The APIC uses the device configuration model that is provided in the device package to pass the appropriate
configuration to the device scripts. The device script handlers interface with the device using its REST or CLI
interface.
Figure 103: How the Device Scripts Interface with a Service Device
The device package enables an administrator to automate the management of the following services:
• Device attachment and detachment
• Endpoint attachment and detachment
• Service graph rendering
• Health monitoring
• Alarms, notifications, and logging
• Counters
For more information about device packages and how to develop a device package, see Cisco APIC Layer 4
to Layer 7 Device Package Development Guide
The service graph template uses a specific device that is based on a device selection policy (called a logical
device context) that an administrator defines.
An administrator can set up a maximum of two concrete devices in active-standby mode.
To set up a device cluster, you must perform the following tasks:
1 Connect the concrete devices to the fabric.
2 Assign the management IP address to the device cluster.
3 Register the device cluster with the APIC. The APIC validates the device using the device specifications
from the device package.
Note The APIC does not validate a duplicate IP address that is assigned to two device clusters. The APIC can
provision the wrong device cluster when two device clusters have the same management IP address. If
you have duplicate IP addresses for your device clusters, delete the IP address configuration on one of the
devices and ensure there are no duplicate IP addresses that are provisioned for the management IP address
configuration.
A chassis manager is a physical or virtual "container" of processing resources. A chassis manager supports a
number of virtual service devices that are represented as CDev objects. A chassis manager handles networking,
while CDev handles processing. A chassis manager enables the on-demand creation of virtual processing nodes.
For a virtual device, some parts of a service (specifically the VLANs) must be applied to the chassis rather
than to the virtual machine. To accomplish this, the chassis management IP address and credentials must be
included in callouts.
The following figure illustrates a chassis manager acting as a container of processing resources:
Without a device manager or chassis manager, the model for service devices contains the following key
managed objects:
• MDev—Represents a device type (vendor, model, version).
• LDevVIP—Represents a cluster, a set of identically configured devices for Cold Standby. Contains CMgmt
and CCred for access to the device.
• CDev—Represents a member of a cluster, either physical or virtual. Contains CMgmt and CCred for access
to the device.
• VDev—Represents a context on a cluster, similar to a virtual machine on a server.
The following figure illustrates the model for the key managed objects, with CMgmt (management connectivity)
and CCred (credentials) included:
Figure 106: Managed Object Model Without a Device Manager or Chassis Manager
CMgmt (host + port) and CCred (username + password) allow the script to access the device and cluster.
A device manager and chassis manager adds the ability to control the configuration of clusters and devices
from a central management station. The chassis adds a parallel hierarchy to the MDev object and ALDev object
to allow a CDev object to be tagged as belonging to a specific chassis. The following managed objects are
added to the model to support the device and chassis manager concept:
• MDevMgr—Represents a type of device manager. An MDevMgr can manage a set of different MDevs, which
are typically different products from the same vendor.
• DevMgr—Represents a device manager. Access to the manager is provided using the contained CMgmt
and CCred managed objects. Each cluster can be associated with only one DevMgr.
• MChassis—Represents a type of chassis. This managed object is typically included in the package.
• Chassis—Represents a chassis instance. It contains the CMgmt and CCred[Secret] managed objects to
provide connectivity to the chassis.
During service graph template instantiation, VLANs and VXLANs are programmed on concrete interfaces
that are based on their association with logical interfaces.
About Privileges
An administrator can grant privileges to the roles in the APIC. Privileges determine what tasks a role is allowed
to perform. Administrators can grant the following privileges to the administrator roles:
Privilege Description
nw-svc-connectivity
• Create a management EPG
• Create management connectivity to other objects
nw-svc-policy
• Create a service graph
• Attach a service graph to an application EPG
and a contract
• Monitor a service graph
Privilege Description
nw-svc-device
• Create a device cluster
• Create a concrete device
• Create a device context
Note Only an infrastructure administrator can upload a device package to the APIC.
Management Tools
Cisco Application Centric Infrastructure (ACI) tools help fabric administrators, network engineers, and
developers to develop, configure, debug, and automate the deployment of tenants and applications.
• Implemented from the ground up in Python; can switch between the Python interpreter and CLI
• Plugin architecture for extensibility
• Virtual Routing and Forwarding (VRF)-based access to monitoring, operation, and configuration data
• Automation through Python commands or batch scripting
Note The ACI fabric must be configured with an active Network Time Protocol (NTP) policy
to assure that the system clocks on all devices are correct. Otherwise, a certificate could
be rejected on nodes with out-of-sync time.
REST API
About the REST API
The Application Policy Infrastructure Controller (APIC) REST API is a programmatic interface that uses
REST architecture. The API accepts and returns HTTP (not enabled by default) or HTTPS messages that
contain JavaScript Object Notation (JSON) or Extensible Markup Language (XML) documents. You can use
any programming language to generate the messages and the JSON or XML documents that contain the API
methods or Managed Object (MO) descriptions.
The REST API is the interface into the management information tree (MIT) and allows manipulation of the
object model state. The same REST interface is used by the APIC CLI, GUI, and SDK, so that whenever
information is displayed, it is read through the REST API, and when configuration changes are made, they
are written through the REST API. The REST API also provides an interface through which other information
can be retrieved, including statistics, faults, and audit events. It even provides a means of subscribing to
push-based event notification, so that when a change occurs in the MIT, an event can be sent through a web
socket.
Standard REST methods are supported on the API, which includes POST, GET, and DELETE operations
through HTTP. The POST and DELETE methods are idempotent, meaning that there is no additional effect
if they are called more than once with the same input parameters. The GET method is nullipotent, meaning
that it can be called zero or more times without making any changes (or that it is a read-only operation).
Payloads to and from the REST interface can be encapsulated through either XML or JSON encoding. In the
case of XML, the encoding operation is simple: the element tag is the name of the package and class, and any
properties of that object are specified as attributes of that element. Containment is defined by creating child
elements.
For JSON, encoding requires definition of certain entities to reflect the tree-based hierarchy; however, the
definition is repeated at all levels of the tree, so it is fairly simple to implement after it is initially understood.
• All objects are described as JSON dictionaries, in which the key is the name of the package and class.
The value is another nested dictionary with two keys: attribute and children.
• The attribute key contains a further nested dictionary describing key-value pairs that define attributes
on the object.
• The children key contains a list that defines all the child objects. The children in this list are dictionaries
containing any nested objects, which are defined as described here.
Authentication
REST API username- and password-based authentication uses a special subset of request Universal Resource
Identifiers (URIs), including aaaLogin, aaaLogout, and aaaRefresh as the DN targets of a POST operation.
Their payloads contain a simple XML or JSON payload containing the MO representation of an aaaUser
object with the attribute name and pwd defining the username and password: for example, <aaaUser
name='admin' pwd='password'/>. The response to the POST operation will contain an authentication token
as both a Set-Cookie header and an attribute to the aaaLogin object in the response named token, for which
the XPath is /imdata/aaaLogin/@token if the encoding is XML. Subsequent operations on the REST API
can use this token value as a cookie named APIC-cookie to authenticate future requests.
Subscription
The REST API supports the subscription to one or more MOs during your active API session. When any MO
is created, changed, or deleted because of a user- or system-initiated action, an event is generated. If the event
changes the data on any of the active subscribed queries, the APIC will send out a notification to the API
client that created the subscription.
API Inspector
The API Inspector provides a real-time display of REST API commands that the APIC processes to perform
GUI interactions. The figure below shows REST API commands that the API Inspector displays upon navigating
to the main tenant section of the GUI.
See the following figure for an example of how an administrator can use the MIM to research an object in the
MIT.
Every MO in the system can be identified by a unique distinguished name (DN). This approach allows the
object to be referred to globally. In addition to its distinguished name, each object can be referred to by its
relative name (RN). The relative name identifies an object relative to its parent object. Any given object's
distinguished name is derived from its own relative name that is appended to its parent object's distinguished
name.
A DN is a sequence of relative names that uniquely identifies an object:
dn = {rn}/{rn}/{rn}/{rn}
dn =”sys/ch/lcslot-1/lc/leafport-1”
Distinguished names are directly mapped to URLs. Either the relative name or the distinguished name can be
used to access an object, depending on the current location in the MIT.
Because of the hierarchical nature of the tree and the attribute system used to identify object classes, the tree
can be queried in several ways for obtaining managed object information. Queries can be performed on an
object itself through its distinguished name, on a class of objects such as a switch chassis, or on a tree-level
to discover all members of an object.
Tree-Level Queries
The following figure shows two chassis that are queried at the tree level.
Both queries return the referenced object and its child objects. This approach is useful for discovering the
components of a larger system. In this example, the query discovers the cards and ports of a given switch
chassis.
Class-Level Queries
The following figure shows the second query type: the class-level query.
Class-level queries return all the objects of a given class. This approach is useful for discovering all the objects
of a certain type that are available in the MIT. In this example, the class used is Cards, which returns all the
objects of type Cards.
Object-Level Queries
The third query type is an object-level query. In an object-level query a distinguished name is used to return
a specific object. The figure below shows two object-level queries: for Node 1 in Chassis 2, and one for Node
1 in Chassis 1 in Card 1 in Port 2.
For all MIT queries, an administrator can optionally return the entire subtree or a partial subtree. Additionally,
the role-based access control (RBAC) mechanism in the system dictates which objects are returned; only the
objects that the user has rights to view will ever be returned.
Managed-Object Properties
Managed objects in the Cisco ACI contain properties that define the managed object. Properties in a managed
object are divided into chunks that are managed by processes in the operating system. Any object can have
several processes that access it. All these properties together are compiled at runtime and are presented to the
user as a single object. The following figure shows an example of this relationship.
The example object has three processes that write to property chunks that are in the object. The data management
engine (DME), which is the interface between the Cisco APIC (the user) and the object, the port manager,
which handles port configuration, and the spanning tree protocol (STP) all interact with chunks of this object.
The APIC presents the object to the user as a single entity compiled at runtime.
• method: Optional indication of the method being invoked on the object; applies only to HTTP POST
requests
• xml | json: Encoding format
• options: Query options, filters, and arguments
With the capability to address and access an individual object or a class of objects with the REST URL, an
administrator can achieve complete programmatic access to the entire object tree and to the entire system.
The following are REST query examples:
• Find all EPGs and their faults under tenant solar.
http://192.168.10.1:7580/api/mo/uni/tn-solar.xml?query-target=subtree&target-subtree-class=fvAEPg&rsp-subtree-include=faults
Configuration Export/Import
All APIC policies and configuration data can be exported to create backups. This is configurable via an export
policy that allows either scheduled or immediate backups to a remote server. Scheduled backups can be
configured to execute periodic or recurring backup jobs. By default, all policies and tenants are backed up,
but the administrator can optionally specify only a specific subtree of the management information tree.
Backups can be imported into the APIC through an import policy, which allows the system to be restored to
a previous configuration.
One or more shards are located on each APIC appliance. The shard data assignments are based on a
predetermined hash function, and a static shard layout determines the assignment of shards to appliances.
The APIC uses a 16 to 32 character passphrase to generate the AES-256 keys. The APIC GUI displays a hash
of the AES passphrase. This hash can be used to see if the same passphrases was used on two ACI fabrics.
This hash can be copied to a client computer where it can be compared to the passphrase hash of another ACI
fabric to see if they were generated with the same passphrase. The hash cannot be used to reconstruct the
original passphrase or the AES-256 keys.
Observe the following guidelines when working with encrypted configuration files:
• Backward compatibility is supported for importing old ACI configurations into ACI fabrics that use the
AES encryption configuration option.
Note Reverse compatibility is not supported; configurations exported from ACI fabrics that
have enabled AES encryption cannot be imported into older versions of the APIC
software.
• Always enable AES encryption when performing fabric backup configuration exports. Doing so will
assure that all the secure properties of the configuration will be successfully imported when restoring
the fabric.
Note If a fabric backup configuration is exported without AES encryption enabled, none of
the secure properties will be included in the export. Since such an unencrypted backup
would not include any of the secure properties, it is possible that importing such a file
to restore a system could result in the administrator along with all users of the fabric
being locked out of the system.
• The AES passphrase that generates the encryption keys cannot be recovered or read by an ACI
administrator or any other user. The AES passphrase is not stored. The APIC uses the AES passphrase
to generate the AES keys, then discards the passphrase. The AES keys are not exported. The AES keys
cannot be recovered since they are not exported and cannot be retrieved via the REST API.
• The same AES-256 passphrase always generates the same AES-256 keys. Configuration export files
can be imported into other ACI fabrics that use the same AES passphrase.
• For troubleshooting purposes, export a configuration file that does not contain the encrypted data of the
secure properties. Temporarily turning off encryption before performing the configuration export removes
the values of all secure properties from the exported configuration. To import such a configuration file
that has all secure properties removed, use the import merge mode; do not use the import replace mode.
Using the import merge mode will preserve the existing secure properties in the ACI fabric.
• By default, the APIC rejects configuration imports of files that contain fields that cannot be decrypted.
Use caution when turning off this setting. Performing a configuration import inappropriately when this
default setting is turned off could result in all the passwords of the ACI fabric to be removed upon the
import of a configuration file that does not match the AES encryption settings of the fabric.
Note Failure to observe this guideline could result in all users, including fabric administrations,
being locked out of the system.
Configuration Export
The following figure shows how the process works for configuring an export policy.
Configuration Import
An administrator can create an import policy that performs the import in one of the following two modes:
• Best-effort—ignores objects within a shard that cannot be imported. If the version of the incoming
configuration is incompatible with the existing system, shards that are incompatible are not be imported
while the import proceeds with those that can be imported.
• Atomic—ignores shards that contain objects that cannot be imported while proceeding with shards that
can be imported. If the version of the incoming configuration is incompatible with the existing system,
the import terminates.
• Best-effort Merge—imported configuration is merged with existing configuration but ignores objects
that cannot be imported.
• Atomic Merge—imported configuration is merged with the existing configuration, but ignores shards
that contain objects that cannot be imported.
• Atomic Replace—overwrites existing configuration with imported configuration data. Any objects in
the existing configuration that do not exist in the imported configuration are deleted. Objects are deleted
from the existing configuration that have children in the existing configuration but do not have children
in the incoming imported configuration. For example, if an existing configuration has two tenants, solar
and wind, but the imported backed up configuration was saved before the tenant wind was created, tenant
soar is restored from the backup but tenant wind is deleted.
The following figure shows how the process works for configuring an import policy.
• The policy is untriggered (it is available but has not been activated).
Note The maximum number of statistics export policies is approximately equal to the number of tenants. Each
tenant can have multiple statistics export policies and multiple tenants can share the same export policy,
but the total number of policies is limited to approximately the number of tenants.
An administrator can configure policy details such as the transfer protocol, compression algorithm, and
frequency of transfer. Policies can be configured by users who are authenticated using AAA. A security
mechanism for the actual transfer is based on a username and password. Internally, a policy element handles
the triggering of data.
Note For information about faults, events, errors, and system messages, see the Cisco APIC Faults, Events, and
System Messages Management Guide and the Cisco APIC Management Information Model Reference, a
Web-based application.
The APIC maintains a comprehensive, current run-time representation of the administrative and operational
state of the ACI Fabric system in the form of a collection of MOs. The system generates faults, errors, events,
and audit log data according to the run-time state of the system and the policies that the system and user create
to manage these processes.
The APIC GUI enables you to create customized "historical record groups" of fabric switches, to which you
can then assign customized switch policies that specify customized size and retention periods for the audit
logs, event logs, health logs, and fault logs maintained for the switches in those groups.
The APIC GUI also enables you to customize a global controller policy that specifies size and retention periods
for the audit logs, event logs, health logs, and fault logs maintained for the controllers on this fabric.
Faults
Based on the run-time state of the system, the APIC automatically detects anomalies and creates fault objects
to represent them. Fault objects contain various properties that are meant to help users diagnose the issue,
assess its impact and provide a remedy.
For example, if the system detects a problem associated with a port, such as a high parity-error rate, a fault
object is automatically created and placed in the management information tree (MIT) as a child of the port
object. If the same condition is detected multiple times, no additional instances of the fault object are created.
After the condition that triggered the fault is remedied, the fault object is preserved for a period of time
specified in a fault life-cycle policy and is finally deleted. See the following figure.
A life cycle represents the current state of the issue. It starts in the soak time when the issue is first detected,
and it changes to raised and remains in that state if the issue is still present. When the condition is cleared, it
moves to a state called "raised-clearing" in which the condition is still considered as potentially present. Then
it moves to a "clearing time" and finally to "retaining". At this point, the issue is considered to be resolved
and the fault object is retained only to provide the user visibility into recently resolved issues.
Each time that a life-cycle transition occurs, the system automatically creates a fault record object to log it.
Fault records are never modified after they are created and they are deleted only when their number exceeds
the maximum value specified in the fault retention policy.
The severity is an estimate of the impact of the condition on the capability of the system to provide service.
Possible values are warning, minor, major and critical. A fault with a severity equal to warning indicates a
potential issue (including, for example, an incomplete or inconsistent configuration) that is not currently
affecting any deployed service. Minor and major faults indicate that there is potential degradation in the service
being provided. Critical means that a major outage is severely degrading a service or impairing it altogether.
Description contains a human-readable description of the issue that is meant to provide additional information
and help in troubleshooting.
Events
Event records are objects that are created by the system to log the occurrence of a specific condition that might
be of interest to the user. They contain the fully qualified domain name (FQDN) of the affected object, a
timestamp and a description of the condition. Examples include link-state transitions, starting and stopping
of protocols, and detection of new hardware components. Event records are never modified after creation and
are deleted only when their number exceeds the maximum value specified in the event retention policy.
The following figure shows the process for fault and events reporting.
Errors
APIC error messages typically display in the APIC GUI and the APIC CLI. These error messages are specific
to the action that a user is performing or the object that a user is configuring or administering. These messages
can be the following:
• Informational messages that provide assistance and tips about the action being performed
• Warning messages that provide information about system errors related to an object, such as a user
account or service profile, that the user is configuring or administering
• Finite state machine (FSM) status messages that provide information about the status of an FSM stage
Many error messages contain one or more variables. The information that the APIC uses to replace these
variables depends upon the context of the message. Some messages can be generated by more than one type
of error.
Audit Logs
Audit records are objects that are created by the system to log user-initiated actions, such as login/logout and
configuration changes. They contain the name of the user who is performing the action, a timestamp, a
description of the action and, if applicable, the FQDN of the affected object. Audit records are never modified
after creation and are deleted only when their number exceeds the maximum value specified in the audit
retention policy.
Policies define what statistics are gathered, at what intervals, and what actions to take. For example, a policy
could raise a fault on an EPG if a threshold of dropped packets on an ingress VLAN is greater than 1000 per
second.
Statistics data are gathered from a variety of sources, including interfaces, VLANs, EPGs, application profiles,
ACL rules, tenants, or internal APIC processes. Statistics accumulate data in 5-minute, 15-minute, 1-hour,
1-day, 1-week, 1-month, 1-quarter, or 1-year sampling intervals. Shorter duration intervals feed longer intervals.
A variety of statistics properties are available, including last value, cumulative, periodic, rate of change, trend,
maximum, min, average. Collection and retention times are configurable. Policies can specify if the statistics
are to be gathered from the current state of the system or to be accumulated historically or both. For example,
a policy could specify that historical statistics be gathered for 5-minute intervals over a period of 1 hour. The
1 hour is a moving window. Once an hour has elapsed, the incoming 5 minutes of statistics are added, and
the earliest 5 minutes of data are abandoned.
Note The maximum number of 5-minute granularity sample records is limited to 12 samples (one hour of
statistics). All other sample intervals are limited to 1,000 sample records. For example, hourly granularity
statistics can be maintained for up to 41 days. Statistics will not be maintained for longer than these limits.
To gather statistics for longer durations, create an export policy.
• History data
• Current data
The MO names corresponding to these objects start with a two-letter prefix: HD or CD. HD indicates history
data while CD indicates current data. For example, "CDl2IngrBytesAg15min." The MO name is also an
indicator of the time interval for which the data is collected. For example, "CDl2IngrBytesAg15min" indicates
that the MO corresponds to 15-minute intervals.
A CD object holds currently running data, and the values that the object holds change as time passes. However,
at the end of a given time interval, the data collected in a CD object is copied to an HD object and the CD
object attributes are reset to 0. For example, at the end of a given 15-minute interval, the data in the
CDl2IngrBytesAg15min object is moved to the HDl2IngrBytesAg15min object and the CDl2IngrBytesAg15min
object is reset.
If a CD...15min object data is closely observed for more than 15 minutes, you can notice that the value goes
to 0, then gets incremented twice and goes to 0 again. This is because the values are getting updated every 5
minutes. The third update (at the end of 15 minutes) goes unnoticed, as the data was rolled up to the HD object
and the CD object was reset as soon as that update occurred.
CD...15min objects are updated every 5 minutes and CD...5min objects are updated every 10 seconds.
CD...15min objects are rolled up as HD...15min objects and CD...5min are rolled up as HD...5min objects.
The data that any CD object holds is dynamic and for all practical purposes it must be considered to be internal
data. HD data objects can be used for any further analytical purposes and can be considered to be published
or static data.
The HD objects are also rolled up as time passes. For example, three consecutive HD...5min data objects
contribute to one HD...15min object. The length of time that one HD...5min object resides in the system is
decided by the statistic collection policies.
The APIC includes the following four classes of default monitoring policies:
• monCommonPol (uni/fabric/moncommon): applies to both fabric and access infrastructure hierarchies
• monFabricPol (uni/fabric/monfab-default): applies to fabric hierarchies
• monInfraPol (uni/infra/monifra-default): applies to the access infrastructure hierarchy
• monEPGPol (uni/tn-common/monepg-default): applies to tenant hierarchies
In each of the four classes of monitoring policies, the default policy can be overridden by a specific policy.
For example, a monitoring policy applied to the Solar tenant (tn-solar) would override the default one for the
Solar tenant while other tenants would still be monitored by the default policy.
Each of the four objects in the figure below contains monitoring targets.
The Infra monitoring policy contains monInfra targets, the fabric monitoring policy contains monFab targets,
and the tenant monitoring policy contains monEPG targets. Each of the targets represent the corresponding
class of objects in this hierarchy. For example, under the monInfra-default monitoring policy, there is a
target representing FEX fabric-facing ports. The policy details regarding how to monitor these FEX fabric-facing
ports are contained in this target. Only policies applicable to a target are allowed under that target. Note that
not all possible targets are auto-created by default. The administrator can add more targets under a policy if
the target is not there.
The common monitoring policy (monCommonPol ) has global fabric-wide scope and is automatically deployed
on all nodes in the fabric, including the APIC controllers. Any source (such as syslog, callhome, or snmp)
located under the common monitoring policy captures all faults, events, audits and health occurrences. The
single common monitoring policy monitors the whole fabric. The threshold of the severity for syslog and
snmp or urgency for callhome can be configured according to the level of detail that a fabric administrator
determines is appropriate.
Multiple monitoring policies can be used to monitor individual parts of the fabric independently. For example,
a source under the global monitoring policy reflects a global view. Another source under a custom monitoring
policy deployed only to some nodes could closely monitor their power supplies. Or, specific fault or event
occurrences for different tenants could be redirected to notify specific operators.
Sources located under other monitoring policies capture faults, events and audits within a smaller scope. A
source located directly under a monitoring policy, captures all occurrences within the scope (for example
fabric, infra, etc.). A source located under a target, captures all occurrences related to that target (for example,
eqpt:Psu for power supply). A source located under a fault/event severity assignment policy captures only
the occurrences that match that particular fault or event as identified by the fault/event code.
When a fault/event/audit is generated, all applicable sources are used. For example consider the following for
the configuration below:
• Syslog source 4, pointing to syslog group 4 is defined for fault F0123.
• Syslog source 3, pointing to syslog group 3 is defined for target power supply (eqpt:Psu).
• Syslog source 2, pointing to syslog group 2 is defined for scope infra.
• Syslog source 1, pointing to syslog group 1 is defined for the common monitoring policy.
If fault F0123 occurs on an MO of class eqpt:Psu in scope infra, a syslog message is sent to all the destinations
in syslog groups 1-4, assuming the severity of the message is at or above the minimum defined for each source
and destination. While this example illustrates a syslog configuration, callhome and SNMP configurations
would operate in the same way.
The following figure shows how the process works for configuring a fabric monitoring policy for statistics.
The APIC applies this monitoring policy as shown in the following figure:
Monitoring policies can also be configured for other system operations, such as faults or health scores. The
structure of monitoring policies map to this hierarchy:
Monitoring Policy
• Statistics Export
• Collection Rules
• Monitoring Targets
◦Statistics Export
◦Collection Rules
◦Statistics
◦Collection Rules
◦Thresholds Rules
◦Statistics Export
Statistics Export policies option in the following figure define the format and destination for statistics to be
exported. The output can be exported using FTP, HTTP, or SCP protocols. The format can be JSON or XML.
The user or administrator can also choose to compress the output. Export can be defined under Statistics,
Monitoring Targets, or under the top-level monitoring policy. The higher-level definition of Statistics Export
takes precedence unless there is a defined lower-level policy.
As shown in the figure below, monitoring policies are applied to specific observable objects (such as ports,
cards, EPGs, and tenants) or groups of observable objects by using selectors or relations.
As shown in the figure below, Collection Rules are defined per sampling interval.
They configure whether the collection of statistics should be turned on or off, and when turned on, what the
history retention period should be. Monitoring Targets correspond to observable objects (such as ports and
EPGs).
Collection Rules can be defined under Statistics, Monitoring Targets, or under the top-level Monitoring Policy.
The higher-level definition of Collection Rules takes precedence unless there is a defined lower-level policy.
As shown in the figure below, threshold rules are defined under collection rules and would be applied to the
corresponding sampling-interval that is defined in the parent collection rule.
Tetration Analytics
About Cisco Tetration Analytics Agent Installation
The Cisco Tetration Analytics agent installation is accomplished by downloading the RPM Package Manager
(RPM) file from the Cisco Tetration Analytics cluster and upload it to APIC. The Cisco Tetration Analytics
cluster send a notification to the switch whenever a later version of the Cisco Tetration Analytics agent is
uploaded.
There are two possible scenarios regarding the installation of the image on the switch:
• The Cisco Tetration Analytics image is not installed on the switch: the switch receives a notification
from APIC, downloads and installs the Cisco Tetration Analytics agent image on the container on the
switch.
• The Cisco Tetration Analytics image is installed on the switch and the switch receives a notification
from the APIC. The switch checks if the APIC version is higher than that of the agent image already
installed. If the version is higher, the switch downloads and installs the latest Cisco Tetration Analytics
image on the container on the switch.
The image is installed in persistent memory. On reboot, after receiving controller notification from APIC, the
switch starts the Cisco Tetration Analytics agent irrespective of the image that is available on APIC.
NetFlow
About NetFlow
The NetFlow technology provides the metering base for a key set of applications, including network traffic
accounting, usage-based network billing, network planning, as well as denial of services monitoring, network
monitoring, outbound marketing, and data mining for both service providers and enterprise customers. Cisco
provides a set of NetFlow applications to collect NetFlow export data, perform data volume reduction, perform
post-processing, and provide end-user applications with easy access to NetFlow data. If you have enabled
NetFlow monitoring of the traffic flowing through your datacenters, this feature enables you to perform the
same level of monitoring of the traffic flowing through the Cisco Application Centric Infrastructure (Cisco
ACI) fabric.
Instead of hardware directly exporting the records to a collector, the records are processed in the supervisor
engine and are exported to standard NetFlow collectors in the required format.
For information about configuring NetFlow with virtual machine networking, see the Cisco ACI Virtualization
Guide.
Note NetFlow is only supported on EX switches. See the Cisco NX-OS Release Notes for Cisco Nexus 9000
Series ACI-Mode Switches document for the release that you have installed for a list of the supported EX
switches.
• The ICMP checksum is part of the Layer 4 src port in the flow record, so for ICMP records, many flow
entries will be created if this is not masked, as is similar for other non-TCP/UDP packets.
Stateless apps are inserted in the APIC UI as an IFRAME. In this type of applications, app specific state is
stored on the APIC for that app. The app queries APIC using its northbound REST APIs and retrieves
information from the APIC. In stateless app, no state is maintained in the between two invocations of the app.
Some of the common examples for stateless app include the following:
• Data visualization apps that gather data available from querying the APIC and that can present them in
a visual format.
• L4-L7 vendor specific configuration apps.
backend component. This front-end component is inserted in the APIC UI as an IFRAME, in the same way
as a stateless app. If a stateful app is developed without a front-end, then it is installed using REST APIs.
Some of the common examples for stateful app include the following:
• Visualization Apps that can plot graphs for the historical data for a specific time interval.
• Alerts apps that can send alerts based on certain events that are not supported natively in APIC.
• Monitoring apps that can track APIC’s events, faults, and statistics and analyze it for detecting anomalies.
• Apps to sync the data between APIC and a third party vendor.
Troubleshooting
The ACI fabric provides extensive troubleshooting and monitoring tools as shown in the following figure.
A traceroute that is initiated from the tenant endpoints shows the default gateway as an intermediate hop
appears at the ingress leaf switch.
Traceroute modes include from endpoint to endpoint, and from leaf to leaf (TEP to TEP). Traceroute discovers
all paths across the fabric, points of exit for external endpoints, and helps to detect if any path is blocked.
Traceroute works with IPv6 source and destination addresses but configuring source and destination addresses
across IPv4 and IPv6 addresses is not allowed. Source (RsTrEpIpSrc) and destination (RsTrEpIpDst) relations
support source and destination of type fvIp. At times, multiple IP addresses are learned from the same endpoint.
The administrator chooses the desired source and destination addresses.
Atomic Counters
Atomic counters detect drops and misrouting in the fabric. The resulting statistics enable quick debugging
and isolation of application connectivity issues. Atomic counters require an active fabric Network Time
Protocol (NTP) policy. Atomic counters work for either IPv6 or IPv4 source and destination addresses but
not across different address families.
For example, an administrator can enable atomic counters on all leaf switches to trace packets from endpoint
1 to endpoint 2. If any leaf switches have nonzero counters, other than the source and destination leaf switches,
an administrator can drill down to those leaf switches.
In conventional settings, it is nearly impossible to monitor the amount of traffic from a baremetal NIC to a
specific IP address (an endpoint) or to any IP address. Atomic counters allow an administrator to count the
number of packets that are received from a baremetal endpoint without any interference to its data path. In
addition, atomic counters can monitor per-protocol traffic that is sent to and from an endpoint or an application
group.
Leaf-to-leaf (TEP to TEP) atomic counters can provide the following:
• Counts of drops, admits, and excess packets.
• Short-term data collection such as the last 30 seconds, and long-term data collection such as 5 minutes,
15 minutes, or more.
• A breakdown of per-spine traffic is available only when the number of TEPs, leaf or VPC, is less than
64.
• Ongoing monitoring.
Note Leaf-to-leaf (TEP to TEP) atomic counters are cumulative and cannot be cleared. However, because
30-second atomic counters reset at 30-second intervals, they can be used to isolate intermittent or recurring
problems.
Note Use of atomic counters is not supported when the endpoints are in different tenants or in different Virtual
Routing and Forwarding (VRF) instances (also known as contexts or private networks) within the same
tenant. Atomic counters work for IPv6 source and destinations but configuring source and destination IP
addresses across IPv4 and IPv6 addresses is not allowed.
Health Scores
The ACI fabric uses a policy model to combine data into a health score. Health scores can be aggregated for
a variety of areas such as for the system, infrastructure, tenants, applications, or services.
ACI fabric health information is available for the following views of the system:
• System — aggregation of system-wide health, including pod health scores, tenant health scores, system
fault counts by domain and type, and the APIC cluster health state.
• Pod — aggregation of health scores for a pod (a group of spine and leaf switches), and pod-wide fault
counts by domain and type.
• Tenant — aggregation of health scores for a tenant, including performance data for objects such as
applications and EPGs that are specific to a tenant, and tenant-wide fault counts by domain and type.
• Managed Object — health score policies for managed objects (MOs), which includes their dependent
and related MOs. These policies can be customized by an administrator.
The pod health scores are based on the leaf and spine switches health scores as well as the number of end-points
learnt on the leaf switches. The GUI fabric pod dashboard screen also displays pod-wide fault counts by
domain and type.
The system and pod health scores are calculated the same way. The calculation is based on the weighted
average of the leaf health scores divided by the total number of learned end points of the leaf switches times
the spine coefficient which is derived from the number of spines and their health scores.
The following equation shows how this calculation is done.
For example, an EPG could be using ports of two leaf switches. Each leaf switch would contain a deployed
EPG component. The number of learned endpoints is a weighting factor. Each port could have a different
number of learned endpoints. So the EPG health score would be derived by summing the health score of each
EPG component times its number of learned endpoints on that leaf, divided by the total number of learned
endpoints across the leaf switches the EPG uses.
MO Health Scores
Each managed object (MO) belongs to a health score category. By default, the health score category of an
MO is the same as its MO class name.
Each health score category is assigned an impact level. The five health score impact levels are Maximum,
High, Medium, Low, and None. For example, the default impact level of fabric ports is Maximum and the
default impact level of leaf ports is High. Certain categories of children MOs can be excluded from health
score calculations of its parent MO by assigning a health score impact of None. These impact levels between
objects are user configurable. However, if the default impact level is None, the administrator cannot override
it.
The following factors are the various impact levels:
Maximum: 100% High: 80% Medium: 50% Low: 20% None: 0%
The category health score is calculated using an Lp -Norm formula. The health score penalty equals 100 minus
the health score. The health score penalty represents the overall health score penalties of a set of MOs that
belong to a given category and are children or direct relatives of the MO for which a health score is being
calculated.
The health score category of an MO class can be changed by using a policy. For example, the default health
score category of a leaf port is eqpt:LeafP and the default health score category of fabric ports is eqpt:FabP.
However, a policy that includes both leaf ports and fabric ports can be made to be part of the same category
called ports.
In the following figure, a hardware fault impacts the health score of an application component.
Multinode SPAN
The APIC traffic monitoring policies can span policies at the appropriate places to keep track of all the members
of each application group and where they are connected. If any member moves, the APIC automatically pushes
the policy to the new leaf. For example, when an endpoint VMotions to a new leaf, the span configuration
automatically adjusts.
Refer to the IETF Internet Draft at the following URL for descriptions of ERSPAN headers: https://tools.ietf.org/
html/draft-foschiano-erspan-00.
The ACI fabric supports the following two extensions of remote SPAN (ERSPAN) formats:
• Access or tenant SPAN—done for leaf switch front panel ports with or without using VLAN as a filter.
The Broadcom Trident 2 ASIC in the leaf switches supports a slightly different version of the ERSPAN
Type 1 format. It differs from the ERSPAN Type 1 format defined in the document referenced above
in that the GRE header is only 4 bytes and there is no sequence field. The GRE header is always encoded
with the following – 0x000088be. Even thought 0x88be indicates ERSPAN Type 2, the remaining 2
bytes of the fields identify this as an ERSPAN Type 1 packet with a GRE header of 4 bytes.
• Fabric SPAN—done in leaf switches by the Northstar ASIC or by the Alpine ASIC in the spine switches.
While these ASICs support ERSPAN Type 2 and 3 formats, the ACI fabric currently only supports
ERSPAN Type 2 for fabric SPAN, as documented in the base-line document referenced above.
About SNMP
The Cisco Application Centric Infrastructure (ACI) provides extensive SNMPv1, v2, and v3 support, including
Management Information Bases (MIBs) and notifications (traps). The SNMP standard allows any third-party
applications that support the different MIBs to manage and monitor the ACI fabric.
SNMPv3 provides extended security. Each SNMPv3 device can be selectively enabled or disabled for SNMP
service. In addition, each device can be configured with a method of handling SNMPv1 and v2 requests.
For more information about using SNMP, see the Cisco ACI MIB Quick Reference.
About Syslog
During operation, a fault or event in the Cisco Application Centric Infrastructure (ACI) system can trigger
the sending of a system log (syslog) message to the console, to a local file, and to a logging server on another
system. A system log message typically contains a subset of information about the fault or event. A system
log message can also contain audit log and session log entries.
Note For a list of syslog messages that the APIC and the fabric nodes can generate, see http://www.cisco.com/
c/en/us/td/docs/switches/datacenter/aci/apic/sw/1-x/syslog/guide/aci_syslog/ACI_SysMsg.html.
Many system log messages are specific to the action that a user is performing or the object that a user is
configuring or administering. These messages can be the following:
• Informational messages, providing assistance and tips about the action being performed
• Warning messages, providing information about system errors related to an object, such as a user account
or service profile, that the user is configuring or administering
In order to receive and monitor system log messages, you must specify a syslog destination, which can be the
console, a local file, or one or more remote hosts running a syslog server. In addition, you can specify the
minimum severity level of messages to be displayed on the console or captured by the file or host. The local
file for receiving syslog messages is /var/log/external/messages.
A syslog source can be any object for which an object monitoring policy can be applied. You can specify the
minimum severity level of messages to be sent, the items to be included in the syslog messages, and the syslog
destination.
You can change the display format for the Syslogs to NX-OS style format.
Additional details about the faults or events that generate these system messages are described in the Cisco
APIC Faults, Events, and System Messages Management Guide, and system log messages are listed in the
Cisco ACI System Messages Reference Guide.
Note Not all system log messages indicate problems with your system. Some messages are purely informational,
while others may help diagnose problems with communications lines, internal hardware, or the system
software.
Note For a list of Troubleshooting Wizard CLI commands, see the Cisco APIC Command-Line Interface User
Guide.
Related Topics
Getting Started with the Troubleshooting Wizard
Topology in the Troubleshooting Wizard
Label Matching
Label matching is used to determine which consumer and provider EPGs can communicate. Contract subjects
of a given producer or consumer of that contract determine that consumers and providers can communicate.
The match type algorithm is determined by the matchT attribute that can have one of the following values:
• All
• AtLeastOne (default)
• None
• AtmostOne
When both EPG and contract subject labels exist, label matching is done first for EPGs, then for contract
subjects.
When checking for a match of provider labels, vzProvLbl, and consumer labels, vzConsLbl, the matchT is
determined by the provider EPG.
When checking for a match of provider or consumer subject labels, vzProvSubjLbl, vzConsSubjLbl, in EPGs
that have a subject, the matchT is determined by the subject.
The same matchT logic is the same for EPG and contract subject labels. The following table shows simple
examples of all the EPG and contract subject provider and consumer match types and their results. In this
table, a [ ] entry indicates no labels (NULL).
All [] [] match
All [] [] no match
AtLeastOne [] [] match
None [] [] match
AtmostOne [] [] match
In this example all four EPGs share the same contract, but two of them are in one Virtual Routing and
Forwarding (VRF) instance (also known as a context or private network) and two of them in the other VRF.
The contract is applied only between EPG1 and EPG2, and then separately between EPG3 and EPG4. The
contract is limited to whatever the scope is, which in this case is the VRF.
The same thing applies if the scope = application profile. If two application profiles have EPGs and if
the scope = application profile, then the contract is enforced on EPGs in their application profiles.
One contract is for web-to-app communication, which has a scope of application profile. The app-to-db
contract has a scope of VRF. The app1 and app2 applications profiles are in the same VRF. Each application
profile contains EPGs.
Because the scope of the contract app-to-db is enforced at the VRF level, and both application profiles belong
to the same VRF, all consumers of the app-to-db contract are allowed to communicate with its provider EPGs.
• EPG-app1-db can communicate bi-directionally with EPG-app1-app
• EPG-app2-db can communicate bi-directionally with EPG-app2-app
• EPG-app1-db can communicate bi-directionally with EPG-app2-app
• EPG-app2-db can communicate bi-directionally with EPG-app1-app
The next pairs of endpoints using the web-to-app contracts with a scope of application-profile allow only the
provider and consumers of the contract to communicate within that application profile.
• EPG-app1-app can communicate with EPG-app1-web
• EPG-app2-app can communicate with EPG-app2-web
Unlike those above, the app and db EPGs cannot communicate outside of their application profiles.
Secure Properties
The table below lists the secure properties of managed objects that include a password field property type.
pki:WebTokenData initializationVector
pki:WebTokenData key
pki:CsyncSharedKey key
pki:CertReq pwd
mcp:Inst key
mcp:InstPol key
sysdebug:BackupBehavior pwd
stats:Dest userPasswd
firmware:CcoSource password
firmware:InternalSource password
f firmware:OSource password
firmware:Source password
bgp:PeerDef password
bgp:Peer password
bgp:APeerP password
bgp:PeerP password
bfd:AuthP key
comp:UsrAccP pwd
comp:Ctrlr pwd
aaa:LdapProvider key
aaa:LdapProvider monitoringPassword
aaa:UserData pwdHistory
aaa:TacacsPlusProvidermonitoring password
aaa:AProvider key
aaa:AProvider monitoringPassword
aaa:RadiusProvider key
aaa:RadiusProvider monitoringPassword
aaa:User pwd
aaa:ChangePassword newPassword
aaa:ChangePassword oldPassword
ospf:AuthP key
ospf:IfPauth Key
ospf:AIfPauth Key
ospf:IfDef authKey
file:RemotePath userPasswd
file:ARemotePath userPasswd
vmm:UsrAccP pwd
snmp:UserSecP authKey
snmp:UserSecP privKey
snmp:UserP authKey
snmp:UserP privKey
snmp:AUserP authKey
snmp:AUserP privKey
vns:VOspfVEncapAsc authKey
vns:SvcPkgSource password
vns:CCredSecret value
fabric:PortBlk
fabric:ProtGEp
fabric:ProtPol
fabric:SFPortS
fabric:SpCardP
fabric:SpCardPGrp
fabric:SpCardS
fabric:SpNodePGrp
fabric:SpPortP
fabric:SpPortPGrp
fc:DomP
fc:FabricPol
fc:IfPol
fc:InstPol
file:RemotePath
fvns:McastAddrInstP
fvns:VlanInstP
fvns:VsanInstP
fvns:VxlanInstP
infra:AccBaseGrp
infra:AccBndlGrp
infra:AccBndlPolGrp
infra:AccBndlSubgrp
infra:AccCardP
infra:AccCardPGrp
infra:AccNodePGrp
infra:AccPortGrp
infra:AccPortP
infra:AttEntityP
infra:CardS
infra:ConnFexBlk
infra:ConnFexS
infra:ConnNodeS
infra:DomP
infra:FexBlk
infra:FexBndlGrp
infra:FexGrp
infra:FexP
infra:FuncP
infra:HConnPortS
infra:HPathS
infra:HPortS
infra:LeafS
infra:NodeBlk
infra:NodeGrp
infra:NodeP
infra:OLeafS
infra:OSpineS
infra:PodBlk
infra:PodGrp
infra:PodP
infra:PodS
infra:PolGrp
infra:PortBlk
infra:PortP
infra:PortS
infra:PortTrackPol
infra:Profile
infra:SHPathS
infra:SHPortS
infra:SpAccGrp
infra:SpAccPortGrp
infra:SpAccPortP
infra:SpineP
infra:SpineS
isis:DomPol
l2ext:DomP
l2:IfPol
l2:InstPol
l2:PortSecurityPol
l3ext:DomP
lacp:IfPol
lacp:LagPol
lldp:IfPol
lldp:InstPol
mcp:IfPol
mcp:InstPol
mgmt:NodeGrp
mgmt:PodGrp
mon:FabricPol
mon:InfraPol
phys:DomP
psu:InstPol
qos:DppPol
snmp:Pol
span:Dest
span:DestGrp
span:SpanProv
span:SrcGrp
span:SrcTargetShadow
span:SrcTargetShadowBD
span:SrcTargetShadowCtx
span:TaskParam
span:VDest
span:VDestGrp
span:VSpanProv
span:VSrcGrp
stormctrl:IfPol
stp:IfPol
stp:InstPol
stp:MstDomPol
stp:MstRegionPol
trig:SchedP
vmm:DomP
vpc:InstPol
vpc:KAPol
ACI Terminology
Cisco ACI Term Industry Standard Term Description
(Approximation)
Alias Alias A changeable name for a given object. While
the name of an object, once created, cannot
be changed, the Alias is a field that can be
changed.
Atomic Counters Atomic Counters Atomic counters allow you to gather statistics
about traffic between flows. Using atomic
counters, you can detect drops and misrouting
in the fabric, enabling quick debugging and
isolation of application connectivity issues.
For example, an administrator can enable
atomic counters on all leaf switches to trace
packets from endpoint 1 to endpoint 2. If any
leaf switches have nonzero counters, other
than the source and destination leaf switches,
an administrator can drill down to those leaf
switches.
Border Leaf Switches Border Leaf Switches Border leaf switches in the fabric are
connected to the external routed network and
carry L3 Out external connections to an
external EPG (L3extInst).
Bridge Domain Bridge Domain A bridge domain is a set of logical ports that
share the same flooding or broadcast
characteristics. Like a virtual LAN (VLAN),
bridge domains span multiple devices.
Contract Approximation of Access The rules that specify what and how
Control List (ACL) communication in a network is allowed. In
Cisco ACI, contracts specify how
communications between EPGs take place.
Contract scope can be limited to the EPGs in
an application profile, a tenant, a VRF, or the
entire fabric.
Endpoint Group (EPG) Endpoint Group A logical entity that contains a collection of
physical or virtual network endpoints. In
Cisco ACI, endpoints are devices connected
to the network directly or indirectly. They
have an address (identity), a location,
attributes (e.g., version, patch level), and can
be physical or virtual. Endpoint examples
include servers, virtual machines, storage, or
clients on the Internet.