Telemetry vs. SNMP - Is One Better For Network Management
Telemetry vs. SNMP - Is One Better For Network Management
Telemetry vs. SNMP - Is One Better For Network Management
com
Telemetry vs. SNMP: Is one better for network management?
By Terry Slattery
Simple Network Management Protocol, or SNMP, and telemetry operate with quite different mechanisms. When weighing telemetry vs. SNMP, do
those distinctions make one better than the other?
SNMP has been in use for network management since 1990 and is widely supported by both network devices and monitoring platforms. Device
performance data is collected through a polling mechanism and returned to the management platform. There are three versions of SNMP, with
SNMPv3 adding important authentication and encryption features.
SNMP uses a simple protocol that requests data identified by one or more object ID (OID) in a GetRequest, GetNextRequest or GetBulkRequest
packet. Data is returned in response packets. The OIDs are structured in a management information base (MIB). It is easy to perform ad hoc data
collection as needed. Asynchronous events can be communicated back to the management system via SNMP traps or via syslog. Data is transported
via User Datagram Protocol (UDP), which requires only minimal overhead on both the network device and the management system.
SNMP's polling architecture also has a downside. The management system needs to create and send data requests to each device, only to repeat the
process a few minutes later. There is also a processing cost. Lexicographical sorting in the MIB is different than the way that interface performance
data is stored, so the device's CPU has to do more processing to handle the polling requests.
A vendor-independent MIB, named MIB-II, provides a general set of operational variables across a wide range of devices. Vendors can augment MIB-
II with custom MIBs, and some network management systems take advantage of this additional data source.
Streaming network telemetry is a relatively new mechanism that uses a push model to continuously send high-resolution device operational data to a
network management system. It sends data at a higher rate and with lower impact on the network devices than with other methods, like SNMP or the
command-line interface (CLI). Data is selected by configuring a periodic cadence, which can be subsecond or an event trigger, such as a threshold
breech (e.g., high errors) or a status change (e.g., interface state change).
The data is encoded as XML, JSON or Google protocol buffers. Either UDP or TCP transport can be used, frequently in conjunction with Google
Remote Procedure Calls (gRPC), with encryption. GRPC enables a collector to dynamically request a data stream from a network device. It can be
used to establish new data streams or to poll for data that rarely changes.
Model-driven telemetry, meanwhile, is based on YANG (Yet Another Next Generation) models and simplifies the selection of the data to stream. The
OpenConfig working group is creating standardized models that can be applied across groups of network devices. In addition, Google, through its
gRPC Network Management Interface (gNMI) initiative, is attempting to define a standard that governs how telemetry can be used to retrieve network
state data.
The volume of data that can be streamed from even a moderately sized network can be huge, requiring big data storage and processing mechanisms.
Network managers have to determine the cadence or event triggers for streaming each type of data so they don't overwhelm the processing capabilities
of the network management system in question.
SNMP is used best when retrieving relatively static data, such as inventory or neighboring devices. Its polling mechanism makes collecting high-
volume, high-resolution performance data a challenge.
Note: Several network management products exist that can collect a full suite of performance variables from more than 1 million interfaces on
thousands of devices every minute from one server. Clearly, a good implementation is critical to good performance.
SNMP is useful for networks equipped with significant numbers of older devices that don't support telemetry. It is also good for collecting
nonperformance data, such as routing peers, bridge domain neighbors, Network Time Protocol peers and device inventory information -- i.e., serial
numbers, modules and slot locations. Finally, the protocol's use of UDP eliminates the need to allocate large receive buffers, enabling management
servers to more efficiently allocate internal memory.
Streaming telemetry is better for collecting high-resolution performance data, such as high-speed network interface statistics. It's becoming more
practical as more device and network management vendors begin to support the methodology.
In addition, newer RPC mechanisms make telemetry more efficient than SNMP or CLI in obtaining data from network devices, making telemetry the
obvious choice going forward. Keep in mind, however, that telemetry collectors that rely on TCP connections may use a significant amount of
memory for receive buffers, depending on the implementation. Moreover, the large number of YANG models for each vendor can make it difficult to
analyze streaming data.
For networks that contain a mix of old and new network devices, a combination of SNMP and telemetry will be best. A switch to telemetry is possible
when all network devices within an organization support it.
Regardless of how you may assess the data collection methods of telemetry vs. SNMP, network management is essentially a big data problem. The
management system needs to process large volumes of data to identify anomalies and alert the network operations team to problems. The OpenConfig
and gNMI initiatives are working to simplify data collection and analysis.
09 Jun 2020
All Rights Reserved, Copyright 2000 - 2020, TechTarget | Read our Privacy Statement