Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Large Language Model-Driven Cross-Domain Orchestration Using Multi-Agent Workflow

Xiaonan Xu1, Haoshuo Chen1, Jesse E. Simsarian1, Roland Ryf1, Nicolas K. Fontaine1,
Mikael Mazur1, Lauren Dallachiesa1, David T. Neilson1
1Nokia Bell Labs, 600 Mountain Ave., Murray Hill, NJ 07974, USA
xuxiaonan2008@gmail.com
Abstract

We showcase an application that leverages multiple agents, powered by large language models and integrated tools, to collaboratively solve complex network operation tasks across various domains. The tasks include real-time topology retrieval, network optimization using physical models, and fiber switching facilitated by a robotic arm.

Index Terms:
Cross-domain orchestration, large language model (LLM), robotic automation

I Introduction

Attention is increasingly focused on integrating AI, including large language models (LLMs), into domain-specific applications[1, 2]. However, the use of LLMs for orchestrating networks across multiple domains has not yet been demonstrated. Cross-domain network orchestration is essential for delivering dynamic, scalable, and high-performance services in today’s vertical networks[3].

In this paper, we demonstrate an orchestration application over a robotic domain and an optical transport network (OTN) domain using multi-agent conversation [4]. This orchestration application consists of multiple interacting LLM-empowered agents that exchange information and collaborate to solve problems beyond the capability of individual agents. Orchestration across multiple domains is achieved through two key features of intelligent agents. First, they can access and utilize a variety of domain-specific management interfaces, tools, and functionalities through function calling. This enables, for example, retrieving real-time network parameters, using models to analyze network Quality of Transmission (QoT), and planning collision-free robot motion [5]. Second, LLM empowers agents not only to understand and generate human-like text, including code blocks, but also to facilitate efficient communication between agents to complete complex tasks. This capability enables, for instance, the suggestion of detailed plans for complex tasks and the selection of the most suitable agent for each specific subtask.

Refer to caption
Figure 1: Architecture and workflow of cross-domain network automation enabled by LLM-empowered agents. (DT: digital twin)

 

II Application Description

Figure 1 illustrates the orchestration architecture and workflow of network automation, spanning the robotic and OTN domains. In the network domain, we utilize a laboratory optical network comprising six commercial Nokia 1830 Photonic Service Switch (PSS) transport nodes equipped with flexgrid reconfigurable optical add-drop multiplexers (ROADMs), with fiber lengths indicated in the network topology. An OTN controller can collect and analyze real-time network data and employ network digital twins with the QoT estimator GNPy [6] for route planning and optimization. In the robotic domain, a real robotic arm with a 2-finger gripper mounted on a mobile robotic base with Lidar is applied to perform physical tasks in the laboratory [5]. The robot controller manages and controls the robot’s operations and can interact with the robot’s digital twin, like MoveIt [7], for movement planning. The orchestration architecture can be extended to support more network domains such as 5G network, and other emerging technologies such as augmented/virtual reality (AR/VR) [8].

We assign one chat group to the OTN domain and another to the robotic domain, each containing multiple intelligent agents with specific roles: 1) a manager automatically selects the appropriate agent based on the task request, each agent’s responsibilities, and their responses, 2) a planner suggests a plan to accomplish the request, 3) a writer handles language tasks such as retrieving data and generating code, with access to device controllers and various tools and 4) an executor runs code contained in input messages and sends out result messages. The agents within each chat group can communicate freely, and different groups are linked through the executors.

Refer to caption
Figure 2: Output of multi-agent conversation involving aPlanner,Writer, andExecutor from the OTN domain (a) and the robotic domain (b), respectively. TheAdmininputs the task request. Responses highlighted in red signify the usage of tools such as function calling and code execution. Responses highlighted in purple exemplify the usage of the LLM’s language skills such as data retrieval and simple mathematical reasoning. ”[…]” indicates omitted information such as paths for saving files, code output, and descriptions of pre-defined functions. Theyellow boxsignifies omitted code or part of code generation.

We demonstrate the interactive multi-agent conversations in which 1) the network domain evaluates the generalized signal-to-noise ratio (GSNR) of two paths based on retrieved data from the real network, determines a better path, and shares the result with the robotic domain, and 2) the robotic domain listens and receives the message, sends specific commands to the robot, and executes the physical action of switching the fiber. Figure 2(a) and (b) exemplify the dialogues of the OTN and the robotic chat group, respectively. Upon receiving the request from an administrator, the planner divides the task and suggests a plan with multiple steps. The writer and the executor utilize language skills, pre-defined functions, or code generation to accomplish sub-tasks step-by-step. The agents can fix code and analyze the problem if an error is indicated, as shown before step 3 in Figure 2(a). Step 6 in Figure 2(b) shows the writer generating code using basic pre-defined functions to operate the robot, instructing it to unplug the fiber from port A and plug it into port C after receiving the message from the OTN domain. All the agents are configured using GPT-4.

Future work will focus on extending the application to support more network domains and utilizing open-sourced and local fine-tuned LLMs for enhanced data security, domain-specific tuning, and reduced latency.

References

  • [1] Wang, Y., Li, J., Pang, Y., Song, Y., Zhang, L., Zhang, M., and Wang, D., “AlarmGPT: an intelligent operation assistant for optical network alarm analysis using chatgpt,” in ECOC, 2023, p. Th.A.1.2.
  • [2] Huang, Y., Du, H., Zhang, X., Niyato, D., Kang, J., Xiong, Z., Wang, S., and Huang, T., “Large language models for networking: Applications, enabling techniques, and challenges,” arXiv preprint arXiv:2311.17474, 2023.
  • [3] Pagès, A., Agraz, F., and Spadaro, S., “End-to-end orchestration in support of IIoT applications over optically interconnected tsn domains,” in OFC, 2023, p. Tu3D.2.
  • [4] Wu, Q., Bansal, G., Zhang, J., Wu, Y., Zhang, S., Zhu, E., Li, B., Jiang, L., Zhang, X., and Wang, C., “Autogen: Enabling next-gen LLM applications via multi-agent conversation framework,” arXiv preprint arXiv:2308.08155, 2023.
  • [5] Xu, X., Chen, H., Scheutzow, M., Simsarian, J. E., Ryf, R., Qua, G., Hande, A., Dinoff, R., Szczerban, M., Mazur, M. et al., “Automated fiber switch with path verification enabled by an AI-powered multi-task mobile robot,” in ECOC, 2023, p. Th.C.2.7.
  • [6] Ferrari, A., Filer, M., Balasubramanian, K., Yin, Y., Le Rouzic, E., Kundrát, J., Grammel, G., Galimberti, G., and Curri, V., “GNPy: an open source application for physical layer aware open optical networks,” Journal of Optical Communications and Networking, vol. 12, no. 6, pp. C31–C40, 2020.
  • [7] https://moveit.github.io/moveit_tutorials/.
  • [8] Chen, H., Xu, X., Simsarian, J. E., Szczerban, M., Harby, R., Ryf, R., Mazur, M., Dallachiesa, L., Fontaine, N. K., Cloonan, J., Sandoz, J., and Neilson, D. T., “Digital twin of a network and operating environment using augmented reality,” in ECOC, 2023, p. Th.A.1.1.