0% found this document useful (0 votes)

66 views

Debugging Performance Addon Operator For Low Latency Pods

The document discusses debugging performance issues for low latency pods in Openshift Container Platform 4.x. It provides diagnostic steps to verify that pods with guaranteed quality of service (QoS) are properly isolated from other workloads through CPU manager configuration and annotations. This includes checking that pods are scheduled to properly labeled nodes, have reserved resources, and are pinned to isolated CPUs. Tracing issues may require opening a support case with diagnostic data.

Uploaded by

Srinivasan

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views

Debugging Performance Addon Operator For Low Latency Pods

Uploaded by

Srinivasan

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Debugging Performance Addon Operator

for low latency pods (guaranteed QoS and

IRQ balancing)
SOLUTION VERIFIED - Updated November 17 2021 at 8:16 AM -
English
Environment
Redhat Openshift Container Platform 4.x

Issue
 Other threads scheduled on cores that are meant to be isolated.
 DPDK lost packets due to unwanted interrupts.
 pods not in the "guaranteed" QOS class.

Resolution
If the Diagnostic Steps have been followed and therefor the operator confirms the following:
- Nodes where the pods have been scheduled are:
* Properly labeled
* cpuManagerPolicy is set to static
- Pods that require best performance are:
* Configured to have guaranteed QoS containers (reserved the correct amount of resources)
* Have both annotations in their configuration
* Are scheduled to the correct labeled nodes

If the pods are still not isolated from disrupting IRQs/threads, we then suggest that a case for Red
Hat support be open with the following data sets:
- All outputs from the Diagnostic Steps verification section
- Pod name(s) and their configuration yaml files
- General must-gather of the environment
- Performance Addon Operator specific must-gather
- sosreport of the node where the pods are deployed

Diagnostic Steps
Cluster Configuration Verification

Based on the official Openshift documentation the cluster needs to utilize CPU Manager in order
to "isolate" guaranteed QoS pods' CPUs.

The scheduled nodes will require a custom KubeletConfig with cpuManagerPolicy:

static configured, this can be configured via the use of a Performanceprofile.

Example:

Raw
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
name: cpumanager-enabled
spec:
machineConfigPoolSelector:
matchLabels:
custom-kubelet: cpumanager-enabled
kubeletConfig:
cpuManagerPolicy: static
cpuManagerReconcilePeriod: 5s

With Openshift Container Platform 4, "isolated" really means "available for isolation":
- Interrupts, kernel processes, OS/systemd processes will always run on reserved CPUs as
configured in CPU Manager (reservedSystemCPUs).
- Burstable pods will run on reserved CPUs and isolated CPUs NOT used by a guaranteed QoS
pod. This is how Kubernetes implemented CPU Manager.
- Guaranteed pods’ containers will be pinned to a specific set of CPUs from the isolated pool (in
other words, available for isolation).

Note that for OCP 4:

- Reserved + isolated CPUs must equal all the CPUs on the server.
- Reserved CPUs should be large enough to accommodate the kernel and its OS.
- Guaranteed pod will have the CPUs dedicated to itself after 5 to 10 seconds (configurable) but
setting it too low will put higher load on the node.
- Total of allocatable CPUs of a node = capacity - reserved.

Pod Configuration Verification

Here are the steps to ensure the system is configured correctly for IRQ dynamic load balancing.
Consider a node with 6 CPUs targeted by a 'v2' Performance Profile:
Let's assume the node name is cnf-worker.demo.lab.

A profile reserving 2 CPUs for housekeeping can look like this:

Raw
apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
name: dynamic-irq-profile
spec:
cpu:
isolated: 2-5
reserved: 0-1
...

1. Ensure you are using a v2 profile in the apiVersion.

2. Ensure GloballyDisableIrqLoadBalancing field is missing or has the value false.

The pod below is guaranteed QoS and requires 2 exclusive CPUs out of the 6 available CPUs
in the node.

Raw
apiVersion: v1
kind: Pod
metadata:
name: dynamic-irq-pod
annotations:
irq-load-balancing.crio.io: "disable"
cpu-quota.crio.io: "disable"
spec:
containers:
- name: dynamic-irq-pod
image: "quay.io/openshift-kni/cnf-tests:4.6"
command: ["sleep", "10h"]
resources:
requests:
cpu: 2
memory: "200M"
limits:
cpu: 2
memory: "200M"
nodeSelector:
node-role.kubernetes.io/worker-cnf: ""
runtimeClassName: dynamic-irq-profile

Note: Only disable CPU load balancing when the CPU manager static policy is enabled and for
pods with guaranteed QoS that use whole CPUs. Otherwise, disabling CPU load balancing can
affect the performance of other containers in the cluster. See above section Cluster
Configuration Verification.

1. Ensure both annotations exist (irq-load-balancing.crio.io and cpu-

quota.crio.io).
2. Ensure the pod has its runtimeClassName as the respective profile name, in this
example dynamic-irq-profile.
3. Ensure the node selector targets a cnf-worker.

Ensure the pod is running correctly.

Raw
oc get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
NOMINATED NODE READINESS GATES
dynamic-irq-pod 1/1 Running 0 5h33m 10.135.1.140 cnf-
worker.demo.lab <none> <none>

1. Ensure status is Running.

2. Ensure the pod is scheduled on a cnf-worker node, in our case on the cnf-
worker.demo.lab node.

Find out the CPUs dynamic-irq-pod runs on.

Raw
oc exec -it dynamic-irq-pod -- /bin/bash -c "grep Cpus_allowed_list /proc/self/status
| awk '{print $2}'"
Cpus_allowed_list: 2-3

Ensure the node configuration is applied correctly.

Connect to the cnf-worker.demo.lab node to verify the configuration.

Raw
oc debug node/ocp47-worker-0.demo.lab
Starting pod/ocp47-worker-0demolab-debug ...
To use host binaries, run `chroot /host`

Pod IP: 192.168.122.99

If you don't see a command prompt, try pressing enter.

sh-4.4#

Use the node file system:

Raw
sh-4.4# chroot /host
sh-4.4#
1. Ensure the default system CPU affinity mask does not include the dynamic-irq-pod
CPUs, in our case 2,3.

Raw
cat /proc/irq/default_smp_affinity
33

2. Ensure the system IRQs are not configured to run on the dynamic-irq-pod CPUs

Raw
find /proc/irq/ -name smp_affinity_list -exec sh -c 'i="$1"; mask=$(cat $i); file=$
(echo $i); echo $file: $mask' _ {} \;
/proc/irq/0/smp_affinity_list: 0-5
/proc/irq/1/smp_affinity_list: 5
/proc/irq/2/smp_affinity_list: 0-5
/proc/irq/3/smp_affinity_list: 0-5
/proc/irq/4/smp_affinity_list: 0
/proc/irq/5/smp_affinity_list: 0-5
/proc/irq/6/smp_affinity_list: 0-5
/proc/irq/7/smp_affinity_list: 0-5
/proc/irq/8/smp_affinity_list: 4
/proc/irq/9/smp_affinity_list: 4
/proc/irq/10/smp_affinity_list: 0-5
/proc/irq/11/smp_affinity_list: 0
/proc/irq/12/smp_affinity_list: 1
/proc/irq/13/smp_affinity_list: 0-5
/proc/irq/14/smp_affinity_list: 1
/proc/irq/15/smp_affinity_list: 0
/proc/irq/24/smp_affinity_list: 1
/proc/irq/25/smp_affinity_list: 1
/proc/irq/26/smp_affinity_list: 1
/proc/irq/27/smp_affinity_list: 5
/proc/irq/28/smp_affinity_list: 1
/proc/irq/29/smp_affinity_list: 0
/proc/irq/30/smp_affinity_list: 0-5

Note: Some IRQ controllers do not support IRQ re-balancing and will always expose all online
CPUs as the IRQ mask.
Usually they will effectively run on CPU 0, a hint can be received with:

Raw
for i in {0,2,3,5,6,7,10,13,30}; do cat /proc/irq/$i/effective_affinity_list; done
0

0
0
0
0
0
0
1
More information on Best practices for avoiding noisy neighbor issues using CPU manager
behaves with regards to hyper-threading(SMTAlignment).

Linux Foundation-CKAD
No ratings yet
Linux Foundation-CKAD
49 pages
CKA - Exam Simulators
No ratings yet
CKA - Exam Simulators
44 pages
Main
No ratings yet
Main
7 pages
Demo Ex280
100% (1)
Demo Ex280
7 pages
Killer Shell - CKS CKA CKAD Simulator
No ratings yet
Killer Shell - CKS CKA CKAD Simulator
6 pages
IX Chemistry - Chapter 3 - AToms and Molecules - Worksheet 1 - Answer Key PDF
No ratings yet
IX Chemistry - Chapter 3 - AToms and Molecules - Worksheet 1 - Answer Key PDF
1 page
Docker SBeliakou Part03
No ratings yet
Docker SBeliakou Part03
61 pages
RAC Build Requirements
No ratings yet
RAC Build Requirements
8 pages
Openshift Installation Steps
No ratings yet
Openshift Installation Steps
18 pages
Configure High Availability Cluster in Centos 7 (Step by Step Guide)
No ratings yet
Configure High Availability Cluster in Centos 7 (Step by Step Guide)
9 pages
Linux+Recommended+Settings-Multipathing-IO Balance
No ratings yet
Linux+Recommended+Settings-Multipathing-IO Balance
11 pages
2. RAC_Setup
No ratings yet
2. RAC_Setup
7 pages
AIX Boot Process
No ratings yet
AIX Boot Process
99 pages
Pure Storage Linux Recommended Settings
No ratings yet
Pure Storage Linux Recommended Settings
11 pages
Kubernetes Cluster Using Ubuntu 20.04 LTS.
No ratings yet
Kubernetes Cluster Using Ubuntu 20.04 LTS.
2 pages
Oracle RAC Installation
No ratings yet
Oracle RAC Installation
46 pages
Lab3 - Deploying The Kubernetes Cluster - Node2
100% (1)
Lab3 - Deploying The Kubernetes Cluster - Node2
8 pages
Actual4test CKS
No ratings yet
Actual4test CKS
17 pages
Team BasicsforEngineer 220920 0333 1720 PDF
No ratings yet
Team BasicsforEngineer 220920 0333 1720 PDF
13 pages
Instana Install & Upgrade Guide Single VM Setup by Stanctl
No ratings yet
Instana Install & Upgrade Guide Single VM Setup by Stanctl
19 pages
Solution: Be Sure To Watch The Introduction Video For Details Regarding Testing Your Progress
No ratings yet
Solution: Be Sure To Watch The Introduction Video For Details Regarding Testing Your Progress
16 pages
What is OpenShift
No ratings yet
What is OpenShift
11 pages
kubernetes_Zero_To_Zero
No ratings yet
kubernetes_Zero_To_Zero
53 pages
Multi-Node OCP Cluster Installation Steps 2
No ratings yet
Multi-Node OCP Cluster Installation Steps 2
15 pages
R-EX200
No ratings yet
R-EX200
13 pages
Step by Step Oracle 12c Grid Infrastructure - Installation
100% (1)
Step by Step Oracle 12c Grid Infrastructure - Installation
70 pages
Install Kubernetes CRI-O Container Runtime On CentOS 8 CentOS 7
No ratings yet
Install Kubernetes CRI-O Container Runtime On CentOS 8 CentOS 7
13 pages
Cluvfy - Bright DBA
No ratings yet
Cluvfy - Bright DBA
18 pages
Fibre Channel Boot Notes
No ratings yet
Fibre Channel Boot Notes
6 pages
Kubernetes HandsOn Project
No ratings yet
Kubernetes HandsOn Project
8 pages
Main Points
No ratings yet
Main Points
10 pages
Day 22
No ratings yet
Day 22
5 pages
k8s1 PDF
No ratings yet
k8s1 PDF
61 pages
Cyberark Pam Sentry.docx
No ratings yet
Cyberark Pam Sentry.docx
38 pages
Best Practices Oracle Cluster Ware Session 355
No ratings yet
Best Practices Oracle Cluster Ware Session 355
30 pages
Network Troubleshooting
No ratings yet
Network Troubleshooting
5 pages
5b-book-of-ahv-how-it-works
No ratings yet
5b-book-of-ahv-how-it-works
7 pages
Component Pack 6.0.0.6 Installation Guide: Martti Garden - IBM Roberto Boccadoro - ELD Engineering
No ratings yet
Component Pack 6.0.0.6 Installation Guide: Martti Garden - IBM Roberto Boccadoro - ELD Engineering
11 pages
Cloud 12C
No ratings yet
Cloud 12C
44 pages
Rhce Exam-Certcollection
No ratings yet
Rhce Exam-Certcollection
14 pages
PLUS
No ratings yet
PLUS
5 pages
Create A Repositary For: RHCE Questions
No ratings yet
Create A Repositary For: RHCE Questions
19 pages
Red Hat Openshift Notes
No ratings yet
Red Hat Openshift Notes
71 pages
Killer Shell - CKS CKA CKAD Simulator
No ratings yet
Killer Shell - CKS CKA CKAD Simulator
41 pages
Mini Project
No ratings yet
Mini Project
9 pages
How To Setup Kubernetes (k8s) Cluster in HA With Kubeadm
No ratings yet
How To Setup Kubernetes (k8s) Cluster in HA With Kubeadm
36 pages
Open Source Lab Manual
No ratings yet
Open Source Lab Manual
84 pages
Amazon EC2
No ratings yet
Amazon EC2
4 pages
K8s installation on Rocky Linux
No ratings yet
K8s installation on Rocky Linux
6 pages
Oracle 10G Installations
100% (6)
Oracle 10G Installations
6 pages
All Q_2
No ratings yet
All Q_2
4 pages
Oracle 10G Installation: Step I: Check For Following Required Package Versions (Or Later)
No ratings yet
Oracle 10G Installation: Step I: Check For Following Required Package Versions (Or Later)
46 pages
Step by Step Rac Configuration of Oracle 10G
No ratings yet
Step by Step Rac Configuration of Oracle 10G
12 pages
Changes
No ratings yet
Changes
20 pages
Openstack Manual Installation in Centos7
No ratings yet
Openstack Manual Installation in Centos7
31 pages
EX300 Redhat Exam Prep Cheat Sheet
No ratings yet
EX300 Redhat Exam Prep Cheat Sheet
13 pages
RAC 11gR2 CLUSTER SETUP
No ratings yet
RAC 11gR2 CLUSTER SETUP
82 pages
EX294
No ratings yet
EX294
16 pages
SAP Kernel Update On Unix For ABAP
No ratings yet
SAP Kernel Update On Unix For ABAP
9 pages
LPIC-1 Primer
From Everand
LPIC-1 Primer
John Greene
4.5/5 (3)
Kubernetes Made Easy
From Everand
Kubernetes Made Easy
Pankaj Joshi
No ratings yet
Configuration of a Simple Samba File Server, Quota and Schedule Backup
From Everand
Configuration of a Simple Samba File Server, Quota and Schedule Backup
Dr. Hidaia Mahmood Alassouli
No ratings yet
How To Gather Data For Openshift OVN-Kubernetes
100% (1)
How To Gather Data For Openshift OVN-Kubernetes
8 pages
Back Up Your Files To The Cloud PDF
No ratings yet
Back Up Your Files To The Cloud PDF
1 page
Using EC2 Roles and Instance Profiles in AWS
No ratings yet
Using EC2 Roles and Instance Profiles in AWS
21 pages
CloudWatch Dashboards
No ratings yet
CloudWatch Dashboards
4 pages
Weekly Assessment For Class IX 18-7-20
No ratings yet
Weekly Assessment For Class IX 18-7-20
4 pages
Introduction To AWS IAM
No ratings yet
Introduction To AWS IAM
5 pages
OLVM - Firewall Requirements For noVNC Console Invocation
No ratings yet
OLVM - Firewall Requirements For noVNC Console Invocation
2 pages
Satellite Upgrade
No ratings yet
Satellite Upgrade
4 pages
PUDHU UTSAVAM Appl FINAL
No ratings yet
PUDHU UTSAVAM Appl FINAL
1 page
Everwin Group of Schools STD: Ix Periodic Assessment I - Science Answer Key Date: 17.08.2020 MARKS: 40 I. Answer The Following: 7X1 7
No ratings yet
Everwin Group of Schools STD: Ix Periodic Assessment I - Science Answer Key Date: 17.08.2020 MARKS: 40 I. Answer The Following: 7X1 7
2 pages
Biology Worksheet Answer Key Class9 PDF
No ratings yet
Biology Worksheet Answer Key Class9 PDF
2 pages
Ix Maths Weekly Assessment
No ratings yet
Ix Maths Weekly Assessment
5 pages
Salt Cheat Sheet
No ratings yet
Salt Cheat Sheet
10 pages
02 - C - Amazon Inspector
No ratings yet
02 - C - Amazon Inspector
33 pages
Pod Scheduling Behavior
No ratings yet
Pod Scheduling Behavior
2 pages
Do280 Manually Scaling An OpenShift Cluster
100% (1)
Do280 Manually Scaling An OpenShift Cluster
2 pages
DO280 Automatically Scaling
No ratings yet
DO280 Automatically Scaling
1 page
Do280 Scaling An Application
No ratings yet
Do280 Scaling An Application
1 page
Architecting On AWS 5 - Lab
No ratings yet
Architecting On AWS 5 - Lab
61 pages
Do280 Limiting Resource Usage
No ratings yet
Do280 Limiting Resource Usage
2 pages
Instant download The Mind As a Scientific Object Between Brain and Culture 1st Edition Christina E. Erneling pdf all chapter
No ratings yet
Instant download The Mind As a Scientific Object Between Brain and Culture 1st Edition Christina E. Erneling pdf all chapter
55 pages
LVGL
No ratings yet
LVGL
488 pages
Rubric For Mathematical Presentations
No ratings yet
Rubric For Mathematical Presentations
1 page
Vedant's Resumé
No ratings yet
Vedant's Resumé
1 page
Motor Feeder Cable & Cable Tray Sizing and Data
No ratings yet
Motor Feeder Cable & Cable Tray Sizing and Data
5 pages
Qashqai 1.6T
100% (1)
Qashqai 1.6T
3 pages
Sand Reclamation and Conditioning
No ratings yet
Sand Reclamation and Conditioning
13 pages
Week29 - Unit 4 Lunchtime
100% (1)
Week29 - Unit 4 Lunchtime
5 pages
Research Article: Modelling of Dual-Junction Solar Cells Including Tunnel Junction
No ratings yet
Research Article: Modelling of Dual-Junction Solar Cells Including Tunnel Junction
6 pages
CNN Course V1.3
No ratings yet
CNN Course V1.3
19 pages
Lecture Slides (Week 8) - Final Version
No ratings yet
Lecture Slides (Week 8) - Final Version
33 pages
Bennett Enterprises Limited Profile
No ratings yet
Bennett Enterprises Limited Profile
15 pages
Matter & Its Various States: of Solids
No ratings yet
Matter & Its Various States: of Solids
37 pages
ME3493 MANUFACTURING TECHNOLOGY syllabus
No ratings yet
ME3493 MANUFACTURING TECHNOLOGY syllabus
2 pages
Creativity
100% (2)
Creativity
19 pages
Lesson Plan For Grade 3 English (Aiza A. Miranda-Bse3-2E)
No ratings yet
Lesson Plan For Grade 3 English (Aiza A. Miranda-Bse3-2E)
10 pages
Ufgs 40 17 26.00 20
No ratings yet
Ufgs 40 17 26.00 20
28 pages
Ceri D 21 11715
No ratings yet
Ceri D 21 11715
44 pages
Loads On Bearing
No ratings yet
Loads On Bearing
8 pages
Points East and West: Acupuncture and Comparative Philosophy of Science
No ratings yet
Points East and West: Acupuncture and Comparative Philosophy of Science
10 pages
Functional and Non Functional
No ratings yet
Functional and Non Functional
4 pages
Call for Papers_BSA_Vienna_2025_Bourdieu_finale
No ratings yet
Call for Papers_BSA_Vienna_2025_Bourdieu_finale
4 pages
Your Guide To Abs and Ebs: Updated September 2003
No ratings yet
Your Guide To Abs and Ebs: Updated September 2003
40 pages
PINN Gentle Introduction
No ratings yet
PINN Gentle Introduction
26 pages
Introduction To Statistical Learning
No ratings yet
Introduction To Statistical Learning
16 pages
Robotics Reflection
No ratings yet
Robotics Reflection
5 pages
Astm D395 - 2003
No ratings yet
Astm D395 - 2003
6 pages
Utilisation of Research Findings
100% (1)
Utilisation of Research Findings
13 pages
Hdpe Boru Fiyat Listesi Hdpe Pipe Price List
No ratings yet
Hdpe Boru Fiyat Listesi Hdpe Pipe Price List
2 pages