Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Critical
Fix Version/s: 4.13.z
Affects Version/s: 4.12
Component/s: Networking / ovn-kubernetes
Labels:
- SDN:Backport

Regression:
No
Sprint:
SDN Sprint 254
sprint_count:
1
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Release Note Text:

Hide
*Cause*: What actions or circumstances cause this bug to present.
When ovnkube-node pod is started and there are many external IPs or load balancer IP services exist.
*Consequence*: What happens when the bug presents.
ovnkube-node will take a very long time to start up due to it programming iptables rules one at a time for all services. Iterating over external IPs or load balancer iPs is especially costly. The length of start up time depends on the number of services and external IPs/load balancer IPs. We have seen this time take from several minutes to almost an hour.
*Fix*: What was done to fix the bug.
Optimizations in the service parsing logic reduces duplicate calls to create iptables rules. Now iptables rules are atomically created using iptable-restore rather than a separate call to iptables for each external IP.
*Result*: Bug doesn’t present anymore.
The start up time is much faster, even at high scale it should be on the order of seconds rather than minutes.

Show
*Cause*: What actions or circumstances cause this bug to present. When ovnkube-node pod is started and there are many external IPs or load balancer IP services exist. *Consequence*: What happens when the bug presents. ovnkube-node will take a very long time to start up due to it programming iptables rules one at a time for all services. Iterating over external IPs or load balancer iPs is especially costly. The length of start up time depends on the number of services and external IPs/load balancer IPs. We have seen this time take from several minutes to almost an hour. *Fix*: What was done to fix the bug. Optimizations in the service parsing logic reduces duplicate calls to create iptables rules. Now iptables rules are atomically created using iptable-restore rather than a separate call to iptables for each external IP. *Result*: Bug doesn’t present anymore. The start up time is much faster, even at high scale it should be on the order of seconds rather than minutes.
Release Note Type:
Bug Fix
Release Note Status:
In Progress
Target Version:

4.13.z

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

This is a clone of issue ~~OCPBUGS-32426~~. The following is the description of the original issue:
—
on clusters with a large number of services with externalIPs or services from type loadBalancer the ovnkube-node initialization can take up to 50 min

The problem is after a node reboot done by MCO the unschedule taint is removed from the node so the api allocates pods to that node that get stuck on ContrainerCreating and other nodes continue to go down for reboot making the workloads unavailable. (if no PDB exists for the workload to protect it)

clones

OCPBUGS-33537 [4.14z] slow ovnkube-node initialization on large number of services with externalIps

Closed

depends on

OCPBUGS-33537 [4.14z] slow ovnkube-node initialization on large number of services with externalIps

Closed

is cloned by

OCPBUGS-34273 [4.12z] slow ovnkube-node initialization on large number of services with externalIps

Verified

is depended on by

OCPBUGS-34273 [4.12z] slow ovnkube-node initialization on large number of services with externalIps

Verified

links to

openshift/ovn-kubernetes#2172: [release-4.13] OCPBUGS-33730: Improves service iptables efficiency on start up

RHBA-2024:3494 OpenShift Container Platform 4.13.z bug fix update

(1 links to)

Assignee:: Tim Rozet

Reporter:: OpenShift Prow Bot

QA Contact:: Jean Chen

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2024/05/15 6:10 PM

Updated:: 2024/05/30 8:32 AM

Details

Description

Attachments

Issue Links

Activity

People

Dates