{"id":180,"date":"2024-09-03T12:29:59","date_gmt":"2024-09-03T10:29:59","guid":{"rendered":"https:\/\/blog.openshift.one\/?p=180"},"modified":"2024-09-03T12:39:49","modified_gmt":"2024-09-03T10:39:49","slug":"update-existing-openshift-node-network-configuration","status":"publish","type":"post","link":"https:\/\/blog.openshift.one\/index.php\/2024\/09\/03\/update-existing-openshift-node-network-configuration\/","title":{"rendered":"Update existing OpenShift Node network configuration"},"content":{"rendered":"\n<div style=\"height:5px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>OpenShift can be installed on various platforms (cloud, virtual, baremetal etc) using various tools (IPI, UPI, Assisted Installer or Agent-based Installer). For the last months I use the later the most and in this post I will focus on a cluster being installed with the Agent-based Installer.<\/p>\n\n\n\n<p>The whole installation procedure is out of the scope for this post, but you can dig deeper into it starting with this document: <a href=\"https:\/\/docs.openshift.com\/container-platform\/4.16\/installing\/installing_with_agent_based_installer\/preparing-to-install-with-agent-based-installer.html\">https:\/\/docs.openshift.com\/container-platform\/4.16\/installing\/installing_with_agent_based_installer\/preparing-to-install-with-agent-based-installer.html<\/a>.<\/p>\n\n\n\n<p>The Agent-based Installer allows you to flexibly configure nodes networking accordingly to your environment or needs. This can include bonded, tagged networks running on the top of SR-IOV devices if required. Network configuration at the installation time is extensively explained &#8211; with examples &#8211; in the OpenShift documentation: https:\/\/docs.openshift.com\/container-platform\/4.16\/installing\/installing_with_agent_based_installer\/preparing-to-install-with-agent-based-installer.html#agent-install-sample-config-bonds-vlans_preparing-to-install-with-agent-based-installer<\/p>\n\n\n\n<p>However what is not documented is how to update the configuration after the deployment. <br><strong>Please note: This may be subject to SLA constraints so please consider this post as an example only and do not apply anything to your production environments without checking it first with your friends from the Red Hat support.<\/strong><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Single NIC configuration<\/h2>\n\n\n\n<p>Let&#8217;s assume you deployed the cluster with just a very simple network configuration, everything is configured on a single physical NIC. In this example it is <code>enp1s0<\/code> with 192.168.232.101\/24 as on the diagram below.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"546\" src=\"https:\/\/blog.openshift.one\/wp-content\/uploads\/2024\/09\/Untitled-Frame-1-1-1024x546.jpg\" alt=\"\" class=\"wp-image-195\" style=\"width:1450px;height:auto\" srcset=\"https:\/\/blog.openshift.one\/wp-content\/uploads\/2024\/09\/Untitled-Frame-1-1-1024x546.jpg 1024w, https:\/\/blog.openshift.one\/wp-content\/uploads\/2024\/09\/Untitled-Frame-1-1-300x160.jpg 300w, https:\/\/blog.openshift.one\/wp-content\/uploads\/2024\/09\/Untitled-Frame-1-1-768x410.jpg 768w, https:\/\/blog.openshift.one\/wp-content\/uploads\/2024\/09\/Untitled-Frame-1-1-1536x819.jpg 1536w, https:\/\/blog.openshift.one\/wp-content\/uploads\/2024\/09\/Untitled-Frame-1-1-2048x1092.jpg 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>To make it slightly more complicated, IP configuration is static &#8211; you don&#8217;t use DHCP to assign it dynamically to the nodes.<\/p>\n\n\n\n<p>The <code>agent-config.yaml<\/code> file used to deploy nodes with such configuration could look like below:<\/p>\n\n\n\n<pre class=\"wp-block-code has-foreground-color has-tertiary-background-color has-text-color has-background\"><code>apiVersion: v1alpha1\nkind: AgentConfig\nrendezvousIP: 192.168.232.100\nhosts:\n(...)\n  - hostname: master-1\n    rootDeviceHints:\n      deviceName: \"\/dev\/sda\"\n<strong>    interfaces:\n      - name: enp1s0\n        macAddress: de:ad:be:ef:66:01<\/strong>\n    networkConfig:\n      routes:\n        config:\n          - destination: 0.0.0.0\/0\n            next-hop-address: 192.168.232.1\n            next-hop-interface: enp1s0\n            table-id: 254\n      dns-resolver:\n        config:\n          server:\n            - 192.168.232.1\n<strong>      interfaces:\n        - name: enp1s0\n          type: ethernet\n          state: up\n          ipv4:\n            dhcp: false\n            enabled: true\n            address:\n              - ip: 192.168.232.101\n                prefix-length: 24<\/strong>\n        - name: enp2s0\n          type: ethernet\n          state: down\n          ipv4:\n            dhcp: false\n            enabled: false\n        - name: enp3s0\n          type: ethernet\n          state: down\n          ipv4:\n            dhcp: false\n            enabled: false\n        - name: enp4s0\n          type: ethernet\n          state: down\n          ipv4:\n            dhcp: false\n            enabled: false\n(...)<\/code><\/pre>\n\n\n\n<div style=\"height:5px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>On the OpenShift node, the data above would translate into following interface configuration:<\/p>\n\n\n\n<pre class=\"wp-block-code has-background-color has-foreground-background-color has-text-color has-background\"><code># ip address show\n(...)\n<strong>2: enp1s0: &lt;BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovs-system state UP group default qlen 1000\n    link\/ether de:ad:be:ef:66:01 brd ff:ff:ff:ff:ff:ff<\/strong>\n(...)\n<strong>8: br-ex: &lt;BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000\n    link\/ether de:ad:be:ef:66:01 brd ff:ff:ff:ff:ff:ff\n    inet 192.168.232.101\/24 brd 192.168.232.255 scope global noprefixroute br-ex\n       valid_lft forever preferred_lft forever\n<\/strong>    inet 169.254.169.2\/29 brd 169.254.169.7 scope global br-ex\n       valid_lft forever preferred_lft forever<strong>\n    inet 192.168.232.111\/32 scope global vip\n       valid_lft forever preferred_lft forever\n    inet 192.168.232.110\/32 scope global vip\n       valid_lft forever preferred_lft forever<\/strong>\n(...)\n# ovs-vsctl list-ports <strong>br-ex<\/strong>\n<strong>enp1s0<\/strong>\npatch-br-ex_master-1-to-br-int<\/code><\/pre>\n\n\n\n<p>As you can see, OpenShift configured OVS bridge <code>br-ex<\/code> on the top of default route NIC which was there at the time of Kubernetes start (system boot). The IP configuration was inherited from it (192.168.232.101\/24 address) and additional IPs from the same range (machineNetwork &#8211; 192.168.232.0\/24) were configured as well (API and Ingress virtual IPs).<\/p>\n\n\n\n<p>While the configuration itself is correct it has one big disadvantage &#8211; in case of NIC\/cable\/switch port failure &#8211; node will immediately become unreachable. If the platform is supposed to host any important workloads, redundancy is what you&#8217;re looking for. <\/p>\n\n\n\n<p>So how to turn it into something more production grade? Get a bond configured to provide NIC level redundancy. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Network bond types in Linux<\/h2>\n\n\n\n<p>There is few type of bond configurations available in Linux. For detailed description please refer to <a href=\"https:\/\/docs.redhat.com\/en\/documentation\/red_hat_enterprise_linux\/9\/html\/configuring_and_managing_networking\/configuring-network-bonding_configuring-and-managing-networking\">https:\/\/docs.redhat.com\/en\/documentation\/red_hat_enterprise_linux\/9\/html\/configuring_and_managing_networking\/configuring-network-bonding_configuring-and-managing-networking<\/a>. From my perspective the two most commonly used are active-backup and 802.3ad (LACP). The first one is simple active-backup and does not require any fancy switch or configuration on it, the second is more complicated, requires switch supporting LACP and proper configuration in place, but offers much more benefits over the simple one. Due to the restrictions in my lab I will focus on the easy one \ud83d\ude09<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Alternative options to apply configuration update<\/h2>\n\n\n\n<p>Since the node was already deployed, update of <code>agent-config.yaml<\/code> file won&#8217;t help here.<\/p>\n\n\n\n<p>Using NMState Operator (https:\/\/docs.openshift.com\/container-platform\/4.16\/networking\/k8s_nmstate\/k8s-nmstate-about-the-k8s-nmstate-operator.html) isn&#8217;t good idea as it applies desired configuration AFTER Kubernetes is started and already bound to the IP address. In such case you will lose connectivity to the Kubernetes processes running on the node as they don&#8217;t like IP to travel between interfaces. You could restart them after NMState did it thing but this will happen each time you reboot the node &#8211; simple race condition.<\/p>\n\n\n\n<p>Potentially MachineConfigOperator could help here (https:\/\/docs.openshift.com\/container-platform\/4.16\/machine_configuration\/index.html). The challenge here is with statically configured IP addresses. The MCO is helpful to apply the same setting across fleet of nodes, not to each of them individually. That would work if you have DHCP there as each of the nodes would have exactly the same configuration, while IP configuration is differentiated between the nodes at DHCP configuration level. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Using nmcli in offline mode to generate configuration files<\/h2>\n\n\n\n<p>Nevertheless the MachineConfigOperator approach would do exactly the same what I will propose next &#8211; configure NetworkManager connection keyfiles in <code>\/etc\/NetworkManager\/system-connections<\/code> folder. Let&#8217;s take a look what&#8217;s there now:<\/p>\n\n\n\n<pre class=\"wp-block-code has-background-color has-foreground-background-color has-text-color has-background\"><code># ls -la \/etc\/NetworkManager\/system-connections\/\ntotal 16\ndrwxr-xr-x. 2 root root 114 Aug 30 12:27 .\ndrwxr-xr-x. 7 root root 134 Aug 30 12:26 ..\n<strong>-rw-------. 1 root root 383 Aug 30 12:27 enp1s0.nmconnection\n<\/strong>-rw-------. 1 root root 265 Aug 30 12:27 enp2s0.nmconnection\n-rw-------. 1 root root 265 Aug 30 12:27 enp3s0.nmconnection\n-rw-------. 1 root root 265 Aug 30 12:27 enp4s0.nmconnection\n\n# cat \/etc\/NetworkManager\/system-connections\/<strong>enp1s0.nmconnection<\/strong>\n&#91;connection]\nautoconnect=true\nautoconnect-slaves=-1\nid=<strong>enp1s0<\/strong>\ninterface-name=<strong>enp1s0<\/strong>\ntype=802-3-ethernet\nuuid=11655ff4-09bf-500f-9f38-9bfcc91ab36b\nautoconnect-priority=1\n\n&#91;ipv4]\naddress0=<strong>192.168.232.101\/24<\/strong>\ndhcp-timeout=2147483647\ndns=192.168.232.1\ndns-priority=40\nmethod=manual\nroute0=0.0.0.0\/0,192.168.232.1,0\nroute0_options=table=254\n\n&#91;ipv6]\ndhcp-timeout=2147483647\nmethod=disabled<\/code><\/pre>\n\n\n\n<p>As you can see above, Agent-based installer just created these static files with configuration that has been specified in the <code>agent-config.yaml<\/code> file. The good thing is the files are persistent across reboots and upgrades, so even it isn&#8217;t the most elegant approach (in case of CoreOS you should refrain from editing files manually) you can update them manually and nothing bad should happen \ud83d\ude09<\/p>\n\n\n\n<p>While you could potentially create new files manually I believe it is safer to use <code>nmctl<\/code> tool to generate it &#8211; this will ensure syntax is correct and safe you some troubleshooting later on. To start please become familiar with <a href=\"https:\/\/docs.redhat.com\/en\/documentation\/red_hat_enterprise_linux\/9\/html\/configuring_and_managing_networking\/assembly_networkmanager-connection-profiles-in-keyfile-format_configuring-and-managing-networking#proc_using-nmcli-to-create-keyfile-connection-profiles-in-offline-mode_assembly_networkmanager-connection-profiles-in-keyfile-format \">https:\/\/docs.redhat.com\/en\/documentation\/red_hat_enterprise_linux\/9\/html\/configuring_and_managing_networking\/assembly_networkmanager-connection-profiles-in-keyfile-format_configuring-and-managing-networking#proc_using-nmcli-to-create-keyfile-connection-profiles-in-offline-mode_assembly_networkmanager-connection-profiles-in-keyfile-format <\/a>document as it explains how and why to use nmcli in offline mode (spoiler alert: offline mode creates keyfile connection profiles without touching running configuration).<\/p>\n\n\n\n<p>Once you know how to use <code>nmcli<\/code> in offline mode and it happen you want to configure NIC bonding there, it will be good idea to become with your bond configuration options too: <a href=\"https:\/\/docs.redhat.com\/en\/documentation\/red_hat_enterprise_linux\/9\/html\/configuring_and_managing_networking\/configuring-network-bonding_configuring-and-managing-networking#understanding-the-default-behavior-of-controller-and-port-interfaces_configuring-network-bonding\">https:\/\/docs.redhat.com\/en\/documentation\/red_hat_enterprise_linux\/9\/html\/configuring_and_managing_networking\/configuring-network-bonding_configuring-and-managing-networking#understanding-the-default-behavior-of-controller-and-port-interfaces_configuring-network-bonding<\/a><\/p>\n\n\n\n<p>Let&#8217;s create bond interface first &#8211; remember, there will bo no change in the running configuration &#8211; the command below will create nmconnection keyfile only and it will be applied only when you reboot the node. <br><strong>Please note the <code>umask<\/code> command &#8211; NetworkManager expect nmconnection files to be readable only for root user, otherwise it will ignore them<\/strong>.<\/p>\n\n\n\n<pre class=\"wp-block-code has-background-color has-foreground-background-color has-text-color has-background\"><code># umask 0077 &amp;&amp; nmcli --offline connection add type bond con-name bond0 ifname bond0 bond.options 'mode=active-backup' ipv4.gateway '192.168.232.1' ipv4.addresses '192.168.232.101\/24' ipv4.dns '192.168.232.1' ipv4.method manual | tee \/etc\/NetworkManager\/system-connections\/bond0.nmconnection\n\n&#91;connection]\nid=bond0\nuuid=f86482ad-8145-4624-8018-b9257ef60686\ntype=bond\ninterface-name=bond0\n\n&#91;bond]\nmode=active-backup\n\n&#91;ipv4]\naddress1=192.168.232.101\/24,192.168.232.1\ndns=192.168.232.1;\nmethod=auto\n\n&#91;ipv6]\naddr-gen-mode=default\nmethod=auto\n\n&#91;proxy]<\/code><\/pre>\n\n\n\n<p>Now let&#8217;s ensure the bond will be assigned with two NICs &#8211; <code>enp1s0<\/code> and <code>enp2s0<\/code>:<\/p>\n\n\n\n<pre class=\"wp-block-code has-background-color has-foreground-background-color has-text-color has-background\"><code># umask 0077 &amp;&amp; nmcli --offline connection add type ethernet port-type bond con-name bond0-port1 ifname enp1s0 controller bond0 | sed '\/^Warning:.*\/d' | tee \/etc\/NetworkManager\/system-connections\/enp1s0.nmconnection\n\n&#91;connection]\nid=bond0-port1\nuuid=f4885096-6a7f-4741-ae32-2b35513ebf21\ntype=ethernet\ncontroller=bond0\ninterface-name=enp1s0\nmaster=bond0\nport-type=bond\nslave-type=bond\n\n&#91;ethernet]\n\n&#91;bond-port]\n\n# umask 0077 &amp;&amp; nmcli --offline connection add type ethernet port-type bond con-name bond0-port2 ifname enp2s0 controller bond0 | sed '\/^Warning:.*\/d' | tee \/etc\/NetworkManager\/system-connections\/enp2s0.nmconnection\n\n&#91;connection]\nid=bond0-port2\nuuid=32345c95-9809-49e8-9a89-ddfbc87be8d0\ntype=ethernet\ncontroller=bond0\ninterface-name=enp2s0\nmaster=bond0\nport-type=bond\nslave-type=bond\n\n&#91;ethernet]\n\n&#91;bond-port]<\/code><\/pre>\n\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">New configuration verification<\/h2>\n\n\n\n<p>Now once you reboot the node, it should come back on-line with new network configuration:<\/p>\n\n\n\n<pre class=\"wp-block-code has-background-color has-foreground-background-color has-text-color has-background\"><code># ip address show\n(...)\n2: enp1s0: &lt;BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP group default qlen 1000\n    link\/ether de:ad:be:ef:66:01 brd ff:ff:ff:ff:ff:ff\n3: enp2s0: &lt;BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP group default qlen 1000\n    link\/ether de:ad:be:ef:66:01 brd ff:ff:ff:ff:ff:ff permaddr de:ad:be:ef:67:01\n(...)\n10: bond0: &lt;BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovs-system state UP group default qlen 1000\n    link\/ether de:ad:be:ef:66:01 brd ff:ff:ff:ff:ff:ff\n11: br-ex: &lt;BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000\n    link\/ether de:ad:be:ef:66:01 brd ff:ff:ff:ff:ff:ff\n    inet 192.168.232.101\/24 brd 192.168.232.255 scope global noprefixroute br-ex\n       valid_lft forever preferred_lft forever\n    inet 169.254.169.2\/29 brd 169.254.169.7 scope global br-ex\n       valid_lft forever preferred_lft forever\n    inet 192.168.232.111\/32 scope global vip\n       valid_lft forever preferred_lft forever\n    inet6 fe80::6913:da4:9d90:fb4\/64 scope link noprefixroute\n       valid_lft forever preferred_lft forever\n\n# cat \/proc\/net\/bonding\/bond0\nEthernet Channel Bonding Driver: v5.14.0-427.33.1.el9_4.x86_64\n\nBonding Mode: fault-tolerance (active-backup)\nPrimary Slave: None\nCurrently Active Slave: enp2s0\nMII Status: up\nMII Polling Interval (ms): 100\nUp Delay (ms): 0\nDown Delay (ms): 0\nPeer Notification Delay (ms): 0\n\nSlave Interface: enp2s0\nMII Status: up\nSpeed: Unknown\nDuplex: Unknown\nLink Failure Count: 0\nPermanent HW addr: de:ad:be:ef:67:01\nSlave queue ID: 0\n\nSlave Interface: enp1s0\nMII Status: up\nSpeed: Unknown\nDuplex: Unknown\nLink Failure Count: 0\nPermanent HW addr: de:ad:be:ef:66:01\nSlave queue ID: 0\n\n# ovs-vsctl list-ports br-ex\nbond0\npatch-br-ex_master-1-to-br-int<\/code><\/pre>\n\n\n\n<p>As you can see there is new bond interface which is connected to br-ex (this happens automatically when Kubernetes starts). The bond is configured on the top of enp1s0 and enp2s0 interfaces with active-backup mode. Therefore we can conclude this post now with the following network layout diagram.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"546\" src=\"https:\/\/blog.openshift.one\/wp-content\/uploads\/2024\/09\/Untitled-Frame-2-2-1024x546.jpg\" alt=\"\" class=\"wp-image-196\" style=\"width:1450px;height:auto\" srcset=\"https:\/\/blog.openshift.one\/wp-content\/uploads\/2024\/09\/Untitled-Frame-2-2-1024x546.jpg 1024w, https:\/\/blog.openshift.one\/wp-content\/uploads\/2024\/09\/Untitled-Frame-2-2-300x160.jpg 300w, https:\/\/blog.openshift.one\/wp-content\/uploads\/2024\/09\/Untitled-Frame-2-2-768x410.jpg 768w, https:\/\/blog.openshift.one\/wp-content\/uploads\/2024\/09\/Untitled-Frame-2-2-1536x819.jpg 1536w, https:\/\/blog.openshift.one\/wp-content\/uploads\/2024\/09\/Untitled-Frame-2-2-2048x1092.jpg 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>OpenShift can be installed on various platforms (cloud, virtual, baremetal etc) using various tools (IPI, UPI, Assisted Installer or Agent-based Installer). For the last months I use the later the most and in this post I will focus on a cluster being installed with the Agent-based Installer. The whole installation procedure is out of the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[19,6],"tags":[23,21,20,9],"class_list":["post-180","post","type-post","status-publish","format-standard","hentry","category-network","category-openshift","tag-bond","tag-configuration","tag-network","tag-openshift"],"_links":{"self":[{"href":"https:\/\/blog.openshift.one\/index.php\/wp-json\/wp\/v2\/posts\/180","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.openshift.one\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.openshift.one\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.openshift.one\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.openshift.one\/index.php\/wp-json\/wp\/v2\/comments?post=180"}],"version-history":[{"count":12,"href":"https:\/\/blog.openshift.one\/index.php\/wp-json\/wp\/v2\/posts\/180\/revisions"}],"predecessor-version":[{"id":200,"href":"https:\/\/blog.openshift.one\/index.php\/wp-json\/wp\/v2\/posts\/180\/revisions\/200"}],"wp:attachment":[{"href":"https:\/\/blog.openshift.one\/index.php\/wp-json\/wp\/v2\/media?parent=180"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.openshift.one\/index.php\/wp-json\/wp\/v2\/categories?post=180"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.openshift.one\/index.php\/wp-json\/wp\/v2\/tags?post=180"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}