In the first post on NSX-T Data Center 2.5 Upgrades, the admin went through all of the necessary checklists and proper prerequisite procedures to ensure a smooth and successful upgrade of their NSX-T Data Center 2.4.1 deployment.
Reviewing the requirements and the step by step process:
- Upgrade the NSX-T deployment from 2.4.1 to 2.5.0
- Note any outages that may occur
- Check VMware Product Interoperability Matrix
- Check VMware Product Interoperation Matrix – Upgrade Path
- Take Backup of NSX-T Manager
- Check NSX-T 2.5 Upgrade Guide Official Documentation
- Perform upgrade – Steps 4a – 4l
- Post-upgrade tasks
- Troubleshoot upgrade errors
The NSX-T Data Center 2.5 documentation offers a checklist for customers to follow that the admin has followed thus far up to Step 3g – If you are using NSX Cloud for your public cloud workload VMs, upgrade NSX Cloud components.
With the prerequisites checked, the admin is now ready to perform the actual upgrade of NSX-T Data Center from 2.4.1 to 2.5.
Step 4h – Upgrade your upgrade coordinator
The first step to the upgrade process is to SSH to the NSX-T Manager CLI via admin and run ‘get services’ to verify that the install-upgrade service is running on the NSX-T Manager node.
The upgrade of the upgrade coordinator starts by uploading the NSX-T Data Center upgrade bundle .mub file to the NSX-T Manager.
This part can take 10-20 minutes to upload and confirm the upgrade bundle. Once the bundle checks out ready to go, the admin sees that they can now begin the upgrade procedure.
Once the ‘Begin Upgrade’ button is pressed, the admin is presented with a EULA agreement and then the actual button to ‘Continue’ and upgrade the Upgrade Coordinator as the first step in the process.
With the Upgrade Coordinator finished upgrading, the admin can now run the prechecks for upgrading the rest of the system.
The prechecks for upgrading will show all of the issues that could present a complication during the upgrade process. The admin notices a few issues reported back by the precheck scan.
When the admin takes a look deeper, they can see just what each precheck issue is and if it really impacts the upgrade.
Looking at the Hosts precheck issues, the admin can see that even with two compute managers connected to NSX-T, we can upgrade NSX-T across all of those hosts regardless of the compute manager. The host with the issue is a single host in a cluster without DRS enabled. This host is the test host that the organization uses for testing restores with. It’s a single host cluster and DRS is not necessary, and it’s not hosting any workloads, so the precheck issue can be safely ignored however, the admin will have to place the host in maintenance mode manually to perform the upgrade on this host. If this had been a non-DRS enabled cluster, this is a precheck reminder to enable it so that the hosts can be upgraded without impact to the workloads running on them. DRS will evacuate the hosts of the workloads as the host goes into maintenance mode to perform the NSX-T Data Center 2.5 upgrade. The admin decides to turn on DRS on the cluster to prevent any warnings or issues during the process.
Taking a look at the next issue around the Management Nodes, the admin can see the warning about the need to ensure that the TCP port 1234 is open to prevent communications errors.
With all of the precheck issues having been triaged and accounted for, the admin can proceed to the next step of the upgrade process.
Step 4i – Upgrade the Edge Cluster
The first component of the NSX-T Data Center deployment that’s upgraded are the Edge Clusters and the Edge Nodes that encompass that cluster. The admin changes the Upgrade process to ‘Serial’ since this is their first upgrade and they want to watch the process take place and go slow and steady. The NSX-T Data Center Upgrade Coordinator offers a few ways to perform the upgrade from one-click to granular and more slow and steady approaches. During the upgrade, the admin can see the progress and which Edge Node is being upgraded at the time.
The process of upgrade the Edge Nodes finish, and the admin can see all of the details and ‘Successful’ as the status. They can now proceed to the next step.
Step 4k – Upgrade the hosts
Moving to the next step of upgrading the host clusters, the admin can see all of the clusters under the NSX-T domain, over both vCenters. They select a serial upgrade order, and press ‘Start’.
Each cluster is upgraded separately when using Serial. Going forward, this can be switched to parallel for faster upgrades. Looking at vCenter Server, the admin can watch the process and the hosts being upgraded.
Now that the hosts have been upgraded, the final step is to upgrade the NSX-T Manager node.
Step 4l – Upgrade the Management plane
The last step in the process is to upgrade the NSX-T Manager node itself. During the upgrade there is a warning about any changes, access to the UI, or REST API operations could be impacted during the upgrade process. This is temporary and all data plane operations still continue to function regardless the status of the NSX-T Manager.
During the upgrade of the NSX-T Manager, there is a connection loss with the UI. This is typical and will reestablish for the admin when the upgrade is completed.
When the upgrade finishes, the admin can log back into the NSX-T Manager UI and see the upgrade has completed successfully.
Step 5 – Post-upgrade tasks
With a successful upgrade completed, the admin can now run through the post-upgrade check list to verify that the components are all online and functional. The post checks are similar to the initial pre-upgrade checks performed in the first blog post.
- Overview Dashboard
The admin notices an increase in system load and will continue to monitor to see if it goes down after time.
- Fabric Nodes
- Host Transport Nodes
The admin notices that one of the ESXi hosts has Tunnels ‘Not Available’. This is because there are no workloads on this specific hosts running across the overlay segment so not tunnels are necessary to be up.
- Edge Transport Nodes
- Edge Clusters
The admin connects via SSH to one of the Edge Nodes in the Edge Cluster and runs ‘get edge-cluster status’ to verify that the high availability of the Edge Cluster is running like normal.
The last verification is to check connectivity from one of the workloads on the overlay.
- North-South Connectivity
- East-West Connectivity
The admin can see that there is connectivity to both the North-South and East-West from a workload on the NSX-T domain.
Step 6 – Troubleshoot upgrade errors
The admin has taken a look at all of the verification necessary for the platform and only notes an increase in CPU load on the NSX-T Manager which they will monitor to see if it comes down.
This concludes a successful upgrade of the Healthcare organization’s NSX-T Data Center 2.4.1 deployment to NSX-T Data Center 2.5. They’ll now be able to take advantage of all of the enhancements and new use cases they’ll need the platform for.