NSX-T Data Center 2.5 Upgrade – Practical

NSX-T Data Center 2.5 Upgrade – Planning

NSX-T Data Center 2.5 Upgrade – Practical

In the first post on NSX-T Data Center 2.5 Upgrades, the admin went through all of the necessary checklists and proper prerequisite procedures to ensure a smooth and successful upgrade of their NSX-T Data Center 2.4.1 deployment.

Reviewing the requirements and the step by step process:

Requirements:

  • Upgrade the NSX-T deployment from 2.4.1 to 2.5.0
  • Note any outages that may occur

Steps:

  1. Check VMware Product Interoperability Matrix
  2. Check VMware Product Interoperation Matrix – Upgrade Path
  3. Take Backup of NSX-T Manager
  4. Check NSX-T 2.5 Upgrade Guide Official Documentation
    1. Perform upgrade – Steps 4a – 4l
  5. Post-upgrade tasks
  6. Troubleshoot upgrade errors

The NSX-T Data Center 2.5 documentation offers a checklist for customers to follow that the admin has followed thus far up to Step 3g – If you are using NSX Cloud for your public cloud workload VMs, upgrade NSX Cloud components.

upgrade_pic2

With the prerequisites checked, the admin is now ready to perform the actual upgrade of NSX-T Data Center from 2.4.1 to 2.5.

Step 4h – Upgrade your upgrade coordinator

The first step to the upgrade process is to SSH to the NSX-T Manager CLI via admin and run ‘get services’ to verify that the install-upgrade service is running on the NSX-T Manager node.

upgrade_practical_pic2

The upgrade of the upgrade coordinator starts by uploading the NSX-T Data Center upgrade bundle .mub file to the NSX-T Manager.

upgrade_practical_pic1

 

This part can take 10-20 minutes to upload and confirm the upgrade bundle.  Once the bundle checks out ready to go, the admin sees that they can now begin the upgrade procedure.

upgrade_practical_pic3

Once the ‘Begin Upgrade’ button is pressed, the admin is presented with a EULA agreement and then the actual button to ‘Continue’ and upgrade the Upgrade Coordinator as the first step in the process.

upgrade_practical_pic4

With the Upgrade Coordinator finished upgrading, the admin can now run the prechecks for upgrading the rest of the system.

upgrade_practical_pic5

The prechecks for upgrading will show all of the issues that could present a complication during the upgrade process.  The admin notices a few issues reported back by the precheck scan.

upgrade_practical_pic7

When the admin takes a look deeper, they can see just what each precheck issue is and if it really impacts the upgrade.

Looking at the Hosts precheck issues, the admin can see that even with two compute managers connected to NSX-T, we can upgrade NSX-T across all of those hosts regardless of the compute manager.  The host with the issue is a single host in a cluster without DRS enabled.  This host is the test host that the organization uses for testing restores with.  It’s a single host cluster and DRS is not necessary, and it’s not hosting any workloads, so the precheck issue can be safely ignored however, the admin will have to place the host in maintenance mode manually to perform the upgrade on this host.  If this had been a non-DRS enabled cluster, this is a precheck reminder to enable it so that the hosts can be upgraded without impact to the workloads running on them.  DRS will evacuate the hosts of the workloads as the host goes into maintenance mode to perform the NSX-T Data Center 2.5 upgrade.  The admin decides to turn on DRS on the cluster to prevent any warnings or issues during the process.

upgrade_practical_pic6

Taking a look at the next issue around the Management Nodes, the admin can see the warning about the need to ensure that the TCP port 1234 is open to prevent communications errors.

upgrade_practical_pic8

With all of the precheck issues having been triaged and accounted for, the admin can proceed to the next step of the upgrade process.

Step 4i – Upgrade the Edge Cluster

The first component of the NSX-T Data Center deployment that’s upgraded are the Edge Clusters and the Edge Nodes that encompass that cluster.  The admin changes the Upgrade process to ‘Serial’ since this is their first upgrade and they want to watch the process take place and go slow and steady.  The NSX-T Data Center Upgrade Coordinator offers a few ways to perform the upgrade from one-click to granular and more slow and steady approaches.  During the upgrade, the admin can see the progress and which Edge Node is being upgraded at the time.

upgrade_practical_pic10

The process of upgrade the Edge Nodes finish, and the admin can see all of the details and ‘Successful’ as the status.  They can now proceed to the next step.

upgrade_practical_pic11

Step 4k – Upgrade the hosts

Moving to the next step of upgrading the host clusters, the admin can see all of the clusters under the NSX-T domain, over both vCenters.  They select a serial upgrade order, and press ‘Start’.

upgrade_practical_pic12

Each cluster is upgraded separately when using Serial.  Going forward, this can be switched to parallel for faster upgrades.  Looking at vCenter Server, the admin can watch the process and the hosts being upgraded.

upgrade_practical_pic13

upgrade_practical_pic14

upgrade_practical_pic15

Now that the hosts have been upgraded, the final step is to upgrade the NSX-T Manager node.

Step 4l – Upgrade the Management plane

The last step in the process is to upgrade the NSX-T Manager node itself.  During the upgrade there is a warning about any changes, access to the UI, or REST API operations could be impacted during the upgrade process.  This is temporary and all data plane operations still continue to function regardless the status of the NSX-T Manager.

upgrade_practical_pic16

During the upgrade of the NSX-T Manager, there is a connection loss with the UI.  This is typical and will reestablish for the admin when the upgrade is completed.

upgrade_practical_pic17

When the upgrade finishes, the admin can log back into the NSX-T Manager UI and see the upgrade has completed successfully.

Step 5 – Post-upgrade tasks

With a successful upgrade completed, the admin can now run through the post-upgrade check list to verify that the components are all online and functional.  The post checks are similar to the initial pre-upgrade checks performed in the first blog post.

  • Overview Dashboard

upgrade_practical_pic18

The admin notices an increase in system load and will continue to monitor to see if it goes down after time.

  • Fabric Nodes
    • Host Transport Nodes

upgrade_practical_pic19

The admin notices that one of the ESXi hosts has Tunnels ‘Not Available’.  This is because there are no workloads on this specific hosts running across the overlay segment so not tunnels are necessary to be up.

  • Edge Transport Nodes

upgrade_practical_pic20

  • Edge Clusters

upgrade_practical_pic21

The admin connects via SSH to one of the Edge Nodes in the Edge Cluster and runs ‘get edge-cluster status’ to verify that the high availability of the Edge Cluster is running like normal.

The last verification is to check connectivity from one of the workloads on the overlay.

  • North-South Connectivity
  • East-West Connectivity

upgrade_practical_pic22

The admin can see that there is connectivity to both the North-South and East-West from a workload on the NSX-T domain.

Step 6 – Troubleshoot upgrade errors

The admin has taken a look at all of the verification necessary for the platform and only notes an increase in CPU load on the NSX-T Manager which they will monitor to see if it comes down.

This concludes a successful upgrade of the Healthcare organization’s NSX-T Data Center 2.4.1 deployment to NSX-T Data Center 2.5.  They’ll now be able to take advantage of all of the enhancements and new use cases they’ll need the platform for.

 

Advertisements

NSX-T Data Center 2.5 Upgrade – Planning

NSX-T Data Center 2.5 Upgrade – Planning

NSX-T Data Center 2.5 Upgrade – Practical

Upgrades to software-based products are generally thought of just ‘next, next, next’ and done.  When you’re using a software platform for running your most critical business workloads, you’re usually a bit more careful.

The Healthcare organization has taken a look at the most recent release of NSX-T Data Center, version 2.5, since putting NSX-T Data Center 2.4.1 in recently.  There are several new use cases that have come up that the organization feels that 2.5 can provide them so they’re looking into what is necessary to perform an upgrade of their current NSX-T deployment.

Requirements:

  • Upgrade the NSX-T deployment from 2.4.1 to 2.5.0
  • Note any outages that may occur

Steps:

  1. Check VMware Product Interoperability Matrix
  2. Check VMware Product Interoperation Matrix – Upgrade Path
  3. Take Backup of NSX-T Manager
  4. Check NSX-T 2.5 Upgrade Guide Official Documentation
    1. Perform upgrade – Steps 4a – 4l
  5. Post-upgrade tasks
  6. Troubleshoot upgrade errors

Step 1 – Check VMware Product Interoperability Matrix

One of the first things to do is to check the VMware Product Interoperability Matrix to ensure that the version of ESXi and vCenter Server are compatible with NSX-T Data Center 2.5.  The organization’s infrastructure is running ESXi 6.7 U2 and vCenter Server version 6.7 U3.

upgrade_pic1

From this chart, it appears that NSX-T Data Center 2.5.0 supports the version of ESXi and vCenter Server necessary.

Step 2 – Check VMware Product Interoperability Matrix – Upgrade Path

While on the same web page, by clicking on the Upgrade Path tab, the admin can see what versions of NSX-T Data Center are supported upgrade paths.

upgrade_pic6

Step 3 – Take Backup of NSX-T Manager

The admin runs a current backup of the NSX-T Manager in case a restore is necessary.

restore_pract_pic1

Step 4 – Check NSX-T 2.5 Upgrade Guide Official Documentation

Digging into the NSX-T 2.5 Upgrade Guide, VMware has provided a checklist of items to review for upgrading NSX-T Data Center.

upgrade_pic2

Each of these tasks has sections to follow for performing the upgrade of NSX-T.  The admin will add these steps to the existing steps as part of the overall plan.

Step 3a – Review the known upgrade problems and workaround documented in the NSX-T Data Center release notes.

Doing a quick search in the Release Notes for NSX-T Data Center 2.5 for the word ‘upgrade’, the admin starts looking through the matches to see any changes that might impact the upgrade process.

There are a few items that stand out:

  • Messaging Port Change – There is a port change in the messaging channel from the NSX-T Manager to the Transport and Edge Nodes. This TCP port has changed from TCP 5671 to TCP 1234.
  • Upgrade Order Change– When upgrading to NSX-T 2.5, the new upgrade order is Edge-component upgrade before Host component upgrade. This enhancement provides significant benefits when upgrading the cloud infrastructure by allowing optimizations to reduce the overall maintenance window.

The admin notes these impacts and has checked the other smaller upgrade issue changes and noted them in case they run into any of them.

Step 3b – Follow the system configuration requirements and prepare your infrastructure

Reviewing the NSX-T Data Center 2.5 Installation Guide, the admin takes a look at the following components:

  • NSX-T Manager – no changes in the requirements from 2.4.1 to 2.5 in terms of size, disk, vCPU, or memory requirements
  • ESXi Hypervisors – no changes in the requirements from 2.4.1 to 2.5 and the admin verified that the ESXi version was listed on the VMware Product Interoperability Matrix.

Step 3c – Evaluate the operational impact of the upgrade.

  • Manager Upgrade – TCP port 1234 will replace TCP port 5671 from NSX-T Manager to Edge Nodes and Transport Nodes. There should be no impact as there is currently no firewall between the NSX-T Manager and the Transport or Edge Nodes.
  • Edge Cluster Upgrade – One of the notable impacts looking over the official documentation for the upgrade, is the possible impact to the North-South datapath during the upgrade of the Edge Nodes and disruption between East-West datapath traffic. This possible disruption of traffic will require the admin to notify their change management and perform the upgrade during a maintenance period where a disruption has minimal impact to the business.
  • Hosts Upgrade – All ESXi hosts are in a DRS-enabled cluster and hosts will be placed into maintenance mode and flushed before upgraded. No impact to the running VMs is anticipated.

Step 3d – Upgrade your supported hypervisor.

The admin confirms that they are running the appropriate version of VMware vSphere that is supported by NSX-T Data Center 2.5.

upgrade_pic3

upgrade_pic4

Provided from – https://kb.vmware.com/s/article/2143832

Step 3e – Verify that the NSX-T Data Center environment is in a healthy state.

To perform this step, the admin logs into the NSX-T Manager and checks the following locations for errors:

  • Overview dashboard

upgrade_pic8

  • Fabric Nodes
    • Host Transport Nodes

upgrade_pic9

  • Edge Transport Nodes

upgrade_pic10

  • Edge Clusters

Checking the Edge cluster status and the high availability for the cluster requires checking the CLI from one of the Edge nodes in the cluster.  Logging in as ‘admin’ via SSH to one of the Edge Nodes, run the following command – ‘get edge-cluster status’.

upgrade_pic11

Then the admin will double check from a VM:

  • North-South connectivity
  • East-West connectivity

A quick RDP session into one of the production servers and both connectivity needs can be tested.

upgrade_pic7

Step 3f – Download the latest NSX-T Data Center upgrade bundle.

The admin visits http://my.vmware.com, logs in, and downloads the appropriate upgrade bundle for NSX-T Data Center 2.5.  The file comes in a .mub file extension.

upgrade_pic5

Step 3g – If you are using NSX Cloud for your public cloud workload VMs, upgrade NSX Cloud components.

The organization is not currently using any cloud-based workloads, so this step is not applicable at this time.

The steps that proceed after the last step are part of the actual upgrade process.  Those steps will be continued in the next post.

This blog goes through a typical check during an upgrade process.  There may be other processes that other organizations take prior to upgrading and this blog is not meant to be encompassing of every step another organization may take.

Using NSX-T to Test NSX-T and Virtual Machine Recovery with Automation – Practical

Part 1 – Windows SFTP Backup Targets

Part 2a – Using NSX-T to Test NSX-T and Virtual Machine Recovery with Automation – Concept

Part 2b –Using NSX-T to Test NSX-T and Virtual Machine Recovery with Automation – Practical

In Part 2a, the Healthcare organization admins had created several scripts using VMware PowerCLI, PowerShell Core 6, OVF Tool, and NSX-T Policy REST APIs.  Those scripts are located at the following GitHub link for other community admins to consume as well.

The original requirements that were put forth for the admins to provide a design for were:

Requirements:

  1. Use NSX-T to build a production replica network to test restores of the NSX-T Manager and show virtual machines can also be restored and tested on the same network
  2. Use Veeam to restore the following virtual machines:
    1. Backup Server – Will be used to run automation scripts from
    2. Active Directory – Will be needed for DNS purposes
    3. SFTP Server – Hosts the NSX-T backups that restores will be tested from
  3. Deploy a new NSX-T Manager to test the restore process to it
  4. Use automation wherever possible to continue expanding automated techniques

To meet these requirements the admin had designed the following topology to meet these requirements:

finish_topology

  • Standalone Tier-1 Gateway – not connected to any Tier-0 Gateway, preventing northbound communications that would conflict with the production networking
  • Restore Network Segment – Provides a logical network for the restored VMs to attach to
  • Restored Domain Controller – One of the organizations domain controllers that will provide DNS for the replica network and the VMs attached
  • Restored Backup Server – Hosts the PowerShell scripts that are necessary for scripting part of the deployment on the restored NSX-T Manager. Some of the scripts will need to be run from the Production Backup Server and some of them from the Restored Backup Server since there will be no outside communications to the Restore environment other than vCenter Server direct console access
  • Restored SFTP Server – Hosts the backups of the NSX-T Manager
  • Restored NSX-T Manager – Will be used to test its own restores. NSX-T Manager restores requires that the new NSX-T Manager have the same IP address as the production copy.  To test this appropriately, we have to create a copy of the production network and IP addressing
  • vCenter Server B – Manages the Compute Cluster B
  • Compute Cluster B – Provides a non-production host for the restored systems to be placed on that’s not managed by the production vCenter Server A.

For further details on reasonings for this topology, you can take a look at Part 2a referenced at the top of this thread.

With the scripts created, it’s now time for the admin to work through the workflow processes and test that this strategy will meet the requirements in practice.  This is a review of the workflow process:

restores_pic10

Step 1 – Copy scripts to BACKUP-01a – GitHub download and copy

The scripts just need to be pulled down from GitHub and copied to a location on the BACKUP-01a server

Step 2 – Copy NSX-T OVA to BACKUP-01a – Download and copy

Another straightforward step with downloading the NSX-T OVA that’s the exact version of the current NSX-T Manager and copying it to a location on BACKUP-01a

Step 3 – Install PowerShell Core 6, PowerCLI, and OVFTool – Download installs and install

restore_pract_pic2

Step 4 – Perform a Backup of the NSX-T Manager – Native Backup Tool

A pretty simple step by just going into the NSX-T Manager and the Backup & Restore tab and pressing the ‘BACKUP NOW’ button and verifying its completion.

restore_pract_pic1

Step 5 – Backup SFTP-01a, AD-01a, BACKUP-01a – Single Veeam Backup Job

Once all of the components to perform the remaining workflows are done and installed and configured, the backups of the necessary virtual machines, especially the BACKUP-01a machine, can occur.

restore_pract_pic3

Step 6 and 7 – Deploy Testing Tier-1 Gateway and Segment – NSX-T Policy API via PowerCLI

From the BACKUP-01a production server, the admin runs the 01_NSXT_DEPLOY.ps1 to build the Tier-1 Gateway and Segment and then it will start the OVF Tool to deploy the NSX-T Manager OVA file to the Compute Cluster B.

restore_pract_pic4

Tier-1 Gateway has been created, not linked to a Tier-0 Gateway to prevent Northbound connectivity with the overlapping production network and ‘nsxt-restore-segment’ created for the virtual machines and new NSX-T Manager to attach to.

restore_pract_pic5

restore_pract_pic6

The admin can also see that the new NSX-T Manager, connected to the ‘nsxt-restore-segment’ is being deployed.

restore_pract_pic7

Step 8 – Adjust NSX-T CPU/Mem Resources and Power-On – PowerCLI

Once the new NSX-T Manager is deployed, the admin wants to adjust the memory reservation so that they can start the NSX-T Manager without running into memory constraints since the test environment is rather limited.  The deployed NSX-T Manager is in ‘small’ form factor, but still has a 16GB Memory reservation on it.  From the BACKUP-01a production server, the admin runs the 02_NSXT_RESERVATION_ADJUST.ps1 to adjust the memory reservation down to 8GB and then power on the appliance.

restore_pract_pic8

Step 9 – Restore VMs to NSX-T Testing Segment – Veeam Restore Job

To get the virtual machines necessary to help in the NSX-T restore process and to prove that the admins can restore NSX-T and virtual machines from native and Veeam backups respectively, the admin runs a restore entire VM job of the three VMs previously backed up, and…

  • Points the Veeam restores to the Compute Cluster B host
  • Places them on the VM Network
  • Appends ‘_restored’ to each of their VM names
  • Leaves them powered Off. They’re left powered off so that once restored, the admin can adjust their network configurations to be attached to the ‘nsxt-restore-segment’.

restore_pract_pic9

Step 10 – Change Restored VMs networking to NSX-T Testing Segment – vCenter Server network vMotion

The restored VMs can easily be moved in bulk to the ‘nsxt-restore-segment’ by using the Migrate VMs to Another Network option.

restore_pract_pic10

Once the VMs are restored and moved to the ‘nsxt-restore-segment’, they can be powered on and the next step can proceed.

Step 11 – Add NSX-T Restore Config – NSX-T Policy API via PowerCLI

Now that the restored VMs are all added to the ‘nsxt-restore-segment’ and the new NSX-T Manager is online and attached as well, the admin can access these VMs by using the vSphere Client and using a direct console to the BACKUP-01a_restored VM.  It’s critical to run the remaining scripts from that machine, as there is no outside network access to the new NSX-T Manager appliance, as intended.

Consoling into the BACKUP-01a_restored server, the admin can make some checks to see if network connectivity is indeed limited to the ‘nsxt-restore-segment’.  Taking a quick look at the IPCONFIG of the BACKUP-01a_restored server, the admin can see that they cannot PING the default gateway of the network, however they are able to PING the other VMs and the NSX-T Manager (which has the same IP address as the Production NSX-T Manager).

restore_pract_pic11

The admin can also log into the UI of the NSX-T Manager from the BACKUP-01_restored server as well and can see that this is a brand-new deployment with no configurations.

restore_pract_pic12

The admin can also see that the Restore configuration is no longer configured as well.  The next step is to get the configuration for restoring the NSX-T Manager put back into the new NSX-T Manager.  This NSX-T Manager is already the same IP and Name as the production version, which is a requirement for restoration.

restore_pract_pic13

With connectivity to the NSX-T Manager, and confirmation that there’s no configurations, the admin can proceed with running the PowerCLI script to add the Restore Configuration into the NSX-T Manager from script 03_NSXT_RESTORE_CONFIG.ps1.

restore_pract_pic14

A quick run of the script and a refresh of the NSX-T Manager UI, and the admin can see that the SFTP server configuration is back and all of the backups that have been taken are showing up as well.

restore_pract_pic15

After checking the backup files, the admin picks the first one in the list of Available Backups and clicks on the restore button to apply the configuration.  During the restore process, since this is not a full restore and components such as Edge Nodes and Transport Node hosts are not contactable, the admin may get a few error messages that they can skip through.  Once the restore is done, the admin can take a look at the restored configuration and see that the NSX-T Manager configuration matches the production instance and the restore was successfully finished and validated.

restore_pract_pic16

restore_pract_pic17

With a successful test and the requirements accomplished, the admin can now perform the final steps running the last two scripts on the BACKUP-01a production server.  One of the scripts, 04_NSXT_RESTORE_CLEANUP.ps1 will shutdown and then forcibly delete all of the restored virtual machines and the NSX-T Manager.  The last script, 05_NSXT_DEPLOY_CLEANUP.ps1, runs a Policy API REST command to remove the Tier-1 Gateway and Segment to bring the entire deployment back to its original, clean state.

restore_pract_pic18

restore_pract_pic19

restore_pract_pic20

The last 2 posts have shown the Healthcare organization the power of using NSX-T and how it can be used with even a small amount of automated techniques to accomplish several use case examples and provide a real value to the organization that requires them to test their backups.

Using NSX-T to Test NSX-T and Virtual Machine Recovery with Automation – Conceptual

The Healthcare organization, last post, configured their NSX-T Manager to send its backups to an SFTP backup server so they can perform restores if necessary.  The Healthcare organization also utilizes Veeam Backup and Recovery to provide virtual machine-based backups for their virtual infrastructure.  Unfortunately, the NSX-T Manager is not supported to be backed up using Veeam, and requires a new fresh NSX-T Manager installation deployed and a backup configuration restored to it and the Healthcare organization would like to test restores of the NSX-T Manager.

Configuring and actually backing up the NSX-T Manager configuration or a virtual machine is one thing, actually being able to test the backups is another.  A backup is no good if you can’t restore from it.  The organization has found a way to test both their NSX-T backups and their virtual machine backups at the same time to meet the requirements.  Taking some pointers from what they’ve learned previously around using automation tools, they plan to expand their automation learning with this same process.

NSX-T can provide exact copies of production environments running on top of the same underlying physical network with no changes to the physical network.  The Healthcare organization has placed a very large bet on NSX-T being their networking and security platform for their infrastructure, and looking to use this capability to provide an isolated backup environment to test restoring their backups.  Keeping an automation mindset in place, the Healthcare organization admins take a look at the requirements they’ll need to accomplish the tasks:

Requirements:

  1. Use NSX-T to build a production replica network to test restores of the NSX-T Manager and show virtual machines can also be restored and tested on the same network
  2. Use Veeam to restore the following virtual machines:
    1. Backup Server – Will be used to run automation scripts from
    2. Active Directory – Will be needed for DNS purposes
    3. SFTP Server – Hosts the NSX-T backups that restores will be tested from
  3. Deploy a new NSX-T Manager to test the restore process to it
  4. Use automation wherever possible to continue expanding automated techniques

The following topology is drawn out by the admin that will ensure that they can rebuild a production network replica while not overlapping with the actual production networking.  This topology consists of the following constructs to build out the production replica network:

restores_pic3

  • Standalone Tier-1 Gateway – not connected to any Tier-0 Gateway, preventing northbound communications that would conflict with the production networking
  • Restore Network Segment – Provides a logical network for the restored VMs to attach to
  • Restored Domain Controller – One of the organizations domain controllers that will provide DNS for the replica network and the VMs attached
  • Restored Backup Server – Hosts the PowerShell scripts that are necessary for scripting part of the deployment on the restored NSX-T Manager. Some of the scripts will need to be run from the Production Backup Server and some of them from the Restored Backup Server since there will be no outside communications to the Restore environment other than vCenter Server direct console access
  • Restored SFTP Server – Hosts the backups of the NSX-T Manager
  • Restored NSX-T Manager – Will be used to test its own restores
  • vCenter Server B – Manages the Compute Cluster B
  • Compute Cluster B – Provides a non-production host for the restored systems to be placed on that’s not managed by the production vCenter Server A.

Before any automation can begin, the admin needs to understand all of the workflow steps that will be necessary and how to perform them so they can put automation around each workflow process.

restores_pic10

 

NSX-T 2.4.x provides a hierarchical intent-based Policy API for customers to use for automation techniques.  The admin takes a look at the NSX-T API official documentation on the Policy API and finds a few REST API commands that could be useful for creating the necessary constructs.  From the configuration of the backup of the NSX-T Manager in the previous post, the admin can also use the information collected there and REST API commands to automate adding the restore configuration into the NSX-T Manager that will be deployed.

The NSX-T Manager comes from the VMware download site as an OVA type of download.  Using a tool such as the OVFTOOL, could be used to help automate the process of deploying the NSX-T Manager to the new network that will be created.

To wrap all of these different automation techniques into scripts that the admin can use, they’re planning to use PowerCLI and PowerShell Core 6 to build scripts that can be run to automate as much of this process as possible.

The admin performs the following actions to be able to use PowerShell Core 6 and VMware PowerCLI on the Backup Server.  The Backup Server will host the scripts and also be the server where the scripts are run from, both in production and in the restored segment.

Install Prerequisites:

The post isn’t going to go into how to install these items as they are fairly simple to install with mostly click, click, next.

Each of these processes will be necessary to meet all of the requirements.  There are specific portions of the workflow where processes can be joined together into singular scripts and the admin will attempt to do so within their experience.

The first and second workflow process in the table consists of building a Veeam Backup Job around all of the virtual machines needed, and ensuring that NSX-T is sending backups to the SFTP server.

Requirement 2 – Use Veeam to restore the following virtual machines

  • Backup Server – Will be used to run automation scripts from
  • Active Directory – Will be needed for DNS purposes
  • SFTP Server – Hosts the NSX-T backups that restores will be tested from

Regardless of the order of the requirements, first and foremost to test the restore process, the admin needs to ensure that they have backups of the systems that they’re planning to perform restore testing.  The admin also hops into the NSX-T Manager console and checks that the latest backup job has completed or can press the ‘BACKUP NOW’ button to start a latest backup to the SFTP server.

restores_pic2

For the process of testing backups in this use case, the admins have configured a separate backup job in Veeam that has the 3 virtual machines that will be used for this testing procedure.

restores_pic1

The admin waits to start the backup job in Veeam until the scripts are all built as they’ll be needed once the Backup Server is restored.  The admin can start to take a look at how to build an NSX-T isolated copy of the production network.

Requirement 1 – Use NSX-T to build a production IP-based isolated network to test restores to NSX-T and show virtual machines can also be restored and tested on the same network

Requirement 3 – Deploy a new NSX-T Manager to test the restore process to it

Requirement 4 – Use automation wherever possible to continue expanding automated techniques

The process of building the production replica network can be accomplished using the NSX-T REST API.  The admin has taken a look at the NSX-T REST API official documentation and found an example of using the hierarchical intent-based API to build the Tier-1 Gateway and the Segment that will be used.  The next process is around using the OVF Tool to deploy the NSX-T Manager to the same segment previously created.  Since these processes can be called from PowerCLI, the admin decides to combine these two workflows into one script.

The code that was built for this resembles the following:

restores_pic4

restores_pic5

This script builds the Tier-1 Gateway and Segment using the NSX-T Policy API, then immediately jumps to using the OVF Tool to deploy the new NSX-T Manager to the previously created Segment.  You can find the actual script over here – github link.  For ease of reading, the OVF arguments were word wrapped.  Those need to be in one-line, normally.

The next process is around changing the Memory resources of the NSX-T Manager.  Typically, the NSX-T Manager has a memory reservation to ensure enough memory is available for it to run.  Given this is a testing environment restore, the admin wants to remove this reservation so they can start the NSX-T Manager without running into issues.  The admin builds another script to adjust this and start the VM.

The code that was built for this resembles the following:

restores_pic6

This script adjusts the memory reservation down to 8GB and then starts the NSX-T Manager VM.

The next piece of scripting that the admin chooses to do is around putting in the Restore Configuration for the NSX-T Manager into the new NSX-T Manager virtual machine using PowerCLI and the REST API.  The code that was built for this resembles the following:

restores_pic7

This script sends a REST API command to put in the Restore Server configuration into the NSX-T Manager so it can now see the NSX-T Backups on the restored SFTP-01a and can choose which one the admin wants to test the restore to.

The final script the admin decides to build is around clean up of all of the virtual machines and networking components created to test with.  The code built for this resembles the following:

restores_pic8

This script powers down and deletes all of the restored virtual machines and the NSX-T Manager, and then runs the NSX-T Policy API to remove the Tier-1 Gateway and testing Segment created resetting the infrastructure back to its original configuration.

restores_pic9

There are obviously several areas where the scripting can be improved and even further simplified.  This is a good first start for the admin to meet the requirements and grow their automation skills and further refine the scripting.  In the next post, the admin will put all of these scripts and processes to work and test the full process.  The screenshots of the script code may be tough to read, so the admin has uploaded all of the scripts to this location – https://github.com/vwilmo/NSXT_RESTORE_TESTING 🙂

 

NSX-T Backup and Restore Configuration and Automation | Part 1 – Windows SFTP Backup Targets

Now that the Healthcare organization has completed their journey of migrating from NSX Data Center for vSphere over to NSX-T Data Center, it’s time to do a bit of day 2 configuration, specifically configuring the backups of the NSX-T Manager.

The infrastructure admins that are currently in charge of running the NSX-T environment for the organization are expanding their scripting knowledge a bit and working on automating many of the configurations and operations that NSX-T Data Center requires.  The first area where some simple scripting can help is around configuration and management of NSX-T Backups.

Typically, the admin could go into the NSX-T Manager UI and perform these configurations via the UI.

backups_pic1

Since the admins are wanting to expand their knowledge in scripting and using REST APIs, and the plan is to bring this knowledge forward into performing and checking NSX-T restores later, they’ve opted to use a different approach.

Requirements:

  • Setup Backup configuration for the NSX-T Manager with an eye on automation
  • At least 3 backups per day and automatic backups after configuration changes
  • Maintaining at least 30 days of backups for the NSX-T Manager

Requirement 1 – Setup Backup configuration for the NSX-T Manager with an eye on automation

Requirement 2 – At least 3 backups per day and automatic backups after configuration changes

The first two requirements can be handled with one straightforward approach.  The organization currently has a Cerberus SFTP server that backs up configuration from other devices on their network.  It’s a FIPS 140-2 compliant software package that will work well with NSX-T.  This software package runs on a Windows Server 2016 machine for the organization to store the backups.  Consulting the official NSX-T documentation for Backup and Restore, the admin finds the required items to be able to perform the configuration.  The information is put into a chart for documentation purposes so that they can be tracked and the infrastructure and security team know the settings being used.  

backups_table_pic1

Now that the settings have been documented accordingly, the admin can take a further look at how to configure the settings in NSX-T.  The admin has decided that they will take the following approach around automating the installation of the configuration.  They will use the NSX-T REST API to perform the configuration using the documented settings.  To be able to do this a few things will need to happen.

  • Installation of a REST API client – Postman
  • Code example from the NSX-T Data Center API Guide for configuration and testing backups

This post will not go into the installation of Postman, it’s a simple installation.  The following configuration is however needed to properly ensure Postman will call the NSX-T Manager REST API.

backups_table_pic2

After consulting the NSX-T Data Center API Guide, the following code was pulled that should provide the necessary single API call to configure the NSX-T Manager backup schedule.

Example code for backup configuration:

backups_table_pic3

Taking the information collected during the documentation process, the admin can now substitute in the organization-specific configuration that will be used for the body of the REST API call.

Organization-specific code for backup configuration:

backups_table_pic4

When the admin pastes the above configuration into the body of the REST API PUT command and sends the command, they receive a Status 200 OK meaning the command was realized and accepted.

backups_pic2

There are several ways that the admin can check the work, but the Status 200 OK will display the result from the command in the Body section from the response.  It is also possible to change the same command from PUT to GET and resend it to get the same result.

With the configuration in place, the admin can issue another command via the REST API that will initiate a backup from the NSX-T Manager to the SFTP server.

backups_table_pic5

Running this command will take some time to send the request and get a response as the actual process of performing the backup needs to take place and send back a Status 200 OK which is only sent when the backup actually completes successfully.  As you can see from the Postman output below, the request took 1 minute and 1.08 seconds to actually perform the command.

backups_pic3

The admin can now go into the NSX-T Manager UI and check the configuration and backup status visually as well and it appears that all is configured properly and backing up to the SFTP server as they’d expect it to.

backups_pic4

The admin also takes a quick look at the SFTP server and the backup directory to check that files have been created.

backups_pic5

Requirement 3 – Maintaining at least 30 days of backups for the NSX-T Manager

To meet the last requirement, while still maintaining Requirement 1 around an eye for automation, the admin needs to find a way to only keep 30 days of backups for the NSX-T Manager.  The official NSX-T documentation has several scripts that can be run on Linux-based systems and coupled with a cron job, can be used to clean up the backup directory on an automatic and scheduled basis.  However, there are no scripts supplied for Windows-based SFTP systems and the Healthcare organization is using a Windows machine for their SFTP server.  The admin decides to create their own script using PowerShell and using a Windows Scheduled task to provide the same benefit.

Taking a look at the SFTP server, the admin can see that there are several folders created for the backup files.

  • ccp-backups – Contains .tar files of the Control Plane backup for NSX-T
  • cluster-node-backups – Contains .tar files in date specific folders for the NSX-T Manager/Policy/Controller Cluster and each individual NSX-T Manager backup
  • inventory-summary – Contains .json files for every inventory object in the NSX-T Manager backup

Each of these folders contains multiple files after a backup occurs for NSX-T.  Below is an example:

backups_pic6

The admin determines that the easiest way to handle this is to use PowerShell to create a script that will automatically look for files older than 30 days and remove the folders and files within the folders appropriately.  The code looks like this and can be found on GitHub as well.

backups_table_pic6

The admin tests this script by changing the $Daysback variable in the script to -0 as that will delete all of the backups that have been taken thus far.  Running the script, the admin can see that all of the backups have been removed and the folder structure for the backups is still intact.

backups_pic7

After running the backup again, the admin can see that the new backup files are present in the folder.

backups_pic8

With the script working as intended, the admin can now create a Windows scheduled task to call the PowerShell script on a nightly basis to clean up the SFTP backup directory

backups_table_pic7

With the task created, the admin runs the task manually and verifies that the current backup is removed as intended.  The admin can now run a current backup of the configuration and change the $Daysback variable to -30 again.

backups_pic9

The requirements have been fulfilled and the admin can now move onto the next task which is testing the backup and restore process in Part 2.

NSX Data Center for vSphere to NSX-T Data Center Migration – Part 3

Planning and preparation are complete and the Healthcare organization is now ready to proceed with Part 3 of the NSX Data Center for vSphere to NSX-T Data Center migration.

Researching the process for migration from NSX Data Center for vSphere to NSX-T Data Center involves the following processes.  These efforts will be covered over a series of blog posts related to each step in the processes:

  • Understanding the NSX Data Center for vSphere Migration Process – Part 1
    • Checking Supported Features
    • Checking Supported Topologies
    • Checking Supported Limits
    • Reviewing the Migration Process and the prerequisites
  • Preparing to Migrate the NSX Data Center for vSphere Environment – Part 2
    • Prepare a new NSX-T Data Center Environment and necessary components
    • Prepare NSX Data Center for vSphere for Migration
  • Migration of NSX Data Center for vSphere to NSX-T Data Center – Part 3

As they started the process in part 1, consulting the official documentation on the processes and what steps to perform are recommended.

https://docs.vmware.com/en/VMware-NSX-T-Data-Center/2.4/migration/GUID-78947686-CC6C-460B-A185-7E2EE7D3BCED.html

MIGRATION OF NSX DATA CENTER FOR VSPHERE TO NSX-T DATA CENTER

The migration to NSX-T Data Center is a multi-step process.  The steps are outlined below:

  • Import the NSX Data Center for vSphere Configuration
  • Resolve Issues with the NSX Data Center for vSphere Configuration
  • Migrate the NSX Data Center for vSphere Configuration
  • Migrate NSX Data Center for vSphere Edges
  • Migrate NSX Data Center for vSphere Hosts
  • Finish the NSX Data Center for vSphere Migration

Upon further review of each step, the organization deployed two NSX-T Data Center Edge Nodes which will be used as replacements for the NSX Data Center for vSphere Edge Services Gateways.  These Edge Nodes were deployed using the official documentation and added to the NSX-T Manager.

migration_coordinator_process_pic8

migration_coordinator_process_pic9

IMPORT THE NSX DATA CENTER FOR VSPHERE CONFIGURATION

To begin the process, the organization needs to enable the Migration Coordinator on the NSX-T Manager that they deployed.  A quick SSH session into the NSX-T Manager using the admin account, will provide the means to run the command necessary to start the Migration Coordinator service and enable the user interface that will be used for the migration in the NSX-T Manager:

migration_coordinator_start

Now that the Migration Coordinator service is running, the user interface in the NSX-T Manager will be enabled.

migration_coordinator_process_pic1

The next step in the process is to authenticate to the NSX Manager and the vCenter Server.

migration_coordinator_process_pic2

migration_coordinator_process_pic3

With the NSX Data Center for vSphere Manager and vCenter Server added in, the organization can start the import configuration step.

migration_coordinator_process_pic4

The organization sees ‘Successful’ on importing the existing configuration into NSX-T Data Center.  There is an option to ‘View Imported Topology’ which will give them a nice visual diagram of the configuration details that were imported.

migration_coordinator_process_pic5

A successful import allows the organization to proceed with the next step in the migration process

RESOLVE ISSUES WITH THE NSX DATA CENTER FOR VSPHERE CONFIGURATION

Moving to the next step, the organization is presented with all of the ‘issues’ that need to be resolved to move forward with the migration process. The total number of inputs that need to be resolved are listed and once resolved, will also be listed.

migration_coordinator_process_pic6

Several of the issues appear to be items that the organization does have already have configured.  Each issue has a recommendation by the Migration Coordinator for the organization to consider and move forward with the migration process.  The more important issues listed, are the ones that deal with the ‘EDGE’ as those issues will result in new NSX-T Data Center Edge Nodes being deployed to replace the existing Edge Services Gateways.

migration_coordinator_process_pic7

After selecting the EDGE category of issues to resolve, the organization was met with the following items to remediate before it was able to proceed to the next step.

migration_coordinator_process_pic10

  • IP addresses for TEPs on all Edge transport nodes will be allocated from the selected IP Pool. You must ensure connectivity between Edge TEPs and NSX for vSphere VTEPs.

This issue requires putting in the TEP_POOL that was created for the Edge Nodes already.

  • An NSX-T Edge node will provide the connectivity to replace NSX-v edge. Enter an IP address for the uplink.

This issue requires putting in a valid uplink IP address for the NSX-T Edge Node.  The organization will want to use the same IP address that the NSX Data Center for vSphere Edge Services Gateway is currently using since the TOR is statically routed to that IP address.

  • An NSX-T Edge node will provide HA redundancy for NSX-v edge. Enter an IP address for the uplink on this Edge node. This IP address must be in the same subnet as the uplink of the other NSX-T Edge used to replace this edge.

This issue requires putting in a valid IP address for the HA redundancy that the Edge Node will provide

  • An NSX-T Edge node will provide HA redundancy for edge replacing NSX-v edge. Enter an unused fabric ID for Edge node. See System > Fabric > Nodes > Edge Transport Nodes.

This issue requires selecting the UUID that was imported from the NSX-T Edge Nodes and selecting which one will be the replacing the NSX Data Center for vSphere Edge Services Gateway

  • An NSX-T Edge node will provide the connectivity to replace NSX-v edge. Enter an unused fabric ID for this Edge node. See System > Fabric > Nodes > Edge Transport Nodes.

This issue is similar to the one above but requires selecting the second NSX-T Edge Node UUID instead.

  • An NSX-T Edge node will provide the connectivity to replace NSX-v Edge. Enter a VLAN ID for the uplink on this Edge node.

This issue requires putting in the VLAN ID of the uplink adapter that will be used.

With all of the items resolved, the organization is ready to proceed with the actual migration process. Given that there will be some data plane outages that will need to occur during this process, the Edge Services Gateways will need to migrate to NSX-T Gateways, the organization has decided to perform the actual migration process during a scheduled maintenance window.

MIGRATE THE NSX DATA CENTER FOR VSPHERE CONFIGURATION

Pressing start, the Migration Coordinate begins migrating the configuration over to the NSX-T Data Center Manager.  This part of the process does not incur an outage as it’s a copy of the configuration.

migration_coordinator_process_pic11

Once the configuration has been copied over, the organization can now see all of the components that have been created in NSX-T Data Center from the configuration imported.

NETWORKING

The organization can see that a new Tier-0 Gateway has been created and has the routing configuration that the Edge Services Gateways had.

networking

networking2networking3

GROUPS

The organization checks the new Group objects and can see that those new Inventory objects have been created

groups1

SECURITY

Lastly, the organization checks the security objects, specifically that their Distributed Firewall and Service Composer rulesets are migrated over properly.

security1

MIGRATE NSX DATA CENTER FOR VSPHERE EDGES

The next part will incur an outage as this is the process of migrating the NSX Data Center for vSphere Edge Services Gateways over to the NSX-T Data Center Edge Nodes.  This will involve moving the IP addressing over.

migration_coordinator_process_pic12

migration_coordinator_process_pic13

Once the Edges have been migrated over, the organization can see that a new Transport Zone is created, Edge Node Cluster created, and N-VDS switch is created.

MIGRATE NSX DATA CENTER FOR VSPHERE HOSTS

The next step involves swapping the ESXi host software components for NSX Data Center for vSphere out with NSX-T Data Center.

hosts1

With the ESXi hosts now migrated the organization has now been successfully migrated from NSX Data Center for vSphere over to NSX-T Data Center.

finished1.png

Now that the Healthcare organization has migrated over to NSX-T Data Center, they can start the decommissioning of the NSX Data Center for vSphere components that are no longer needed.  The topology of their data center environment with NSX-T Data Center now looks like this.

finish_topology

NSX Data Center for vSphere to NSX-T Data Center Migration – Part 2

Part 2 of the NSX Data Center for vSphere to NSX-T Data Center migration for the Healthcare organization is around preparing the new NSX-T Data Center environment by deploying, installing, and configuring the necessary components.

Researching the process for migration from NSX Data Center for vSphere to NSX-T Data Center involves the following processes.  These efforts will be covered over a series of blog posts related to each step in the processes:

  • Understanding the NSX Data Center for vSphere Migration Process – Part 1
    • Checking Supported Features
    • Checking Supported Topologies
    • Checking Supported Limits
    • Reviewing the Migration Process and the prerequisites
  • Preparing to Migrate the NSX Data Center for vSphere Environment – Part 2
    • Prepare a new NSX-T Data Center Environment and necessary components
    • Prepare NSX Data Center for vSphere for Migration
  • Migration of NSX Data Center for vSphere to NSX-T Data Center – Part 3

As they started the process in part 1, consulting the official documentation on the processes and what steps to perform are recommended.

https://docs.vmware.com/en/VMware-NSX-T-Data-Center/2.4/migration/GUID-78947686-CC6C-460B-A185-7E2EE7D3BCED.html

PREPARE A NEW NSX-T DATA CENTER ENVIRONMENT AND NECESSARY COMPONENTS

Preparing a new NSX-T Data Center environment involves deploying the NSX-T Manager.  Installation of the NSX-T Manager is beyond the scope of this blog post as the official documentation has the necessary steps involved.  The key piece of information for this part of the migration process is to deploy the NSX-T Manager appliance(s) on ESXi hosts that are NOT part of the NSX Data Center for vSphere environment that’s being migrated.  The Healthcare organization deployed the new NSX-T Manager on the same hosts that the NSX Data Center for vSphere Manager is currently deployed on.

before_topology.with.nsxt

The next step is to add the vCenter Server that is associated with the NSX Data Center for vSphere environment.  NSX-T Data Center has a completely separate user-interface to manage the NSX-T installation, that will not conflict with the NSX Data Center for vSphere user-interface that’s added as a plug-in to the vSphere Client.  The steps to add the vCenter Server, Compute Manager, into NSX-T are documented in the same official documentation as part 2 of the migration process.  Once added into NSX-T, this is what the organization sees:

nsxt_compute_manager_added

There is a recommendation to add more NSX-T Managers to form a cluster for a proper production deployment, but since the Migration Coordinator is only run on one of the NSX-T Manager appliances, they can be added later.

The last step to prepare the NSX-T side of the migration process for the organization is to create an IP Pool for the Edge Tunnel Endpoints (TEP).  The organization already has a VLAN network for the VXLAN Tunnel Endpoints on the ESXi hosts for NSX Data Center for vSphere.  The VLAN is constrained using an IP range and part of the VLAN network will be assigned for the Edge TEPs as well as the host TEPs that will need to be created as well.

tep_pool_pic1

A TEP pool is created that the organization will reference during the migration

tep_pool_pic2

An IP range of addresses in the VLAN network is allocated and ensured not stepped on by any other devices in the range.

PREPARE NSX DATA CENTER FOR VSPHERE FOR MIGRATION

With the NSX-T Data Center environment setup and the steps followed, the next part of the migration process involves preparing the NSX Data Center for vSphere environment.

The first step involves configuring any hosts that might not already be added to a vSphere Distributed Switch.  The Healthcare organization has moved all of the data center hosts over to a vSphere Distributed Switch so this part of the process is not applicable to them.

The second step of this part of the migration process involves checking the Distributed Firewall Filter Export Version of the virtual machines.  This involves checking the ESXi hosts where these workloads reside and running a few simple commands.  Checking the vSphere Client, the workloads and the hosts they reside on can be seen so the organization knows which hosts to check filter export versions.

vcenter_vm_inventory

Now that the information on the virtual workload is confirmed, a simple SSH session into the ESXi host will determine if the export version is correct or needs to be modified to support the migration process.

export_filter_check

The check of the workload shows that the Distributed Firewall Filter Export Version is the correct version for this workload.  The organization can now check all of the other workloads to ensure this is the case with those as well.  This is the last step in part 2 of the process and once fully completed the Healthcare organization can moved to Part 3 and begin the actual migration process.