Using NSX-T to Test NSX-T and Virtual Machine Recovery with Automation – Practical

Part 1 – Windows SFTP Backup Targets

Part 2a – Using NSX-T to Test NSX-T and Virtual Machine Recovery with Automation – Concept

Part 2b –Using NSX-T to Test NSX-T and Virtual Machine Recovery with Automation – Practical

In Part 2a, the Healthcare organization admins had created several scripts using VMware PowerCLI, PowerShell Core 6, OVF Tool, and NSX-T Policy REST APIs.  Those scripts are located at the following GitHub link for other community admins to consume as well.

The original requirements that were put forth for the admins to provide a design for were:

Requirements:

  1. Use NSX-T to build a production replica network to test restores of the NSX-T Manager and show virtual machines can also be restored and tested on the same network
  2. Use Veeam to restore the following virtual machines:
    1. Backup Server – Will be used to run automation scripts from
    2. Active Directory – Will be needed for DNS purposes
    3. SFTP Server – Hosts the NSX-T backups that restores will be tested from
  3. Deploy a new NSX-T Manager to test the restore process to it
  4. Use automation wherever possible to continue expanding automated techniques

To meet these requirements the admin had designed the following topology to meet these requirements:

finish_topology

  • Standalone Tier-1 Gateway – not connected to any Tier-0 Gateway, preventing northbound communications that would conflict with the production networking
  • Restore Network Segment – Provides a logical network for the restored VMs to attach to
  • Restored Domain Controller – One of the organizations domain controllers that will provide DNS for the replica network and the VMs attached
  • Restored Backup Server – Hosts the PowerShell scripts that are necessary for scripting part of the deployment on the restored NSX-T Manager. Some of the scripts will need to be run from the Production Backup Server and some of them from the Restored Backup Server since there will be no outside communications to the Restore environment other than vCenter Server direct console access
  • Restored SFTP Server – Hosts the backups of the NSX-T Manager
  • Restored NSX-T Manager – Will be used to test its own restores. NSX-T Manager restores requires that the new NSX-T Manager have the same IP address as the production copy.  To test this appropriately, we have to create a copy of the production network and IP addressing
  • vCenter Server B – Manages the Compute Cluster B
  • Compute Cluster B – Provides a non-production host for the restored systems to be placed on that’s not managed by the production vCenter Server A.

For further details on reasonings for this topology, you can take a look at Part 2a referenced at the top of this thread.

With the scripts created, it’s now time for the admin to work through the workflow processes and test that this strategy will meet the requirements in practice.  This is a review of the workflow process:

restores_pic10

Step 1 – Copy scripts to BACKUP-01a – GitHub download and copy

The scripts just need to be pulled down from GitHub and copied to a location on the BACKUP-01a server

Step 2 – Copy NSX-T OVA to BACKUP-01a – Download and copy

Another straightforward step with downloading the NSX-T OVA that’s the exact version of the current NSX-T Manager and copying it to a location on BACKUP-01a

Step 3 – Install PowerShell Core 6, PowerCLI, and OVFTool – Download installs and install

restore_pract_pic2

Step 4 – Perform a Backup of the NSX-T Manager – Native Backup Tool

A pretty simple step by just going into the NSX-T Manager and the Backup & Restore tab and pressing the ‘BACKUP NOW’ button and verifying its completion.

restore_pract_pic1

Step 5 – Backup SFTP-01a, AD-01a, BACKUP-01a – Single Veeam Backup Job

Once all of the components to perform the remaining workflows are done and installed and configured, the backups of the necessary virtual machines, especially the BACKUP-01a machine, can occur.

restore_pract_pic3

Step 6 and 7 – Deploy Testing Tier-1 Gateway and Segment – NSX-T Policy API via PowerCLI

From the BACKUP-01a production server, the admin runs the 01_NSXT_DEPLOY.ps1 to build the Tier-1 Gateway and Segment and then it will start the OVF Tool to deploy the NSX-T Manager OVA file to the Compute Cluster B.

restore_pract_pic4

Tier-1 Gateway has been created, not linked to a Tier-0 Gateway to prevent Northbound connectivity with the overlapping production network and ‘nsxt-restore-segment’ created for the virtual machines and new NSX-T Manager to attach to.

restore_pract_pic5

restore_pract_pic6

The admin can also see that the new NSX-T Manager, connected to the ‘nsxt-restore-segment’ is being deployed.

restore_pract_pic7

Step 8 – Adjust NSX-T CPU/Mem Resources and Power-On – PowerCLI

Once the new NSX-T Manager is deployed, the admin wants to adjust the memory reservation so that they can start the NSX-T Manager without running into memory constraints since the test environment is rather limited.  The deployed NSX-T Manager is in ‘small’ form factor, but still has a 16GB Memory reservation on it.  From the BACKUP-01a production server, the admin runs the 02_NSXT_RESERVATION_ADJUST.ps1 to adjust the memory reservation down to 8GB and then power on the appliance.

restore_pract_pic8

Step 9 – Restore VMs to NSX-T Testing Segment – Veeam Restore Job

To get the virtual machines necessary to help in the NSX-T restore process and to prove that the admins can restore NSX-T and virtual machines from native and Veeam backups respectively, the admin runs a restore entire VM job of the three VMs previously backed up, and…

  • Points the Veeam restores to the Compute Cluster B host
  • Places them on the VM Network
  • Appends ‘_restored’ to each of their VM names
  • Leaves them powered Off. They’re left powered off so that once restored, the admin can adjust their network configurations to be attached to the ‘nsxt-restore-segment’.

restore_pract_pic9

Step 10 – Change Restored VMs networking to NSX-T Testing Segment – vCenter Server network vMotion

The restored VMs can easily be moved in bulk to the ‘nsxt-restore-segment’ by using the Migrate VMs to Another Network option.

restore_pract_pic10

Once the VMs are restored and moved to the ‘nsxt-restore-segment’, they can be powered on and the next step can proceed.

Step 11 – Add NSX-T Restore Config – NSX-T Policy API via PowerCLI

Now that the restored VMs are all added to the ‘nsxt-restore-segment’ and the new NSX-T Manager is online and attached as well, the admin can access these VMs by using the vSphere Client and using a direct console to the BACKUP-01a_restored VM.  It’s critical to run the remaining scripts from that machine, as there is no outside network access to the new NSX-T Manager appliance, as intended.

Consoling into the BACKUP-01a_restored server, the admin can make some checks to see if network connectivity is indeed limited to the ‘nsxt-restore-segment’.  Taking a quick look at the IPCONFIG of the BACKUP-01a_restored server, the admin can see that they cannot PING the default gateway of the network, however they are able to PING the other VMs and the NSX-T Manager (which has the same IP address as the Production NSX-T Manager).

restore_pract_pic11

The admin can also log into the UI of the NSX-T Manager from the BACKUP-01_restored server as well and can see that this is a brand-new deployment with no configurations.

restore_pract_pic12

The admin can also see that the Restore configuration is no longer configured as well.  The next step is to get the configuration for restoring the NSX-T Manager put back into the new NSX-T Manager.  This NSX-T Manager is already the same IP and Name as the production version, which is a requirement for restoration.

restore_pract_pic13

With connectivity to the NSX-T Manager, and confirmation that there’s no configurations, the admin can proceed with running the PowerCLI script to add the Restore Configuration into the NSX-T Manager from script 03_NSXT_RESTORE_CONFIG.ps1.

restore_pract_pic14

A quick run of the script and a refresh of the NSX-T Manager UI, and the admin can see that the SFTP server configuration is back and all of the backups that have been taken are showing up as well.

restore_pract_pic15

After checking the backup files, the admin picks the first one in the list of Available Backups and clicks on the restore button to apply the configuration.  During the restore process, since this is not a full restore and components such as Edge Nodes and Transport Node hosts are not contactable, the admin may get a few error messages that they can skip through.  Once the restore is done, the admin can take a look at the restored configuration and see that the NSX-T Manager configuration matches the production instance and the restore was successfully finished and validated.

restore_pract_pic16

restore_pract_pic17

With a successful test and the requirements accomplished, the admin can now perform the final steps running the last two scripts on the BACKUP-01a production server.  One of the scripts, 04_NSXT_RESTORE_CLEANUP.ps1 will shutdown and then forcibly delete all of the restored virtual machines and the NSX-T Manager.  The last script, 05_NSXT_DEPLOY_CLEANUP.ps1, runs a Policy API REST command to remove the Tier-1 Gateway and Segment to bring the entire deployment back to its original, clean state.

restore_pract_pic18

restore_pract_pic19

restore_pract_pic20

The last 2 posts have shown the Healthcare organization the power of using NSX-T and how it can be used with even a small amount of automated techniques to accomplish several use case examples and provide a real value to the organization that requires them to test their backups.

VCP-NV Blueprint Breakdown

I spent the last few weeks breaking down the VCP-NV v1.0 blueprint as best as I could. I wanted to share it with the community and hopefully it will help someone else out.

The blueprint is pretty extensive for a VCP level exam, but there are several sections that mirror other sections in other blueprints. The tools provided in the blueprint document are very close to everything you would need to break it down. I found that some of the tools simply weren’t good enough to provide proper information. So I added links to them at the bottom of this post. I spent countless hours in the VMware Hands-on-Labs running through the NSX interface and CLI. I probably loaded HOL-SDC-1303 over 50 times all together. Bottom line, if you don’t have direct access to NSX, other than deployment those HOLs will pretty much provide you with anything you need to do in the VCP-NV blueprint. I suggest going through those labs multiple times.

All in all, this is hopefully helpful to you in your studies. I’ll update things as I catch them as wrong or lacking in the document.

You can find the breakdown here

Helpful sites –

http://www.yet.org/2014/09/nsxv-troubleshooting/

http://blogs.vmware.com/management/2014/05/vcac-nsx-dynamically-configuring-application-specific-network-services.html

http://blogs.vmware.com/networkvirtualization/tag/vmware-nsx

Helpful HOL training –

HOL-SDC-1303

HOL-SDC-1404

HOL-SDC-1425

Editing vmnic names – vSphere 5.5

One of my hosts in my lab had a bad NIC that I found the other day. It’s a quad port NIC and only two of the four ports were showing up. After some testing, I went ahead and replaced the NIC with a spare I had laying around and found that when I rebooted up the server, the vmnics were completely out of whack on the host.

Doing some research I found a KB article explaining that changing the file was unsupported, but since this is a lab and that’s never stopped me in the past I went ahead and edited the esx.conf file anyway.

I’ve already fixed this issue on my lab server, so I went back and broke it again for demonstration purposes. As you can see the vmnic27 NIC is completely jacked up. It should be vmnic7 in our order. So let’s edit the esx.conf file and make it work.

esx_conf_fix_pic1The configuration file can be found in the following directory on the ESXi host, /etc/vmware/esx.conf. Below are the excerpts from the esx.conf file where you need to make the appropriate changes to change the vmnic naming.


/vmkdevmgr/pci/s00000002:02.01/alias = "vmnic27"

/device/000:009:00.1/vmkname = "vmnic27"

/net/pnic/child[0014]/mac = "00:10:18:c0:f8:c6"
/net/pnic/child[0014]/virtualMac = "00:50:56:50:f8:c6"
/net/pnic/child[0014]/name = "vmnic27"

As you can see, there are three places in the file that wrong name exists. We should double check to make sure that what we’re changing accurately reflects the true NIC we want to change. We’re looking for ‘00:10:18:c0:f8:c6’.

esx_conf_fix_pic2So we confirmed that this is in fact the correct NIC as the MAC addresses coincide.

Note – if you have dual or quad port NIC, ESXi numbers the vmnics based on slot and then appears to number by MAC address in order. As you can see above, the quad port NIC has the exact same MAC address with the exception of the last 2 hexadecimal characters. The lowest number in hex, is the lowest vmnic number of that card and goes up until all numbered. Does this make sense? Well it does appear that way. If you take a look at the vmnics listed in the screenshot below, you can see that they correspond like this:

vmnic4 – 00:10:18:c0:f8:c0
vmnic5 – 00:10:18:c0:f8:c2
vmnic6 – 00:10:18:c0:f8:c4
vmnic7 – 00:10:18:c0:f8:c6 <- should be this but shows as vmnic27, because it’s broken

So the theory seems to hold up. You want to take this into consideration because I had more than one NIC that was screwed up because of an entire quad NIC change out. This meant I had to match MAC addresses to the vmnics to determine the order. Needless to say it was a mess, but I was able to correct it.

Now let’s fix. Make a backup copy of the esx.conf file, call it esxoriginal.conf if you want. Then we make the changes to the sections of the esx.conf file, save it back to the host, and reboot.


/vmkdevmgr/pci/s00000002:02.01/alias = "vmnic7"
/device/000:009:00.1/vmkname = "vmnic7"
/net/pnic/child[0014]/mac = "00:10:18:c0:f8:c6"

/net/pnic/child[0014]/virtualMac = "00:50:56:50:f8:c6"

/net/pnic/child[0014]/name = "vmnic7"

Once the reboot is complete, we take a look at our network adapters in the vSphere Client, and we can see that vmnic7 has been restored to its proper name. Pretty simple.esx_conf_fix_pic3

 

 

User-Defined Network Resource Pools

This isn’t anything profound, but as I was going through my VCAP-DCA5 materials, one of the objectives concerns user-defined network resource pools.   While creating them is well documented, in the study guides I was using there were no mentions of where you turn these on.  It’s pretty simple actually.  Here’s a quick recap of what we’re trying to accomplish:

Create a new user-defined network resource pool.

user_net_resource_step1Name the pool

user_net_resource_step2Assign the pool to a dvPortGroup

user_net_resource_step3Confirm

user_net_resource_step4You can even do it from another interface as well by clicking on the ‘Manage Port Groups’ link in the same window

user_net_resource_step5Another objective point take care of.  Easy stuff.