NSX-T Backup and Restore Configuration and Automation | Part 1 – Windows SFTP Backup Targets

Now that the Healthcare organization has completed their journey of migrating from NSX Data Center for vSphere over to NSX-T Data Center, it’s time to do a bit of day 2 configuration, specifically configuring the backups of the NSX-T Manager.

The infrastructure admins that are currently in charge of running the NSX-T environment for the organization are expanding their scripting knowledge a bit and working on automating many of the configurations and operations that NSX-T Data Center requires.  The first area where some simple scripting can help is around configuration and management of NSX-T Backups.

Typically, the admin could go into the NSX-T Manager UI and perform these configurations via the UI.

backups_pic1

Since the admins are wanting to expand their knowledge in scripting and using REST APIs, and the plan is to bring this knowledge forward into performing and checking NSX-T restores later, they’ve opted to use a different approach.

Requirements:

  • Setup Backup configuration for the NSX-T Manager with an eye on automation
  • At least 3 backups per day and automatic backups after configuration changes
  • Maintaining at least 30 days of backups for the NSX-T Manager

Requirement 1 – Setup Backup configuration for the NSX-T Manager with an eye on automation

Requirement 2 – At least 3 backups per day and automatic backups after configuration changes

The first two requirements can be handled with one straightforward approach.  The organization currently has a Cerberus SFTP server that backs up configuration from other devices on their network.  It’s a FIPS 140-2 compliant software package that will work well with NSX-T.  This software package runs on a Windows Server 2016 machine for the organization to store the backups.  Consulting the official NSX-T documentation for Backup and Restore, the admin finds the required items to be able to perform the configuration.  The information is put into a chart for documentation purposes so that they can be tracked and the infrastructure and security team know the settings being used.  

backups_table_pic1

Now that the settings have been documented accordingly, the admin can take a further look at how to configure the settings in NSX-T.  The admin has decided that they will take the following approach around automating the installation of the configuration.  They will use the NSX-T REST API to perform the configuration using the documented settings.  To be able to do this a few things will need to happen.

  • Installation of a REST API client – Postman
  • Code example from the NSX-T Data Center API Guide for configuration and testing backups

This post will not go into the installation of Postman, it’s a simple installation.  The following configuration is however needed to properly ensure Postman will call the NSX-T Manager REST API.

backups_table_pic2

After consulting the NSX-T Data Center API Guide, the following code was pulled that should provide the necessary single API call to configure the NSX-T Manager backup schedule.

Example code for backup configuration:

backups_table_pic3

Taking the information collected during the documentation process, the admin can now substitute in the organization-specific configuration that will be used for the body of the REST API call.

Organization-specific code for backup configuration:

backups_table_pic4

When the admin pastes the above configuration into the body of the REST API PUT command and sends the command, they receive a Status 200 OK meaning the command was realized and accepted.

backups_pic2

There are several ways that the admin can check the work, but the Status 200 OK will display the result from the command in the Body section from the response.  It is also possible to change the same command from PUT to GET and resend it to get the same result.

With the configuration in place, the admin can issue another command via the REST API that will initiate a backup from the NSX-T Manager to the SFTP server.

backups_table_pic5

Running this command will take some time to send the request and get a response as the actual process of performing the backup needs to take place and send back a Status 200 OK which is only sent when the backup actually completes successfully.  As you can see from the Postman output below, the request took 1 minute and 1.08 seconds to actually perform the command.

backups_pic3

The admin can now go into the NSX-T Manager UI and check the configuration and backup status visually as well and it appears that all is configured properly and backing up to the SFTP server as they’d expect it to.

backups_pic4

The admin also takes a quick look at the SFTP server and the backup directory to check that files have been created.

backups_pic5

Requirement 3 – Maintaining at least 30 days of backups for the NSX-T Manager

To meet the last requirement, while still maintaining Requirement 1 around an eye for automation, the admin needs to find a way to only keep 30 days of backups for the NSX-T Manager.  The official NSX-T documentation has several scripts that can be run on Linux-based systems and coupled with a cron job, can be used to clean up the backup directory on an automatic and scheduled basis.  However, there are no scripts supplied for Windows-based SFTP systems and the Healthcare organization is using a Windows machine for their SFTP server.  The admin decides to create their own script using PowerShell and using a Windows Scheduled task to provide the same benefit.

Taking a look at the SFTP server, the admin can see that there are several folders created for the backup files.

  • ccp-backups – Contains .tar files of the Control Plane backup for NSX-T
  • cluster-node-backups – Contains .tar files in date specific folders for the NSX-T Manager/Policy/Controller Cluster and each individual NSX-T Manager backup
  • inventory-summary – Contains .json files for every inventory object in the NSX-T Manager backup

Each of these folders contains multiple files after a backup occurs for NSX-T.  Below is an example:

backups_pic6

The admin determines that the easiest way to handle this is to use PowerShell to create a script that will automatically look for files older than 30 days and remove the folders and files within the folders appropriately.  The code looks like this and can be found on GitHub as well.

backups_table_pic6

The admin tests this script by changing the $Daysback variable in the script to -0 as that will delete all of the backups that have been taken thus far.  Running the script, the admin can see that all of the backups have been removed and the folder structure for the backups is still intact.

backups_pic7

After running the backup again, the admin can see that the new backup files are present in the folder.

backups_pic8

With the script working as intended, the admin can now create a Windows scheduled task to call the PowerShell script on a nightly basis to clean up the SFTP backup directory

backups_table_pic7

With the task created, the admin runs the task manually and verifies that the current backup is removed as intended.  The admin can now run a current backup of the configuration and change the $Daysback variable to -30 again.

backups_pic9

The requirements have been fulfilled and the admin can now move onto the next task which is testing the backup and restore process in Part 2.

Advertisements

Getting the PernixData FVP Acceleration Policy from the CLI

When I was building out the PernixData/SnapMirror script, I was playing around with the ‘Set-PrnxAccelerationPolicy’ command. What I noticed that there wasn’t any ‘Get-PrnxAccelerationPolicy’. A quick tweet and I got response from the PernixData product management, Bala Narasimhan. He pointed me to a command that you can run that will give you some feedback and you can figure this out pretty easily. This command is only available within FVP 1.5.

The first thing we need to do is pull up the lab. I have two datastores and a VM that are being accelerated in both write-through and write-back to demonstrate the differences.

pernix_accel_cli_pic1We’re going to run the following commands to list out the policy. Per Bala, we’re not going to see ‘Write-Back’ or ‘Write-through’ in the output. We’ll see a number returned. The numbers represent the policy being applied.

pernix_accel_cli_pic2As you can see we’re getting the ‘cachePolicy’ value of 3 and of 7. These values represent:

3 – Write-Through

7 – Write-Back

Thanks again Bala for the quick help!

Automating NetApp SnapMirror and PernixData FVP Write-Back Caching v1.0

I wanted to toss up a script I’ve been working on in my lab that would automate the transition of SnapMirror volumes on a NetApp array from using Write Back caching with PernixData’s FVP, to Write Through so you can properly take snapshots of the underlying volumes for replication purposes. This is just a v1.0 script and I’m sure I’ll modify it more going forward but I wanted to give people a place to start.

Assumptions made:

  • You’re accelerating entire datastores and not individual VMs.
  • The naming schemes between LUNs in vCenter and Volumes on the NetApp Filer are close.

Requirements:

  • You’ll need the DataONTAP Powershell Toolkit 3.1 from NetApp. Its community driven but you’ll still need a NetApp login to download it. It should be free to sign up. Here’s a link to it.
  • You’ll need to do some credential and password building first, the instructions are in the comments of the script.
  • You’ll need to be running FVP version 1.5.

What this script does:

  • Pulls the SnapMirror information from a NetApp Controller, specifically Source and Destination information based on ‘Idle’ and ‘LagTimeTS’ status. The ‘LagTimeTS’ timer is adjustable so you can focus in on SnapMirrors that have a distinct schedule based on lag time and aren’t currently in a transferring state.
  • Takes the name of the volumes in question and passes them through to the PernixData Management Server for transitioning from Write Back to Write Through and waiting an adjustable amount of time for the volume to change to Write Through and cache to de-stage back to the array.
  • Performs a SnapMirrorUpdate of the same volumes originally pulled and waits for an adjustable amount of time for the snapshots to take place
  • Resets the datastores back into Write Back with 1 Network peer (adjustable).

Comments and suggestions are always welcomed. I’m always open to learning how to make it more efficient and I’m sure there are several ways to tackle this.

You can download the script from here.