Using NSX-T to Test NSX-T and Virtual Machine Recovery with Automation – Practical

Part 1 – Windows SFTP Backup Targets

Part 2a – Using NSX-T to Test NSX-T and Virtual Machine Recovery with Automation – Concept

Part 2b –Using NSX-T to Test NSX-T and Virtual Machine Recovery with Automation – Practical

In Part 2a, the Healthcare organization admins had created several scripts using VMware PowerCLI, PowerShell Core 6, OVF Tool, and NSX-T Policy REST APIs.  Those scripts are located at the following GitHub link for other community admins to consume as well.

The original requirements that were put forth for the admins to provide a design for were:

Requirements:

  1. Use NSX-T to build a production replica network to test restores of the NSX-T Manager and show virtual machines can also be restored and tested on the same network
  2. Use Veeam to restore the following virtual machines:
    1. Backup Server – Will be used to run automation scripts from
    2. Active Directory – Will be needed for DNS purposes
    3. SFTP Server – Hosts the NSX-T backups that restores will be tested from
  3. Deploy a new NSX-T Manager to test the restore process to it
  4. Use automation wherever possible to continue expanding automated techniques

To meet these requirements the admin had designed the following topology to meet these requirements:

finish_topology

  • Standalone Tier-1 Gateway – not connected to any Tier-0 Gateway, preventing northbound communications that would conflict with the production networking
  • Restore Network Segment – Provides a logical network for the restored VMs to attach to
  • Restored Domain Controller – One of the organizations domain controllers that will provide DNS for the replica network and the VMs attached
  • Restored Backup Server – Hosts the PowerShell scripts that are necessary for scripting part of the deployment on the restored NSX-T Manager. Some of the scripts will need to be run from the Production Backup Server and some of them from the Restored Backup Server since there will be no outside communications to the Restore environment other than vCenter Server direct console access
  • Restored SFTP Server – Hosts the backups of the NSX-T Manager
  • Restored NSX-T Manager – Will be used to test its own restores. NSX-T Manager restores requires that the new NSX-T Manager have the same IP address as the production copy.  To test this appropriately, we have to create a copy of the production network and IP addressing
  • vCenter Server B – Manages the Compute Cluster B
  • Compute Cluster B – Provides a non-production host for the restored systems to be placed on that’s not managed by the production vCenter Server A.

For further details on reasonings for this topology, you can take a look at Part 2a referenced at the top of this thread.

With the scripts created, it’s now time for the admin to work through the workflow processes and test that this strategy will meet the requirements in practice.  This is a review of the workflow process:

restores_pic10

Step 1 – Copy scripts to BACKUP-01a – GitHub download and copy

The scripts just need to be pulled down from GitHub and copied to a location on the BACKUP-01a server

Step 2 – Copy NSX-T OVA to BACKUP-01a – Download and copy

Another straightforward step with downloading the NSX-T OVA that’s the exact version of the current NSX-T Manager and copying it to a location on BACKUP-01a

Step 3 – Install PowerShell Core 6, PowerCLI, and OVFTool – Download installs and install

restore_pract_pic2

Step 4 – Perform a Backup of the NSX-T Manager – Native Backup Tool

A pretty simple step by just going into the NSX-T Manager and the Backup & Restore tab and pressing the ‘BACKUP NOW’ button and verifying its completion.

restore_pract_pic1

Step 5 – Backup SFTP-01a, AD-01a, BACKUP-01a – Single Veeam Backup Job

Once all of the components to perform the remaining workflows are done and installed and configured, the backups of the necessary virtual machines, especially the BACKUP-01a machine, can occur.

restore_pract_pic3

Step 6 and 7 – Deploy Testing Tier-1 Gateway and Segment – NSX-T Policy API via PowerCLI

From the BACKUP-01a production server, the admin runs the 01_NSXT_DEPLOY.ps1 to build the Tier-1 Gateway and Segment and then it will start the OVF Tool to deploy the NSX-T Manager OVA file to the Compute Cluster B.

restore_pract_pic4

Tier-1 Gateway has been created, not linked to a Tier-0 Gateway to prevent Northbound connectivity with the overlapping production network and ‘nsxt-restore-segment’ created for the virtual machines and new NSX-T Manager to attach to.

restore_pract_pic5

restore_pract_pic6

The admin can also see that the new NSX-T Manager, connected to the ‘nsxt-restore-segment’ is being deployed.

restore_pract_pic7

Step 8 – Adjust NSX-T CPU/Mem Resources and Power-On – PowerCLI

Once the new NSX-T Manager is deployed, the admin wants to adjust the memory reservation so that they can start the NSX-T Manager without running into memory constraints since the test environment is rather limited.  The deployed NSX-T Manager is in ‘small’ form factor, but still has a 16GB Memory reservation on it.  From the BACKUP-01a production server, the admin runs the 02_NSXT_RESERVATION_ADJUST.ps1 to adjust the memory reservation down to 8GB and then power on the appliance.

restore_pract_pic8

Step 9 – Restore VMs to NSX-T Testing Segment – Veeam Restore Job

To get the virtual machines necessary to help in the NSX-T restore process and to prove that the admins can restore NSX-T and virtual machines from native and Veeam backups respectively, the admin runs a restore entire VM job of the three VMs previously backed up, and…

  • Points the Veeam restores to the Compute Cluster B host
  • Places them on the VM Network
  • Appends ‘_restored’ to each of their VM names
  • Leaves them powered Off. They’re left powered off so that once restored, the admin can adjust their network configurations to be attached to the ‘nsxt-restore-segment’.

restore_pract_pic9

Step 10 – Change Restored VMs networking to NSX-T Testing Segment – vCenter Server network vMotion

The restored VMs can easily be moved in bulk to the ‘nsxt-restore-segment’ by using the Migrate VMs to Another Network option.

restore_pract_pic10

Once the VMs are restored and moved to the ‘nsxt-restore-segment’, they can be powered on and the next step can proceed.

Step 11 – Add NSX-T Restore Config – NSX-T Policy API via PowerCLI

Now that the restored VMs are all added to the ‘nsxt-restore-segment’ and the new NSX-T Manager is online and attached as well, the admin can access these VMs by using the vSphere Client and using a direct console to the BACKUP-01a_restored VM.  It’s critical to run the remaining scripts from that machine, as there is no outside network access to the new NSX-T Manager appliance, as intended.

Consoling into the BACKUP-01a_restored server, the admin can make some checks to see if network connectivity is indeed limited to the ‘nsxt-restore-segment’.  Taking a quick look at the IPCONFIG of the BACKUP-01a_restored server, the admin can see that they cannot PING the default gateway of the network, however they are able to PING the other VMs and the NSX-T Manager (which has the same IP address as the Production NSX-T Manager).

restore_pract_pic11

The admin can also log into the UI of the NSX-T Manager from the BACKUP-01_restored server as well and can see that this is a brand-new deployment with no configurations.

restore_pract_pic12

The admin can also see that the Restore configuration is no longer configured as well.  The next step is to get the configuration for restoring the NSX-T Manager put back into the new NSX-T Manager.  This NSX-T Manager is already the same IP and Name as the production version, which is a requirement for restoration.

restore_pract_pic13

With connectivity to the NSX-T Manager, and confirmation that there’s no configurations, the admin can proceed with running the PowerCLI script to add the Restore Configuration into the NSX-T Manager from script 03_NSXT_RESTORE_CONFIG.ps1.

restore_pract_pic14

A quick run of the script and a refresh of the NSX-T Manager UI, and the admin can see that the SFTP server configuration is back and all of the backups that have been taken are showing up as well.

restore_pract_pic15

After checking the backup files, the admin picks the first one in the list of Available Backups and clicks on the restore button to apply the configuration.  During the restore process, since this is not a full restore and components such as Edge Nodes and Transport Node hosts are not contactable, the admin may get a few error messages that they can skip through.  Once the restore is done, the admin can take a look at the restored configuration and see that the NSX-T Manager configuration matches the production instance and the restore was successfully finished and validated.

restore_pract_pic16

restore_pract_pic17

With a successful test and the requirements accomplished, the admin can now perform the final steps running the last two scripts on the BACKUP-01a production server.  One of the scripts, 04_NSXT_RESTORE_CLEANUP.ps1 will shutdown and then forcibly delete all of the restored virtual machines and the NSX-T Manager.  The last script, 05_NSXT_DEPLOY_CLEANUP.ps1, runs a Policy API REST command to remove the Tier-1 Gateway and Segment to bring the entire deployment back to its original, clean state.

restore_pract_pic18

restore_pract_pic19

restore_pract_pic20

The last 2 posts have shown the Healthcare organization the power of using NSX-T and how it can be used with even a small amount of automated techniques to accomplish several use case examples and provide a real value to the organization that requires them to test their backups.

Using NSX-T to Test NSX-T and Virtual Machine Recovery with Automation – Conceptual

The Healthcare organization, last post, configured their NSX-T Manager to send its backups to an SFTP backup server so they can perform restores if necessary.  The Healthcare organization also utilizes Veeam Backup and Recovery to provide virtual machine-based backups for their virtual infrastructure.  Unfortunately, the NSX-T Manager is not supported to be backed up using Veeam, and requires a new fresh NSX-T Manager installation deployed and a backup configuration restored to it and the Healthcare organization would like to test restores of the NSX-T Manager.

Configuring and actually backing up the NSX-T Manager configuration or a virtual machine is one thing, actually being able to test the backups is another.  A backup is no good if you can’t restore from it.  The organization has found a way to test both their NSX-T backups and their virtual machine backups at the same time to meet the requirements.  Taking some pointers from what they’ve learned previously around using automation tools, they plan to expand their automation learning with this same process.

NSX-T can provide exact copies of production environments running on top of the same underlying physical network with no changes to the physical network.  The Healthcare organization has placed a very large bet on NSX-T being their networking and security platform for their infrastructure, and looking to use this capability to provide an isolated backup environment to test restoring their backups.  Keeping an automation mindset in place, the Healthcare organization admins take a look at the requirements they’ll need to accomplish the tasks:

Requirements:

  1. Use NSX-T to build a production replica network to test restores of the NSX-T Manager and show virtual machines can also be restored and tested on the same network
  2. Use Veeam to restore the following virtual machines:
    1. Backup Server – Will be used to run automation scripts from
    2. Active Directory – Will be needed for DNS purposes
    3. SFTP Server – Hosts the NSX-T backups that restores will be tested from
  3. Deploy a new NSX-T Manager to test the restore process to it
  4. Use automation wherever possible to continue expanding automated techniques

The following topology is drawn out by the admin that will ensure that they can rebuild a production network replica while not overlapping with the actual production networking.  This topology consists of the following constructs to build out the production replica network:

restores_pic3

  • Standalone Tier-1 Gateway – not connected to any Tier-0 Gateway, preventing northbound communications that would conflict with the production networking
  • Restore Network Segment – Provides a logical network for the restored VMs to attach to
  • Restored Domain Controller – One of the organizations domain controllers that will provide DNS for the replica network and the VMs attached
  • Restored Backup Server – Hosts the PowerShell scripts that are necessary for scripting part of the deployment on the restored NSX-T Manager. Some of the scripts will need to be run from the Production Backup Server and some of them from the Restored Backup Server since there will be no outside communications to the Restore environment other than vCenter Server direct console access
  • Restored SFTP Server – Hosts the backups of the NSX-T Manager
  • Restored NSX-T Manager – Will be used to test its own restores
  • vCenter Server B – Manages the Compute Cluster B
  • Compute Cluster B – Provides a non-production host for the restored systems to be placed on that’s not managed by the production vCenter Server A.

Before any automation can begin, the admin needs to understand all of the workflow steps that will be necessary and how to perform them so they can put automation around each workflow process.

restores_pic10

 

NSX-T 2.4.x provides a hierarchical intent-based Policy API for customers to use for automation techniques.  The admin takes a look at the NSX-T API official documentation on the Policy API and finds a few REST API commands that could be useful for creating the necessary constructs.  From the configuration of the backup of the NSX-T Manager in the previous post, the admin can also use the information collected there and REST API commands to automate adding the restore configuration into the NSX-T Manager that will be deployed.

The NSX-T Manager comes from the VMware download site as an OVA type of download.  Using a tool such as the OVFTOOL, could be used to help automate the process of deploying the NSX-T Manager to the new network that will be created.

To wrap all of these different automation techniques into scripts that the admin can use, they’re planning to use PowerCLI and PowerShell Core 6 to build scripts that can be run to automate as much of this process as possible.

The admin performs the following actions to be able to use PowerShell Core 6 and VMware PowerCLI on the Backup Server.  The Backup Server will host the scripts and also be the server where the scripts are run from, both in production and in the restored segment.

Install Prerequisites:

The post isn’t going to go into how to install these items as they are fairly simple to install with mostly click, click, next.

Each of these processes will be necessary to meet all of the requirements.  There are specific portions of the workflow where processes can be joined together into singular scripts and the admin will attempt to do so within their experience.

The first and second workflow process in the table consists of building a Veeam Backup Job around all of the virtual machines needed, and ensuring that NSX-T is sending backups to the SFTP server.

Requirement 2 – Use Veeam to restore the following virtual machines

  • Backup Server – Will be used to run automation scripts from
  • Active Directory – Will be needed for DNS purposes
  • SFTP Server – Hosts the NSX-T backups that restores will be tested from

Regardless of the order of the requirements, first and foremost to test the restore process, the admin needs to ensure that they have backups of the systems that they’re planning to perform restore testing.  The admin also hops into the NSX-T Manager console and checks that the latest backup job has completed or can press the ‘BACKUP NOW’ button to start a latest backup to the SFTP server.

restores_pic2

For the process of testing backups in this use case, the admins have configured a separate backup job in Veeam that has the 3 virtual machines that will be used for this testing procedure.

restores_pic1

The admin waits to start the backup job in Veeam until the scripts are all built as they’ll be needed once the Backup Server is restored.  The admin can start to take a look at how to build an NSX-T isolated copy of the production network.

Requirement 1 – Use NSX-T to build a production IP-based isolated network to test restores to NSX-T and show virtual machines can also be restored and tested on the same network

Requirement 3 – Deploy a new NSX-T Manager to test the restore process to it

Requirement 4 – Use automation wherever possible to continue expanding automated techniques

The process of building the production replica network can be accomplished using the NSX-T REST API.  The admin has taken a look at the NSX-T REST API official documentation and found an example of using the hierarchical intent-based API to build the Tier-1 Gateway and the Segment that will be used.  The next process is around using the OVF Tool to deploy the NSX-T Manager to the same segment previously created.  Since these processes can be called from PowerCLI, the admin decides to combine these two workflows into one script.

The code that was built for this resembles the following:

restores_pic4

restores_pic5

This script builds the Tier-1 Gateway and Segment using the NSX-T Policy API, then immediately jumps to using the OVF Tool to deploy the new NSX-T Manager to the previously created Segment.  You can find the actual script over here – github link.  For ease of reading, the OVF arguments were word wrapped.  Those need to be in one-line, normally.

The next process is around changing the Memory resources of the NSX-T Manager.  Typically, the NSX-T Manager has a memory reservation to ensure enough memory is available for it to run.  Given this is a testing environment restore, the admin wants to remove this reservation so they can start the NSX-T Manager without running into issues.  The admin builds another script to adjust this and start the VM.

The code that was built for this resembles the following:

restores_pic6

This script adjusts the memory reservation down to 8GB and then starts the NSX-T Manager VM.

The next piece of scripting that the admin chooses to do is around putting in the Restore Configuration for the NSX-T Manager into the new NSX-T Manager virtual machine using PowerCLI and the REST API.  The code that was built for this resembles the following:

restores_pic7

This script sends a REST API command to put in the Restore Server configuration into the NSX-T Manager so it can now see the NSX-T Backups on the restored SFTP-01a and can choose which one the admin wants to test the restore to.

The final script the admin decides to build is around clean up of all of the virtual machines and networking components created to test with.  The code built for this resembles the following:

restores_pic8

This script powers down and deletes all of the restored virtual machines and the NSX-T Manager, and then runs the NSX-T Policy API to remove the Tier-1 Gateway and testing Segment created resetting the infrastructure back to its original configuration.

restores_pic9

There are obviously several areas where the scripting can be improved and even further simplified.  This is a good first start for the admin to meet the requirements and grow their automation skills and further refine the scripting.  In the next post, the admin will put all of these scripts and processes to work and test the full process.  The screenshots of the script code may be tough to read, so the admin has uploaded all of the scripts to this location – https://github.com/vwilmo/NSXT_RESTORE_TESTING 🙂

 

NSX-T Backup and Restore Configuration and Automation | Part 1 – Windows SFTP Backup Targets

Now that the Healthcare organization has completed their journey of migrating from NSX Data Center for vSphere over to NSX-T Data Center, it’s time to do a bit of day 2 configuration, specifically configuring the backups of the NSX-T Manager.

The infrastructure admins that are currently in charge of running the NSX-T environment for the organization are expanding their scripting knowledge a bit and working on automating many of the configurations and operations that NSX-T Data Center requires.  The first area where some simple scripting can help is around configuration and management of NSX-T Backups.

Typically, the admin could go into the NSX-T Manager UI and perform these configurations via the UI.

backups_pic1

Since the admins are wanting to expand their knowledge in scripting and using REST APIs, and the plan is to bring this knowledge forward into performing and checking NSX-T restores later, they’ve opted to use a different approach.

Requirements:

  • Setup Backup configuration for the NSX-T Manager with an eye on automation
  • At least 3 backups per day and automatic backups after configuration changes
  • Maintaining at least 30 days of backups for the NSX-T Manager

Requirement 1 – Setup Backup configuration for the NSX-T Manager with an eye on automation

Requirement 2 – At least 3 backups per day and automatic backups after configuration changes

The first two requirements can be handled with one straightforward approach.  The organization currently has a Cerberus SFTP server that backs up configuration from other devices on their network.  It’s a FIPS 140-2 compliant software package that will work well with NSX-T.  This software package runs on a Windows Server 2016 machine for the organization to store the backups.  Consulting the official NSX-T documentation for Backup and Restore, the admin finds the required items to be able to perform the configuration.  The information is put into a chart for documentation purposes so that they can be tracked and the infrastructure and security team know the settings being used.  

backups_table_pic1

Now that the settings have been documented accordingly, the admin can take a further look at how to configure the settings in NSX-T.  The admin has decided that they will take the following approach around automating the installation of the configuration.  They will use the NSX-T REST API to perform the configuration using the documented settings.  To be able to do this a few things will need to happen.

  • Installation of a REST API client – Postman
  • Code example from the NSX-T Data Center API Guide for configuration and testing backups

This post will not go into the installation of Postman, it’s a simple installation.  The following configuration is however needed to properly ensure Postman will call the NSX-T Manager REST API.

backups_table_pic2

After consulting the NSX-T Data Center API Guide, the following code was pulled that should provide the necessary single API call to configure the NSX-T Manager backup schedule.

Example code for backup configuration:

backups_table_pic3

Taking the information collected during the documentation process, the admin can now substitute in the organization-specific configuration that will be used for the body of the REST API call.

Organization-specific code for backup configuration:

backups_table_pic4

When the admin pastes the above configuration into the body of the REST API PUT command and sends the command, they receive a Status 200 OK meaning the command was realized and accepted.

backups_pic2

There are several ways that the admin can check the work, but the Status 200 OK will display the result from the command in the Body section from the response.  It is also possible to change the same command from PUT to GET and resend it to get the same result.

With the configuration in place, the admin can issue another command via the REST API that will initiate a backup from the NSX-T Manager to the SFTP server.

backups_table_pic5

Running this command will take some time to send the request and get a response as the actual process of performing the backup needs to take place and send back a Status 200 OK which is only sent when the backup actually completes successfully.  As you can see from the Postman output below, the request took 1 minute and 1.08 seconds to actually perform the command.

backups_pic3

The admin can now go into the NSX-T Manager UI and check the configuration and backup status visually as well and it appears that all is configured properly and backing up to the SFTP server as they’d expect it to.

backups_pic4

The admin also takes a quick look at the SFTP server and the backup directory to check that files have been created.

backups_pic5

Requirement 3 – Maintaining at least 30 days of backups for the NSX-T Manager

To meet the last requirement, while still maintaining Requirement 1 around an eye for automation, the admin needs to find a way to only keep 30 days of backups for the NSX-T Manager.  The official NSX-T documentation has several scripts that can be run on Linux-based systems and coupled with a cron job, can be used to clean up the backup directory on an automatic and scheduled basis.  However, there are no scripts supplied for Windows-based SFTP systems and the Healthcare organization is using a Windows machine for their SFTP server.  The admin decides to create their own script using PowerShell and using a Windows Scheduled task to provide the same benefit.

Taking a look at the SFTP server, the admin can see that there are several folders created for the backup files.

  • ccp-backups – Contains .tar files of the Control Plane backup for NSX-T
  • cluster-node-backups – Contains .tar files in date specific folders for the NSX-T Manager/Policy/Controller Cluster and each individual NSX-T Manager backup
  • inventory-summary – Contains .json files for every inventory object in the NSX-T Manager backup

Each of these folders contains multiple files after a backup occurs for NSX-T.  Below is an example:

backups_pic6

The admin determines that the easiest way to handle this is to use PowerShell to create a script that will automatically look for files older than 30 days and remove the folders and files within the folders appropriately.  The code looks like this and can be found on GitHub as well.

backups_table_pic6

The admin tests this script by changing the $Daysback variable in the script to -0 as that will delete all of the backups that have been taken thus far.  Running the script, the admin can see that all of the backups have been removed and the folder structure for the backups is still intact.

backups_pic7

After running the backup again, the admin can see that the new backup files are present in the folder.

backups_pic8

With the script working as intended, the admin can now create a Windows scheduled task to call the PowerShell script on a nightly basis to clean up the SFTP backup directory

backups_table_pic7

With the task created, the admin runs the task manually and verifies that the current backup is removed as intended.  The admin can now run a current backup of the configuration and change the $Daysback variable to -30 again.

backups_pic9

The requirements have been fulfilled and the admin can now move onto the next task which is testing the backup and restore process in Part 2.

NSX Data Center for vSphere to NSX-T Data Center Migration – Part 3

Planning and preparation are complete and the Healthcare organization is now ready to proceed with Part 3 of the NSX Data Center for vSphere to NSX-T Data Center migration.

Researching the process for migration from NSX Data Center for vSphere to NSX-T Data Center involves the following processes.  These efforts will be covered over a series of blog posts related to each step in the processes:

  • Understanding the NSX Data Center for vSphere Migration Process – Part 1
    • Checking Supported Features
    • Checking Supported Topologies
    • Checking Supported Limits
    • Reviewing the Migration Process and the prerequisites
  • Preparing to Migrate the NSX Data Center for vSphere Environment – Part 2
    • Prepare a new NSX-T Data Center Environment and necessary components
    • Prepare NSX Data Center for vSphere for Migration
  • Migration of NSX Data Center for vSphere to NSX-T Data Center – Part 3

As they started the process in part 1, consulting the official documentation on the processes and what steps to perform are recommended.

https://docs.vmware.com/en/VMware-NSX-T-Data-Center/2.4/migration/GUID-78947686-CC6C-460B-A185-7E2EE7D3BCED.html

MIGRATION OF NSX DATA CENTER FOR VSPHERE TO NSX-T DATA CENTER

The migration to NSX-T Data Center is a multi-step process.  The steps are outlined below:

  • Import the NSX Data Center for vSphere Configuration
  • Resolve Issues with the NSX Data Center for vSphere Configuration
  • Migrate the NSX Data Center for vSphere Configuration
  • Migrate NSX Data Center for vSphere Edges
  • Migrate NSX Data Center for vSphere Hosts
  • Finish the NSX Data Center for vSphere Migration

Upon further review of each step, the organization deployed two NSX-T Data Center Edge Nodes which will be used as replacements for the NSX Data Center for vSphere Edge Services Gateways.  These Edge Nodes were deployed using the official documentation and added to the NSX-T Manager.

migration_coordinator_process_pic8

migration_coordinator_process_pic9

IMPORT THE NSX DATA CENTER FOR VSPHERE CONFIGURATION

To begin the process, the organization needs to enable the Migration Coordinator on the NSX-T Manager that they deployed.  A quick SSH session into the NSX-T Manager using the admin account, will provide the means to run the command necessary to start the Migration Coordinator service and enable the user interface that will be used for the migration in the NSX-T Manager:

migration_coordinator_start

Now that the Migration Coordinator service is running, the user interface in the NSX-T Manager will be enabled.

migration_coordinator_process_pic1

The next step in the process is to authenticate to the NSX Manager and the vCenter Server.

migration_coordinator_process_pic2

migration_coordinator_process_pic3

With the NSX Data Center for vSphere Manager and vCenter Server added in, the organization can start the import configuration step.

migration_coordinator_process_pic4

The organization sees ‘Successful’ on importing the existing configuration into NSX-T Data Center.  There is an option to ‘View Imported Topology’ which will give them a nice visual diagram of the configuration details that were imported.

migration_coordinator_process_pic5

A successful import allows the organization to proceed with the next step in the migration process

RESOLVE ISSUES WITH THE NSX DATA CENTER FOR VSPHERE CONFIGURATION

Moving to the next step, the organization is presented with all of the ‘issues’ that need to be resolved to move forward with the migration process. The total number of inputs that need to be resolved are listed and once resolved, will also be listed.

migration_coordinator_process_pic6

Several of the issues appear to be items that the organization does have already have configured.  Each issue has a recommendation by the Migration Coordinator for the organization to consider and move forward with the migration process.  The more important issues listed, are the ones that deal with the ‘EDGE’ as those issues will result in new NSX-T Data Center Edge Nodes being deployed to replace the existing Edge Services Gateways.

migration_coordinator_process_pic7

After selecting the EDGE category of issues to resolve, the organization was met with the following items to remediate before it was able to proceed to the next step.

migration_coordinator_process_pic10

  • IP addresses for TEPs on all Edge transport nodes will be allocated from the selected IP Pool. You must ensure connectivity between Edge TEPs and NSX for vSphere VTEPs.

This issue requires putting in the TEP_POOL that was created for the Edge Nodes already.

  • An NSX-T Edge node will provide the connectivity to replace NSX-v edge. Enter an IP address for the uplink.

This issue requires putting in a valid uplink IP address for the NSX-T Edge Node.  The organization will want to use the same IP address that the NSX Data Center for vSphere Edge Services Gateway is currently using since the TOR is statically routed to that IP address.

  • An NSX-T Edge node will provide HA redundancy for NSX-v edge. Enter an IP address for the uplink on this Edge node. This IP address must be in the same subnet as the uplink of the other NSX-T Edge used to replace this edge.

This issue requires putting in a valid IP address for the HA redundancy that the Edge Node will provide

  • An NSX-T Edge node will provide HA redundancy for edge replacing NSX-v edge. Enter an unused fabric ID for Edge node. See System > Fabric > Nodes > Edge Transport Nodes.

This issue requires selecting the UUID that was imported from the NSX-T Edge Nodes and selecting which one will be the replacing the NSX Data Center for vSphere Edge Services Gateway

  • An NSX-T Edge node will provide the connectivity to replace NSX-v edge. Enter an unused fabric ID for this Edge node. See System > Fabric > Nodes > Edge Transport Nodes.

This issue is similar to the one above but requires selecting the second NSX-T Edge Node UUID instead.

  • An NSX-T Edge node will provide the connectivity to replace NSX-v Edge. Enter a VLAN ID for the uplink on this Edge node.

This issue requires putting in the VLAN ID of the uplink adapter that will be used.

With all of the items resolved, the organization is ready to proceed with the actual migration process. Given that there will be some data plane outages that will need to occur during this process, the Edge Services Gateways will need to migrate to NSX-T Gateways, the organization has decided to perform the actual migration process during a scheduled maintenance window.

MIGRATE THE NSX DATA CENTER FOR VSPHERE CONFIGURATION

Pressing start, the Migration Coordinate begins migrating the configuration over to the NSX-T Data Center Manager.  This part of the process does not incur an outage as it’s a copy of the configuration.

migration_coordinator_process_pic11

Once the configuration has been copied over, the organization can now see all of the components that have been created in NSX-T Data Center from the configuration imported.

NETWORKING

The organization can see that a new Tier-0 Gateway has been created and has the routing configuration that the Edge Services Gateways had.

networking

networking2networking3

GROUPS

The organization checks the new Group objects and can see that those new Inventory objects have been created

groups1

SECURITY

Lastly, the organization checks the security objects, specifically that their Distributed Firewall and Service Composer rulesets are migrated over properly.

security1

MIGRATE NSX DATA CENTER FOR VSPHERE EDGES

The next part will incur an outage as this is the process of migrating the NSX Data Center for vSphere Edge Services Gateways over to the NSX-T Data Center Edge Nodes.  This will involve moving the IP addressing over.

migration_coordinator_process_pic12

migration_coordinator_process_pic13

Once the Edges have been migrated over, the organization can see that a new Transport Zone is created, Edge Node Cluster created, and N-VDS switch is created.

MIGRATE NSX DATA CENTER FOR VSPHERE HOSTS

The next step involves swapping the ESXi host software components for NSX Data Center for vSphere out with NSX-T Data Center.

hosts1

With the ESXi hosts now migrated the organization has now been successfully migrated from NSX Data Center for vSphere over to NSX-T Data Center.

finished1.png

Now that the Healthcare organization has migrated over to NSX-T Data Center, they can start the decommissioning of the NSX Data Center for vSphere components that are no longer needed.  The topology of their data center environment with NSX-T Data Center now looks like this.

finish_topology

NSX Data Center for vSphere to NSX-T Data Center Migration – Part 2

Part 2 of the NSX Data Center for vSphere to NSX-T Data Center migration for the Healthcare organization is around preparing the new NSX-T Data Center environment by deploying, installing, and configuring the necessary components.

Researching the process for migration from NSX Data Center for vSphere to NSX-T Data Center involves the following processes.  These efforts will be covered over a series of blog posts related to each step in the processes:

  • Understanding the NSX Data Center for vSphere Migration Process – Part 1
    • Checking Supported Features
    • Checking Supported Topologies
    • Checking Supported Limits
    • Reviewing the Migration Process and the prerequisites
  • Preparing to Migrate the NSX Data Center for vSphere Environment – Part 2
    • Prepare a new NSX-T Data Center Environment and necessary components
    • Prepare NSX Data Center for vSphere for Migration
  • Migration of NSX Data Center for vSphere to NSX-T Data Center – Part 3

As they started the process in part 1, consulting the official documentation on the processes and what steps to perform are recommended.

https://docs.vmware.com/en/VMware-NSX-T-Data-Center/2.4/migration/GUID-78947686-CC6C-460B-A185-7E2EE7D3BCED.html

PREPARE A NEW NSX-T DATA CENTER ENVIRONMENT AND NECESSARY COMPONENTS

Preparing a new NSX-T Data Center environment involves deploying the NSX-T Manager.  Installation of the NSX-T Manager is beyond the scope of this blog post as the official documentation has the necessary steps involved.  The key piece of information for this part of the migration process is to deploy the NSX-T Manager appliance(s) on ESXi hosts that are NOT part of the NSX Data Center for vSphere environment that’s being migrated.  The Healthcare organization deployed the new NSX-T Manager on the same hosts that the NSX Data Center for vSphere Manager is currently deployed on.

before_topology.with.nsxt

The next step is to add the vCenter Server that is associated with the NSX Data Center for vSphere environment.  NSX-T Data Center has a completely separate user-interface to manage the NSX-T installation, that will not conflict with the NSX Data Center for vSphere user-interface that’s added as a plug-in to the vSphere Client.  The steps to add the vCenter Server, Compute Manager, into NSX-T are documented in the same official documentation as part 2 of the migration process.  Once added into NSX-T, this is what the organization sees:

nsxt_compute_manager_added

There is a recommendation to add more NSX-T Managers to form a cluster for a proper production deployment, but since the Migration Coordinator is only run on one of the NSX-T Manager appliances, they can be added later.

The last step to prepare the NSX-T side of the migration process for the organization is to create an IP Pool for the Edge Tunnel Endpoints (TEP).  The organization already has a VLAN network for the VXLAN Tunnel Endpoints on the ESXi hosts for NSX Data Center for vSphere.  The VLAN is constrained using an IP range and part of the VLAN network will be assigned for the Edge TEPs as well as the host TEPs that will need to be created as well.

tep_pool_pic1

A TEP pool is created that the organization will reference during the migration

tep_pool_pic2

An IP range of addresses in the VLAN network is allocated and ensured not stepped on by any other devices in the range.

PREPARE NSX DATA CENTER FOR VSPHERE FOR MIGRATION

With the NSX-T Data Center environment setup and the steps followed, the next part of the migration process involves preparing the NSX Data Center for vSphere environment.

The first step involves configuring any hosts that might not already be added to a vSphere Distributed Switch.  The Healthcare organization has moved all of the data center hosts over to a vSphere Distributed Switch so this part of the process is not applicable to them.

The second step of this part of the migration process involves checking the Distributed Firewall Filter Export Version of the virtual machines.  This involves checking the ESXi hosts where these workloads reside and running a few simple commands.  Checking the vSphere Client, the workloads and the hosts they reside on can be seen so the organization knows which hosts to check filter export versions.

vcenter_vm_inventory

Now that the information on the virtual workload is confirmed, a simple SSH session into the ESXi host will determine if the export version is correct or needs to be modified to support the migration process.

export_filter_check

The check of the workload shows that the Distributed Firewall Filter Export Version is the correct version for this workload.  The organization can now check all of the other workloads to ensure this is the case with those as well.  This is the last step in part 2 of the process and once fully completed the Healthcare organization can moved to Part 3 and begin the actual migration process.

 

 

NSX Data Center for vSphere to NSX-T Data Center Migration – Part 1

It’s been well over a year since the last post discussing the usage of VMware NSX with the Healthcare organization.  In that time, they’ve deployed NSX controllers, and a small amount of VXLAN networks with a few workloads attached as well as continued their micro-segmentation journey, building security around their important workloads.  The release of NSX-T Data Center and the added benefits and support, have led the organization to look into migrating from NSX Data Center for vSphere to NSX-T Data Center.  NSX-T Data Center 2.4 now includes an NSX Data Center for vSphere to NSX-T Data Center Migration Coordinator that can help transition NSX Data Center for vSphere deployments over to new or existing NSX-T Data Center deployments.  The Healthcare organization has decided to pursue making use of the tool and moving their organization from NSX Data Center for vSphere to NSX-T Data Center.

Researching the process for migration from NSX Data Center for vSphere to NSX-T Data Center involves the following processes.  These efforts will be covered over a series of blog posts related to each step in the processes:

  • Understanding the NSX Data Center for vSphere Migration Process – Part 1
    • Checking Supported Features
    • Checking Supported Topologies
    • Checking Supported Limits
    • Reviewing the Migration Process and the prerequisites
  • Preparing to Migrate the NSX Data Center for vSphere Environment – Part 2
    • Prepare a new NSX-T Data Center Environment and necessary components
    • Prepare NSX Data Center for vSphere for Migration
  • Migration of NSX Data Center for vSphere to NSX-T Data Center – Part 3

With these processes in mind, it makes sense to start with by first taking a look at the official documentation on how to migrate from NSX for vSphere to NSX-T Data Center, the organization begins to document the functionality it’s currently using to compare it to the list of supported functions that the migration tool supports.

https://docs.vmware.com/en/VMware-NSX-T-Data-Center/2.4/migration/GUID-78947686-CC6C-460B-A185-7E2EE7D3BCED.html

CHECKING SUPPORTED FEATURES BY THE MIGRATION COORDINATOR

The organization has reviewed the supported features and listed them in a chart as they reviewed them to document the items, they will need to pay attention to and are relevant to their NSX Data Center for vSphere deployment:

table1

Going through the supported features, a few things were found that the organization was using that it had to make slight changes to facilitate the migration.

table2

CHECKING SUPPORTED TOPOLOGIES BY THE MIGRATION COORDINATOR

The Healthcare organization built out the following diagram to show and document the infrastructure they had built as they continued to add new features of NSX Data Center for vSphere in their environment.  This diagram is very beneficial for checking against the supported topology that the Migration Coordinator supports.

before_topology

The Healthcare organization’s network topology contains the following configurations which we can use to compare against the supported topology for the Migration Coordinator to find one that matches.

table3

The network topology leans toward following the Migration Coordinator supported topology that’s represented in this diagram from the official documentation. 

before_topology_supported

The supported topology from the official documentation supports the configurations that match the organizations current NSX Data Center for vSphere topology.

CHECKING SUPPORTED LIMITS BY THE MIGRATION COORDINATOR

The next step in the process is to review the supported limits of the Migration Coordinator against the currently configurations in the existing NSX Data Center for vSphere environment.  The Healthcare organization is a rather small deployment so their current configurations should fall into the supported limits of the Migration Coordinator.  The organization documented the following configuration information for the migration process to ensure they were within the limits:

table4

REVIEWING THE MIGRATION PROCESS AND THE PREREQUISITES

The last step in part 1 of the Migration process is to review the prerequisites that are necessary to facilitate the migration to NSX-T Data Center.  Reviewing the official documentation, the following items are required to migrate properly:

  • Deploy a new NSX-T Data Center environment
  • Import the configuration from the NSX Data Center for vSphere environment
  • Resolve issues with the configuration and deploy NSX-T Edge Nodes
  • Migrate Configuration
  • Migrate Edges
  • Migrate Hosts

After taking a look at the first part of the process, the Healthcare organization is ready to proceed to step 2 which will involve modifying their infrastructure to support the NSX-T Data Center environment they will be migrating to.

VMworld 2019 Sessions and looking forward

It’s been well over a year since I last posted anything on my blog, but since moving into the Networking and Security Business Unit at VMware and my girls getting older, I’ve been focusing on other commitments.  That being said, I have been compiling a list of things that I plan to start working on and blogging over the next several months to bring new content back to the blog.  Stay tuned, I have a massive list that I want to get posted.

Now for a bit of shameless self-promotion of my own sessions as well as ones from my peers. 🙂

We’re now just a few short weeks away from VMworld US 2019.  There are well over 200 different sessions for security including Breakout Sessions, Deep Dives, Self-Paced and Expert-led Hands-on labs, Meet the Expert, and many more.  Discussing with my peers, their sessions are filling up so if you haven’t signed up for all of your sessions take a look at the ones below and get them scheduled.

Below are the sessions that I will be presenting with one of my customers, CHRISTUS Health, my partner in EUC crime, Graeme Gordon, and last but not least my deep dive on NSX-T Guest Introspection (Endpoint Protection) that came out recently with the NSX-T 2.4 release.

SPEAKERS – Geoff Wilmington, Senior Technical Product Manager, VMware

Thursday, August 29, 09:00 AM – 10:00 AM

SPEAKERS – Graeme Gordon, Senior Staff EUC Architect, VMware and Geoff Wilmington, Senior Technical Product Manager, VMware

Tuesday, August 27, 05:00 PM – 06:00 PM

SPEAKERS – Brandon Rivera, Enterprise Infrastructure Architect, CHRISTUS Health and Geoff Wilmington, Senior Technical Product Manager, VMware

Monday, August 26, 02:30 PM – 03:30 PM

If you’re looking for other VMware security product related deep dive sessions, take a look at these from some of my peers that I highly recommend you attend.  These folks are all amazing presenters and their content is top-notch.

SPEAKERS – Stijn Vanveerdeghem, Senior Technical Product Manager, VMware

Wednesday, August 28, 03:30 PM – 04:30 PM

SPEAKERS – Ganapathi Bhat, Sr Technical Product Manager, VMware

Wednesday, August 28, 09:30 AM – 10:30 AM

SPEAKERS – Anthony Burke, Solutions Architect, VMware and Dale Coghlan, Staff Solution Architect, VMware

Wednesday, August 28, 08:00 AM – 09:00 AM

SPEAKERS – Kevin Berger, Director, Security Engineering, VMware and Chris Corde, Senior Director of Product Management, VMware

Wednesday, August 28, 01:00 PM – 02:00 PM

Last but not least, you’re definitely going to want to check out this session.  I won’t go into too many details, but Ray has some seriously cool stuff to show off.

SPEAKERS – Ray Budavari, Senior Staff Technical Product Manager, VMware

Wednesday, August 28, 02:00 PM – 03:00 PM

If we’ve never met, please don’t hesitate to come up and say ‘Hi’.

 

The VMware NSX Platform – Healthcare Series – Part 7.1: Secure End-User VDI Practical

In the last post we broke down each of the six use cases that the VMware NSX platform can provide for Secure End-User computing environments.  Each of these use cases providing a Healthcare organization a different business value depending on their needs.

As we start to break each use case down, the first one we’ll be looking at is around Micro-segmentation of a Virtual Desktop Infrastructure (VDI) environment that a Healthcare organization may run.

practical_pic1

Speaking with my Healthcare customers running VDI systems, there wasn’t many security-based controls to protect traffic between each desktop.  These systems were fully allowed to communicate with each other regardless of an actual need to.  Not only that, but those VDI systems could scale up or down as demanded.  The security posture must be able to scale up and down in a similar fashion.  These use cases for NSX are quite simple and easy to implement.  By blocking unnecessary communications between these systems, we can ensure that if one desktop is compromised, could transition and compromise other desktops.  Several times in discussions, my Healthcare customers were using one or more technologies as they transitioned from one to the other or as part of a legacy acquisition.  We’ll take the top two VDI software providers, Citrix and Horizon and show that it doesn’t matter which one a customer is using, both can be secured in an identical fashion.

Healthcare organizations also had different types of uses for example, clinicians and HR users.  There could be a need to create separate pools for each of these different types of users.  There are other ways to address this problem with other products and utilizing a single pool but in this case, we’re focusing on customers with this model.  In a later blog we’ll also explore how Identity-based Firewall in NSX can help as well.

Let’s look at our use case.

Use case – Augment an existing VDI deployment using VMware Horizon with NSX and secure traffic between each system and providing isolation for desktop pools

  • Block all VDI to VDI
  • Isolate HR and Clinician Desktop pool from each other
  • Maintain the existing infrastructure as much as possible and without major changes 

Technology used –

Windows systems:

  • External Client-01a – Windows 10 – HR User
  • External Client-02a – Windows 10 – Clinician User
  • EMR-VDI-01a > EMR-VDI-02a (HZN)
  • HR-VDI-01a > HR-VDI-02a (CTX)
  • Allow EMR-VDI pool to expand and contract based on Clinician need

VMware Products

  • vSphere
  • vCenter
  • NSX
  • Horizon View

Citrix Products

  • XenDesktop 

Applications in question –

Open Source Healthcare Application:

  • OpenMRS – Open Source EMR system
    • Apache/PHP Web Server
    • MySQL Database Server

Open Source HR Application:

  • IceHRM – Open Source HR system
    • Apache/PHP Web Server
    • MySQL Database Server

Environment –

practical_pic2

In this environment, the customer has a Citrix VDI environment and a Horizon VDI environment.  They are in the middle of a transition from one platform to another, however security needs to remain consistent.  The HR VDI systems need to be able to access the HR application, and the Clinicians need to be able to access the EMR application.  Neither should be able to talk to the other application and neither pool of desktops should be able to talk to each other.

practical_pic3

We also need to provide a security posture for when the EMR Desktop pool needs to expand due to Clinician need.

We’ll start by building out our NSX grouping constructs and putting in each of the different objects into their respective containers.  We’ll also mock up our firewall rule sets

NSX Groupings:

practical_pic4

The groupings are shown here for use in our rule sets.  Each of these NSX Security Groups will enable the DFW rule sets to scale as the environment scales without the need to make modifications to the DFW rules.  If more VDI systems are created, they will simply land in their appropriate Security Group, and the DFW policy will be applied.  We’ll be exploring the concept of automation for VDI in another post.

HR Application Communications:

practical_pic5

EMR Application Communications:

practical_pic6

VDI Communications:

practical_pic7

The rule sets are simple to segment out each of the different VDI systems from their appropriate application servers.  Also, by nesting all VDI VMs into one Security Group, we can keep the block communications between the systems to one rule.  With all the necessary components to create our security policy, we can start putting it into the NSX DFW.

practical_pic8

All the rules are in place; we can now perform verification to see that our rules are working and the requirements have been achieved.

HR users can get to their HR application and EMR users cannot get to the HR application

practical_pic9

EMR users can get to their EMR application and HR users cannot get to the EMR application

practical_pic10

VDI desktops are not allowed to talk to each other

practical_pic11

Now that we’ve verified that the HR user and the Clinician can access their VDI desktops and the relevant applications they need, we’ll scale up the EMR VDI systems by adding one more desktop and attempt to connect to the newest desktop the same as we did in the previous verification.

practical_pic12

Currently, there are only two VDI desktop machines in our pool.  We’ll increase that to 3 as a customer may need a new desktop should more clinicians need to login during rush periods, where not enough desktops are online already.  The main consideration here, is that we want to keep the same security posture as above, regardless of how many desktops we have in the pool.

practical_pic13

With the pool increased, our dynamic criteria has added the newly built machine into the Security Group as it should.  All the rules associated with this Security Group will now apply to this newly built VDI desktop as well.  Let’s verify.

practical_pic14

Our pings to the other VDI systems, including the new one we just built drop as we’d expect them to drop.  This fulfills the requirements of the customer.

Micro-segmentation for Secure End-User starts with securing VDI desktops.  Regardless of the implementation technologies, Horizon or Citrix, NSX can provide a security posture surrounding the VDI desktops and the application systems they require access to.  In future posts I’ll be showing how NSX can provide ever further value for the other Secure End-User use cases.

The VMware NSX Platform – Healthcare Series – Part 6: DMZ Anywhere Practical

Continuing our discussion on the topic of Healthcare and the DMZ use case, we’re going to put these concepts into actual practice.  With Healthcare systems, patients want access to their information quickly and not necessarily within the four walls of a Healthcare organization.  This means that this information needs to be provided to Internet-facing devices for secure access.  Below is the layout we’re going to use as a typical layout with Internet-facing EMR Patient Portal for customers using traditional methods.

Traditional Model

dmz_ms_pic1

For this post, we’re going to use a physical Perimeter Firewall, and an NSX Edge Services Gateway (ESG) as the Internal Firewall to separate the DMZ systems from the Internal data center systems.

In our concept post, we talked about how NSX can help augment an existing DMZ approach to simplify the restrictions of communications between systems that reside there.  For Healthcare providers, the EMR Internet-facing Web Servers should not allow communications between themselves.  If one Web Server is compromised, lateral movement must be restricted.  Traditional approaches to restrict intra-server traffic between the EMR Web Servers would require blocking the communication at Layer 2, using MAC addresses.  With NSX, we can instantiate a firewall at the virtual machine layer, regardless of the servers being on the same Layer 2 network, and restrict the Web Servers from talking to each other without needing to know the MAC addresses or by sending the intra-server traffic through an external firewall to block.  This same concept of East-West Micro-Segmentation, is covered in previous posts and is the same concept we can apply for DMZ workloads.

Let’s lay out the requirements from the customer for the first use case.

VMware NSX – DMZ Augment Model

dmz_ms_pic2

Use case – Augment the existing DMZ to remove communications between DMZ systems.

  • Block all EMR Web Servers from talking to each other
  • Maintain the existing infrastructure as much as possible and without major changes

Technology used

Windows Clients:

  • Windows 10 – Management Desktop – Jumpbox-01a (192.168.0.99)

VMware Products

  • vSphere
  • vCenter
  • NSX
  • Log Insight

Application in question

Open Source Healthcare Application:

  • OpenMRS – Open Source EMR system
    • Apache/PHP Web Server
    • MySQL Database Server

Let’s start things off like we normally do, with the layout of our methodology for writing our rule sets.

dmz_ms_table1

 

When we put in NSX, we can write one rule and get the following result.

dmz_ms_pic3

The rule is very simple to write. We simply add any DMZ systems to a Security Group and add that Security Group as both the Source and Destination and apply a Block.

dmz_ms_pic4

Once this rule is in place, any virtual machines we place into the DMZ-SG-ALL Security Group, will be blocked from talking to each other.  Let’s verify this is working.

dmz_ms_pic24

As we can see, the Web Servers are no longer allowed to talk to each other.  We have produced a similar result with less complexity and more scalability and operational ease without changing the existing infrastructure at all.

For the next use case, collapsing the traditional hardware DMZ back into the data center, the goal is to remove the need for the NSX Edge Services Gateway to provide an Internal Firewall and use the NSX Distributed Firewall (DFW) to handle access between the DMZ and the internal data center systems.

VMware NSX – DMZ Anywhere (Collapsed) Model

dmz_ms_pic6

You may notice that the ESG is still in place.  That’s because the Internal data center is running on VXLAN and there still needs to be an off-ramp router to get to the physical network.  However, we have disabled the ESG’s firewall to demonstrate removing the Internal Firewall separation and allowing the DFW to handle the restrictions.

Let’s lay out the use case and the requirements from the customer.

Use case – Collapse the existing DMZ back into the data center while still maintaining the same security posture as when it was isolated.

  • Restrict External Patients to connect only to the EMR DMZ Web Servers
  • Restrict Internal Clinicians to connect only to the internal EMR Web Server
  • Allow all EMR Web Servers to connect to the EMR DB Server
  • Block all EMR Web Servers from talking to each other
  • Maintain DMZ isolation of the EMR System from the HR System

Technology used

Windows Clients:

  • Windows 10 – Clinician Desktop – Client-01a (192.168.0.36)
  • Windows 10 – HR Desktop – Client-02a (192.168.0.33)
  • iPad – External Patients – (External IP)

VMware Products

  • vSphere
  • vCenter
  • NSX
  • Log Insight

Application in question

Open Source Healthcare Application:

  • OpenMRS – Open Source EMR system
    • Apache/PHP Web Server
    • MySQL Database Server
  • IceHRM – HRHIS (Human Resource for Health Information System)
    • Apache/PHP Web Server
    • MySQL Database Server

Let’s start things off like we normally do, with the layout of our methodology for writing our rule sets.  I’m not going to go through how to get these flows.  Please reference one of my previous posts around using Log Insight, Application Rule Manager, and vRealize Network Insight to gather this information.

 

dmz_ms_table2

A few things of note.  We created an RFC 1918 IP Set in these groupings.  We did so, so that we can restrict only External IP addresses access to the EMR DMZ Web Servers.  We don’t want our internal Clinicians connecting to them.  By blocking the entire 1918 range set, we should never get a connection from an internal system to the DMZ systems.  To do this, we create an IP Set with all three RFC 1918 ranges in it.  We create a Security Group with this IP Set put into the Inclusion Criteria.  Then we write a rule that blocks these ranges above an ANY rule to filter the types of traffic that should hit the DMZ Web Servers.

dmz_ms_table3

 

Let’s put our rules in the appropriate places on the appropriate firewalls and do some testing to verify the traditional method is working properly.

dmz_ms_pic7

NSX Edge Services Gateway Firewall Policy

This rule is in place to allow the EMR DMZ Web Servers to talk to the backend Database only.  We have to use an IP Set here because the DMZ Web Servers are outside the scope of NSX and do not have a firewall applied to them yet.  However, we can control what talks to the EMR-SG-DB Security Group from the physical environment.

dmz_ms_pic8

Physical Firewall Policy

We’re going to forward our DMZ Web Servers through our Physical Firewall to accept traffic on TCP 8080.  With this change we should be able to access our OpenMRS EMR system from the Internet.  Let’s verify.

dmz_ms_pic9

As you can see from the address bar, we’re able to hit one of the DMZ Web Servers from the Internet.  I’m using an iPad to demonstrate that it doesn’t matter the device at this point.  We can also verify that our NSX ESG Firewall is being hit by the DMZ Web Servers as well.  Using Log Insight, we can verify this quickly.

dmz_ms_pic10

We can see that the DMZ Servers are hitting our rule and that the destination being hit is 172.16.20.11, which is the EMR-DB-01a server.

Let’s put our rules for inside the data center into the NSX DFW.

dmz_ms_pic11

This type of configuration represents how we’d have to build our rule sets to accommodate a segregated DMZ environment.  Let’s verify that our EMR DMZ and Internal EMR Web Servers can still hit the EMR DB and that our Clinician Desktop and HR Desktops cannot browse to their respective systems.

Clinician Desktop to Internal EMR

dmz_ms_pic12

HR Desktop to HRHIS

dmz_ms_pic13

We’ve confirmed that all the rules in place are working and the traditional approach still works.  Let’s collapse those two Web Servers back into the data center and show how we can still provide a similar security posture, without the physical hardware isolation.

To do this we’re going to need to move back into our data center the two EMR DMZ Web Servers.  I’m going to create a new VXLAN network for them to live on that mimics their physical VLAN configuration inside the data center so we can still keep network isolation.  Keeping the same network doesn’t technically matter since we can still control the traffic, but most production Healthcare organizations would want to refrain from having to change IP addresses of their production systems if they can help it.

dmz_ms_pic14

dmz_ms_pic15

As you can see, the EMR-DMZ-WEB-01a/02a machines are now inside the Compute cluster in my data center.  They’re also on their same layer 2 network as they were before in hardware isolation.

We’ve disabled the Firewall on the ESG as well.

dmz_ms_pic16

And here is our now modified DFW rule sets to accommodate a collapsed DMZ environment similar to the hardware isolated configuration.

dmz_ms_pic17

So, here’s what did we added/changed:

  • We added our RFC1918 Security Group so that any internal systems would not connect to the DMZ Web Servers.
  • We also created a PERIMETER-IPSET for the Physical Firewall. This is because the ports for the EMR DMZ Web Servers are being NAT’d through the Perimeter Firewall so communications to the EMR DMZ Web Servers appear to come from an interface on that device.  Since that interface is on RFC1918 network, we add it to the RFC1918 Security Group as an Excluded host address.
  • Added DMZ Security Tags so that any new systems that are built can have the DMZ-ST-ALL Security Tag applied, which will put them into the DMZ-SG-ALL Security Group and block intra-server communications immediately.

Now that all of our changes in architecture are in place, we can go through and verify that all the requirements are being accounted for.  Let’s revisit the requirements.

Use case – Collapse the existing DMZ back into the data center while still maintaining the same security posture as when it was isolated.

  • Restrict External Patients to connect only to the EMR DMZ Web Servers

dmz_ms_pic18

dmz_ms_pic19

dmz_ms_pic20

We can see that our External device from an IP of 172.221.12.80 is connecting to our EMR-DMZ-WEB-01a server.  We can also see that the Web Server is also talking to the backend EMR-DB-01a server.

  • Restrict Internal Clinicians to connect only to the internal EMR Web Server

dmz_ms_pic21

dmz_ms_pic22

dmz_ms_pic23

Here we can see that our Internal Clinician Desktop has the ability to connect to the Internal EMR Web Server but when they attempt to connect to one of the DMZ Servers, they’re blocked.

  • Allow all EMR Web Servers to connect to the EMR DB Server

dmz_ms_pic18

dmz_ms_pic21

This requirement appears to be functioning as expected as well.

  • Block all EMR Web Servers from talking to each other

dmz_ms_pic24

A quick cURL to the Web Servers shows that Internal and External are not communicating with each other.  Also, from EMR-DMZ-WEB-02a to EMR-DMZ-WEB-01a we’re not getting a connection either.

  • Maintain DMZ isolation of the EMR System from the HR System

dmz_ms_pic25

Another attempt to cURL to the HRHIS System shows that the EMR-DMZ-WEB-01a server is not able to communicate to the HRHIS System.  This completes the requirements set forth by the customer.  The patient information access is now limited to only from the EMR system and compromise of any adjacent system within the Healthcare organization, will not allow communications between those systems and the EMR.  We have effectively reduced the attack surface and added defense-in-depth security with minimal efforts.

As we look back, there are several ways to architect a DMZ environment.  Traditional hardware isolation methods can still be augmented to remove massive infrastructure changes to an existing DMZ.  Customers looking to remove the hardware isolation altogether, can do so by collapsing the DMZ environment back into the data center and still maintain the same level of control over the communications both in and out of DMZ systems.  With NSX, the DFW and its ability to control security from an East-West perspective can be overlaid on top of any existing architecture.  This software-based approach helps provide security around a Healthcare organization’s most critical externally-facing patient systems and help reduce exposure from adjacent threats in the data center.

 

 

 

 

The VMware NSX Platform – Healthcare Series – Part 5: DMZ Anywhere Concept

Healthcare organizations are being asked to expose Internet-based services and applications to their patients more than ever.  With Healthcare, exposure of PHI and PII is of the utmost concern.  With the perimeter of the Healthcare organization needing to be as secure as possible, exposing external systems and applications to the Internet falls under this scope as well.  Traditional DMZ approaches are hardware-centric, costly, and operationally difficult to use in most modern datacenters.  With VMware NSX, we can take the concept of the DMZ, and augment a current DMZ approach, or even collapse the DMZ back inside the data center while still providing a robust security posture necessary for Internet-facing applications.

Let’s revisit the nine NSX use cases we identified previously.

dmz_aw_pic1

DMZ Anywhere is a use case that our customers are looking at that augments traditional hardware-based approaches and leverages the Distributed Firewall capabilities to segment how traffic is allowed to flow between systems anywhere in the data center.  Let’s be clear, VMware NSX is not in the business of replacing a hardware perimeter firewall system.  But with NSX, you can fundamentally change how you design the DMZ environment once you’re inside the perimeter firewall to provide a much easier to manage and scalable solution overall.  You can review previous posts on how to Micro-segmentation works here.  https://vwilmo.wordpress.com/category/micro-segmentation/

Let’s take a quick look at traditional approaches to building a DMZ environment with physical devices.

dmz_aw_pic2

Traditional hardware-based approaches can leverage either Zone-based logical firewalling or actual physically independent firewalls to separate out a specific section called the DMZ for Internet-facing applications to sit in. These zones are built to only allow specific sets of communication flows from the Internet-facing systems to their backend components. The systems are typically on their own separate networks.  Typical applications exposed to the Internet are web-based applications for major systems.  These types of systems can comprise of several Web servers, all of which can be used to provide multi-user access to the application.

If customers want to keep the same traditional approaches using zone-based Firewalling, NSX can help block movement for the virtual systems that reside within the DMZ from East-West movement.  In most cases, the systems that sit in the DMZ are Web-based systems.  These types of systems typically do not require communications between the Web servers, or even between disparate applications.

dmz_aw_pic3

In the above examples, all the DMZ Servers can instantiate a conversation bi-directionally with each other.  This is inherently insecure and the only way to secure these is to send all the East-West traffic through the firewall.  When you add more systems, you add more rules. This problem continues to compound itself the larger the DMZ gets.  What if you have multiple networks and systems in the DMZ?  That will require significantly more rules and more complexity.  If you need to scale out this environment, it becomes even more operationally difficult.  How can NSX plug into this scenario and help reduce this complexity and also provide a similar level of security?

With NSX, we can provide the East-West firewalling capabilities in both scenarios to secure the applications from each other from compromise.  If one system is breached, the attack surface for movement laterally, is removed as the systems don’t even know the other systems exist.

Putting in NSX, we’re now blocking the systems from talking to each other without changing any aspect of the underlying infrastructure to do so.  We’re placing an NSX firewall at the virtual machine layer and blocking traffic.  As you can see, NSX can be made to fit nearly any DMZ infrastructure architecture.

dmz_aw_pic4

Here we have our Electronic Medical Records application that has an Internet-facing Patient Access Portal.  With a traditional approach, the Patient Portal may be on separate hardware, situated between two sets of hardware Firewalls, or one set of Internally Zoned, Firewalls, and on a completely different logical network.  The backend systems that are required for the DMZ EMR systems are situated behind another internal firewall along with the rest of the systems in the data center, in this case, share infrastructure systems and the EMR backend database system.  Neither of these systems should have contact with the Internal HR Web or DB Server.  If they did, compromise from the DMZ environment could allow an attacker access to other sensitive internal systems like the HR system.

Now let’s look how NSX can change the traditional design of a DMZ and collapse it back into the data center but will allow the same levels of security as traditional methods, but with a software-based focus.

dmz_aw_pic5

Using NSX in this approach, we’re doing the same thing we did when we augmented the existing hardware approach by placing a software-based Firewall on each Virtual Machine in the data center.  This fundamentally means, that every VM, has its own perimeter and we can programmatically control how each of those VM’s talk or don’t talk to each other.  This approach could enable a Healthcare organization to pull back the hardware isolation for their DMZ back into their data center compute clusters and apply DMZ-level security to those specific workloads hereby collapsing the isolation into software constructs versus hardware ones.  In the collapsed DMZ model, we have no separate infrastructure to support a DMZ environment, we simply control the inbound traffic from the perimeter through the physical firewall as we would normally do, but apply VM-level security using NSX between the systems that would’ve been separated out.  The DMZ EMR Web Servers are still restricted access to the HR system even though they technically live next to each other within the Internal data center.

Let’s contrast a software-based approach versus traditional hardware methods.

Hardware-based

  • For Zone-based firewalling leveraging a single hardware appliance, this is much less of an issue. Some organizations purchase at multiple Firewalls at the perimeter for HA configurations.  If they leverage a separation of their DMZ using two sets of Firewalls, that means they’ll need to purchase at least 4 Firewalls to perform this configuration.
    • New features and functions with hardware products can be tied to the hardware itself. Want these new items?  That could require a new hardware purchase.
  • Scale
    • Hardware-based scaling is generally scale-up. If the Firewall runs out of resources or begins to be over-utilized, it could require moving to larger Firewalls overall to accommodate. This means a rip and replace of the existing equipment.
  • Static
    • A hardware-based DMZ is very static. It doesn’t move within the data center and the workloads have to be positioned in accordance to the network functions it provides.  In modern data centers, workloads can exist anywhere and on any host in the data center.  They can even exist between data centers.  Uptime is critical for Healthcare providers as is maintaining data security.  Wherever the workload may end up, it requires the same, consistent security policy.
  • Cost
    • Buying multiple hardware Firewalls is not cheap. If the organization needs to scale up, ripping and replacing the existing Firewalls for new ones can be costly and incur downtime.  For Healthcare organizations, downtime affects patient care.  Some DMZ architectures have separate hardware to run only the workloads in the DMZ environment.  This separates out the Management of that environment from the internal data center environment.  It also means that, when architecting a hardware-based DMZ, you may end up with compute resources that costly and underutilized.  A concept that totally goes against virtualization in general and leads to higher operating costs in the data center and wasted resources.
  • Operationally difficult
    • If the customer is going with the multiple Firewall method, this means that to configure the allowed and disallowed traffic, the customer would need to go into two sets of Firewalls to do this. Hardware Firewalls for the DMZ will require MAC addresses for all the workloads going into them.  DMZ networks may be a few networks, but usually Web Servers exist on the same logical network.  Healthcare systems can have massive Internet-facing infrastructures to provide for their patients.

Software-based

  • By placing the Firewall capabilities within the ESXi kernel, we’re able to ensure security simply by virtue of the workload residing on any host that is running the vSphere hypervisor. When it comes to new features and functions, where you might need to upgrade proprietary Firewall hardware, NSX is tied to any x86 hardware and new features simply require an update to the software packages reducing the possibility of ripping and replacing hardware.  For Healthcare customers, this reduces or eliminates the downtime required to keep systems up-to-date where downtime is a premium.
  • Scale
    • The nature of NSX being in every hypervisor means Firewall scales linearly as you add more hypervisors to a customer environment. It also means, that instead of having to purchase large physical Firewalls for all your workloads, the DFW will provide throughput and functionality for whatever your consolidation ratio is on your vSphere hosts.  Instead of a few physical devices supporting security for 100s-1000s of virtual machines, each host with the vSphere hypervisor supports security for the VMs residing on it.  With a distributed model that scales as you add hosts, this creates a massive scale platform for security needs.  Physical Firewalls with high bandwidth ports are very expensive, and generally don’t have nearly as many ports as you can have in a distributed model across multiple x86 hardware platforms.
  • Mobility
    • Hardware-based appliances are generally static. They don’t move in your data center although the workloads you’re trying to protect may.  These workloads, when virtualized, can moved to any number of hosts within the data center and even between data centers.  With NSX, the Firewall policy follows the virtual workload no matter the location.  Healthcare providers care about uptime, the ability to move sensitive data systems around to maintain uptime, while maintaining security, is crucial.
  • Cost-effective
    • Software-based solution only need to be licensed for the hosts that the workloads will reside on. No need to purchase licensing for hosts where protected workloads may never traverse to.  With Healthcare organizations, they can focus on the workloads that house their patient’s sensitive data and the systems that interact with them.
    • No need to spend money on separate hardware just for a DMZ. Collapse the DMZ workloads back to the compute environments and reduce wasted resources.
  • Operationally easier
    • By removing another configuration point within the security model, NSX can still provide the same level of security around DMZ workloads even if they sat on the same host as a non-DMZ workload. All of this, while keeping them logically isolated versus physically isolated.  With NSX, there’s no reason to use multiple networks to segment DMZ traffic and the workloads on those segments.  NSX resolves the IP and MAC addresses so that rule and policy creation is much simpler and can be applied programmatically versus traditional manual methods.

When it comes to DMZ architecture, traditional hardware approaches that have been followed in the past, can be too static and inflexible for modern workloads.  Healthcare customers need uptime and scale as medical systems that house patient data are not getting smaller and patient requirements for access to their information continues to grow.  With NSX, we can augment a current DMZ strategy, or even collapse their physical DMZ back into their virtual compute environment and still provide the same levels of security and protection as hardware-based approaches, at a lower cost and easier to maintain.