This is one stop global knowledge base where you can learn about all the products, solutions and support features.
Product Release Date: 2022-08-02
Last updated: 2022-08-02
This Collector version includes the new features and updates from Collector version 4.1. For more information, see the Collector 4.1 Release Notes.
The following issue is resolved in this release:
For more information about Collector open source licensing details, see Open Source Licenses for Collector.
Stay Ahead in Today’s Competitive Market!
Unlock your company’s full potential with a Virtual Delivery Center (VDC). Gain specialized expertise, drive
seamless operations, and scale effortlessly for long-term success.
Product Release Date: 2022-09-14
Last updated: 2022-09-14
For more information about Collector open source licensing details, see Open Source Licenses for Collector.
Product Release Date: 2022-09-14
Last updated: 2022-09-14
Following are the feature updates made in this release:
A new column Collector vPartition UUID is added in the vPartition tab to account only unique entry for calculating the storage capacity and consumed storage.
The following issues are resolved in this release:
This section describes the issues found in this or recent releases that you might encounter:
For the workaround, see the section Troubleshooting of the Collector User Guide .
For more information about Collector open source licensing details, see Open Source Licenses for Collector.
Last updated: 2022-12-14
Nutanix Data Lens™ (Data Lens) provides a cloud-hosted analytics and monitoring service for all of your file servers hosted on Nutanix Files. Data Lens centralizes data from all of your clusters connected to Pulse, across various data center locations. Cloud resources reduce scaling constraints, as the Cloud is not dependent on the file server resources, letting you have near-real-time analytics and alerts even for load-heavy file servers of more than 250 million files and over 500 TB of storage. Hosting File Analytics on premises limits the service to local file servers only. In contrast, Data Lens functions on a global level, in a cluster-neutral environment, without being tied to a single Nutanix cluster.
Meet the requirements for running Data Lens.
Perform the tasks described in this chapter to get started with Data Lens .
Do the following:
You must have a My Nutanix Account to access the Data Lens console.
Perform the following procedure to create a My Nutanix account.
Follow the specified password complexity requirements when you are creating the password.
A confirmation page is displayed and you receive an email from mynutanix@nutanix.com after you successfully complete the sign-up process.
Following is an example of the email.
Hi First Name,
Welcome to the My Nutanix portal!
To get started, confirm your email by clicking on the link below. If clicking the link does not work, you can copy and paste the link into your browser's address window.
https://my.nutanix.com/#/verify?username=your_email_address_ &confirmation=
If you run into any issues, please email portal-accounts@nutanix.com to speak with a Nutanix Portal representative. Please do not reply to this email directly.
Best Regards,
Nutanix Team
A confirmation message briefly appears and you are directed to the Nutanix Support portal.
The Welcome Back page appears.
Before you sign up for a paid plan to use Data Lens , you can start a 60-day free trial. To continue to use Data Lens after the trial period ends, you must upgrade your plan to one of the paid plans.
Perform the following procedure to subscribe to a free trial of Data Lens .
The Port Reference provides detailed port information for Data Lens and other Nutanix products and services, including port sources and destinations, service descriptions, directionality, and protocol requirements.
Lists the unsupported features for Data Lens.
The Global Dashboard details all registered file servers.
The Data Lens Global Dashboard is the landing page after launching Data Lens. The Global Dashboard includes a table that lists all file servers across all of your registered clusters.
The Global Dashboard consists of the following elements.
Column | Description |
---|---|
Status | Indicates if Data Lens is enabled or disabled. |
Vendor | Indicates the file server vendor. |
File Server Name | Indicates the name of the file server. |
Number of Shares | Indicates the number of shares and exports on the file server. |
File Server Version | Indicates the Files version of the file server. |
Data Retention | Indicates the data retention period of Data Lens data. |
Three-Dot Menu | Provides options to enable or disable Data Lens on the file server. |
In the Data Summary pane, clicking the number of Shares displays Share Details view.
The Share Details view consists of the following elements.
Column | Description |
---|---|
Share Name | Indicates the name of the share. |
File Server | Displays the name of the file server of that share. |
Used Capacity | Indicates the capacity used by the share. |
Max Size |
Indicates the quota assigned to a share, and it is the max capacity a share can
use.
If value of max size is not defined, Undefined value is displayed in the table. |
The Dashboard tab displays data on the operational trends of an entity (file server, object, or share).
The Dashboard tab is the opening screen that appears after launching Data Lens for a specific entity. The scope selector indicates the entity for which the dashboard displays data using various widgets. By default the scope selector displays data for the file server (all shares). To have the widgets display data for a specific, select a single share from the scope selector . See the "Dashboard Widgets" table for a description of each widget.
Tile Name | Description | Values and Intervals |
---|---|---|
Data Growth Trend |
Displays data growth trends for the entity including the data added, data
removed, and net changes.
Clicking an event period widget displays the Data Growth Trend Details view. Clicking the View Forecast displays the data growth forecast for a time period. |
7 days, the last 30 days, or the last 1 year |
Data summary by age and storage tier |
Displays the percentage of data by age. Data age determines the data heat,
including: hot, warm, and cold.
Includes an option to open the Smart Tiering Dashboard to configure tiering and to configure cost savings, see Configuring Cost Savings. To edit the heat levels, see Configuring Data Heat Levels. |
Default intervals are as follows:
|
Permission denials | Displays users who have had excessive permission denials and the number of denials. Clicking a user displays audit details, see Audit Trails - Users View for more. | [user id], [number of permission denials] |
File distribution by size | Displays the number of files by file size. Provides trend details for top 5 files. | Less than 1 MB, 1–10 MB, 10–100 MB, 100 MB to 1 GB, greater than 1 GB) |
File distribution by type | Displays the space taken up by various applications and file types. The file extension determines the file type. See the File types table for more details. | MB or GB |
File distribution by type details view |
Displays a trend graph of the top 5 file types. File distribution details include
file type, current space used, current number of files, and change in space for the
last 7 or 30 days.
Clicking View Details displays the File Distribution by Type view. |
Daily size trend for top 5 files (GB), file type (see the "File Type" table), current space used (GB), current number of files (numeric), change in last 7 or 30 days (GB) |
Potential duplicate files | Displays a summary of potential duplicate files based on their name, size, and extension for files larger than 1MB. | Integers for total files with duplicates and total count of duplicates. MB, GB, or TB for total size of duplicates |
Top 5 active users | Lists the users who have accessed the most files and number of operations the user performed for the specified period. When there are more than 5 active users, the more link provides details on the top 50 users. Clicking the user name displays the audit view for the user, see Audit Trails - Users View for more. | 24 hours, 7 days, 1 month, or 1 year |
Top 5 accessed files |
Lists the 5 most frequently accessed files. Clicking
more
provides details on the top 50 files.
Clicking the file name displays the audit view details for the file, see Audit Trails - Files View for more. |
24 hours, 7 days, 1 month, or 1 year |
Files operations |
Displays the distribution of operation types for the specified period, including
a count for each operation type and the total sum of all operations.
Operations include: create, delete, read, write, rename, permission changed, set attribute, symlink, permission denied, permission denied (file blocking). Clicking an operation displays the File Operation Trend view. |
24 hours, 7 days, 1 month, or 1 year |
Clicking an event period in the Data Growth Trend widget displays the Data growth trend details view for that period. The view includes the Share/Export and Category tabs (only the Category tab appears when viewing the details for a share). Each tab includes columns detailing entity details such as, Name, Net Capacity Change, Data Added, and Data Removed.
Column | Description |
---|---|
Name | Name of share/export or category |
Net capacity change | The total difference between capacity at the beginning and the end of the specified period |
Data added | Total data added for the specified period |
Data removed | Total data removed for the specified period |
Clicking the View Forecast for the Data Growth Trend widget displays the data forecast for an entity over a time period.
The data forecast is based on the historical data of the last 90 days. From the drop-down option, select a forecast period of 3 months, 6 months, 9 months, or 1 year.
Usage | Description |
---|---|
Low Usage | The number of days in which the entity reaches the maximum capacity as per the low usage. |
Medium Usage | The number of days in which the entity reaches the maximum capacity as per medium usage. |
High Usage | The number of days in which the entity reaches the maximum capacity as per high usage. |
Clicking View Details for the File Distribution by Type widget displays granular details of file distribution, see the File Types table for details.
Column | Description |
---|---|
File type | Name of file type |
Current space used | Space capacity occupied by the file type |
Current number of files | Number of files for the file type |
Change (in last 30 days) | The increase in capacity over a 30-day period for the specified file type |
Category | Supported File Type |
---|---|
Archives | .cab, .gz, .rar, .tar, .z, .zip |
Audio | .aiff, .au, .mp3, .mp4, .wav, .wma |
Backups | .bak, .bkf, .bkp |
CD/DVD images | .img, .iso, .nrg |
Desktop publishing | .qxd |
Email archives | .pst |
Hard drive images | .tib, .gho, .ghs |
Images | .bmp, .gif, .jpg, .jpeg, .pdf .png, .psd, .tif, .tiff, |
Installers | .msi, .rpm |
Log Files | .log |
Lotus notes | .box, .ncf, .nsf, .ns2, .ns3, .ns4, .ntf |
MS Office documents | .accdb, .accde, .accdt, .accdr, .doc, .docx, .docm, .dot, .dotx, .dotm, .xls, .xlsx, .xlsm, .xlt, .xltx, .xltm, .xlsb, .xlam, .ppt, .pptx, .pptm, .potx, .potm, .ppam, .ppsx, .ppsm, .mdb |
System files | .bin, .dll, .exe |
Text files | .csv, .pdf, .txt |
Video | .avi, mpg, .mpeg, .mov, .m4v |
Disk image | .hlog, .nvram, .vmdk, .vmx, .vmxf, .vmtm, .vmem, .vmsn, .vmsd |
Clicking View Duplicates displays the Potential Duplicate Files view. The following table describes the information found on each pane in the view.
Pane | Summary |
---|---|
Overall Summary | Provides three high level metrics: Total Files with Duplicates , Total Count of Duplicates , and Total Size of Duplicates . |
Files With Duplicates | Includes an option to search by file name and a table that lists potential duplicate files. The table is organized with the file names, a link to View All Instances , and the following columns: original creation date , duplicate count , size of duplicates , and number of duplicate file owners . |
Clicking View All Instances displays a table with information about the file organized by the following columns: Path , Owner , Share , Date Created , and Date Modified . | |
Filters | Lets you filter by original creation date , duplicate count , size of duplicates , number of duplicate file owners , data type , owners and shares . |
Clicking an operation type in the File Operations widget displays the File Operation Trend view. The File Operation Trend view breaks down the specified period into smaller intervals, and displays the number of occurrences of the operation during each interval.
Element | Description |
---|---|
Operation type | A drop-down option to specify the operation type. See Files Operations in the Dashboard Widgets table for a list of operation types. |
Last (time period) | A drop-down option to specify the period for the file operation trend. |
File operation trend graph | The x-axis displays shorter intervals for the specified period. The y-axis displays the number of operations trend over the extent of the intervals. |
Data panes in the Anomalies view display data and trends for configured anomalies.
The Anomalies view provides options for creating anomaly policies and displays dashboards for viewing anomaly trends.
You can configure anomalies for the following operations:
Define anomaly rules by the specifying the following conditions:
Meeting the lower operation threshold triggers an anomaly.
Consider a scenario where you have 1 thousand files, the operation count threshold defined as 10, and the operation percentage threshold defined as 10%. The count threshold takes precedence, as 10% of 1 thousand is 100, which is greater than the count threshold of 10.
Pane Name | Description | Values |
---|---|---|
Anomaly Trend | Displays the number of anomalies per day or per month. | Last 7 days, Last 30 days, Last 1 year |
Top Users | Displays the users with the most anomalies and the number of anomalies per user. | Last 7 days, Last 30 days, Last 1 year |
Top Folders | Displays the folders with the most anomalies and the number of anomalies per folder. | Last 7 days, Last 30 days, Last 1 year |
Operation Anomaly Types | Displays the percentage of occurrences per anomaly type. | Last 7 days, Last 30 days, Last 1 year |
Clicking an anomaly bar in the Anomaly Trend graph displays the Anomaly Details view.
Column | Description |
---|---|
Anomaly Type | The configured anomaly type. Anomaly types not configured do not show up in the table. |
Total User Count | The number of users that have performed the operation causing the specified anomaly during the specified time range. |
Total Folder Count | The numbers of folders in which the anomaly occurred during the specified time range. |
Total Operation Count | Total number of anomalies for the specified anomaly type that occurred during the specified time range. |
Time Range | The time range for which the total user count, total folder count, and total operation count are specified. |
Column | Description |
---|---|
Username or Folders | Indicates the entity for the operation count. Selecting the Users tab indicates operation count for specific users, and selecting the Folders tab indicates the operation count for specific folders. |
Operation count | The total number of operations causing anomalies for the selected user or folder during the time period for the bar in the Anomaly Trend graph. |
Steps for configuring anomaly rules.
To create an anomaly rule, do the following.
Use audit trails to look up operation data for a specific user, file, folder, or client.
The Audit Trails view includes Files , Folders , Users , and Client IP options for specifying the audit type. Use the search bar for specifying the entity for the audit (user, folder, file, or client IP).
The results table presents details for entities that match the search criteria. Clicking the entity name (or client IP number) takes you to the details for the target entity.
Audit a user, file, client, or folder.
Details for the user audit trails view.
When you search by user in Audit Trails , search results display the following information in a table:
Clicking View Audit displays the Audit Details page, which shows the following audit information for the selected user.
The Results table provides granular details of the audit results. The following data is displayed for every event.
Click the gear icon for options to download the data as an xls, csv, or JSON file.
Details for the folder audit trails view.
When you search by folder name in Audit Trails , search results display the following information in a table:
The Audit Details page shows the following audit information for the selected folder.
The Results table provides granular details of the audit results. Data Lens displays the following data for every event.
Click the gear icon for options to download the data as a CSV file.
Dashboards details for the files audit trails view.
When you search by file name in Audit Trails , search results display the following information in a table:
The Audit Details page shows the following audit information for the selected file
The Results table provides granular details of the audit results. Data Lens displays the following data for every event.
Click the gear icon for options to download the data as a CSV file.
Details for the client IP audit trails view.
When you search by client IP in Audit Trails , search results display the following information in a table:
The Audit Details page shows the following audit information for the selected client.
The Results table provides granular details of the audit results. Data Lens displays the following data for every event.
Click the gear icon for an option to download the data as a CSV file.
Protecting your file server against 0-day ransomware detection and protection.
Data Lens scans file audit events for ransomware in near real time and notifies you in the event of a ransomware attack once you configure email notifications. Ransomware protection includes signature-based and event-pattern-based ransomware detection.
Signature-based detection uses the Nutanix Files ransomware file blocking mechanism to identify and block file renames whose extension and file names match ransomware signatures carrying out The blocking of file renames helps to identify file malicious activity by containing the ransomware attack from further infecting the files . The ransomware file blocking mechanism uses a dynamically curated list of signatures that frequently appear in ransomware files. The curated list is dynamically updated as new ransomware signatures are available. You can also modify the list by manually adding or removing signatures.
Event-pattern-based ransomware protection looks for audit events in near real time to identify potential ransomware attacks. Configuring auto-remediation allows you to block malicious clients from accessing all shares. In addition to that customers will also have the option to put the files in Read-Only mode where no clients will be able to do any write operations to the shares . Customers are recommended to upgrade to Files 4.2 to use advanced auto-remediation features as described above containing the ransomware attack from further infecting the files.
Data Lens also monitors shares for self-service restore (SSR) policies and identifies shares that do not have SSR enabled in the ransomware dashboard. You can enable SSR through the ransomware dashboard.
The Ransomware dashboard includes panes for viewing threats summary and its details, managing and configuring ransomware protection, managing recovery settings, viewing blocked clients (users or client IP addresses), and viewing and updating blocked file signatures.
The Ransomware dashboard includes the following sections:
The Threats Summary pane of the ransomware dashboard displays the highlighted threats and its details (impacted shares, users, client IP address, impacted files, and Recover).
To view the threat details and the impacted files, do the following:
Enable ransomware protection on your file server.
Configure ransomware protection policies on file servers.
Do the following to configure a ransomware protection policy on a file server.
Searches, adds or removes a signature from the ransomware protection list.
Do the following to add or remove a signature from the protection list.
Enable self-service restore (SSR) on shares identified by Data Lens.
Data Lens scans shares for SSR policies. Do the following to protect the shares with the configured SSR policy.
Unblocks the client IP addresses.
Do the following to unblock the blocked entities, such as client users and client IP addresses, on a file server.
Manage hot, warm, and cold file server data.
Use Smart Tiering to maximize the available file server space by moving cold data from the file server to an object store. Nutanix supports using Nutanix Objects, AWS Standard, AWS IA, or Wasabi (S3 compatible storage) as the object storage, which you must configure before setting up a tiering profile, for details on setting up Nutanix Objects, see the Objects User Guide .
Tiered storage does not contain the full data set. The full metadata and pointers to tiered data remain on the primary storage. However, tiering cold data to an object store does provide storage cost benefits, which you can calculate using the cost savings widget in the Data Tiering tab. You can recall tiered data from the object store by configuring an auto-recall policy during tiering policy creation, or by recalling data manually. You can also specify retention policies that indicate how long deleted data remains in secondary storage prior to permanent deletion.
Do the following to enable tiering on the file server:
Meet the indicated requirements to configure and administer tiering.
Have the following AWS IAM user permissions.
Ensure that the security keys have the following permissions on the S3 bucket:
Nutanix recommends using the following security best practices:
The Data Age dashboard consists of the Smart Tiering and Explore tabs.
The Smart Tiering dashboard includes tools for managing the tiering configuration of a file server and consists of the following primary elements:
The following table provides a detailed description of the features of each pane in the dashboard.
Pane | Feature | Description |
---|---|---|
Tiering configuration | Tiering location | Indicates the name of the tiering profile and the object store type. Provides an option to configure or edit the tiering location. |
Capacity threshold | Indicates the configured capacity threshold and whether tiering is manual or scheduled. Provides option to configure the capacity threshold, edit the capacity threshold, and to set up a tiering schedule. | |
Tiering policy | Indicates the configured tiering policy. Provides option to define files for tiering. | |
Capacity summary | File server and capacity | Indicates the name of the file server, the capacity used, and the total capacity configured for the file server. |
Data distribution on primary storage |
Indicates the distribution of data on the file server by the space used, the
space planned for tiering, and the free space.
Note:
The widget refreshes
hourly.
|
|
Total tiered data | The amount of data that has been moved to tiered storage. | |
Current cost savings | The approximate amount of money saved from tiering data. | |
Configure cost model option | See Configuring Cost Savings. | |
Manual recall option | See Manually Recalling Tiered Data. | |
Tier data option | See Tiering Data Manually. | |
Tiering summary | Number of files tiered | Indicates the total number of files recalled for the specified interval. |
Data tiered | Indicates the total amount of data tiered for the specified interval. | |
Data tiered graph | Displays the number of files tiered over time. Hovering over the data displays the value for the time specified on the horizontal axis. | |
Recall summary | Number of files recalled | Indicates the total number of files recalled for the specified interval. |
Data recalled | Indicates the total amount of data recalled for the specified interval. | |
Data recalled graph | Displays the number of files recalled over time. Hovering over the data displays the value for the time specified on the horizontal axis. |
The explore tab consists of the following elements:
To tier data, configure a secondary storage object store.
Follow the steps as indicated to configure a tiering profile:
Specify when to tier data.
Follow the steps as indicated.
Add a policy that defines when to tier cold data to object storage.
To create a tiering policy, do the following:
jpg
,
png
, or
mpeg
.
UID:13199
or
nutanix\user
.
Manually recall data from secondary to primary storage.
You can configure auto recall of data during tiering policy setup, see Creating a Tiering Policy. Otherwise, follow the steps as indicated to manually recall data.
Manually initiate tiering.
If you did not choose to tier data on a schedule in the capacity threshold configuration, tier data manually.
Edit an existing tiering configuration.
Follow the steps as indicated.
Update the values that constitute different data heat levels.
Configure the cost savings widget.
Configuring cost savings helps you estimate the amount of money saved by tiering data.
Generate a report for entities on the file server.
Create a report with custom attribute values or use one of the Data Lens pre-canned report templates. To create a custom report, specify the entity, attributes (and operators for some attributes), attribute values, column headings, and the number of columns, see Creating a Custom Report. Pre-canned reports define most of the attributes and headings based on the entity and template that you choose, see Creating a Pre-Canned Report. To schedule a report, see Scheduling a Report.
You can also rerun existing reports rather than creating new ones. After running a report, you can download it as a JSON or CSV file.
The reports page provides options to create, schedule, and download reports.
The Reports page consists of the following elements:
Clicking Create a new report takes you to the New Report view, which includes Report builder and Pre-canned Reports Templates tabs.
The Report builder and Pre-canned Reports Templates tabs include the following elements:
Entity | Attributes (filters) | Operator | Value | Column |
---|---|---|---|---|
Events | event_date |
|
(date) |
|
Event_operation | N/A |
|
||
Files | Category |
|
(date) |
|
Extensions | N/A | (type in value) | ||
Deleted | N/A | Last (number of days from 1 to 30) days | ||
creation_date |
|
(date) | ||
access_date |
|
(date) | ||
Size |
|
(number) (file size)
File size options:
|
||
Folders | Deleted | N/A | Last (number of days from 1 to 30) days |
|
creation_date |
|
(date) | ||
Users | last_event_date |
|
(date) |
|
Entity | Pre-canned report template | Columns |
---|---|---|
Events |
|
|
Files |
|
|
Users |
|
|
Create a custom report by defining the entity, attribute, filters, and columns.
Use one of the pre-canned Data Lens templates for your report.
Schedule a custom or pre-canned report.
Follow the steps as indicated to schedule a report.
View the tasks on the file server.
The Tasks dashboard displays a table with the status and other details of the tasks.
The Tasks dashboard displays the tasks that are in queue, in-progress, canceled, or in failed status.
The Task dashboard lists the following options. You can filter the tasks based on these options.
The tasks table lists the following details:
You can get more insight into the usage and contents of files on your system by configuring and updating Data Lens features and settings. Some options include scanning the files on your file server on demand, updating data retention, and configuring data protection.
The data retention period determines how long Data Lens retains event data.
Follow the steps as indicated to configure data retention.
Once enabled, Data Lens scans the metadata of all files and shares on the system. You can perform an on-demand scan of shares in your file system when new shares are created after the initial scan.
Data Lens uses the file category configuration to classify file extensions.
The capacity widget in the dashboard uses the category configuration to calculate capacity details.
Delete file server audit data or clean up the analyzed data of already deleted files and folders.
Follow the steps as indicated.
Use Data Lens as a lightweight analytics solution for file servers on Dell EMC Isilon, a third-party file server software.
The Data Lens integration with Isilon deploys an agent VM in the third-party environment of the Isilon file server. The agent VM combines the capabilities of a syslog server and Isilon's incremental scan to receive audit events, scan data, and send the heartbeat of the container for health monitoring. Nutanix also collects insights data from your Isilon file server using Pulse. Before enabling the Isilon file server on Data Lens, you must consent to insights data collection, refer to "Nutanix Insights" in Support Portal Help.
As a lightweight analytics and monitoring solution, Data Lens for Isilon currently does not include a full feature set, see Isilon Technical Preview for a list of unsupported features. As a result, the Data Lens UI for Isilon includes fewer widgets. Some of the supported features include audit trails, capacity trends, and heat monitoring. UI updates include a column listing the vendor in the file server table on the Global Dashboard and a new Registered Agents view for monitoring all registered agents. If the agent VM is down for 24 hours, an automated alert goes out to Nutanix Support.
Deploying and managing Data Lens for Isilon requires the following:
Technical preview of the Data Lens integration with Dell EMC Isilon.
This document describes the user experience for Data Lens with Isilon.
Data Lens for Isilon provides general storage capacity reporting and audit visibility.
The Isilon integration does not include the following elements and features:
The agent VM has the following limitations and restrictions:
Clicking the gear icon > Registered Agents displays a tabular view of all the agents in Registered Agents view. The following describes the columns in the table of the Registered Agents view:
Column | Description |
---|---|
Name | Agent VM name |
IP address | IP address of the host where the agent VM is registered. |
ID | Unique identifier for the agent VM. |
No. of file servers served | Number file servers on the host that the agent VM supports. |
Active and offline agents | Red and green icons indicate if the agent VM is on line (green) or off line (red). |
Install a Data Lens agent VM in your third-party environment.
Follow the steps as indicated:
nutanix@agentVM dl_agent_cli --register --token=token
nutanix@agentVM dl_agent_cli --add_server --type=isilon --host=IP/Host-name \
--port=REST-api-port --user=Isilon-server-user [--password_file=password-file-name]
Next, enable the Isilon file server, see Enabling a File Server.
Enable Data Lens for a file server.
Follow the steps as indicated.
Re-deploying an agent VM on a previously configured third-party server.
Follow the steps as indicated:
$ dl_agent_cli --register --token=token
nutanix@agentVM dl_agent_cli --add_server --type=isilon --host=IP/Host-name \
--port=REST-api-port --user=Isilon-server-user [--password_file=password-file-name]
After adding the file servers, run a full scan on all of the added file servers.
Product Release Date: 2022-07-25
Last updated: 2022-11-22
Legacy disaster recovery (DR) configurations use protection domains (PDs) and third-party integrations to protect your applications. These DR configurations replicate data between on-prem Nutanix clusters. Protection domains provide limited flexibility in terms of supporting complex operations (for example, VM boot order, network mapping). With protection domains, you have to perform manual tasks to protect new guest VMs as and when your application scales up.
Nutanix Disaster Recovery offers an entity-centric automated approach to protect and recover applications. It uses categories to group the guest VMs and automate the protection of the guest VMs as the application scales. Application recovery is more flexible with network mappings, an enforceable VM start sequence, and inter-stage delays. Application recovery can also be validated and tested without affecting your production workloads. Asynchronous, NearSync, and Synchronous replication schedules ensure that an application and its configuration details synchronize to one or more recovery locations for a smoother recovery.
Nutanix Disaster Recovery works with sets of physically isolated locations called availability zones. An instance of Prism Central represents an availability zone. One availability zone serves as the primary AZ for an application while one or more paired availability zones serve as the recovery AZs.
When paired, the primary AZ replicates the entities (protection policies, recovery plans, and recovery points) to the recovery AZs in the specified time intervals (RPO). The approach helps application recovery at any of the recovery AZs when there is a service disruption at the primary AZ (For example, natural disasters or scheduled maintenance). The entities start replicating back to the primary AZ when the primary AZ is up and running to ensure High Availability of applications. The entities you create or update synchronize continuously between the primary and recovery AZs. The reverse synchronization enables you to create or update entities (protection policies, recovery plans, or guest VMs) at either the primary or the recovery AZs.
This guide is primarily divided into the following two parts.
The section walks you through the procedure of application protection and DR to other Nutanix clusters at the same or different on-prem AZs. The procedure also applies to protection and DR to other Nutanix clusters in supported public cloud.
Xi Leap is essentially an extension of Leap to Xi Cloud Services. You can protect applications and perform DR to Xi Cloud Services or from Xi Cloud Services to an on-prem availability zone. The section describes application protection and DR from Xi Cloud Services to an on-prem Nutanix cluster. For application protection and DR to Xi Cloud Services, refer the supported capabilities in Protection and DR between On-Prem AZs (Nutanix Disaster Recovery) because the protection procedure remains the same when the primary AZ is an on-prem availability zone.
Configuration tasks and DR workflows are largely the same regardless of the type of recovery AZ. For more information about the protection and DR workflow, see Nutanix Disaster Recovery Deployment Workflow.
The following section describes the terms and concepts used throughout the guide. Nutanix recommends gaining familiarity with these terms before you begin configuring protection with Nutanix Disaster Recovery or Xi Leap disaster recovery.
A zone that can have one or more independent datacenters inter-connected by low latency links. An AZ can either be in your office premises (on-prem) or in Xi Cloud Services. AZs are physically isolated from each other to ensure that a disaster at one AZ does not affect another AZ. An instance of Prism Central represents an on-prem AZ.
An AZ in your premises.
An AZ in the Nutanix Enterprise Cloud Platform (Xi Cloud Services).
An AZ that initially hosts guest VMs you want to protect.
An AZ where you can recover the protected guest VMs when a planned or an unplanned event occurs at the primary AZ causing its downtime. You can configure at most two recovery AZs for a guest VM.
A cluster running AHV or ESXi nodes on an on-prem AZ, Xi Cloud Services, or any supported public cloud. Leap does not support guest VMs from Hyper-V clusters.
The GUI that provides you the ability to configure, manage, and monitor a single Nutanix cluster. It is a service built into the platform for every Nutanix cluster deployed.
The GUI that allows you to monitor and manage many Nutanix clusters (Prism Element running on those clusters). Prism Starter, Prism Pro, and Prism Ultimate are the three flavors of Prism Central. For more information about the features available with these licenses, see Software Options.
Prism Central essentially is a VM that you deploy (host) in a Nutanix cluster (Prism Element). For more information about Prism Central, see Prism Central Guide. You can set up the following configurations of Prism Central VM.
A logically isolated network service in Xi Cloud Services. A VPC provides the complete IP address space for hosting user-configured VPNs. A VPC allows creating workloads manually or by failover from a paired primary AZ.
The following VPCs are available in each Xi Cloud Services account. You cannot create more VPCs in Xi Cloud Services.
The virtual network from which guest VMs migrate during a failover or failback.
The virtual network to which guest VMs migrate during a failover or failback operation.
A mapping between two virtual networks in paired AZs. A network mapping specifies a recovery network for all guest VMs of the source network. When you perform a failover or failback, the guest VMs in the source network recover in the corresponding (mapped) recovery network.
A VM category is a key-value pair that groups similar guest VMs. Associating a protection
policy with a VM category ensures that the protection policy applies to all the guest VMs in
the group regardless of how the group scales with time. For example, you can associate a
group of guest VMs with the
Department: Marketing
category, where
Department
is a category that includes a value
Marketing
along with other values such as
Engineering
and
Sales
.
VM categories remain the same way on on-prem AZs and Xi Cloud Services. For more information about VM categories, see Category Management in Prism Central Guide .
A copy of the state of a system at a particular point in time.
Application-consistent snapshots are more suited for systems and applications that can be quiesced and un-quiesced or thawed, such as database operating systems and applications such as SQL, Oracle, and Exchange.
A guest VM that you can recover from a recovery point.
A configurable policy that takes recovery points of the protected guest VMs in equal time intervals, and replicates those recovery points to the recovery AZs.
A configurable policy that orchestrates the recovery of protected guest VMs at the recovery AZ.
The time interval that refers to the acceptable data loss if there is a failure. For example, if the RPO is 1 hour, the system creates a recovery point every 1 hour. On recovery, you can recover the guest VMs with data as of up to 1 hour ago. Take Snapshot Every in the Create Protection Policy GUI represents RPO.
The time period from failure event to the restored service. For example, an RTO of 30 minutes enables you to back up and run the protected guest VMs in 30 minutes after the failure event.
The following flowchart provides you with the detailed representation of the disaster recovery (DR) solutions of Nutanix. This decision tree covers both the DR solutions—protection domain-based DR and Nutanix Disaster Recovery helping you to make quick decisions on which DR strategy will best suit your environment.
For information about protection domain-based (legacy) DR, see Data Protection and Recovery with Prism Element guide. With Leap, you can protect your guest VMs and perform DR to on-prem availability zones (AZs) or to Xi Cloud Services. A Leap deployment for DR from Xi Cloud Services to an on-prem Nutanix cluster is Xi Leap. The detailed information about Leap and Xi Leap DR configuration is available in the following sections of this guide.
Protection and DR between On-Prem AZs (Nutanix Disaster Recovery)
Protection and DR between On-Prem AZ and Xi Cloud Service (Xi Leap)
The workflow for entity-centric protection and disaster recovery (DR) configuration is as follows. The workflow is largely the same for both Nutanix Disaster Recovery and Xi Leap configurations except a few extra steps you must perform while configuring Xi Leap.
For DR solutions with Asynchronous, NearSync, and Synchronous replication schedules to succeed, the nodes in the on-prem AZs (AZs or AZs) must have certain resources. This section provides information about the node, disk and Foundation configurations necessary to support the RPO-based recovery point frequencies.
The conditions and configurations provided in this section apply to Local and Remote recovery points.
Any node configuration with two or more SSDs, each SSD being 1.2 TB or greater capacity, supports recovery point frequency for NearSync.
Any node configuration that supports recovery point frequency of six (6) hours also supports AHV-based Synchronous replication schedules because a protection policy with Synchronous replication schedule takes recovery points of the protected VMs every 6 hours. See Protection with Synchronous Replication Schedule (0 RPO) and DR for more details about Synchronous replication.
Both the primary cluster and replication target cluster must fulfill the same minimum resource requirements.
Ensure that any new node or disk additions made to the on-prem AZs (Availability Zones) meet the minimum requirements.
Features such as Deduplication and RF3 may require additional memory depending on the DR schedules and other workloads run on the cluster.
The table lists the supported frequency for the recovery points across various hardware configurations.
Type of disk | Capacity per node | Minimum recovery point frequency | Foundation Configuration - SSD and CVM requirements |
---|---|---|---|
Hybrid | Total HDD tier capacity of 32 TB or lower. Total capacity (HDD + SSD) of 40 TB or lower. |
|
No change required—Default Foundation configuration.
|
Total HDD tier capacity between 32-64 TB. Total capacity (HDD + SSD) of 92 TB or lower. Up to 64 TB HDD Up to 32 TB SSD (4 x 7.68 TB SSDs) |
|
Modify Foundation configurations to minimum:
|
|
Total HDD tier capacity between 32-64 TB. Total capacity (HDD + SSD) of 92 TB or lower. Up to 64 TB HDD Up to 32 TB SSD |
Async (Every 6 hours) | No change required—Default Foundation configuration. | |
Total HDD tier capacity between 64-80 TB. Total capacity (HDD + SSD) of 96 TB or lower. |
Async (Every 6 hours) | No change required—Default Foundation configuration. | |
Total HDD tier capacity greater than 80 TB. Total capacity (HDD + SSD) of 136 TB or lower. |
Async (Every 6 hours) |
Modify Foundation configurations to minimum:
|
|
All Flash | Total capacity of 48 TB or lower |
|
No change required—Default Foundation configuration. |
Total capacity between 48-92 TB |
|
Modify Foundation configurations to minimum:
|
|
Total capacity between 48-92 TB | Async (Every 6 hours) | No change required—Default Foundation configuration. | |
Total capacity greater than 92 TB | Async (Every 6 hours) |
Modify Foundation configurations to minimum:
|
For details about the ports and protocols used by encrypted replication traffic, see Ports and Protocols.
For details about the ports and protocols used by encrypted replication traffic, see Ports and Protocols.
nutanix@CVM:$ cd bin
nutanix@CVM:~/bin$
nutanix@CVM:~/bin$ python onwire_encryption_tool.py --leap --enable <remote_cluster_vip>
For example: If the IP address of the replication (remote) cluster is 10.xxx.xxx.xxx.
nutanix@CVM:~/bin$ python onwire_encryption_tool.py --leap --enable 10.xxx.xxx.xxx
nutanix
access for the remote cluster CVM when prompted.
nutanix@CVM:~/bin$ python onwire_encryption_tool.py --leap --enable 10.xxx.xxx.xxx Checking Source Cluster Compatibility Check Complete. Source is compatible Checking Remote cluster Compatibility nutanix@10.xxx.xxx.xxx's password: Check Complete. Remote cluster is compatible Importing root.crt from Remote Cluster: 10.xxx.xxx.xxx nutanix@10.xxx.xxx.xxx's password: Checking if Remote Cluster's root.crt file already exists nutanix@10.xxx.xxx.xxx's password: Encryption enabled Successfully. Please perform rolling restart of Cerebro and Stargate services for changes to take effect
nutanix@CVM:$ allssh "source /etc/profile; genesis stop cerebro stargate && cluster start; sleep 180"
You can check the enabled or disabled status of the encryption of replication traffic.
For details about the ports and protocols used by encrypted replication traffic, see Ports and Protocols.
To verify the status of encryption of replication traffic, perform the following step on the cluster.
nutanix@CVM:~/bin$ allssh "ls -la /home/nutanix/tmp/ | grep -i trusted"
nutanix@CVM:~/$ allssh "ls -la /home/nutanix/tmp/ | grep -i trusted" ================== xx:xx:xx:2 ================= ================== xx:xx:xx:3 ================= ================== xx:xx:xx:4 ================= -rw-------. 1 nutanix nutanix 1790 Dec xx 06:27 trusted_certs.crt.xx:xx:xx:xx
nutanix@CVM:~/bin$ python onwire_encryption_tool.py --leap --verify <remote_cluster_vip>
For example: If the IP address of the replication (remote) cluster is 10.xxx.xxx.xxx.
nutanix@CVM:~/bin$ python onwire_encryption_tool.py --leap --verify 10.xxx.xxx.xxx
nutanix
access for the remote cluster CVM when prompted.
nutanix@CVM:~/bin$ python onwire_encryption_tool.py --leap --verify 10.xxx.xxx.xxx Checking Source Cluster Compatibility Check Complete. Source is compatible Checking Remote cluster Compatibility nutanix@10.xxx.xxx.xxx's password: Check Complete. Remote cluster is compatible Verifying Encryption: Checking if Remote Cluster's root.crt file already exists nutanix@10.xxx.xxx.xxx's password: Encryption Verification Successful. Encryption is already enabled for this remote
You can disable encryption of replication traffic
To disable encryption of replication traffic, perform the following step on the cluster.
nutanix@CVM:$ cd bin
nutanix@CVM:~/bin$
nutanix@CVM:~/bin$ python onwire_encryption_tool.py --leap --disable <remote_cluster_vip>
For example: If the IP address of the replication (remote) cluster is 10.xxx.xxx.xxx.
nutanix@CVM:~/bin$ python onwire_encryption_tool.py --leap --disable 10.xxx.xxx.xxx
nutanix
access for the remote cluster CVM when prompted.
nutanix@CVM:~/bin$ python onwire_encryption_tool.py --leap --disable 10.xxx.xxx.xxx Checking Source Cluster Compatibility Stopping Cerebro on all nodes of the cluster Encryption disabled Successfully. Please perform rolling restart of Cerebro and Stargate services for changes to take effect.
nutanix@CVM:$ allssh "source /etc/profile; genesis stop cerebro stargate && cluster start; sleep 180"
Leap protects your guest VMs and orchestrates their disaster recovery (DR) to other Nutanix clusters when events causing service disruption occur at the primary AZ. For protection of your guest VMs, protection policies with Asynchronous, NearSync, or Synchronous replication schedules generate and replicate recovery points to other on-prem AZs (AZs). Recovery plans orchestrate DR from the replicated recovery points to other Nutanix clusters at the same or different on-prem AZs.
Protection policies create a recovery point—and set its expiry time—in every iteration of the specified time period (RPO). For example, the policy creates a recovery point every 1 hour for an RPO schedule of 1 hour. The recovery point expires at its designated expiry time based on the retention policy—see step 3 in Creating a Protection Policy with an Asynchronous Replication Schedule (Nutanix Disaster Recovery). If there is a prolonged outage at an AZ, the Nutanix cluster retains the last recovery point to ensure you do not lose all the recovery points. For NearSync replication (lightweight snapshot), the Nutanix cluster retains the last full hourly snapshot. During the outage, the Nutanix cluster does not clean up the recovery points due to expiry. When the Nutanix cluster comes online, it cleans up the recovery points that are past expiry immediately.
For High Availability of a guest VM, Nutanix Disaster Recovery enables replication of its recovery points to one or more on-prem AZs. A protection policy can replicate recovery points to maximum two on-prem AZs. For replication, you must add a replication schedule between AZs. You can set up the on-prem AZs for protection and DR in the following arrangements.
The replication to multiple AZs enables DR to Nutanix clusters at all the AZs where the recovery points replicate or exist. To enable performing DR to a Nutanix cluster at the same or different AZ (recovery AZ), you must create a recovery plan. To enable performing DR to two different Nutanix clusters at the same or different recovery AZs, you must create two discrete recovery plans—one for each recovery AZ. In addition to performing DR to Nutanix clusters running the same hypervisor type, you can also perform cross-hypervisor disaster recovery (CHDR)—DR from AHV clusters to ESXi clusters, or from ESXi clusters to AHV clusters.
The protection policies and recovery plans you create or update synchronize continuously between the primary and recovery on-prem AZs. The reverse synchronization enables you to create or update entities (protection policies, recovery plans, and guest VMs) at either the primary or the recovery AZs.
The following section describes protection of your guest VMs and DR to a Nutanix cluster at the same or different on-prem AZs. The workflow is the same for protection and DR to a Nutanix cluster in supported public cloud platforms. For information about protection of your guest VMs and DR from Xi Cloud Services to an on-prem Nutanix cluster (Xi Leap), see Protection and DR between On-Prem AZ and Xi Cloud Service (Xi Leap).
The following are the general requirements of Nutanix Disaster Recovery . Along with the general requirements, there are specific requirements for protection with the following supported replication schedules.
The AOS license required depends on the features that you want to use. For information about the features that are available with an AOS license, see Software Options.
The underlying hypervisors required differ in all the supported replication schedules. For more information about underlying hypervisor requirements for the supported replication schedules, see:
Nutanix supports replications between the all the latest supported LTS and STS released AOS versions. To check the list of the latest supported AOS versions, see KB-5505. To determine if the AOS versions currently running on your clusters are EOL, see the EOL document .
Upgrade the AOS version to the next available supported LTS/STS release. To determine if an upgrade path is supported, check the Upgrade Paths page before you upgrade the AOS.
For example, the clusters are running AOS versions 5.5.x and 5.10.x respectively. Upgrade the cluster on 5.5.x to 5.10.x. After both the clusters are on 5.10.x, proceed to upgrade each cluster to 5.15.x (supported LTS). Once both clusters are on 5.15.x you can upgrade the clusters to 5.20.x or newer.
Nutanix recommends that both the primary and the replication clusters or AZs run the same AOS version.
You must have one of the following roles in Prism Central.
To view the available roles or create a role, click the hamburger icon at the top-left corner of the window and go to Administration > Roles in the left pane.
To allow two-way replication between Nutanix clusters at the same or different AZs, you must enable certain ports in your external firewall. To know about the required ports, see Disaster Recovery - Leap in Port Reference.
For information about installing NGT, see Nutanix Guest Tools in Prism Web Console Guide .
The empty CD-ROM is required for mounting NGT at the recovery AZ.
NM_CONTROLLED
field to
yes
. After setting the field, restart the network service on
the VM.
For information about installing NGT, see Nutanix Guest Tools in Prism Web Console Guide .
The empty CD-ROM is required for mounting NGT at the recovery AZ.
For example, if you have VLAN with id 0 and network 10.45.128.0/17, and three clusters PE1, PE2, and PE3 at the AZ AZ1, all the clusters must maintain the same name, IP address range, and IP address prefix length ( (Gateway IP/Prefix Length) ), for VLAN with id 0.
m
networks and the other cluster having
n
networks, ensure that the recovery cluster has
m + n
networks. Such
a design ensures that all recovered VMs attach to a network.
For more information about the scaled-out deployments of a Prism Central, see Nutanix Disaster Recovery Terminology.
Nutanix VM mobility drivers are required for accessing the guest VMs after failover. Without Nutanix VM mobility drivers, the guest VMs become inaccessible after a failover.
For example, if you have VLAN with id 0 and network 10.45.128.0/17, and three clusters PE1, PE2, and PE3 at the AZ AZ1, all the clusters must maintain the same name, IP address range, and IP address prefix length ( (Gateway IP/Prefix Length) ), for VLAN with id 0.
Consider the following general limitations before configuring Nutanix Disaster Recovery . Along with the general limitations, there are specific protection limitations with the following supported replication schedules.
You cannot do or implement the following.
When you perform DR of vGPU console-enabled guest VMs, the VMs recover with the default VGA console (without any alert) instead of vGPU console. The guest VMs fail to recover when you perform cross-hypervisor disaster recovery (CHDR). For more information about DR and backup behavior of guest VMs with vGPU, see vGPU Enabled Guest VMs.
You can configure NICs for a guest VM associated with either production or test VPC.
You cannot protect volume groups.
You cannot apply network segmentation for management traffic (any traffic not on the backplane network) in Leap.
Although there is no limit to the number of VLANs that you can create, only the first 500 VLANs list in the drop-down of Network Settings while creating a recovery plan. For more information about VLANs in the recovery plan, see Nutanix Virtual Networks.
Due to the way the Nutanix architecture distributes data, there is limited support for mapping a Nutanix cluster to multiple vSphere clusters. If a Nutanix cluster is split into multiple vSphere clusters, migrate and recovery operations fail.
The following table list the behavior of guest VMs with vGPU to disaster recovery (DR) and backup deployments.
Primary cluster | Recovery cluster | DR or Backup | Identical vGPU models | Unidentical vGPU models or no vGPU |
---|---|---|---|---|
AHV | AHV | Nutanix Disaster Recovery |
Supported:
|
Supported:
|
Backup: HYCU | Guest VMs with vGPU fail to recover. | Guest VMs with vGPU fail to recover. | ||
Backup: Veeam | Guest VMs with vGPU fail to recover. |
Tip:
The VMs start when you disable vGPU on the guest VM
|
||
ESXi | ESXi | Nutanix Disaster Recovery | Guest VMs with vGPU cannot be protected. | Guest VMs with vGPU cannot be protected. |
Backup | Guest VMs with vGPU cannot be protected. | Guest VMs with vGPU cannot be protected. | ||
AHV | ESXi | Nutanix Disaster Recovery | vGPU is disabled after failover of Guest VMs with vGPU. | vGPU is disabled after failover of Guest VMs with vGPU. |
ESXi | AHV | Nutanix Disaster Recovery | Guest VMs with vGPU cannot be protected. | Guest VMs with vGPU cannot be protected. |
For the maximum number of entities you can configure with different replication schedules and perform failover (disaster recovery), see Nutanix Configuration Maximums. The limits have been tested for Leap production deployments. Nutanix does not guarantee the system to be able to operate beyond these limits.
Nutanix recommends the following best practices for configuring Nutanix Disaster Recovery .
If you unpair the AZs while the guest VMs in the Nutanix clusters are still in synchronization, the Nutanix cluster becomes unstable. For more information about disabling Synchronous replication, see Synchronous Replication Management.
You can protect a guest VM either with the legacy DR solution (protection domain-based) or with Leap. To protect a legacy DR-protected guest VM with Leap, you must migrate the guest VM from the protection domain to a protection policy. During the migration, do not delete the guest VM snapshots in the protection domain. Nutanix recommends keeping the guest VM snapshots in the protection domain until the first recovery point for the guest VM is available on Prism Central. For more information, see Migrating Guest VMs from a Protection Domain to a Protection Policy.
If the single Prism Central that you use for protection and DR to Nutanix clusters at the same AZ (AZ) becomes inactive, you cannot perform a failover when required. To avoid the single point of failure in such deployments, Nutanix recommends installing the single Prism Central at a different AZ (different fault domain).
Create storage containers with the same name on both the primary and recovery Nutanix clusters.
Leap automatically maps the storage containers during the first replication (seeding) of a guest VM. If a storage container with the same name exists on both the primary and recovery Nutanix clusters, the recovery points replicate to the storage container with the same name only. For example, if your protected guest VMs are in the SelfServiceContainer on the primary Nutanix cluster, and the recovery Nutanix cluster also has SelfServiceContainer , the recovery points replicate to SelfServiceContainer only. If a storage container with the same name does not exist at the recovery AZ, the recovery points replicate to a random storage container at the recovery AZ. For more information about creating storage containers on the Nutanix clusters, see Creating a Storage Container in Prism Web Console Guide .
Nutanix Disaster Recovery enables protection of your guest VMs and disaster recovery (DR) to one or more Nutanix clusters at the same or different on-prem AZs. A Nutanix cluster is essentially an AHV or an ESXi cluster running AOS. In addition to performing DR to Nutanix clusters running the same hypervisor type, you can also perform cross-hypervisor disaster recovery (CHDR)—DR from AHV clusters to ESXi clusters, or from ESXi clusters to AHV clusters.
Leap supports DR (and CHDR) to maximum two different Nutanix clusters at the same or different AZs (AZs). You can protect your guest VMs with the following replication schedules.
To maintain the efficiency in protection and DR, Leap allows to protect a guest VM with Synchronous replication schedule to only one AHV cluster and at the different on-prem availability zone.
The disaster recovery (DR) views enable you to perform CRUD operations on the following types of Leap entities.
This chapter describes the views of Prism Central (on-prem AZ).
The AZs view under the hamburger icon > Administration lists all of your paired AZs.
The following figure is a sample view, and the tables describe the fields and the actions that you can perform in this view.
Field | Description |
---|---|
Name | Name of the AZ. |
Region | Region to which the AZ belongs. |
Type | Type of AZ. AZs that are backed by on-prem Prism Central instances are shown to be of type physical. The AZ that you are logged in to is shown as a local AZ. |
Connectivity Status | Status of connectivity between the local AZ and the paired AZ. |
Workflow | Description |
---|---|
Connect to AZ (on-prem Prism Central only) | Connect to an on-prem Prism Central or to a Xi Cloud Services for data replication. |
Action | Description |
---|---|
Disconnect | Disconnect the remote AZ. When you disconnect an availability zone, the pairing is removed. |
The Protection Summary view under the hamburger icon > Data Protection shows detailed information about the Leap entities in an AZ (AZ) and helps you generate DR reports for the specified time. The information in the Protection Summary view enables you to monitor the health of your DR deployments (Leap) and the activities performed on Leap entities. Select a topology in the left-hand side pane and protection and recovery information of the selected topology shows up in DR widgets on the right-hand side pane. The following figures are a sample view, and the tables describe the fields and the actions that you can perform in the DR widgets.
Field | Description |
---|---|
Total | Number of guest VMs protected. Clicking the number shows the guest VMs protected in protection policies. |
RPO Not Met | Number of guest VMs that are protected but do not meet the specified RPO. Clicking the number shows the guest VMs that do not meet the specified RPO. |
Field | Description |
---|---|
Ongoing | Number of ongoing replication tasks. |
Stuck | Number of replication tasks that are stuck. Clicking the number shows the stuck alerts generated in Alerts . |
Failed | Number of replication tasks that failed. Clicking the number shows the alerts generated in Alerts . |
Field | Description |
---|---|
Measured by | Failover operations to check the readiness of the recovery plans. You can use Validate , Test Failover , or Planned Failover from the drop-down list to check recovery readiness. |
Succeeded | Number of recovery plans on which the selected failover operation ran successfully. Clicking the number shows the recovery plans on which the selected failover operation ran successfully. |
Succeeded With Warnings | Number of recovery plans on which the selected failover operation ran successfully but with warnings. Clicking the number shows the recovery plans on which the selected failover operation ran successfully with warnings. |
Failed | Number of recovery plans on which the selected failover operation failed to run successfully. Clicking the number shows the recovery plans on which the selected failover operation failed to run successfully. |
Not Executed | Number of recovery plans on which no failover operation ran. Clicking the number shows the recovery plans on which no failover operation ran. |
Field | Description |
---|---|
Name | Names of guest VMs that do not meet the specified RPO. You can use the filters on the guest VMs to investigate the reason for RPO not meeting. |
Field | Description |
---|---|
Alert Description | Description of configuration alerts raised on protection policies and recovery plans. |
Impacted Entity | The entities impacted by the configuration alerts. |
Field | Description |
---|---|
Report Name | Name of the report. |
Generated at | Date and time when the report was generated. |
Download | Option to download the report as a PDF or a CSV document. |
This widget shows you a detailed view of the Recovery Readiness . You can view information about all the recovery plans that ran on the selected AZs in the last 3 months.
The Protection Policies view under the hamburger icon > Data Protection lists all of configured protection policies from all the paired availability zones.
The following figure is a sample view, and the tables describe the fields and the actions that you can perform in this view.
Field | Description |
---|---|
Policy Name | Name of the protection policy. |
Schedules | Number of schedules configured in the protection policy. If the protection policy has multiple schedules, a drop-down icon is displayed. Click the drop-down icon to see the primary location:primary Nutanix cluster , recovery location:recovery Nutanix cluster , and RPO of the schedules in the protection policy. |
Alerts | Number of alerts issued for the protection policy. |
Workflow | Description |
---|---|
Create protection policy | Create a protection policy. |
Action | Description |
---|---|
Update | Update the protection policy. |
Clone | Clone the protection policy. |
Delete | Delete the protection policy. |
The Recovery Plans view under the hamburger icon > Data Protection lists all of configured recovery plans from all the paired availability zones.
The following figure is a sample view, and the tables describe the fields and the actions that you can perform in this view.
Field | Description |
---|---|
Name | Name of the recovery plan. |
Primary Location | Replication source AZ for the recovery plan. |
Recovery Location | Replication target AZ for the recovery plan. |
Entities |
Sum of the following VMs:
|
Last Validation Status | Status of the most recent validation of the recovery plan. |
Last Test Status | Status of the most recent test performed on the recovery plan. |
Last Failover Status | Status of the most recent failover performed on the recovery plan. |
Workflow | Description |
---|---|
Create Recovery Plan | Create a recovery plan. |
Action | Description |
---|---|
Validate | Validates the recovery plan to ensure that the VMs in the recovery plan have a valid configuration and can be recovered. |
Test | Tests the recovery plan. |
Clean-up test VMs | Cleans up the VMs failed over as a result of testing recovery plan. |
Update | Updates the recovery plan. |
Failover | Performs a failover. |
Delete | Deletes the recovery plan. |
The VM Recovery Points view under the hamburger icon > Data Protection lists all the recovery points of all the protected guest VMs (generated over time).
The following figure is a sample view, and the tables describe the fields and the actions that you can perform in this view.
Field | Description |
---|---|
Name | Name of the recovery point. |
Latest Recovery Point on Local AZ | Replication source AZ for the recovery plan. |
Oldest Recovery Point on Local AZ | Replication target AZ for the recovery plan. |
Total Recovery Points | Number of recovery points generated for the guest VM. |
Owner | Owner account of the recovery point. |
Action | Description |
---|---|
Clone (Previously Restore) | Clones the guest VM from the selected recovery points. The operation creates a copy of guest VM in the same Nutanix cluster without overwriting the original guest VM (out-of-place restore). For more information, see Manual Recovery of Guest VMs. |
Revert | Reverts the guest VMs to the selected recovery points. The operation recreates the guest VM in the same Nutanix cluster by overwriting the original guest VM (in-place restore). For more information, see Manual Recovery of Guest VMs. |
Replicate | Manually replicates the selected recovery points to a different Nutanix cluster in the same or different AZs. For more information, see Replicating Recovery Points Manually. |
The dashboard includes widgets that display the statuses of configured protection policies and recovery plans. If you have not configured these VMs, the widgets display a summary of the steps required to get started with Leap.
To view these widgets, click the Dashboard tab.
The following figure is a sample view of the dashboard widgets.
To perform disaster recovery (DR) to Nutanix clusters at different on-prem available zones (AZs), enable Leap at both the primary and recovery AZs (Prism Central). Without enabling Leap, you can configure protection policies and recovery plans that synchronize to the paired AZs but you cannot perform failover and failback operations. To perform DR to different Nutanix clusters at the same AZ, enable Leap in the single Prism Central.
To enable Nutanix Disaster Recovery , perform the following procedure.
To replicate entities (protection policies, recovery plans, and recovery points) to different on-prem AZs (AZs) bidirectionally, pair the AZs with each other. To replicate entities to different Nutanix clusters at the same AZ bidirectionally, you need not pair the AZs because the primary and the recovery Nutanix clusters are registered to the same AZ (Prism Central). Without pairing the AZs, you cannot perform DR to a different AZ.
To pair an on-prem AZ with another on-prem AZ, perform the following procedure at either of the on-prem AZs.
Automated disaster recovery (DR) configurations use protection policies to protect your guest VMs, and recovery plans to orchestrate the recovery of those guest VMs to different Nutanix clusters at the same or different AZs (AZs). You can automate protection of your guest VMs with the following supported replication schedules in Nutanix Disaster Recovery .
To maintain the efficiency in protection and DR, Leap allows to protect a guest VM with Synchronous replication schedule to only one AHV cluster and at the different on-prem availability zone.
Asynchronous replication schedules enable you to protect your guest VMs with an RPO of 1 hour or beyond. A protection policy with an Asynchronous replication schedule creates a recovery point in an hourly time interval, and replicates it to the recovery AZs (AZs) for High Availability. For guest VMs protected with Asynchronous replication schedule, you can perform disaster recovery (DR) to different Nutanix clusters at same or different AZs. In addition to performing DR to Nutanix clusters running the same hypervisor type, you can also perform cross-hypervisor disaster recovery (CHDR)—DR from AHV clusters to ESXi clusters, or from ESXi clusters to AHV clusters.
The following are the specific requirements for protecting your guest VMs with Asynchronous replication schedule. Ensure that you meet the following requirements in addition to the general requirements of Nutanix Disaster Recovery .
For information about the general requirements of Nutanix Disaster Recovery , see Nutanix Disaster Recovery Requirements.
For information about node, disk and Foundation configurations required to support Asynchronous replication schedules, see On-Prem Hardware Resource Requirements.
AHV or ESXi
Each on-prem AZ must have a Nutanix Disaster Recovery enabled Prism Central instance.
The primary and recovery Prism Central and Prism Element on the Nutanix clusters must be running the following versions of AOS.
Guest VMs protected with Asynchronous replication schedule support cross-hypervisor disaster recovery. You can perform failover (DR) to recover guest VMs from AHV clusters to ESXi clusters or guest VMs from ESXi clusters to AHV clusters by considering the following requirements.
NGT configures the guest VMs with all the required drivers for VM portability. For more information about general NGT requirements, see Nutanix Guest Tools Requirements and Limitations in Prism Web Console Guide .
For information about operating systems that support UEFI and Secure Boot, see UEFI and Secure Boot Support for CHDR.
If you have delta disks attached to a guest VM and you proceed with failover, you get a validation warning and the guest VM does not recover. Contact Nutanix Support for assistance.
Operating System | Version | Requirements and limitations |
---|---|---|
Windows |
|
|
Linux |
|
|
The storage container name of the protected guest VMs must be the same on both the primary and recovery clusters. Therefore, a storage container must exist on the recovery cluster with the same name as the one on the primary cluster. For example, if the protected VMs are in the SelfServiceContainer storage container on the primary cluster, there must also be a SelfServiceContainer storage container on the recovery cluster.
Consider the following specific limitations before protecting your guest VMs with Asynchronous replication schedule. These limitations are in addition to the general limitations of Nutanix Disaster Recovery .
For information about the general limitations of Leap, see Nutanix Disaster Recovery Limitations.
CHDR does not preserve hypervisor-specific properties (for example, multi-writer flags, independent persistent and non-persistent disks, changed block tracking (CBT), PVSCSI disk configurations).
To protect the guest VMs in an hourly replication schedule, configure an Asynchronous replication schedule while creating the protection policy. The policy takes recovery points of those guest VMs in the specified time intervals (hourly) and replicates them to the recovery AZs (AZs) for High Availability. To protect the guest VMs at the same or different recovery AZs, the protection policy allows you to configure Asynchronous replication schedules to at most two recovery AZs—a unique replication schedule to each recovery AZ. The policy synchronizes continuously to the recovery AZs in a bidirectional way.
To create a protection policy with an Asynchronous replication schedule, do the following at the primary AZ. You can also create a protection policy at the recovery AZ. Protection policies you create or update at a recovery AZ synchronize back to the primary AZ.
The drop-down lists all the AZs paired with the local AZ. Local AZ represents the local AZ (Prism Central). For your primary AZ, you can check either the local AZ or a non-local AZ.
The drop-down lists all the Nutanix clusters registered to Prism Central representing the selected AZ. If you want to protect the guest VMs from multiple Nutanix clusters in the same protection policy, check the clusters that host those guest VMs. All Clusters protects the guest VMs of all Nutanix clusters registered to Prism Central.
Clicking Save activates the Recovery Location pane. After saving the primary AZ configuration, you can optionally add a local schedule (step iv) to retain the recovery points at the primary AZ.
Specify the following information in the Add Schedule window.
When you enter the frequency in minutes , the system selects the Roll-up retention type by default because minutely recovery points do not support Linear retention types.
For more information about the roll-up recovery points, see step d.iii.
Irrespective of the local or replication schedules, the recovery points are of the specified type. If you check Take App-Consistent Recovery Point , the recovery points generated are application-consistent and if you do not check Take App-Consistent Recovery Point , the recovery points generated are crash-consistent. If the time in the local schedule and the replication schedule match, the single recovery point generated is application-consistent.
The drop-down lists all the AZs paired with the local AZ. Local AZ represents the local AZ (Prism Central). Select Local AZ if you want to configure DR to a different Nutanix cluster at the same AZ.
If you do not select a AZ, local recovery points that are created by the protection policy do not replicate automatically. You can, however, replicate the recovery points manually and use recovery plans to recover the guest VMs. For more information, see Protection and Manual DR (Nutanix Disaster Recovery).
The drop-down lists all the Nutanix clusters registered to Prism Central representing the selected AZ. You can select one cluster at the recovery AZ. If you want to replicate the recovery points to more clusters at the same or different AZs, add another recovery AZ with a replication schedule. For more information to add another recovery AZ with a replication schedule, see step e.
Clicking Save activates the + Add Schedule button between the primary and the recovery AZ. After saving the recovery AZ configuration, you can optionally add a local schedule to retain the recovery points at the recovery AZ.
Specify the following information in the Add Schedule window.
When you enter the frequency in minutes , the system selects the Roll-up retention type by default because minutely recovery points do not support Linear retention types.
For more information about the roll-up recovery points, see step d.iii.
Irrespective of the local or replication schedules, the recovery points are of the specified type. If you check Take App-Consistent Recovery Point , the recovery points generated are application-consistent and if you do not check Take App-Consistent Recovery Point , the recovery points generated are crash-consistent. If the time in the local schedule and the replication schedule match, the single recovery point generated is application-consistent.
Specify the following information in the Add Schedule window. The window auto-populates the Primary Location and Recovery Location that you have selected in step b and step c.
The specified frequency is the RPO. For more information about RPO, see Nutanix Disaster Recovery Terminology.
This field is unavailable if you do not specify a recovery location.
If you select linear retention, the remote and local retention count represents the number of recovery points to retain at any given time. If you select roll-up retention, these numbers specify the retention period.
Reverse retention maintains the retention numbers of recovery points even after failover to a recovery AZ in the same or different AZs. For example, if you retain two recovery points at the primary AZ and three recovery points at the recovery AZ, and you enable reverse retention, a failover event does not change the initial retention numbers when the recovery points replicate back to the primary AZ. The recovery AZ still retains two recovery points while the primary AZ retains three recovery points. If you do not enable reverse retention, a failover event changes the initial retention numbers when the recovery points replicate back to the primary AZ. The recovery AZ retains three recovery points while the primary AZ retains two recovery points.
Maintaining the same retention numbers at a recovery AZ is required if you want to retain a particular number of recovery points, irrespective of where the guest VM is after its failover.
Application-consistent recovery points ensure that application consistency is maintained in the replicated recovery points. For application-consistent recovery points, install NGT on the guest VMs running on AHV clusters. For guest VMs running on ESXi clusters, you can take application-consistent recovery points without installing NGT, but the recovery points are hypervisor-based, and leads to VM stuns (temporary unresponsive VMs) after failover to the recovery AZs.
By default, recovery point creation begins immediately after you create the protection policy. If you want to specify when recovery point creation must begin, click Immediately at the top-right corner, and then, in the Start Time dialog box, do the following.
For example, the guest VM VM_SherlockH is in the category Department:Admin , and you add this category to the protection policy named PP_AdminVMs . Now, if you add VM_SherlockH from the VMs page to another protection policy named PP_VMs_UK , VM_SherlockH is protected in PP_VMs_UK and unprotected from PP_AdminVMs .
If you do not want to protect the guest VMs category wise, proceed to the next step without checking VM categories. You can add the guest VMs individually to the protection policy later from the VMs page (see Adding Guest VMs Individually to a Protection Policy).
This topic describes the conditions and limitations for application-consistent recovery points that you can generate through a protection policy. For information about the operating systems that support the AOS version you have deployed, see the Compatibility Matrix.
Applications running in your guest VM must be able to quiesce I/O operations. For example, For example, you can quiesce I/O operations for database applications and similar workload types.
For installing and enabling NGT, see Nutanix Guest Tools in the Prism Web Console Guide .
For guest VMs running on ESXi, consider these points.
Operating system | Version |
---|---|
Windows |
|
Linux |
|
When you configure a protection policy and select Take App-Consistent Recovery Point , the Nutanix cluster transparently invokes the VSS (also known as Shadow copy or volume snapshot service).
Third party Backup products can choose between VSS_BT_FULL (full backup )and VSS_BT_COPY (copy backup) backup types.
Nutanix VSS recovery points fail for such guest VMs.
C:\Program Files\Nutanix\Scripts\pre_freeze.bat
C:\Program Files\Nutanix\Scripts\post_thaw.bat
/usr/local/sbin/pre_freeze
Replace pre_freeze with the script name (without extension).
/usr/local/sbin/post_thaw
Replace post_thaw with the script name (without extension).
#!/bin/sh
#pre_freeze-script
date >> '/scripts/pre_root.log'
echo -e "\n attempting to run pre_freeze script for MySQL as root user\n" >> /scripts/pre_root.log
if [ "$(id -u)" -eq "0" ]; then
python '/scripts/quiesce.py' &
echo -e "\n executing query flush tables with read lock to quiesce the database\n" >> /scripts/pre_freeze.log
echo -e "\n Database is in quiesce mode now\n" >> /scripts/pre_freeze.log
else
date >> '/scripts/pre_root.log'
echo -e "not root useri\n" >> '/scripts/pre_root.log'
fi
#!/bin/sh
#post_thaw-script
date >> '/scripts/post_root.log'
echo -e "\n attempting to run post_thaw script for MySQL as root user\n" >> /scripts/post_root.log
if [ "$(id -u)" -eq "0" ]; then
python '/scripts/unquiesce.py'
else
date >> '/scripts/post_root.log'
echo -e "not root useri\n" >> '/scripts/post_root.log'
fi
@echo off
echo Running pre_freeze script >C:\Progra~1\Nutanix\script\pre_freeze_log.txt
@echo off
echo Running post_thaw script >C:\Progra~1\Nutanix\script\post_thaw_log.txt
If these requirements are not met, the system captures crash-consistent snapshots.
Server | ESXi | AHV | ||
---|---|---|---|---|
NGT status | Result | NGT status | Result | |
Microsoft Windows Server edition | Installed and active. Also pre-freeze and post-thaw scripts are present. | Nutanix script-based VSS snapshots | Installed and active. Also pre-freeze and post-thaw scripts are present. | Nutanix script-based VSS snapshots |
Installed and active | Nutanix VSS-enabled snapshots. | Installed and active | Nutanix VSS-enabled snapshots | |
Not enabled | Hypervisor-based application-consistent or crash-consistent snapshots. | Not enabled | Crash-consistent snapshots | |
Microsoft Windows Client edition | Installed and active. Also pre-freeze and post-thaw scripts are present. | Nutanix script-based VSS snapshots | Installed and active. Also pre-freeze and post-thaw scripts are present. | Nutanix script-based VSS snapshots |
Not enabled | Hypervisor-based snapshots or crash-consistent snapshots. | Not enabled | Crash-consistent snapshots | |
Linux VMs | Installed and active. Also pre-freeze and post-thaw scripts are present. | Nutanix script-based VSS snapshots | Installed and active. Also pre-freeze and post-thaw scripts are present. | Nutanix script-based VSS snapshots |
Not enabled | Hypervisor-based snapshots or crash-consistent snapshots. | Not enabled | Crash-consistent snapshots |
To orchestrate the failover (disaster recovery) of the protected guest VMs to the recovery AZ, create a recovery plan. After a failover, a recovery plan recovers the protected guest VMs to the recovery AZ. If you have configured two on-prem recovery AZs in a protection policy, create two recovery plans for DR—one for recovery to each recovery AZ. The recovery plan synchronizes continuously to the recovery AZ in a bidirectional way.
To create a recovery plan, do the following at the primary AZ. You can also create a recovery plan at a recovery AZ. The recovery plan you create or update at a recovery AZ synchronizes back to the primary AZ.
Traditionally, to achieve this new configuration, you would manually log on to the recovered VM and modify the relevant files. With in-guest scripts, you have to write a script to automate the required steps and enable the script when you configure a recovery plan. The recovery plan execution automatically invokes the script and performs the reassigning of DNS IP address and reconnection to the database server at the recovery AZ.
Traditionally, to reconfigure, you would manually log on to the VM, remove the VM from an existing domain controller, and then add the VM to a new domain controller. With in-guest scripts, you can automate the task of changing the domain controller.
C:\Program Files\Nutanix\scripts\production\vm_recovery.bat
C:\Program Files\Nutanix\scripts\test\vm_recovery.bat
/usr/local/sbin/production_vm_recovery
/usr/local/sbin/test_vm_recovery
A command prompt icon appears against the guest VMs or VM categories to indicate that in-guest script execution is enabled on those guest VMs or VM categories.
A stage defines the order in which the protected guest VMs start at the recovery cluster. You can create multiple stages to prioritize the start sequence of the guest VMs. In the Power On Sequence , the VMs in the preceding stage start before the VMs in the succeeding stages. On recovery, it is desirable to start some VMs before the others. For example, database VMs must start before the application VMs. Place all the database VMs in the stage before the stage containing the application VMs, in the Power On Sequence .
You perform failover of the protected guest VMs when unplanned failure events (for example, natural disasters) or planned events (for example, scheduled maintenance) happen at the primary AZ (AZ) or the primary cluster. The protected guest VMs migrate to the recovery AZ where you perform the failover operations. On recovery, the protected guest VMs start in the Nutanix cluster you specify in the recovery plan that orchestrates the failover.
The following are the types of failover operations.
At the recovery AZ, the guest VMs can recover using the recovery points replicated from the primary AZ only. The guest VMs cannot recover using the local recovery points. For example, if you perform an unplanned failover from the primary AZ AZ1 to the recovery AZ AZ2 , the guest VMs recover at AZ2 using the recovery points replicated from AZ1 to AZ2 .
You can perform a planned or an unplanned failover in different scenarios of network failure. For more information about network failure scenarios, see Nutanix Disaster Recovery and Xi Leap Failover Scenarios.
At the recovery AZ after a failover, the recovery plan creates only the VM category that was used to include the guest VM in the recovery plan. Manually create the remaining VM categories at the recovery AZ and associate the guest VMs with those categories.
The recovered guest VMs generate recovery points as per the replication schedule that protects it even after recovery. The recovery points replicate back to the primary AZ when the primary AZ starts functioning. The approach for reverse replication enables you to perform failover of the guest VMs from the recovery AZ back to the primary AZ (failback). The same recovery plan applies to both the failover and the failback operations. Only, for failover, you must perform the failover operations on the recovery plan at the recovery AZ while for failback, you must perform the failover operations on the recovery plan at the primary AZ. For example, if a guest VM fails over from AZ1 (Local) to AZ2 , the failback fails over the same VMs from AZ2 (Local) back to AZ1 .
You have the flexibility to perform a real or simulated failover for the full and partial workloads (with or without networking). The term virtual network is used differently on on-prem clusters and Xi Cloud Services. In Xi Cloud Services, the term virtual network is used to describe the two built-in virtual networks—production and test. Virtual networks on the on-prem clusters are virtual subnets bound to a single VLAN. Manually create these virtual subnets, and create separate virtual subnets for production and test purposes. Create these virtual subnets before you configure recovery plans. When configuring a recovery plan, you map the virtual subnets at the primary AZ to the virtual subnets at the recovery AZ.
The following are the various scenarios that you can encounter in Leap configurations for disaster recovery (DR) to an on-prem AZ (AZ) or to Xi Cloud (Xi Leap). Each scenario is explained with the required network-mapping configuration for Xi Leap. However, the configuration remains the same irrespective of disaster recovery (DR) using Leap or Xi Leap. You can either create a recovery plan with the following network mappings (see Creating a Recovery Plan (Nutanix Disaster Recovery)) or update an existing recovery plan with the following network mappings (see Updating a Recovery Plan).
Full network failure is the most common scenario. In this case, it is desirable to bring up the whole primary AZ in the Xi Cloud. All the subnets must failover, and the WAN IP address must change from the on-prem IP address to the Xi WAN IP address. Floating IP addresses can be assigned to individual guest VMs, otherwise, everything use Xi network address translation (NAT) for external communication.
Perform the failover when the on-prem subnets are down and jump the host available on the public Internet through the floating IP address of Xi production network.
To set up the recovery plan that orchestrates the full network failover, perform the following.
The selection auto-populates the Xi production and test failover subnets.
Perform steps 1–4 for every subnet.
You want to failover one or more subnets from the primary AZ to Xi Cloud. The communications between the AZs happen through the VPN or using the external NAT or floating IP addresses. A use case of this type of scenario is that the primary AZ needs maintenance, but some of its subnets must see no downtime.
Perform partial failover when some subnets are active in the production networks at both on-prem and Xi Cloud, and jump the host available on the public Internet through the floating IP address of Xi production network.
On-prem guest VMs can connect to the guest VMs on the Xi Cloud Services.
To set up the recovery plan that orchestrates the partial network failover, perform the following.
The selection auto-populates the Xi production and test failover subnets.
Perform steps 1–4 for one or more subnets based on the maintenance plan.
You want to failover some guest VMs to Xi Cloud, while keeping the other guest VMs up and running at the on-prem cluster (primary AZ). A use case of this type of scenario is that the primary AZ needs maintenance, but some of its guest VMs must see no downtime.
This scenario requires changing IP addresses for the guest VMs running at Xi Cloud. Since you cannot have the subnet active on both the AZs, create a subnet to host the failed over guest VMs. Jump the host available on the public Internet through the floating IP address of Xi production network.
On-prem guest VMs can connect to the guest VMs on the Xi Cloud Services.
To set up the recovery plan that orchestrates the partial subnet network failover, perform the following.
The selection auto-populates the Xi production and test failover subnets for a full subnet failover
Perform steps 1–4 for one or more subnets based on the maintenance plan.
You want to test all the preceding three scenarios by creating an isolated test network so that no routing or IP address conflict happens. Clone all the guest VMs from a local recovery point and bring up to test failover operations. Test failover test when all on-prem subnets are active and on-prem guest VMs can connect to the guest VMs at the Xi Cloud. Jump the host available on the public Internet through the floating IP address of Xi production network.
In this case, focus on the test failover section when creating the recovery plan. When you select a local AZ production subnet, it copies to the test network. You can go one step further and create a test subnet at the Xi Cloud.
You can perform test failover, planned failover, and unplanned failover of the guest VMs protected with Asynchronous replication schedule across different Nutanix clusters at the same or different on-prem AZs (AZs). The steps to perform test, planned, and unplanned failover are largely the same irrespective of the replication schedules that protect the guest VMs.
After you create a recovery plan, you can run a test failover periodically to ensure that the failover occurs smoothly when required. To perform a test failover, do the following procedure at the recovery AZ. If you have two recovery AZs for DR, perform the test at the AZ where you want to recover the guest VMs.
Resolve the error conditions and then restart the test procedure.
After testing a recovery plan, you can remove the test VMs that the recovery plan creates in the recovery test network. To clean up the test VMs, do the following at the recovery AZ where the test failover created the test VMs.
If there is a planned event (for example, scheduled maintenance of guest VMs) at the primary AZ (AZ), perform a planned failover to the recovery AZ. To perform a planned failover, do the following procedure at the recovery AZ. If you have two recovery AZs for DR, perform the failover at the AZ where you want to recover the guest VMs.
Resolve the error conditions and then restart the failover procedure.
The entities of AHV/ESXi clusters recover at a different path on the ESXi clusters if their files conflict with the existing files on the recovery ESXi cluster. For example, there is a file name conflict if a VM (VM1) migrates to a recovery cluster that already has a VM (VM1) in the same container.
However, the entities recover at a different path with VmRecoveredAtAlternatePath alert only if the following conditions are met.
If these conditions are not satisfied, the failover operation fails.
If there is an unplanned event (for example, a natural disaster or network failure) at the primary AZ (AZ), perform an unplanned failover to the recovery AZ. To perform an unplanned failover, do the following procedure at the recovery AZ. If you have two recovery AZs for DR, perform the failover at the AZ where you want to recover the guest VMs.
Resolve the error conditions and then restart the failover procedure.
The entities of AHV/ESXi clusters recover at a different path on the ESXi clusters if their files conflict with the existing files on the recovery ESXi cluster. For example, there is a file name conflict if a VM (VM1) migrates to a recovery cluster that already has a VM (VM1) in the same container.
However, the entities recover at a different path with VmRecoveredAtAlternatePath alert only if the following conditions are met.
If these conditions are not satisfied, the failover operation fails.
A failback is failover of the guest VMs from the recovery AZ (AZ) back to the primary AZ. The same recovery plan applies to both the failover and the failback operations. Only, for failover, you must perform the failover operations on the recovery plan at the recovery AZ while for failback, you must perform the failover operations on the recovery plan at the primary AZ.
To perform a failback, do the following procedure at the primary AZ.
Resolve the error conditions and then restart the failover procedure.
The entities of AHV/ESXi clusters recover at a different path on the ESXi clusters if their files conflict with the existing files on the recovery ESXi cluster. For example, there is a file name conflict if a VM (VM1) migrates to a recovery cluster that already has a VM (VM1) in the same container.
However, the entities recover at a different path with VmRecoveredAtAlternatePath alert only if the following conditions are met.
If these conditions are not satisfied, the failover operation fails.
After you trigger a failover operation, you can monitor failover-related tasks. To monitor a failover, perform the following procedure at the recovery AZ. If you have two recovery AZs for DR, perform the procedure at the AZ where you trigger the failover.
You can configure RBAC policies allowing other Prism Central Active Directory users (non-administrator roles) to perform operations on recovery points and recovery plans. This section guides you to configure recovery plan RBAC policies. For information about RBAC policies for recovery points, see Controlling User Access (RBAC) in the Nutanix Security Guide . Perform the following steps to configure recovery plan RBAC policies.
You must create a custom role because none of the in-built roles support recovery plan operations.
The entity access is revoked because the entity UUID changes after the unplanned failover.
To create a custom role, do the following:
The Roles page appears. See Custom Role Permissions for a list of the permissions available for each custom role option.
For example, for the VM entity, click the radio button for the desired VM permissions:
If you select Set Custom Permissions , click the Change link to display the Custom VM Permissions window, check all the permissions you want to enable, and then click the Save button. Optionally, check the Allow VM Creation box to allow this role to create VMs.
Perform the following procedure to modify or delete a custom role.
A selection of permission options are available when creating a custom role.
The following table lists the permissions you can grant when creating or modifying a custom role. When you select an option for an entity, the permissions listed for that option are granted. If you select Set custom permissions , a complete list of available permissions for that entity appears. Select the desired permissions from that list.
Entity | Option | Permissions |
---|---|---|
App (application) | No Access | (none) |
Basic Access | Abort App Runlog, Access Console VM, Action Run App, Clone VM, Create AWS VM, Create Image, Create VM, Delete AWS VM, Delete VM, Download App Runlog, Update AWS VM, Update VM, View App, View AWS VM, View VM | |
Set Custom Permissions (select from list) | Abort App Runlog, Access Console VM, Action Run App, Clone VM, Create App, Create AWS VM, Create Image, Create VM, Delete App, Delete AWS VM, Delete VM, Download App Runlog, Update App, Update AWS VM, Update VM, View App, View AWS VM, View VM | |
VM Recovery Point | No Access | (none) |
View Only | View VM Recovery Point | |
Full Access | Delete VM Recovery Point, Restore VM Recovery Point, Snapshot VM, Update VM Recovery Point, View VM Recovery Point, Allow VM Recovery Point creation | |
Set Custom Permissions (Change) | Abort App Runlog, Access Console VM, Action Run App, Clone VM, Create App, Create AWS VM, Create Image, Create VM, Delete App, Delete AWS VM, Delete VM, Download App Runlog, Update App, Update AWS VM, Update VM, View App, View AWS VM, View VM | |
Note:
You can assign permissions for the VM Recovery Point entity to users or user groups in the following two ways.
Tip:
When a recovery point is created, it is associated with the
same category as the VM.
|
||
VM | No Access | (none) |
View Access | Access Console VM, View VM | |
Basic Access | Access Console VM, Update VM Power State, View VM | |
Edit Access | Access Console VM, Update VM, View Subnet, View VM | |
Full Access | Access Console VM, Clone VM, Create VM, Delete VM, Export VM, Update VM, Update VM Boot Config, Update VM CPU, Update VM Categories, Update VM Description, Update VM Disk List, Update VM GPU List, Update VM Memory, Update VM NIC List, Update VM Owner, Update VM Power State, Update VM Project, View Cluster, View Subnet, View VM. | |
Set Custom Permissions (select from list) | Access Console VM, Clone VM, Create VM, Delete VM, Update VM, Update VM Boot Config, Update VM CPU, Update VM Categories, Update VM Disk List, Update VM GPU List, Update VM Memory, Update VM NIC List, Update VM Owner, Update VM Power State, Update VM Project, View Cluster, View Subnet, View VM. | |
Granular permissions (applicable if IAM is enabled, see Granular Role-Based Access Control (RBAC)) for details. Allow VM Power Off, Allow VM Power On, Allow VM Reboot, Allow VM Reset, Expand VM Disk Size, Mount VM CDROM, Unmount VM CDROM, Update VM Memory Overcommit, Update VM NGT Config, Update VM Power State Mechanism |
||
Allow VM creation (additional option) | (n/a) | |
Blueprint | No Access | (none) |
View Access | View Account, View AWS AZ, View AWS Elastic IP, View AWS Image, View AWS Key Pair, View AWS Machine Type, View AWS Region, View AWS Role, View AWS Security Group, View AWS Subnet, View AWS Volume Type, View AWS VPC, View Blueprint, View Cluster, View Image, View Project, View Subnet | |
Basic Access | Access Console VM, Clone VM, Create App,Create Image, Create VM, Delete VM, Launch Blueprint, Update VM, View Account, View App, View AWS AZ, View AWS Elastic IP, View AWS Image, View AWS Key Pair, View AWS Machine Type, View AWS Region, View AWS Role, View AWS Security Group, View AWS Subnet, View AWS Volume Type, View AWS VPC, View Blueprint, View Cluster, View Image, View Project, View Subnet, View VM | |
Full Access | Access Console VM, Clone Blueprint, Clone VM, Create App, Create Blueprint, Create Image, Create VM, Delete Blueprint, Delete VM, Download Blueprint, Export Blueprint, Import Blueprint, Launch Blueprint, Render Blueprint, Update Blueprint, Update VM, Upload Blueprint, View Account, View App, View AWS AZ, View AWS Elastic IP, View AWS Image, View AWS Key Pair, View AWS Machine Type, View AWS Region, View AWS Role, View AWS Security Group, View AWS Subnet, View AWS Volume Type, View AWS VPC, View Blueprint, View Cluster, View Image, View Project, View Subnet, View VM | |
Set Custom Permissions (select from list) | Access Console VM, Clone VM, Create App, Create Blueprint, Create Image, Create VM, Delete Blueprint, Delete VM, Download Blueprint, Export Blueprint, Import Blueprint, Launch Blueprint, Render Blueprint, Update Blueprint, Update VM, Upload Blueprint, View Account, View App, View AWS AZ, View AWS Elastic IP, View AWS Image, View AWS Key Pair, View AWS Machine Type, View AWS Region, View AWS Role, View AWS Security Group, View AWS Subnet, View AWS Volume Type, View AWS VPC, View Blueprint, View Cluster, View Image, View Project, View Subnet, View VM | |
Marketplace Item | No Access | (none) |
View marketplace and published blueprints | View Marketplace Item | |
View marketplace and publish new blueprints | Update Marketplace Item, View Marketplace Item | |
Full Access | Config Marketplace Item, Create Marketplace Item, Delete Marketplace Item, Render Marketplace Item, Update Marketplace Item, View Marketplace Item | |
Set Custom Permissions (select from list) | Config Marketplace Item, Create Marketplace Item, Delete Marketplace Item, Render Marketplace Item, Update Marketplace Item, View Marketplace Item | |
Report | No Access | (none) |
View Only | Notify Report Instance, View Common Report Config, View Report Config, View Report Instance | |
Full Access | Create Common Report Config, Create Report Config, Create Report Instance, Delete Common Report Config, Delete Report Config, Delete Report Instance, Notify Report Instance, Run Report Config, Share Report Config, Share Report Instance, Update Common Report Config, Update Report Config, View Common Report Config, View Report Config, View Report Instance, View User, View User Group | |
Cluster | No Access | (none) |
View Access | View Cluster | |
Update Access | Update Cluster | |
Full Access | Update Cluster, View Cluster | |
VLAN Subnet | No Access | (none) |
View Access | View Subnet, View Virtual Switch | |
Edit Access | Update Subnet, View Cluster, View Subnet, View Virtual Switch | |
Full Access | Create Subnet, Delete Subnet, Update Subnet, View Cluster, View Subnet, View Virtual Switch | |
Image | No Access | (none) |
View Only | View Image | |
Set Custom Permissions (select from list) | Copy Image Remote, Create Image, Delete Image, Migrate Image, Update Image, View Image | |
OVA | No Access | (none) |
View Access | View OVA | |
Full Access | View OVA, Create OVA, Update OVA and Delete OVA | |
Set custom permissions Change | View OVA, Create OVA, Update OVA and Delete OVA | |
Image Placement Policy | No Access | (none) |
View Access | View Image Placement Policy, View Name Category, View Value Category | |
Full Access | Create Image Placement Policy, Delete Image Placement Policy, Update Image Placement Policy, View Image Placement Policy, View Name Category, View Value Category | |
Set Custom Permissions (select from list) | Create Image Placement Policy, Delete Image Placement Policy, Update Image Placement Policy, View Image Placement Policy, View Name Category, View Value Category | |
File Server | No Access | (none) |
Allow File Server creation |
Note:
The role has full access if you select
Allow File Server creation
.
|
The following table describe the permissions.
Permission | Description | Assigned Implicilty By |
---|---|---|
Create App | Allows to create an application. | |
Delete App | Allows to delete an application. | |
View App | Allows to view an application. | |
Action Run App | Allows to run action on an application. | |
Download App Runlog | Allows to download an application runlog. | |
Abort App Runlog | Allows to abort an application runlog. | |
Access Console VM | Allows to access the console of a virtual machine. | |
Create VM | Allows to create a virtual machine. | |
View VM | Allows to view a virtual machine. | |
Clone VM | Allows to clone a virtual machine. | |
Delete VM | Allows to delete a virtual machine. | |
Export VM | Allows to export a virtual machine | |
Snapshot VM | Allows to snapshot a virtual machine. | |
View VM Recovery Point | Allows to view a vm_recovery_point. | |
Update VM Recovery Point | Allows to update a vm_recovery_point. | |
Delete VM Recovery Point | Allows to delete a vm_recovery_point. | |
Restore VM Recovery Point | Allows to restore a vm_recovery_point. | |
Update VM | Allows to update a virtual machine. | |
Update VM Boot Config | Allows to update a virtual machine's boot configuration. | Update VM |
Update VM CPU | Allows to update a virtual machine's CPU configuration. | Update VM |
Update VM Categories | Allows to update a virtual machine's categories. | Update VM |
Update VM Description | Allows to update a virtual machine's description. | Update VM |
Update VM GPU List | Allows to update a virtual machine's GPUs. | Update VM |
Update VM NIC List | Allows to update a virtual machine's NICs. | Update VM |
Update VM Owner | Allows to update a virtual machine's owner. | Update VM |
Update VM Project | Allows to update a virtual machine's project. | Update VM |
Update VM NGT Config | Allows updates to a virtual machine's Nutanix Guest Tools configuration. | Update VM |
Update VM Power State | Allows updates to a virtual machine's power state. | Update VM |
Update VM Disk List | Allows to update a virtual machine's disks. | Update VM |
Update VM Memory | Allows to update a virtual machine's memory configuration. | Update VM |
Update VM Power State Mechanism | Allows updates to a virtual machine's power state mechanism. | Update VM or Update VM Power State |
Allow VM Power Off | Allows power off and shutdown operations on a virtual machine. | Update VM or Update VM Power State |
Allow VM Power On | Allows power on operation on a virtual machine. | Update VM or Update VM Power State |
Allow VM Reboot | Allows reboot operation on a virtual machine. | Update VM or Update VM Power State |
Expand VM Disk Size | Allows to expand a virtual machine's disk size. | Update VM or Update VM Disk List |
Mount VM CDROM | Allows to mount an ISO to virtual machine's CDROM. | Update VM or Update VM Disk List |
Unmount VM CDROM | Allows to unmount ISO from virtual machine's CDROM. | Update VM or Update VM Disk List |
Update VM Memory Overcommit | Allows to update a virtual machine's memory overcommit configuration. | Update VM or Update VM Memory |
Allow VM Reset | Allows reset (hard reboot) operation on a virtual machine. | Update VM, Update VM Power State, or Allow VM Reboot |
View Cluster | Allows to view a cluster. | |
Update Cluster | Allows to update a cluster. | |
Create Image | Allows to create an image. | |
View Image | Allows to view a image. | |
Copy Image Remote | Allows to copy an image from local PC to remote PC. | |
Delete Image | Allows to delete an image. | |
Migrate Image | Allows to migrate an image from PE to PC. | |
Update Image | Allows to update a image. | |
Create Image Placement Policy | Allows to create an image placement policy. | |
View Image Placement Policy | Allows to view an image placement policy. | |
Delete Image Placement Policy | Allows to delete an image placement policy. | |
Update Image Placement Policy | Allows to update an image placement policy. | |
Create AWS VM | Allows to create an AWS virtual machine. | |
View AWS VM | Allows to view an AWS virtual machine. | |
Update AWS VM | Allows to update an AWS virtual machine. | |
Delete AWS VM | Allows to delete an AWS virtual machine. | |
View AWS AZ | Allows to view AWS Availability Zones. | |
View AWS Elastic IP | Allows to view an AWS Elastic IP. | |
View AWS Image | Allows to view an AWS image. | |
View AWS Key Pair | Allows to view AWS keypairs. | |
View AWS Machine Type | Allows to view AWS machine types. | |
View AWS Region | Allows to view AWS regions. | |
View AWS Role | Allows to view AWS roles. | |
View AWS Security Group | Allows to view an AWS security group. | |
View AWS Subnet | Allows to view an AWS subnet. | |
View AWS Volume Type | Allows to view AWS volume types. | |
View AWS VPC | Allows to view an AWS VPC. | |
Create Subnet | Allows to create a subnet. | |
View Subnet | Allows to view a subnet. | |
Update Subnet | Allows to update a subnet. | |
Delete Subnet | Allows to delete a subnet. | |
Create Blueprint | Allows to create the blueprint of an application. | |
View Blueprint | Allows to view the blueprint of an application. | |
Launch Blueprint | Allows to launch the blueprint of an application. | |
Clone Blueprint | Allows to clone the blueprint of an application. | |
Delete Blueprint | Allows to delete the blueprint of an application. | |
Download Blueprint | Allows to download the blueprint of an application. | |
Export Blueprint | Allows to export the blueprint of an application. | |
Import Blueprint | Allows to import the blueprint of an application. | |
Render Blueprint | Allows to render the blueprint of an application. | |
Update Blueprint | Allows to update the blueprint of an application. | |
Upload Blueprint | Allows to upload the blueprint of an application. | |
Create OVA | Allows to create an OVA. | |
View OVA | Allows to view an OVA. | |
Update OVA | Allows to update an OVA. | |
Delete OVA | Allows to delete an OVA. | |
Create Marketplace Item | Allows to create a marketplace item. | |
View Marketplace Item | Allows to view a marketplace item. | |
Update Marketplace Item | Allows to update a marketplace item. | |
Config Marketplace Item | Allows to configure a marketplace item. | |
Render Marketplace Item | Allows to render a marketplace item. | |
Delete Marketplace Item | Allows to delete a marketplace item. | |
Create Report Config | Allows to create a report_config. | |
View Report Config | Allows to view a report_config. | |
Run Report Config | Allows to run a report_config. | |
Share Report Config | Allows to share a report_config. | |
Update Report Config | Allows to update a report_config. | |
Delete Report Config | Allows to delete a report_config. | |
Create Common Report Config | Allows to create a common report_config. | |
View Common Report Config | Allows to view a common report_config. | |
Update Common Report Config | Allows to update a common report_config. | |
Delete Common Report Config | Allows to delete a common report_config. | |
Create Report Instance | Allows to create a report_instance. | |
View Report Instance | Allows to view a report_instance. | |
Notify Report Instance | Allows to notify a report_instance. | |
Notify Report Instance | Allows to notify a report_instance. | |
Share Report Instance | Allows to share a report_instance. | |
Delete Report Instance | Allows to delete a report_instance. | |
View Account | Allows to view an account. | |
View Project | Allows to view a project. | |
View User | Allows to view a user. | |
View User Group | Allows to view a user group. | |
View Name Category | Allows to view a category's name. | |
View Value Category | Allows to view a category's value. | |
View Virtual Switch | Allows to view a virtual switch. |
In addition to configuring basic role maps (see Configuring Role Mapping), you can configure more precise role assignments (AHV only). To assign a role to selected users or groups that applies just to a specified set of entities, do the following:
You are adding users or user groups and assigning entities to the new role in the next steps.
Typing few letters in the search field displays a list of users from which you can select, and you can add multiple user names in this field.
This table lists the available entities for each role:
Role | Entities |
---|---|
Consumer | AHV VM, Image, Image Placement Policy, OVA, Subnets: VLAN |
Developer | AHV VM, Cluster, Image, Image Placement Policy, OVA, Subnets:VLAN |
Operator | AHV VM, Subnets:VLAN |
Prism Admin | Individual entity (one or more clusters), All Clusters |
Prism Viewer | Individual entity (one or more clusters), All Clusters |
Custom role (User defined role) | Individual entity, In Category (only AHV VMs) |
This table shows the description of each entity:
Entity | Description |
---|---|
AHV VM | Allows you to manage VMs including create and edit permission |
Image | Allows you to access and manage image details |
Image Placement Policy | Allows you to access and manage image placement policy details |
OVA | Allows you to view and manage OVA details |
Subnets: VLAN | Allows you to view subnet details |
Cluster | Allows you to view and manage details of assigned clusters (AHV and ESXi clusters) |
All Clusters | Allows you to view and manage details of all clusters |
VM Recovery Points | Allows you to perform recovery operations with recovery points. |
Recovery Plan (Single PC only)
|
Allows you to view, validate, and test recovery plans. Also allows you to clean up VMs created after recovery plan test. |
Individual entity | Allows you to view and manage individual entities such as AHV VM, Clusters, and Subnets:VLAN |
The self-service restore (also known as file-level restore) feature allows you to do a self-service data recovery from the Nutanix data protection recovery points with minimal intervention. You can perform self-service data recovery on both on-prem and Xi Cloud Services.
You must deploy NGT 2.0 or newer on guest VMs to enable self-service restore from Prism Central. For more information about enabling and mounting NGT, see Enabling and Mounting Nutanix Guest Tools in the Prism Web Console Guide . When you enable self-service restore and attach a disk by logging into the VM, you can recover files within the guest OS. If you fail to detach the disk from the VM, the disk is detached automatically from the VM after 24 hours.
The requirements of self-service restore of Windows and Linux VMs are as follows.
The following are the general requirements of self-service restore. Ensure that you meet the requirements before configuring self-service restore for guest VMs.
The AOS license required depends on the features that you want to use. For information about the AOS license required for self-service restore, see Software Options.
AHV or ESXi
Prism Centrals and their registered on-prem clusters (Prism Elements) must be running the following versions of AOS.
The following are the specific requirements of self-service restore for guest VMs running Windows OS. Ensure that you meet the requirements before proceeding.
The following are the specific requirements of self-service restore for guest VMs running Linux OS. Ensure that you meet the requirements before proceeding.
The limitations of self-service restore of Windows and Linux VMs are as follows.
The following are the general limitations of self-service restore.
The following are the specific limitations of self-service restore for guest VMs running Windows OS.
Whenever the snapshot disk has an inconsistent filesystem (as indicated by the fsck check), the disk is only attached and not mounted.
After enabling NGT for a guest VM, you can enable the self-service restore for that guest VM. Also, you can enable the self-service restore for a guest VM while you are installing NGT on that guest VM.
For more information, see Enabling and Mounting Nutanix Guest Tools in the Prism Web Console Guide .
Ensure that you have installed and enabled NGT 2.0 or newer on the guest VM.
To enable self-service restore, perform the following procedure.
You can restore the desired files from the guest VM through the web interface or by using the ngtcli utility of self-service restore.
After you install NGT in the Windows guest VM, you can restore the desired files from the VM through the web interface.
To restore a file in Windows guest VMs by using web interface, perform the following.
After you install NGT in the Windows guest VM, you can restore the desired files from the VM through the ngtcli utility.
To restore a file in Windows guest VMs by using ngtcli, perform the following.
> cd c:\Program Files\Nutanix\ngtcli
> python ngtcli.py
creates a terminal
with auto-complete.
ngtcli> ssr ls-snaps
The snapshot ID, disk labels, logical drives, and create time of the snapshot is displayed. You can use this information and take a decision to restore the files from the relevant snapshot that has the data.
ngtcli> ssr ls-snaps snapshot-count=count_value
Replace count_value with the number that you want to list.
ngtcli> ssr attach-disk disk-label=disk_label snapshot-id=snap_id
Replace disk_label with the name of the disk that you want to attach.
Replace snap_id with the snapshot ID of the disk that you want to attach.
For example, to attach a disk with snapshot ID 16353 and disk label scsi0:1, type the folllowing command.
ngtcli> ssr attach-disk snapshot-id=16353 disk-label=scsi0:1
ngtcli> ssr detach-disk attached-disk-label=attached_disk_label
Replace attached_disk_label with the name of the disk that you want to attach.
ngtcli> ssr list-attached-disks
The Linux guest VM user with sudo privileges can restore the desired files from the VM through the web interface or by using the ngtcli utility.
After you install NGT in the Linux guest VM, you can restore the desired files from the VM through the web interface.
To restore a file in Linux guest VMs by using web interface, perform the following.
The selected disk or disks are mounted and the relevant disk label is displayed.
After you install NGT in the Linux guest VM, you can restore the desired files from the VM through the ngtcli utility.
To restore a file in Linux guest VMs by using ngtcli, perform the following.
> cd /usr/local/nutanix/ngt/ngtcli
ngtcli> ssr ls-snaps
The snapshot ID, disk labels, logical drives, and create time of the snapshot is displayed. You can use this information and take a decision to restore the files from the relevant snapshot that has the data.
ngtcli> ssr ls-snaps snapshot-count=count_value
Replace count_value with the number that you want to list.
ngtcli> ssr attach-disk disk-label=disk_label snapshot-id=snap_id
Replace disk_label with the name of the disk that you want to attach.
Replace snap_id with the snapshot ID of the disk that you want to attach.
For example, to attach a disk with snapshot ID 1343 and disk label scsi0:2,
ngtcli> ssr attach-disk snapshot-id=1343 disk-label=scsi0:2
After successfully running the command, a new disk with new label is attached to the guest VM.
ngtcli> ssr detach-disk attached-disk-label=attached_disk_label
Replace attached_disk_label with the name of the disk that you want to attach.
For example, to remove the disk with disk label scsi0:3, type the following command.
ngtcli> ssr detach-disk attached-disk-label=scsi0:3
ngtcli> ssr list-attached-disks
NearSync replication enables you to protect your guest VMs with an RPO of as low as 1 minute. A protection policy with a NearSync replication creates a recovery point in a minutely time interval (between 1–15 minutes), and replicates it to the recovery AZs (AZs) for High Availability. For guest VMs protected with NearSync replication schedule, you can perform disaster recovery (DR) to a different Nutanix cluster at same or different AZs. In addition to DR to Nutanix clusters of the same hypervisor type, you can also perform cross-hypervisor disaster recovery (CHDR)—disaster recovery from AHV clusters to ESXi clusters, or from ESXi clusters to AHV clusters.
The following are the advantages of protecting your guest VMs with a NearSync replication schedule.
Stun time is the time of application freeze when the recovery point is taken.
To implement the NearSync feature, Nutanix has introduced a technology called lightweight snapshots (LWSs). LWS recovery points are created at the metadata level only, and they continuously replicate incoming data generated by workloads running on the active clusters. LWS recovery points are stored in the LWS store, which is allocated on the SSD tier. When you configure a protection policy with a NearSync replication schedule, the system allocates the LWS store automatically.
When you create a NearSync replication schedule, the schedule remains an hourly schedule until its transition into a minutely schedule is complete.
To transition into NearSync (minutely) replication schedule, initial seeding of the recovery AZ with the data is performed, the recovery points are taken on an hourly basis, and replicated to the recovery AZ. After the system determines that the recovery points containing the seeding data have replicated within a specified amount of time (default is an hour), the system automatically transitions the replication schedule into NearSync schedule depending on the bandwidth and the change rate. After you transition into the NearSync replication schedule, you can see the configured minutely recovery points in the web interface.
The following are the characteristics of the process.
To transition out of the NearSync replication schedule, you can do one of the following.
Repeated transitioning in and out of NearSync replication schedule can occur because of the following reasons.
Depending on the RPO (1–15 minutes), the system retains the recovery points for a specific time period. For a NearSync replication schedule, you can configure the retention policy for days, weeks, or months on both the primary and recovery AZs instead of defining the number of recovery points you want to retain. For example, if you desire an RPO of 1 minute and want to retain the recovery points for 5 days, the retention policy works in the following way.
You can also define recovery point retention in weeks or months. For example, if you configure a 3-month schedule, the retention policy works in the following way.
The following are the specific requirements for protecting your guest VMs with NearSync replication schedule. Ensure that you meet the following requirements in addition to the general requirements of Nutanix Disaster Recovery .
For more information about the general requirements of Nutanix Disaster Recovery , see Nutanix Disaster Recovery Requirements.
For information about node, disk and Foundation configurations required to support NearSync replication schedules, see On-Prem Hardware Resource Requirements.
AHV or ESXi
Each on-prem AZ must have a Leap enabled Prism Central instance.
The primary and recovery Prism Centrals and their registered Nutanix clusters must be running the following versions of AOS.
Guest VMs protected with NearSync replication schedule support cross-hypervisor disaster recovery. You can perform failover (DR) to recover guest VMs from AHV clusters to ESXi clusters or guest VMs from ESXi clusters to AHV clusters by considering the following requirements.
NGT configures the guest VMs with all the required drivers for VM portability. For more information about general NGT requirements, see Nutanix Guest Tools Requirements and Limitations in Prism Web Console Guide .
For information about operating systems that support UEFI and Secure Boot, see UEFI and Secure Boot Support for CHDR.
Operating System | Version | Requirements and limitations |
---|---|---|
Windows |
|
|
Linux |
|
|
Consider the following specific limitations before protecting your guest VMs with NearSync replication schedule. These limitations are in addition to the general limitations of Nutanix Disaster Recovery .
For information about the general limitations of Nutanix Disaster Recovery , see Nutanix Disaster Recovery Limitations.
Cross hypervisor disaster recovery (CHDR) does not preserve hypervisor-specific properties (for example, multi-writer flags, independent persistent and non-persistent disks, changed block tracking (CBT), PVSCSI disk configurations).
For example, if you have 1 day retention at the primary AZ and 5 days retention at the recovery AZ, and you want to go back to a recovery point from 5 days ago. NearSync replication schedule does not support replicating 5 days retention back from the recovery AZ to the primary AZ.
To protect the guest VMs in a minutely replication schedule, configure a NearSync replication schedule while creating the protection policy. The policy takes recovery points of the protected guest VMs in the specified time intervals (1–15 minutes) and replicates them to the recovery AZ (AZ) for High Availability. To maintain the efficiency of minutely replication, the protection policy allows you to configure a NearSync replication schedule to only one recovery AZ. When creating a protection policy, you can specify only VM categories. If you want to include VMs individually, you must first create the protection p policy—which can also include VM categories and then include the VMs individually in the protection policy from the VMs page.
Ensure that the primary and the recovery AHV or ESXi clusters at the same or different AZs are NearSync capable. A cluster is NearSync capable if the capacity of each SSD in the cluster is at least 1.2 TB.
See NearSync Replication Requirements (Nutanix Disaster Recovery) and NearSync Replication Limitations (Nutanix Disaster Recovery) before you start.
To create a protection policy with a NearSync replication schedule, do the following at the primary AZ. You can also create a protection policy at the recovery AZ. Protection policies you create or update at a recovery AZ synchronize back to the primary AZ.
The drop-down lists all the AZs paired with the local AZ. Local AZ represents the local AZ (Prism Central). For your primary AZ, you can check either the local AZ or a non-local AZ.
The drop-down lists all the Nutanix clusters registered to Prism Central representing the selected AZ. If you want to protect the guest VMs from multiple Nutanix clusters in the same protection policy, select the clusters that host those guest VMs. All Clusters protects the guest VMs of all Nutanix clusters registered to Prism Central.
Clicking Save activates the Recovery Location pane. After saving the primary AZ configuration, you can optionally add a local schedule to retain the recovery points at the primary AZ.
Specify the following information in the Add Schedule window.
When you enter the frequency in minutes, the system selects the Roll-up retention type by default because minutely recovery points do not support Linear retention types.
For more information about the roll-up recovery points, see step d.iii.
Irrespective of the local or replication schedules, the recovery points are of the specified type. If you check Take App-Consistent Recovery Point , the recovery points generated are application-consistent and if you do not check Take App-Consistent Recovery Point , the recovery points generated are crash-consistent. If the time in the local schedule and the replication schedule match, the single recovery point generated is application-consistent.
The drop-down lists all the AZs paired with the local AZ. Local AZ represents the local AZ (Prism Central). Select Local AZ if you want to configure DR to a different Nutanix cluster at the same AZ.
If you do not select a AZ, local recovery points that are created by the protection policy do not replicate automatically. You can, however, replicate the recovery points manually and use recovery plans to recover the guest VMs. For more information, see Protection and Manual DR (Nutanix Disaster Recovery).
The drop-down lists all the Nutanix clusters registered to Prism Central representing the selected AZ. You can select one cluster at the recovery AZ. To maintain the efficiency of minutely replication, a protection policy allows you to configure only one recovery AZ for a NearSync replication schedule. However, you can add another Asynchronous replication schedule for replicating recovery points to the same or different AZs. For more information to add another recovery AZ with a replication schedule, see step e.
Clicking Save activates the + Add Schedule button between the primary and the recovery AZ. After saving the recovery AZ configuration, you can optionally add a local schedule to retain the recovery points at the recovery AZ.
Specify the following information in the Add Schedule window.
When you enter the frequency in minutes, the system selects the Roll-up retention type by default because minutely recovery points do not support Linear retention types.
For more information about the roll-up recovery points, see step d.iii.
Irrespective of the local or replication schedules, the recovery points are of the specified type. If you check Take App-Consistent Recovery Point , the recovery points generated are application-consistent and if you do not check Take App-Consistent Recovery Point , the recovery points generated are crash-consistent. If the time in the local schedule and the replication schedule match, the single recovery point generated is application-consistent.
Specify the following information in the Add Schedule window. The window auto-populates the Primary Location and Recovery Location that you have selected in step b and step c.
The specified frequency is the RPO. For more information about RPO, see Nutanix Disaster Recovery Terminology.
This field is unavailable if you do not specify a recovery location.
Reverse retention maintains the retention numbers of recovery points even after failover to a recovery AZ in the same or different AZs. For example, if you retain two recovery points at the primary AZ and three recovery points at the recovery AZ, and you enable reverse retention, a failover event does not change the initial retention numbers when the recovery points replicate back to the primary AZ. The recovery AZ still retains two recovery points while the primary AZ retains three recovery points. If you do not enable reverse retention, a failover event changes the initial retention numbers when the recovery points replicate back to the primary AZ. The recovery AZ retains three recovery points while the primary AZ retains two recovery points.
Maintaining the same retention numbers at a recovery AZ is required if you want to retain a particular number of recovery points, irrespective of where the guest VM is after its failover.
Application-consistent recovery points ensure that application consistency is maintained in the replicated recovery points. For application-consistent recovery points, install NGT on the guest VMs running on AHV clusters. For guest VMs running on ESXi clusters, you can take application-consistent recovery points without installing NGT, but the recovery points are hypervisor-based, and leads to VM stuns (temporary unresponsive VMs) after failover to the recovery AZs.
The Add Schedule window shows that auto-populates the Primary Location and the additional Recovery Location . Perform step d again to add the replication schedule.
By default, recovery point creation begins immediately after you create the protection policy. If you want to specify when recovery point creation must begin, click Immediately at the top-right corner, and then, in the Start Time dialog box, do the following.
For example, the guest VM VM_SherlockH is in the category Department:Admin , and you add this category to the protection policy named PP_AdminVMs . Now, if you add VM_SherlockH from the VMs page to another protection policy named PP_VMs_UK , VM_SherlockH is protected in PP_VMs_UK and unprotected from PP_AdminVMs .
If you do not want to protect the guest VMs category wise, proceed to the next step without checking VM categories. You can add the guest VMs individually to the protection policy later from the VMs page (see Adding Guest VMs Individually to a Protection Policy).
To orchestrate the failover of the protected guest VMs to the recovery AZ, create a recovery plan. After a failover, a recovery plan recovers the protected guest VMs to the recovery AZ. If you have configured two recovery AZs in a protection policy, create two recovery plans for DR—one for recovery to each recovery AZ. The recovery plan synchronizes continuously to the recovery AZ in a bidirectional way.
For more information about creating a recovery plan, see Creating a Recovery Plan (Nutanix Disaster Recovery).
You can perform test failover, planned failover, and unplanned failover of the guest VMs protected with NearSync replication schedule across different Nutanix clusters at the same or different on-prem AZ (AZ). The steps to perform test, planned, and unplanned failover are largely the same irrespective of the replication schedules that protect the guest VMs.
Refer Failover and Failback Management for test, planned, and unplanned failover procedures.
Synchronous replication enables you to protect your guest VMs with a zero recovery point objective (0 RPO). A protection policy with Synchronous replication schedule replicates all the writes on the protected guest VMs synchronously to the recovery AZ (AZs) for High Availability. The policy also takes recovery points of those protected VMs every 6 hours—the first snapshot is taken immediately—for raw node (HDD+SSD) size up to 120 TB. Since the replication is synchronous, the recovery points are crash-consistent only. For guest VMs (AHV) protected with Synchronous replication schedule, you can perform DR only to an AHV cluster at the same or different AZ. Replicating writes synchronously and also generating recovery points helps to eliminate data losses due to:
Nutanix recommends that the round-trip latency (RTT) between AHV clusters be less than 5 ms for optimal performance of Synchronous replication schedules. Maintain adequate bandwidth to accommodate peak writes and have a redundant physical network between the clusters.
To perform the replications synchronously yet efficiently, the protection policy limits you to configure only one recovery AZ if you add a Synchronous replication schedule. If you configure Synchronous replication schedule for a guest VM, you cannot add an Asynchronous or NearSync schedule to the same guest VM. Similarly, if you configure an Asynchronous or a NearSync replication schedule, you cannot add a Synchronous schedule to the same guest VM.
If you unpair the AZs while the guest VMs in the Nutanix clusters are still in synchronization, the Nutanix cluster becomes unstable. Therefore, disable Synchronous replication and clear stale stretch parameters if any on both the primary and recovery Prism Element before unpairing the AZs. For more information about disabling Synchronous replication, see Synchronous Replication Management.
The following are the specific requirements for protecting your AHV guest VMs with Synchronous replication schedule. Ensure that you meet the following requirements in addition to the general requirements of Leap.
For information about the general requirements of Nutanix Disaster Recovery , see Nutanix Disaster Recovery Requirements.
For information about node, disk and Foundation configurations required to support Synchronous replication schedules, see On-Prem Hardware Resource Requirements.
AHV
The AHV clusters must be running on version 20190916.189 or newer.
The primary and recovery Nutanix Clusters can be registered with a single Prism Central instance or each can be registered with different Prism Central instances.
For hardware and Foundation configurations required to support Synchronous replication schedules, see On-Prem Hardware Resource Requirements.
nutanix@cvm$ allssh 'modify_firewall -f -r remote_cvm_ip,remote_virtual_ip -p 2030,2036,2073,2090 -i eth0'
Replace remote_cvm_ip with the IP address of the recovery cluster CVM. If there are multiple CVMs, replace remote_cvm_ip with the IP addresses of the CVMs separated by comma.
Replace remote_virtual_ip with the virtual IP address of the recovery cluster.
nutanix@cvm$ allssh 'modify_firewall -f -r source_cvm_ip,source_virtual_ip -p 2030,2036,2073,2090 -i eth0'
Replace source_cvm_ip with the IP address of the primary cluster CVM. If there are multiple CVMs, replace source_cvm_ip with the IP addresses of the CVMs separated by comma.
Replace source_virtual_ip with the virtual IP address of the primary cluster.
Consider the following specific limitations before protecting your guest VMs with Synchronous replication schedule. These limitations are in addition to the general limitations of Nutanix Disaster Recovery .
For information about the general limitations of Leap, see Nutanix Disaster Recovery Limitations.
To protect the guest VMs in an instant replication schedule, configure a Synchronous replication schedule while creating the protection policy. The policy replicates all the writes on the protected guest VMs synchronously to the recovery AZ (AZ) for High Availability. For a raw node (HDD+SSD) size up to 120 TB, the policy also takes crash-consistent recovery points of those guest VMs every 6 hours and replicates them to the recovery AZ—the first snapshot is taken immediately. To maintain the efficiency of synchronous replication, the protection policy allows you to add only one recovery AZ for the protected VMs. When creating a protection policy, you can specify only VM categories. If you want to protect guest VMs individually, you must first create the protection policy—which can also include VM categories, and then include the guest VMs individually in the protection policy from the VMs page.
To create a protection policy with the Synchronous replication schedule, do the following at the primary AZ. You can also create a protection policy at the recovery AZ. Protection policies you create or update at a recovery AZ synchronize back to the primary AZ.
The drop-down lists all the AZs paired with the local AZ. Local AZ represents the local AZ (Prism Central). For your primary AZ, you can check either the local AZ or a non-local AZ.
The drop-down lists all the Nutanix clusters registered to Prism Central representing the selected AZ. If you want to protect the guest VMs from multiple AHV clusters in the same protection policy, select the AHV clusters that host those guest VMs. All Clusters protects the guest VMs of all Nutanix clusters registered to Prism Central. Select All Clusters only if all the clusters are running AHV.
Clicking Save activates the Recovery Location pane. Do not add a local schedule to retain the recovery points locally. To maintain the replication efficiency, Synchronous replication allows only the replication schedule. If you add a local schedule, you cannot click Synchronous in step d.
The drop-down lists all the AZs paired with the local AZ. Local AZ represents the local AZ (Prism Central). Select Local AZ if you want to configure DR to a different AHV cluster at the same AZ.
If you do not select a AZ, local recovery points that are created by the protection policy do not replicate automatically. You can, however, replicate the recovery points manually and use recovery plans to recover the guest VMs. For more information, see Protection and Manual DR (Nutanix Disaster Recovery).
The drop-down lists all the Nutanix clusters registered to Prism Central representing the selected AZ. You can select one AHV cluster at the recovery AZ. Do not select an ESXi cluster because DR configurations using Leap support only AHV cluster. If you select an ESXi cluster and configure a Synchronous replication schedule, replications fail.
Clicking Save activates the + Add Schedule button between the primary and the recovery AZ. Do not add a local schedule to retain the recovery points locally. To maintain the replication efficiency, Synchronous replication allows only the replication schedule. If you add a local schedule, you cannot click Synchronous in step d.
Specify the following information in the Add Schedule window. The window auto-populates the Primary Location and Recovery Location that you have selected in step b and step c.
Clicking Save Schedule disables the + Add Recovery Location button at the top-right because to maintain the efficiency of synchronous replication, the policy allows you to add only one recovery AZ.
By default, recovery point creation begins immediately after you create the protection policy. If you want to specify when recovery point creation must begin, click Immediately at the top-right corner, and then, in the Start Time dialog box, do the following.
For example, the guest VM VM_SherlockH is in the category Department:Admin , and you add this category to the protection policy named PP_AdminVMs . Now, if you add VM_SherlockH from the VMs page to another protection policy named PP_VMs_UK , VM_SherlockH is protected in PP_VMs_UK and unprotected from PP_AdminVMs .
If you do not want to protect the guest VMs category wise, proceed to the next step without checking VM categories. You can add the guest VMs individually to the protection policy later from the VMs page (see Adding Guest VMs Individually to a Protection Policy).
To orchestrate the failover of the protected guest VMs to the recovery AZ, create a recovery plan. After a failover, a recovery plan recovers the protected guest VMs to the recovery AZ. If you have configured two recovery AZs in a protection policy, create two recovery plans for DR—one for recovery to each recovery AZ. The recovery plan synchronizes continuously to the recovery AZ in a bidirectional way.
For more information about creating a recovery plan, see Creating a Recovery Plan (Nutanix Disaster Recovery).
Synchronous replication instantly replicates all writes on the protected guest VMs to the recovery cluster. Replication starts when you configure a protection policy and add the guest VMs to protect. You can manage the replication by enabling, disabling, pausing, or resuming the Synchronous replication on the protected guest VMs from the Prism Central.
When you configure a protection policy with Synchronous replication schedule and add guest VMs to protect, the replication is enabled by default. However, if you have disabled the Synchronous replication on a guest VM, you have to enable it to start replication.
To enable Synchronous replication on a guest VM, perform the following procedure at the primary AZ (AZ). You can also perform the following procedure at the recovery AZ. The operations you perform at a recovery AZ synchronize back to the primary AZ.
The protected guest VMs on the primary cluster stop responding when the recovery cluster is disconnected abruptly (for example, due to network outage or internal service crash). To come out of the unresponsive state, you can pause Synchronous replication on the guest VMs. Pausing Synchronous replication temporarily suspends the replication state of the guest VMs without completely disabling the replication relationship.
To pause Synchronous replication on a guest VM, perform the following procedure.
You can resume the Synchronous replication that you had paused to come out of the unresponsive state of the primary cluster. Resuming Synchronous replication restores the replication status and reconciles the state of the guest VMs. To resume Synchronous replication on a guest VM, perform the following procedure.
You can perform test failover, planned failover, and unplanned failover of the guest VMs protected with Synchronous replication schedule across the AHV clusters at the different on-prem AZ (AZ). The steps to perform test, planned, and unplanned failover are largely the same irrespective of the replication schedules that protects the guest VMs. Additionally, a planned failover of the guest VMs protected with Synchronous replication schedule also allows for live migration of the protected guest VMs.
Refer Failover and Failback Management for test, planned, and unplanned failover procedures.
Planned failover of the guest VMs protected with Synchronous replication schedule supports live migration to another AHV cluster. Live migration offers zero downtime for your applications during a planned failover event to the recovery cluster (for example, during scheduled maintenance).
The following are the specific requirements to successfully migrate your guest VMs with Live Migration.
Ensure that you meet the following requirements in addition to the requirements of Synchronous replication schedule (Synchronous Replication Requirements) and general requirements of Leap (Nutanix Disaster Recovery Requirements).
Network stretch spans your network across different AZs. A stretched L2 network retains the IP addresses of guest VMs after their Live Migration to the recovery AZ.
The primary and recovery Nutanix clusters must have identical CPU feature set. If the CPU feature sets (set of CPU flags) are unidentical, Live Migration fails.
Consider the following limitation in addition to the limitations of Synchronous replication schedule (Synchronous Replication Limitations) and general limitations of Leap (Nutanix Disaster Recovery Limitations) before performing live migration of your guest VMs.
If due to a planned event (for example, scheduled maintenance of guest VMs) at the primary AZ (AZ), you want to migrate your applications to another AHV cluster without downtime, perform a planned failover with Live Migration to the recovery AZ.
To live migrate the guest VMs, do the following procedure at the recovery AZ.
Resolve the error conditions and then restart the failover procedure.
The Witness option extends the capability of metro availability to AHV clusters using Nutanix Disaster Recovery . To automate the recovery process, the recovery plan configuration is enhanced to automatically handle the failure execution. The Witness is a service within Prism Central (including scale-out deployments) that monitors communication between the metro pair of clusters (primary and recovery AHV clusters). When the communication between both the clusters is interrupted for a configurable time interval, the service executes an unplanned failover depending on the failure type and the actions you have configured in the associated recovery plan.
Witness continuously reads the health status from the metro pair of AHV clusters. If the communication between the two clusters is unavailable for a set time period, the service pauses the Synchronous replication between the clusters. If the primary cluster is unavailable, the service can also automatically trigger an unplanned failover automatically to start the guest VMs specified in the Witness configured recovery plan at the recovery cluster.
You can configure AHV metro for th