Signup
×

Welcome to Knowledge Base!

KB at your finger tips

This is one stop global knowledge base where you can learn about all the products, solutions and support features.

Categories
All
Storage and Backups-Nutanix
AHV Administration Guide

AHV 5.20

Product Release Date: 2021-05-17

Last updated: 2022-12-06

AHV Overview

As the default option for Nutanix HCI, the native Nutanix hypervisor, AHV, represents a unique approach to virtualization that offers the powerful virtualization capabilities needed to deploy and manage enterprise applications. AHV compliments the HCI value by integrating native virtualization along with networking, infrastructure, and operations management with a single intuitive interface - Nutanix Prism.

Virtualization teams find AHV easy to learn and transition to from legacy virtualization solutions with familiar workflows for VM operations, live migration, VM high availability, and virtual network management. AHV includes resiliency features, including high availability and dynamic scheduling without the need for additional licensing, and security is integral to every aspect of the system from the ground up. AHV also incorporates the optional Flow Security and Networking, allowing easy access to hypervisor-based network microsegmentation and advanced software-defined networking.

See the Field Installation Guide for information about how to deploy and create a cluster. Once you create the cluster by using Foundation, you can use this guide to perform day-to-day management tasks.

AOS and AHV Compatibility

For information about the AOS and AHV compatibility with this release, see the Compatibility and Interoperability Matrix.

Limitations

For information about AHV configuration limitations, see Nutanix Configuration Maximums webpage.

Nested Virtualization

Nutanix does not support nested virtualization (nested VMs) in an AHV cluster.

Storage Overview

AHV uses a Distributed Storage Fabric to deliver data services such as storage provisioning, snapshots, clones, and data protection to VMs directly.

In AHV clusters, AOS passes all disks to the VMs as raw SCSI block devices. By that means, the I/O path is lightweight and optimized. Each AHV host runs an iSCSI redirector, which establishes a highly resilient storage path from each VM to storage across the Nutanix cluster.

QEMU is configured with the iSCSI redirector as the iSCSI target portal. Upon a login request, the redirector performs an iSCSI login redirect to a healthy Stargate (preferably the local one).

Figure. AHV Storage Click to enlarge

AHV Turbo

AHV Turbo represents significant advances to the data path in AHV. AHV Turbo provides an I/O path that bypasses QEMU and services storage I/O requests, which lowers CPU usage and increases the amount of storage I/O available to VMs.

AHV Turbo represents significant advances to the data path in AHV.

When you use QEMU, all I/O travels through a single queue that can impact system performance. AHV Turbo provides an I/O path that uses the multi-queue approach to bypasses QEMU. The multi-queue approach allows the data to flow from a VM to the storage more efficiently. This results in a much higher I/O capacity and lower CPU usage. The storage queues automatically scale out to match the number of vCPUs configured for a given VM, and results in a higher performance as the workload scales up.

AHV Turbo is transparent to VMs and is enabled by default on VMs that runs in AHV clusters. For maximum VM performance, ensure that the following conditions are met:

  • The latest Nutanix VirtIO package is installed for Windows VMs. For information on how to download and install the latest VirtIO package, see Installing or Upgrading Nutanix VirtIO for Windows.
    Note: No additional configuration is required at this stage.
  • The VM has more than one vCPU.
  • The workloads are multi-threaded.
Note: Multi-queue is enabled by default in current Linux distributions. For details, refer your vendor-specific documentation for Linux distribution.
In addition to multi-queue approach for storage I/O, you can also achieve the maximum network I/O performance using the multi-queue approach for any vNICs in the system. For information about how to enable multi-queue and set an optimum number of queues, see Enabling RSS Virtio-Net Multi-Queue by Increasing the Number of VNIC Queues.
Note: Ensure that the guest operating system fully supports multi-queue before you enable it. For details, refer your vendor-specific documentation for Linux distribution.

Acropolis Dynamic Scheduling in AHV

Acropolis Dynamic Scheduling (ADS) proactively monitors your cluster for any compute and storage I/O contentions or hotspots over a period of time. If ADS detects a problem, ADS creates a migration plan that eliminates hotspots in the cluster by migrating VMs from one host to another.

You can monitor VM migration tasks from the Task dashboard of the Prism Element web console.

Following are the advantages of ADS:

  • ADS improves the initial placement of the VMs depending on the VM configuration.
  • Nutanix Volumes uses ADS for balancing sessions of the externally available iSCSI targets.
Note: ADS honors all the configured host affinities, VM-host affinities, VM-VM antiaffinity policies, and HA policies.

By default, ADS is enabled and Nutanix recommends you keep this feature enabled. However, see Disabling Acropolis Dynamic Scheduling for information about how to disable the ADS feature. See Enabling Acropolis Dynamic Scheduling for information about how to enable the ADS feature if you previously disabled the feature.

ADS monitors the following resources:

  • VM CPU Utilization: Total CPU usage of each guest VM.
  • Storage CPU Utilization: Storage controller (Stargate) CPU usage per VM or iSCSI target

ADS does not monitor memory and networking usage.

How Acropolis Dynamic Scheduling Works

Lazan is the ADS service in an AHV cluster. AOS selects a Lazan manager and Lazan solver among the hosts in the cluster to effectively manage ADS operations.

ADS performs the following tasks to resolve compute and storage I/O contentions or hotspots:

  • The Lazan manager gathers statistics from the components it monitors.
  • The Lazan solver (runner) checks the statistics for potential anomalies and determines how to resolve them, if possible.
  • The Lazan manager invokes the tasks (for example, VM migrations) to resolve the situation.
Note:
  • During migration, a VM consumes resources on both the source and destination hosts as the High Availability (HA) reservation algorithm must protect the VM on both hosts. If a migration fails due to lack of free resources, turn off some VMs so that migration is possible.
  • If a problem is detected and ADS cannot solve the issue (for example, because of limited CPU or storage resources), the migration plan might fail. In these cases, an alert is generated. Monitor these alerts from the Alerts dashboard of the Prism Element web console and take necessary remedial actions.
  • If the host, firmware, or AOS upgrade is in progress and if any resource contention occurs during the upgrade period, ADS does not perform any resource contention rebalancing.

When Is a Hotspot Detected?

Lazan runs every 15 minutes and analyzes the resource usage for at least that period of time. If the resource utilization of an AHV host remains >85% for the span of 15 minutes, Lazan triggers migration tasks to remove the hotspot.

Note: For a storage hotspot, ADS looks at the last 40 minutes of data and uses a smoothing algorithm to use the most recent data. For a CPU hotspot, ADS looks at the last 10 minutes of data only, that is, the average CPU usage over the last 10 minutes.

Following are the possible reasons if there is an obvious hotspot, but the VMs did not migrate:

  • Lazan cannot resolve a hotspot. For example:
    • If there is a huge VM (16 vCPUs) at 100% usage, and accounts for 75% of the AHV host usage (which is also at 100% usage).
    • The other hosts are loaded at ~ 40% usage.

    In these situations, the other hosts cannot accommodate the large VM without causing contention there as well. Lazan does not prioritize one host or VM over others for contention, so it leaves the VM where it is hosted.

  • Number of all-flash nodes in the cluster is less than the replication factor.

    If the cluster has an RF2 configuration, the cluster must have a minimum of two all-flash nodes for successful migration of VMs on all the all-flash nodes.

Migrations Audit

Prism Central displays the list of all the VM migration operations generated by ADS. In Prism Central, go to Menu -> Activity -> Audits to display the VM migrations list. You can filter the migrations by clicking Filters and selecting Migrate in the Operation Type tab. The list displays all the VM migration tasks created by ADS with details such as the source and target host, VM name, and time of migration.

Disabling Acropolis Dynamic Scheduling

Perform the procedure described in this topic to disable ADS. Nutanix recommends you keep ADS enabled.

Procedure

  1. Log on to a Controller VM in your cluster with SSH.
  2. Disable ADS.
    nutanix@cvm$ acli ads.update enable=false

    No action is taken by ADS to solve the contentions after you disable the ADS feature. You must manually take the remedial actions or you can enable the feature.

Enabling Acropolis Dynamic Scheduling

If you have disabled the ADS feature and want to enable the feature, perform the following procedure.

Procedure

  1. Log onto a Controller VM in your cluster with SSH.
  2. Enable ADS.
    nutanix@cvm$ acli ads.update enable=true

Virtualization Management Web Console Interface

You can manage the virtualization management features by using the Prism GUI (Prism Element and Prism Central web consoles).

You can do the following by using the Prism web consoles:

  • Configure network connections
  • Create virtual machines
  • Manage virtual machines (launch console, start/shut down, take snapshots, migrate, clone, update, and delete)
  • Monitor virtual machines
  • Enable VM high availability

See Prism Web Console Guide and Prism Central Guide for more information.

Viewing the AHV Version on Prism Element

You can see the AHV version installed in the Prism Element web console.

About this task

To view the AHV version installed on the host, do the following.

Procedure

  1. Log on to Prism Web Console
  2. The Hypervisor Summary widget widget on the top left side of the Home page displays the AHV version.
    Figure. LCM Page Displays AHV Version Click to enlargeDisplaying the LCM page which shows the AHV version installed.

Viewing the AHV Version on Prism Central

You can see the AHV version installed in the Prism Central console.

About this task

To view the AHV version installed on any host in the clusters managed by the Prism Central, do the following.

Procedure

  1. Log on to Prism Central.
  2. In side bar, select Hardware > Hosts > Summary tab.
  3. Click the host you want to see the hypervisor version for.
  4. The Host detail view page displays the Properties widget that lists the Hypervisor Version.
    Figure. Hypervisor Version in Host Detail View Click to enlargeDisplaying the Host details page showing the Hypervisor Version.

Node Management

Nonconfigurable AHV Components

The components listed here are configured by the Nutanix manufacturing and installation processes. Do not modify any of these components except under the direction of Nutanix Support.

Nutanix Software

Modifying any of the following Nutanix software settings may inadvertently constrain performance of your Nutanix cluster or render the Nutanix cluster inoperable.

  • Local datastore name.
  • Configuration and contents of any CVM (except memory configuration to enable certain features).
Important: Note the following important considerations about CVMs.
  • Do not delete the Nutanix CVM.
  • Do not take a snapshot of the CVM for backup.
  • Do not rename, modify, or delete the admin and nutanix user accounts of the CVM.
  • Do not create additional CVM user accounts.

    Use the default accounts (admin or nutanix), or use sudo to elevate to the root account.

  • Do not decrease CVM memory below recommended minimum amounts required for cluster and add-in features.

    Nutanix Cluster Checks (NCC), preupgrade cluster checks, and the AOS upgrade process detect and monitor CVM memory.

  • Nutanix does not support the usage of third-party storage on the host part of Nutanix clusters.

    Normal cluster operations might be affected if there are connectivity issues with the third-party storage you attach to the hosts in a Nutanix cluster.

  • Do not run any commands on a CVM that are not in the Nutanix documentation.

AHV Settings

Nutanix AHV is a cluster-optimized hypervisor appliance.

Alteration of the hypervisor appliance (unless advised by Nutanix Technical Support) is unsupported and may result in the hypervisor or VMs functioning incorrectly.

Unsupported alterations include (but are not limited to):

  • Hypervisor configuration, including installed packages
  • Controller VM virtual hardware configuration file (.xml file). Each AOS version and upgrade includes a specific Controller VM virtual hardware configuration. Therefore, do not edit or otherwise modify the Controller VM virtual hardware configuration file.
  • iSCSI settings
  • Open vSwitch settings

  • Installation of third-party software not approved by Nutanix
  • Installation or upgrade of software packages from non-Nutanix sources (using yum, rpm, or similar)
  • Taking snapshots of the Controller VM
  • Creating user accounts on AHV hosts
  • Changing the timezone of the AHV hosts. By default, the timezone of an AHV host is set to UTC.
  • Joining AHV hosts to Active Directory or OpenLDAP domains

Controller VM Access

Although each host in a Nutanix cluster runs a hypervisor independent of other hosts in the cluster, some operations affect the entire cluster.

Most administrative functions of a Nutanix cluster can be performed through the web console (Prism), however, there are some management tasks that require access to the Controller VM (CVM) over SSH. Nutanix recommends restricting CVM SSH access with password or key authentication.

This topic provides information about how to access the Controller VM as an admin user and nutanix user.

admin User Access

Use the admin user access for all tasks and operations that you must perform on the controller VM. As an admin user with default credentials, you cannot access nCLI. You must change the default password before you can use nCLI. Nutanix recommends that you do not create additional CVM user accounts. Use the default accounts (admin or nutanix), or use sudo to elevate to the root account.

For more information about admin user access, see Admin User Access to Controller VM.

nutanix User Access

Nutanix strongly recommends that you do not use the nutanix user access unless the procedure (as provided in a Nutanix Knowledge Base article or user guide) specifically requires the use of the nutanix user access.

For more information about nutanix user access, see Nutanix User Access to Controller VM.

You can perform most administrative functions of a Nutanix cluster through the Prism web consoles or REST API. Nutanix recommends using these interfaces whenever possible and disabling Controller VM SSH access with password or key authentication. Some functions, however, require logging on to a Controller VM with SSH. Exercise caution whenever connecting directly to a Controller VM as it increases the risk of causing cluster issues.

Warning: When you connect to a Controller VM with SSH, ensure that the SSH client does not import or change any locale settings. The Nutanix software is not localized, and running the commands with any locale other than en_US.UTF-8 can cause severe cluster issues.

To check the locale used in an SSH session, run /usr/bin/locale. If any environment variables are set to anything other than en_US.UTF-8, reconnect with an SSH configuration that does not import or change any locale settings.

Admin User Access to Controller VM

You can access the Controller VM as the admin user (admin user name and password) with SSH. For security reasons, the password of the admin user must meet Controller VM Password Complexity Requirements. When you log on to the Controller VM as the admin user for the first time, you are prompted to change the default password.

See Controller VM Password Complexity Requirements to set a secure password.

After you have successfully changed the password, the new password is synchronized across all Controller VMs and interfaces (Prism web console, nCLI, and SSH).

Note:
  • As an admin user, you cannot access nCLI by using the default credentials. If you are logging in as the admin user for the first time, you must log on through the Prism web console or SSH to the Controller VM. Also, you cannot change the default password of the admin user through nCLI. To change the default password of the admin user, you must log on through the Prism web console or SSH to the Controller VM.
  • When you make an attempt to log in to the Prism web console for the first time after you upgrade to AOS 5.1 from an earlier AOS version, you can use your existing admin user password to log in and then change the existing password (you are prompted) to adhere to the password complexity requirements. However, if you are logging in to the Controller VM with SSH for the first time after the upgrade as the admin user, you must use the default admin user password (Nutanix/4u) and then change the default password (you are prompted) to adhere to the Controller VM Password Complexity Requirements.
  • You cannot delete the admin user account.
  • The default password expiration age for the admin user is 60 days. You can configure the minimum and maximum password expiration days based on your security requirement.
    • nutanix@cvm$ sudo chage -M MAX-DAYS admin
    • nutanix@cvm$ sudo chage -m MIN-DAYS admin

When you change the admin user password, you must update any applications and scripts using the admin user credentials for authentication. Nutanix recommends that you create a user assigned with the admin role instead of using the admin user for authentication. The Prism Web Console Guide describes authentication and roles.

Following are the default credentials to access a Controller VM.

Table 1. Controller VM Credentials
Interface Target User Name Password
SSH client Nutanix Controller VM admin Nutanix/4u
nutanix nutanix/4u
Prism web console Nutanix Controller VM admin Nutanix/4u

Accessing the Controller VM Using the Admin User Account

About this task

Perform the following procedure to log on to the Controller VM by using the admin user with SSH for the first time.

Procedure

  1. Log on to the Controller VM with SSH by using the management IP address of the Controller VM and the following credentials.
    • User name: admin
    • Password: Nutanix/4u
    You are now prompted to change the default password.
  2. Respond to the prompts, providing the current and new admin user password.
    Changing password for admin.
    Old Password:
    New password:
    Retype new password:
    Password changed.
    

    See the requirements listed in Controller VM Password Complexity Requirements to set a secure password.

    For information about logging on to a Controller VM by using the admin user account through the Prism web console, see Logging Into The Web Console in the Prism Web Console Guide.

Nutanix User Access to Controller VM

You can access the Controller VM as the nutanix user (nutanix user name and password) with SSH. For security reasons, the password of the nutanix user must meet the Controller VM Password Complexity Requirements. When you log on to the Controller VM as the nutanix user for the first time, you are prompted to change the default password.

See Controller VM Password Complexity Requirementsto set a secure password.

After you have successfully changed the password, the new password is synchronized across all Controller VMs and interfaces (Prism web console, nCLI, and SSH).

Note:
  • As a nutanix user, you cannot access nCLI by using the default credentials. If you are logging in as the nutanix user for the first time, you must log on through the Prism web console or SSH to the Controller VM. Also, you cannot change the default password of the nutanix user through nCLI. To change the default password of the nutanix user, you must log on through the Prism web console or SSH to the Controller VM.

  • When you make an attempt to log in to the Prism web console for the first time after you upgrade the AOS from an earlier AOS version, you can use your existing nutanix user password to log in and then change the existing password (you are prompted) to adhere to the password complexity requirements. However, if you are logging in to the Controller VM with SSH for the first time after the upgrade as the nutanix user, you must use the default nutanix user password (nutanix/4u) and then change the default password (you are prompted) to adhere to the Controller VM Password Complexity Requirements.

  • You cannot delete the nutanix user account.
  • You can configure the minimum and maximum password expiration days based on your security requirement.
    • nutanix@cvm$ sudo chage -M MAX-DAYS admin
    • nutanix@cvm$ sudo chage -m MIN-DAYS admin

When you change the nutanix user password, you must update any applications and scripts using the nutanix user credentials for authentication. Nutanix recommends that you create a user assigned with the nutanix role instead of using the nutanix user for authentication. The Prism Web Console Guide describes authentication and roles.

Following are the default credentials to access a Controller VM.

Table 1. Controller VM Credentials
Interface Target User Name Password
SSH client Nutanix Controller VM admin Nutanix/4u
nutanix nutanix/4u
Prism web console Nutanix Controller VM admin Nutanix/4u

Accessing the Controller VM Using the Nutanix User Account

About this task

Perform the following procedure to log on to the Controller VM by using the nutanix user with SSH for the first time.

Procedure

  1. Log on to the Controller VM with SSH by using the management IP address of the Controller VM and the following credentials.
    • User name: nutanix
    • Password: nutanix/4u
    You are now prompted to change the default password.
  2. Respond to the prompts, providing the current and new nutanix user password.
    Changing password for nutanix.
    Old Password:
    New password:
    Retype new password:
    Password changed.
    

    See Controller VM Password Complexity Requirementsto set a secure password.

    For information about logging on to a Controller VM by using the nutanix user account through the Prism web console, see Logging Into The Web Console in the Prism Web Console Guide.

Controller VM Password Complexity Requirements

The password must meet the following complexity requirements:

  • At least eight characters long.
  • At least one lowercase letter.
  • At least one uppercase letter.
  • At least one number.
  • At least one special character.
    Note: Ensure that the following conditions are met for the special characters usage in the CVM password:
    • The special characters are appropriately used while setting up the CVM password. In some cases, for example when you use ! followed by a number in the CVM password, it leads to a special meaning at the system end, and the system may replace it with a command from the bash history. In this case, you may generate a password string different from the actual password that you intend to set.
    • The special character used in the CVM password are ASCII printable characters only. For information about ACSII printable characters, refer ASCII printable characters (character code 32-127) article on ASCII code website.
  • At least four characters difference from the old password.
  • Must not be among the last 5 passwords.
  • Must not have more than 2 consecutive occurrences of a character.
  • Must not be longer than 199 characters.

AHV Host Access

You can perform most of the administrative functions of a Nutanix cluster using the Prism web consoles or REST API. Nutanix recommends using these interfaces whenever possible. Some functions, however, require logging on to an AHV host with SSH.

Note: From AOS 5.15.5 with AHV 20190916.410 onwards, AHV has two new user accounts—admin and nutanix.

Nutanix provides the following users to access the AHV host:

  • root—It is used internally by the AOS. The root user is used for the initial access and configuration of the AHV host.
  • admin—It is used to log on to an AHV host. The admin user is recommended for accessing the AHV host.
  • nutanix—It is used internally by the AOS and must not be used for interactive logon.

Exercise caution whenever connecting directly to an AHV host as it increases the risk of causing cluster issues.

Following are the default credentials to access an AHV host:

Table 1. AHV Host Credentials
Interface Target User Name Password
SSH client AHV Host root nutanix/4u
admin

There is no default password for admin. You must set it during the initial configuration.

nutanix nutanix/4u

Initial Configuration

About this task

The AHV host is shipped with the default password for the root and nutanix users, which must be changed using SSH when you log on to the AHV host for the first time. After changing the default passwords and the admin password, all subsequent logins to the AHV host must be with the admin user.

Perform the following procedure to change admin user account password for the first time:
Note: Perform this initial configuration on all the AHV hosts.

Procedure

  1. Use SSH and log on to the AHV host using the root account.
    $ ssh root@<AHV Host IP Address>
    Nutanix AHV
    root@<AHV Host IP Address> password: # default password nutanix/4u
    
  2. Change the default root user password.
    root@ahv# passwd root
    Changing password for user root.
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.
    
  3. Change the default nutanix user password.
    root@ahv# passwd nutanix
    Changing password for user nutanix.
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.
    
  4. Change the admin user password.
    root@ahv# passwd admin
    Changing password for user admin.
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.
    

Accessing the AHV Host Using the Admin Account

About this task

After setting the admin password in the Initial Configuration, use the admin user for all subsequent logins.

Perform the following procedure to log on to the Controller VM by using the admin user with SSH for the first time.

Procedure

  1. Log on to the AHV host with SSH using the admin account.
    $ ssh admin@ <AHV Host IP Address> 
    Nutanix AHV
    
  2. Enter the admin user password configured in the Initial Configuration.
    admin@<AHV Host IP Address> password:
  3. Append sudo to the commands if privileged access is required.
    $ sudo ls /var/log

Changing Admin User Password

About this task

Perform these steps to change the admin password on every AHV host in the cluster:

Procedure

  1. Log on to the AHV host using the admin account with SSH.
  2. Enter the admin user password configured in the Initial Configuration.
  3. Run the sudo command to change to admin user password.
    $ sudo passwd admin
  4. Respond to the prompts and provide the new password.
    [sudo] password for admin: 
    Changing password for user admin.
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.
    
    Note: Repeat this step for each AHV host.

    See AHV Host Password Complexity Requirements to set a secure password.

Changing the Root User Password

About this task

Perform these steps to change the root password on every AHV host in the cluster:

Procedure

  1. Log on to the AHV host using the admin account with SSH.
  2. Run the sudo command to change to root user.
  3. Change the root password.
    root@ahv# passwd root
  4. Respond to the prompts and provide the current and new root password.
    Changing password for root.
    New password:
    Retype new password:
    passwd: all authentication tokens updated successfully.
    
    Note: Repeat this step for each AHV host.

    See AHV Host Password Complexity Requirements to set a secure password.

Changing Nutanix User Password

About this task

Perform these steps to change the nutanix password on every AHV host in the cluster:

Procedure

  1. Log on to the AHV host using the admin account with SSH.
  2. Run the sudo command to change to root user.
  3. Change the nutanix password.
    root@ahv# passwd nutanix
  4. Respond to the prompts and provide the current and new nutanix password.
    Changing password for nutanix.
    New password:
    Retype new password:
    passwd: all authentication tokens updated successfully.
    
    Note: Repeat this step for each AHV host.

    See AHV Host Password Complexity Requirements to set a secure password.

AHV Host Password Complexity Requirements

The password you choose must meet the following complexity requirements:

  • In configurations with high-security requirements, the password must contain:
    • At least 15 characters.
    • At least one upper case letter (A–Z).
    • At least one lower case letter (a–z).
    • At least one digit (0–9).
    • At least one printable ASCII special (non-alphanumeric) character. For example, a tilde (~), exclamation point (!), at sign (@), number sign (#), or dollar sign ($).
    • At least eight characters different from the previous password.
    • At most three consecutive occurrences of any given character.
    • At most four consecutive occurrences of any given class.

The password cannot be the same as the last 5 passwords.

  • In configurations without high-security requirements, the password must contain:
    • At least eight characters.
    • At least one upper case letter (A–Z).
    • At least one lower case letter (a–z).
    • At least one digit (0–9).
    • At least one printable ASCII special (non-alphanumeric) character. For example, a tilde (~), exclamation point (!), at sign (@), number sign (#), or dollar sign ($).
    • At least three characters different from the previous password.
    • At most three consecutive occurrences of any given character.

The password cannot be the same as the last 5 passwords.

In both types of configuration, if a password for an account is entered three times unsuccessfully within a 15-minute period, the account is locked for 15 minutes.

Verifying the Cluster Health

Before you perform operations such as restarting a CVM or AHV host and putting an AHV host into maintenance mode, check if the cluster can tolerate a single-node failure.

Before you begin

Ensure that you are running the most recent version of NCC.

About this task

Note: If you see any critical alerts, resolve the issues by referring to the indicated KB articles. If you are unable to resolve any issues, contact Nutanix Support.

Perform the following steps to avoid unexpected downtime or performance issues.

Procedure

  1. Review and resolve any critical alerts. Do one of the following:
    • In the Prism Element web console, go to the Alerts page.
    • Log on to a Controller VM (CVM) with SSH and display the alerts.
      nutanix@cvm$ ncli alert ls
    Note: If you receive alerts indicating expired encryption certificates or a key manager is not reachable, resolve these issues before you shut down the cluster. If you do not resolve these issues, data loss of the cluster might occur.
  2. Verify if the cluster can tolerate a single-node failure. Do one of the following:
    • In the Prism Element web console, in the Home page, check the status of the Data Resiliency Status dashboard.

      Verify that the status is OK. If the status is anything other than OK, resolve the indicated issues before you perform any maintenance activity.

    • Log on to a Controller VM (CVM) with SSH and check the fault tolerance status of the cluster.
      nutanix@cvm$ ncli cluster get-domain-fault-tolerance-status type=node
      

      An output similar to the following is displayed:

      Important:
      Domain Type               : NODE
          Component Type            : STATIC_CONFIGURATION
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 14:22:09 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : ERASURE_CODE_STRIP_SIZE
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 13:19:58 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : METADATA
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Mon Sep 28 14:35:25 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : ZOOKEEPER
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Thu Sep 17 11:09:39 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : EXTENT_GROUPS
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 13:19:58 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : OPLOG
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 13:19:58 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : FREE_SPACE
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 14:20:57 GMT+05:00 2015
      

      The value of the Current Fault Tolerance column must be at least 1 for all the nodes in the cluster.

Putting a Node into Maintenance Mode

You may be required to put a node into maintenance mode in certain situations such as making changes to the network configuration of a node or for performing manual firmware upgrades.

Before you begin

Caution: Verify the data resiliency status of your cluster. If the cluster only has replication factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have more than one node shut down, shut down the entire cluster.

About this task

When a host is in maintenance mode, AOS marks the host as unschedulable so that no new VM instances are created on it. Next, an attempt is made to evacuate VMs from the host.

If the evacuation attempt fails, the host remains in the "entering maintenance mode" state, where it is marked unschedulable, waiting for user remediation. You can shut down VMs on the host or move them to other nodes. Once the host has no more running VMs, it is in maintenance mode.

When a host is in maintenance mode, VMs are moved from that host to other hosts in the cluster. After exiting maintenance mode, those VMs are automatically returned to the original host, eliminating the need to manually move them.

VMs with GPU, CPU passthrough, PCI passthrough, and host affinity policies are not migrated to other hosts in the cluster. You can choose to shut down such VMs while putting the node into maintenance mode.

Agent VMs are always shut down if you put a node in maintenance mode and are powered on again after exiting maintenance mode.

Perform the following steps to put the node into maintenance mode.

Procedure

  1. Use SSH to log on to a Controller VM in the cluster.
  2. Determine the IP address of the node that you want to put into maintenance mode.
    nutanix@cvm$ acli host.list

    Note the value of Hypervisor IP for the node that you want to put in maintenance mode.

  3. Put the node into maintenance mode.
    nutanix@cvm$ acli host.enter_maintenance_mode host-IP-address [wait="{ true | false }" ] [non_migratable_vm_action="{ acpi_shutdown | block }" ]
    Note: Never put Controller VM and AHV hosts into maintenance mode on single-node clusters. It is recommended to shut down guest VMs before proceeding with disruptive changes.

    Replace host-IP-address with either the IP address or host name of the AHV host that you want to shut down.

    The following are optional parameters for running the acli host.enter_maintenance_mode command:

    • wait: Set the wait parameter to true to wait for the host evacuation attempt to finish.
    • non_migratable_vm_action: By default the non_migratable_vm_action parameter is set to block, which means VMs with GPU, CPU passthrough, PCI passthrough, and host affinity policies are not migrated or shut down when you put a node into maintenance mode.

      If you want to automatically shut down such VMs, set the non_migratable_vm_action parameter to acpi_shutdown.

  4. Verify if the host is in the maintenance mode.
    nutanix@cvm$ acli host.get host-ip

    In the output that is displayed, ensure that node_state equals to EnteredMaintenanceMode and schedulable equals to False.

    Do not continue if the host has failed to enter the maintenance mode.

  5. See Verifying the Cluster Health to once again check if the cluster can tolerate a single-node failure.
  6. Put the CVM into the maintenance mode.
    nutanix@cvm$ ncli host edit id=host-ID enable-maintenance-mode=true

    Replace host-ID with the ID of the host.

    This step prevents the CVM services from being affected by any connectivity issues.

  7. Determine the ID of the host.
    nutanix@cvm$ ncli host list

    An output similar to the following is displayed:

    Id                        : aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee::1234  
    Uuid                      : ffffffff-gggg-hhhh-iiii-jjjjjjjjjjj 
    Name                      : XXXXXXXXXXX-X 
    IPMI Address              : X.X.Z.3 
    Controller VM Address     : X.X.X.1 
    Hypervisor Address        : X.X.Y.2
    

    In this example, the host ID is 1234.

    Wait for a few minutes until the CVM is put into the maintenance mode.

  8. Verify if the CVM is in the maintenance mode.

    Run the following command on the CVM that you put in the maintenance mode.

    nutanix@cvm$ genesis status | grep -v "\[\]"

    An output similar to the following is displayed:

    nutanix@cvm$ genesis status | grep -v "\[\]"
    2021-09-24 05:28:03.827628: Services running on this node:
      genesis: [11189, 11390, 11414, 11415, 15671, 15672, 15673, 15676]
      scavenger: [27241, 27525, 27526, 27527]
      xmount: [25915, 26055, 26056, 26074]
      zookeeper: [13053, 13101, 13102, 13103, 13113, 13130]
    nutanix@cvm$ 

    Only the Genesis, Scavenger, Xmount, and Zookeeper processes must be running (process ID is displayed next to the process name).

    Do not continue if the CVM has failed to enter the maintenance mode, because it can cause a service interruption.

What to do next

Perform the maintenance activity. Once the maintenance activity is complete, remove the node from the maintenance mode. See Exiting a Node from the Maintenance Mode for more information.

Exiting a Node from the Maintenance Mode

After you perform any maintenance activity, exit the node from the maintenance mode.

About this task

Perform the following to exit the host from the maintenance mode.

Procedure

  1. Remove the CVM from the maintenance mode.
    1. Determine the ID of the host.
      nutanix@cvm$ ncli host list

      An output similar to the following is displayed:

      Id                        : aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee::1234  
      Uuid                      : ffffffff-gggg-hhhh-iiii-jjjjjjjjjjj 
      Name                      : XXXXXXXXXXX-X 
      IPMI Address              : X.X.Z.3 
      Controller VM Address     : X.X.X.1 
      Hypervisor Address        : X.X.Y.2
      

      In this example, the host ID is 1234.

    1. From any other CVM in the cluster, run the following command to exit the CVM from the maintenance mode.
      nutanix@cvm$ ncli host edit id=host-ID enable-maintenance-mode=false

      Replace host-ID with the ID of the host.

      Note: The command fails if you run the command from the CVM that is in the maintenance mode.
    2. Verify if all processes on all the CVMs are in the UP state.
      nutanix@cvm$ cluster status | grep -v UP
    Do not continue if the CVM has failed to exit the maintenance mode.
  2. Remove the AHV host from the maintenance mode.
    1. From any CVM in the cluster, run the following command to exit the AHV host from the maintenance mode.
      nutanix@cvm$ acli host.exit_maintenance_mode host-ip 
      

      Replace host-ip with the new IP address of the host.

      This command migrates (live migration) all the VMs that were previously running on the host back to the host.

    2. Verify if the host has exited the maintenance mode.
      nutanix@cvm$ acli host.get host-ip 

      In the output that is displayed, ensure that node_state equals to kAcropolisNormal or AcropolisNormal and schedulable equals to True.

    Contact Nutanix Support if any of the steps described in this document produce unexpected results.

Shutting Down a Node in a Cluster (AHV)

Before you begin

  • Caution: Verify the data resiliency status of your cluster. If the cluster only has replication factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have more than one node shut down, shut down the entire cluster.

    See Verifying the Cluster Health to check if the cluster can tolerate a single-node failure. Do not proceed if the cluster cannot tolerate a single-node failure.

  • Put the node you want to shut down into maintenance mode.

    See Putting a Node into Maintenance Mode for instructions about how to put a node into maintenance mode.

    You can list all the hosts in the cluster by running nutanix@cvm$ acli host.list command, and note the value of Hypervisor IP for the node you want to shut down.

About this task

Perform the following procedure to shut down a node.

Procedure

  1. Using SSH, log on to the Controller VM on the host you want to shut down.
  2. Shut down the Controller VM.
    nutanix@cvm$ cvm_shutdown -P now

    Note: Once the cvm_shutdown command is issued, it might take a few minutes before CVM is powered off completely. After the cvm_shutdown command is completed successfully, Nutanix recommends that you wait up to 4 minutes before shutting down the AHV host.
  3. Log on to the AHV host with SSH.
  4. Shut down the host.
    root@ahv# shutdown -h now

What to do next

See Starting a Node in a Cluster (AHV) for instructions about how to start a node, including how to start a CVM and how to exit a node from maintenance mode.

Starting a Node in a Cluster (AHV)

About this task

Procedure

  1. On the hardware appliance, power on the node. The CVM starts automatically when your reboot the node.
  2. If the node is in maintenance mode, log on (SSH) to the Controller VM and remove the node from maintenance mode.
  3. Log on to another CVM in the Nutanix cluster with SSH.
  4. Verify that the status of all services on all the CVMs are Up.
    nutanix@cvm$ cluster status
    If the Nutanix cluster is running properly, output similar to the following is displayed for each node in the Nutanix cluster.
    CVM: <host IP-Address> Up
                                    Zeus   UP       [9935, 9980, 9981, 9994, 10015, 10037]
                               Scavenger   UP       [25880, 26061, 26062]
                                  Xmount   UP       [21170, 21208]
                        SysStatCollector   UP       [22272, 22330, 22331]
                               IkatProxy   UP       [23213, 23262]
                        IkatControlPlane   UP       [23487, 23565]
                           SSLTerminator   UP       [23490, 23620]
                          SecureFileSync   UP       [23496, 23645, 23646]
                                  Medusa   UP       [23912, 23944, 23945, 23946, 24176]
                      DynamicRingChanger   UP       [24314, 24404, 24405, 24558]
                                  Pithos   UP       [24317, 24555, 24556, 24593]
                              InsightsDB   UP       [24322, 24472, 24473, 24583]
                                  Athena   UP       [24329, 24504, 24505]
                                 Mercury   UP       [24338, 24515, 24516, 24614]
                                  Mantle   UP       [24344, 24572, 24573, 24634]
                              VipMonitor   UP       [18387, 18464, 18465, 18466, 18474]
                                Stargate   UP       [24993, 25032]
                    InsightsDataTransfer   UP       [25258, 25348, 25349, 25388, 25391, 25393, 25396]
                                   Ergon   UP       [25263, 25414, 25415]
                                 Cerebro   UP       [25272, 25462, 25464, 25581]
                                 Chronos   UP       [25281, 25488, 25489, 25547]
                                 Curator   UP       [25294, 25528, 25529, 25585]
                                   Prism   UP       [25718, 25801, 25802, 25899, 25901, 25906, 25941, 25942]
                                     CIM   UP       [25721, 25829, 25830, 25856]
                            AlertManager   UP       [25727, 25862, 25863, 25990]
                                Arithmos   UP       [25737, 25896, 25897, 26040]
                                 Catalog   UP       [25749, 25989, 25991]
                               Acropolis   UP       [26011, 26118, 26119]
                                   Uhura   UP       [26037, 26165, 26166]
                                    Snmp   UP       [26057, 26214, 26215]
                       NutanixGuestTools   UP       [26105, 26282, 26283, 26299]
                              MinervaCVM   UP       [27343, 27465, 27466, 27730]
                           ClusterConfig   UP       [27358, 27509, 27510]
                                Aequitas   UP       [27368, 27567, 27568, 27600]
                             APLOSEngine   UP       [27399, 27580, 27581]
                                   APLOS   UP       [27853, 27946, 27947]
                                   Lazan   UP       [27865, 27997, 27999]
                                  Delphi   UP       [27880, 28058, 28060]
                                    Flow   UP       [27896, 28121, 28124]
                                 Anduril   UP       [27913, 28143, 28145]
                                   XTrim   UP       [27956, 28171, 28172]
                           ClusterHealth   UP       [7102, 7103, 27995, 28209,28495, 28496, 28503, 28510,	
    28573, 28574, 28577, 28594, 28595, 28597, 28598, 28602, 28603, 28604, 28607, 28645, 28646, 28648, 28792,	
    28793, 28837, 28838, 28840, 28841, 28858, 28859, 29123, 29124, 29127, 29133, 29135, 29142, 29146, 29150,	
    29161, 29162, 29163, 29179, 29187, 29219, 29268, 29273]

Shutting Down an AHV Cluster

You might need to shut down an AHV cluster to perform a maintenance activity or tasks such as relocating the hardware.

Before you begin

Ensure the following before you shut down the cluster.

  1. Upgrade to the most recent version of NCC.
  2. Log on to a Controller VM (CVM) with SSH and run the complete NCC health check.
    nutanix@cvm$ ncc health_checks run_all

    If you receive any failure or error messages, resolve those issues by referring to the KB articles indicated in the output of the NCC check results. If you are unable to resolve these issues, contact Nutanix Support.

    Warning: If you receive alerts indicating expired encryption certificates or a key manager is not reachable, resolve these issues before you shut down the cluster. If you do not resolve these issues, data loss of the cluster might occur.

About this task

Shut down an AHV cluster in the following sequence.

Procedure

  1. Shut down the services or VMs associated with AOS features or Nutanix products. For example, shut down all the Nutanix file server VMs (FSVMs). See the documentation of those features or products for more information.
  2. Shut down all the guest VMs in the cluster in one of the following ways.
    • Shut down the guest VMs from within the guest OS.
    • Shut down the guest VMs by using the Prism Element web console.
    • If you are running many VMs, shut down the VMs by using aCLI:
    1. Log on to a CVM in the cluster with SSH.
    2. Shut down all the guest VMs in the cluster.
      nutanix@cvm$ for i in `acli vm.list power_state=on | awk '{print $1}' | grep -v NTNX` ; do acli vm.shutdown $i ; done
      
    3. Verify if all the guest VMs are shut down.
      nutanix@CVM$ acli vm.list power_state=on
    4. If any VMs are on, consider powering off the VMs from within the guest OS. To force shut down through AHV, run the following command:
      nutanix@cvm$ acli vm.off vm-name

      Replace vm-name with the name of the VM you want to shut down.

  3. Stop the Nutanix cluster.
    1. Log on to any CVM in the cluster with SSH.
    2. Stop the cluster.
      nutanix@cvm$ cluster stop
    3. Verify if the cluster services have stopped.
      nutanix@CVM$ cluster status

      The output displays the message The state of the cluster: stop, which confirms that the cluster has stopped.

      Note: Some system services continue to run even if the cluster has stopped.
  4. Shut down all the CVMs in the cluster. Log on to each CVM in the cluster with SSH and shut down that CVM.
    nutanix@cvm$ sudo shutdown -P now
  5. Shut down each node in the cluster. Perform the following steps for each node in the cluster.
    1. Log on to the IPMI web console of each node.
    2. Under Remote Control > Power Control, select Power Off Server - Orderly Shutdown to gracefully shut down the node.
    3. Ping each host to verify that all AHV hosts are shut down.
  6. Complete the maintenance activity or any other tasks.
  7. Start all the nodes in the cluster.
    1. Press the power button on the front of the block for each node.
    2. Log on to the IPMI web console of each node.
    3. On the System tab, check the Power Control status to verify if the node is powered on.
  8. Start the cluster.
    1. Wait for approximately 5 minutes after you start the last node to allow the cluster services to start.
      All CVMs start automatically after you start all the nodes.
    2. Log on to any CVM in the cluster with SSH.
    3. Start the cluster.
      nutanix@cvm$ cluster start
    4. Verify that all the cluster services are in the UP state.
      nutanix@cvm$ cluster status
    5. Start the guest VMs from within the guest OS or use the Prism Element web console.

      If you are running many VMs, start the VMs by using aCLI:

      nutanix@cvm$ for i in `acli vm.list power_state=off | awk '{print $1}' | grep -v NTNX` ; do acli vm.on $i; done
    6. Start the services or VMs associated with AOS features or Nutanix products. For example, start all the FSVMs. See the documentation of those features or products for more information.
    7. Verify if all guest VMs are powered on by using the Prism Element web console.

Rebooting an AHV Node in a Nutanix Cluster

About this task

The Request Reboot operation in the Prism web console gracefully restarts the selected nodes, including each local CVM one after the other.

The Request Reboot operation in the Prism web console gracefully restarts the selected nodes one after the other.

Perform the following procedure to restart the nodes in the cluster.

Before you begin

  • Ensure the Cluster Resiliency is OK on the Prism web console prior to any restart activities.
  • For successful automated restarts of hosts, ensure that the cluster has HA or resource capacity.
  • Ensure that the guest VMs can migrate between hosts as the hosts are placed in maintenance mode. If not, manual intervention may be required.

Perform the following procedure to restart the nodes in the cluster.

Procedure

  1. Click the gear icon in the main menu and then select Reboot in the Settings page.
  2. In the Request Reboot window, select the nodes you want to restart, and click Reboot.
    Figure. Request Reboot of AHV Node Click to enlarge

    A progress bar is displayed that indicates the progress of the restart of each node.

Changing CVM Memory Configuration (AHV)

About this task

You can increase the memory reserved for each Controller VM in your cluster by using the 1-click Controller VM Memory Upgrade available from the Prism Element web console. Increase memory size depending on the workload type or to enable certain AOS features. See the Increasing the Controller VM Memory Size topic in the Prism Web Console Guide for CVM memory sizing recommendations and instructions about how to increase the CVM memory.

Changing the AHV Hostname

To change the name of an AHV host, log on to any Controller VM (CVM) in the cluster as admin or nutanix user and run the change_ahv_hostname script.

About this task

Perform the following procedure to change the name of an AHV host:

Procedure

  1. Log on to any CVM in the cluster with SSH.
  2. Change the hostname of the AHV host.
    • If you are logged in as nutanix user, run the following command:
      nutanix@cvm$ change_ahv_hostname --host_ip=host-IP-address --host_name=new-host-name
    • If you are logged in as admin user, run the following command:
      admin@cvm$ sudo change_ahv_hostname --host_ip=host-IP-address --host_name=new-host-name
    Note: The system prompts you to enter the admin user password if you run the change_ahv_hostname command with sudo.

    Replace host-IP-address with the IP address of the host whose name you want to change and new-host-name with the new hostname for the AHV host.

    Note: This entity must fulfill the following naming conventions:
    • The maximum length is 63 characters.
    • Allowed characters are uppercase and lowercase letters (A-Z and a-z), decimal digits (0-9), dots (.), and hyphens (-).
    • The entity name must start and end with a number or letter.

    If you want to update the hostname of multiple hosts in the cluster, run the script for one host at a time (sequentially).

    Note: The Prism Element web console displays the new hostname after a few minutes.

Changing the Name of the CVM Displayed in the Prism Web Console

You can change the CVM name that is displayed in the Prism web console. The procedure described in this document does not change the CVM name that is displayed in the terminal or console of an SSH session.

About this task

You can change the CVM name by using the change_cvm_display_name script. Run this script from a CVM other than the CVM whose name you want to change. When you run the change_cvm_display_name script, AOS performs the following steps:

    1. Checks if the new name starts with NTNX- and ends with -CVM. The CVM name must have only letters, numbers, and dashes (-).
    2. Checks if the CVM has received a shutdown token.
    3. Powers off the CVM. The script does not put the CVM or host into maintenance mode. Therefore, the VMs are not migrated from the host and continue to run with the I/O operations redirected to another CVM while the current CVM is in a powered off state.
    4. Changes the CVM name, enables autostart, and powers on the CVM.

Perform the following to change the CVM name displayed in the Prism web console.

Procedure

  1. Use SSH to log on to a CVM other than the CVM whose name you want to change.
  2. Change the name of the CVM.
    nutanix@cvm$ change_cvm_display_name --cvm_ip=CVM-IP --cvm_name=new-name

    Replace CVM-IP with the IP address of the CVM whose name you want to change and new-name with the new name for the CVM.

    The CVM name must have only letters, numbers, and dashes (-), and must start with NTNX- and end with -CVM.

    Note: Do not run this command from the CVM whose name you want to change, because the script powers off the CVM. In this case, when the CVM is powered off, you lose connectivity to the CVM from the SSH console and the script abruptly ends.

Compute-Only Node Configuration (AHV Only)

A compute-only (CO) node allows you to seamlessly and efficiently expand the computing capacity (CPU and memory) of your AHV cluster. The Nutanix cluster uses the resources (CPUs and memory) of a CO node exclusively for computing purposes.

Note: Clusters that have compute-only nodes do not support virtual switches. Instead, use bridge configurations for network connections. For more information, see Virtual Switch Limitations.

You can use a supported server or an existing hyperconverged (HC) node as a CO node. To use a node as CO, image the node as CO by using Foundation and then add that node to the cluster by using the Prism Element web console. For more information about how to image a node as a CO node, see the Field Installation Guide.

Note: If you want an existing HC node that is already a part of the cluster to work as a CO node, remove that node from the cluster, image that node as CO by using Foundation, and add that node back to the cluster. For more information about how to remove a node, see Modifying a Cluster.

Key Features of Compute-Only Node

Following are the key features of CO nodes.

  • CO nodes do not have a Controller VM (CVM) and local storage.
  • AOS sources the storage for vDisks associated with VMs running on CO nodes from the hyperconverged (HC) nodes in the cluster.
  • You can seamlessly manage your VMs (CRUD operations, ADS, and HA) by using the Prism Element web console.
  • AHV runs on the local storage media of the CO node.
  • To update AHV on a cluster that contains a compute-only node, use the Life Cycle Manager. For more information, see the LCM Updates topic in the Life Cycle Manager Guide.

Use Case of Compute-Only Node

CO nodes enable you to achieve more control and value from restrictive licenses such as Oracle. A CO node is part of a Nutanix HC cluster, and there is no CVM running on the CO node (VMs use CVMs running on the HC nodes to access disks). As a result, licensed cores on the CO node are used only for the application VMs.

Applications or databases that are licensed on a per CPU core basis require the entire node to be licensed and that also includes the cores on which the CVM runs. With CO nodes, you get a much higher ROI on the purchase of your database licenses (such as Oracle and Microsoft SQL Server) since the CVM does not consume any compute resources.

Minimum Cluster Requirements

Following are the minimum cluster requirements for compute-only nodes.

  • The Nutanix cluster must be at least a three-node cluster before you add a compute-only node.

    However, Nutanix recommends that the cluster has four nodes before you add a compute-only node.

  • The ratio of compute-only to hyperconverged nodes in a cluster must not exceed the following:

    1 compute-only : 2 hyperconverged

  • All the hyperconverged nodes in the cluster must be all-flash nodes.
  • The number of vCPUs assigned to CVMs on the hyperconverged nodes must be greater than or equal to the total number of available cores on all the compute-only nodes in the cluster. The CVM requires a minimum of 12 vCPUs. For more information about how Foundation allocates memory and vCPUs to your platform model, see CVM vCPU and vRAM Allocation in the Field Installation Guide.
  • The total amount of NIC bandwidth allocated to all the hyperconverged nodes must be twice the amount of the total NIC bandwidth allocated to all the compute-only nodes in the cluster.

    Nutanix recommends you use dual 25 GbE on CO nodes and quad 25 GbE on an HC node serving storage to a CO node.

  • The AHV version of the compute-only node must be the same as the other nodes in the cluster.

    When you are adding a CO node to the cluster, AOS checks if the AHV version of the node matches with the AHV version of the existing nodes in the cluster. If there is a mismatch, the add node operation fails.

For general requirements about adding a node to a Nutanix cluster, see Expanding a Cluster.

Restrictions

Nutanix does not support the following features or tasks on a CO node in this release:

  1. Host boot disk replacement
  2. Network segmentation
  3. Virtual Switch configuration: Use bridge configurations instead.

Supported AOS Versions

Nutanix supports compute-only nodes on AOS releases 5.11 or later.

Supported Hardware Platforms

Compute-only nodes are supported on the following hardware platforms.

  • All the NX series hardware
  • Dell XC Core
  • Cisco UCS

Networking Configuration

To perform network tasks on a compute-only node such as creating or modifying bridges or uplink bonds or uplink load balancing, you must use the manage_ovs commands and add the --host flag to the manage_ovs commands as shown in the following example:

Note: If you have storage-only AHV nodes in clusters with compute-only nodes being ESXI or Hyper-V, deployment of default virtual switch vs0 fails. In such cases, the Prism Element, Prism Central or CLI workflows for virtual switch management are unavailable to manage the bridges and bonds. Use the manage_ovs command options to manage the bridges and bonds.
nutanix@cvm$ manage_ovs --host IP_address_of_co_node --bridge_name bridge_name create_single_bridge

Replace IP_address_of_co_node with the IP address of the CO node and bridge_name with the name of bridge you want to create.

Note: Run the manage_ovs commands for a CO from any CVM running on a hyperconverged node.

Perform the networking tasks for each CO node in the cluster individually.

For more information about networking configuration of the AHV hosts, see Host Network Management in the AHV Administration Guide.

Adding a Compute-Only Node to an AHV Cluster

About this task

Perform the following procedure to add a compute-only node to a Nutanix cluster.

Procedure

  1. Log on to the Prism Element web console.
  2. Do one of the following:
    • Click the gear icon in the main menu and select Expand Cluster in the Settings page.
    • Go to the hardware dashboard (see Hardware Dashboard) and click Expand Cluster.
  3. In the Select Host screen, scroll down and, under Manual Host Discovery, click Discover Hosts Manually.
    Figure. Discover Hosts Manually Click to enlarge

  4. Click Add Host.
    Figure. Add Host Click to enlarge

  5. Under Host or CVM IP, type the IP address of the AHV host and click Save.
    This node does not have a Controller VM and you must therefore provide the IP address of the AHV host.
  6. Click Discover and Add Hosts.
    Prism Element discovers this node and the node appears in the list of nodes in the Select Host screen.
  7. Select the node to display the details of the compute-only node.
  8. Click Next.
  9. In the Configure Host screen, click Expand Cluster.

    The add node process begins and Prism Element performs a set of checks before the node is added to the cluster.

    Check the progress of the operation in the Tasks menu of the Prism Element web console. The operation takes approximately five to seven minutes to complete.

  10. Check the Hardware Diagram view to verify if the node is added to the cluster.
    You can identity a node as a CO node if the Prism Element web console does not display the IP address for the CVM.

Host Network Management

Network management in an AHV cluster consists of the following tasks:

  • Configuring Layer 2 switching through virtual switch and Open vSwitch bridges. When configuring virtual switch vSwitch, you configure bridges, bonds, and VLANs.
  • Optionally changing the IP address, netmask, and default gateway that were specified for the hosts during the imaging process.

Virtual Networks (Layer 2)

Each VM network interface is bound to a virtual network. Each virtual network is bound to a single VLAN; trunking VLANs to a virtual network is not supported. Networks are designated by the L2 type (vlan) and the VLAN number.

By default, each virtual network maps to virtual switch br0. However, you can change this setting to map a virtual network to a custom virtual switch. The user is responsible for ensuring that the specified virtual switch exists on all hosts, and that the physical switch ports for the virtual switch uplinks are properly configured to receive VLAN-tagged traffic.

A VM NIC must be associated with a virtual network. You can change the virtual network of a vNIC without deleting and recreating the vNIC.

You can configure VM NICs in trunk mode to support applications that use trunk mode. For information about configuring virtual NICs in trunk mode, see Configuring a Virtual NIC to Operate in Access or Trunk Mode.

Managed Networks (Layer 3)

A virtual network can have an IPv4 configuration, but it is not required. A virtual network with an IPv4 configuration is a managed network; one without an IPv4 configuration is an unmanaged network. A VLAN can have at most one managed network defined. If a virtual network is managed, every NIC is assigned an IPv4 address at creation time.

A managed network can optionally have one or more non-overlapping DHCP pools. Each pool must be entirely contained within the network's managed subnet.

If the managed network has a DHCP pool, the NIC automatically gets assigned an IPv4 address from one of the pools at creation time, provided at least one address is available. Addresses in the DHCP pool are not reserved. That is, you can manually specify an address belonging to the pool when creating a virtual adapter. If the network has no DHCP pool, you must specify the IPv4 address manually.

All DHCP traffic on the network is rerouted to an internal DHCP server, which allocates IPv4 addresses. DHCP traffic on the virtual network (that is, between the guest VMs and the Controller VM) does not reach the physical network, and vice versa.

A network must be configured as managed or unmanaged when it is created. It is not possible to convert one to the other.

Figure. AHV Networking Architecture Click to enlargeAHV Networking Architecture image

Prerequisites for Configuring Networking

Change the configuration from the factory default to the recommended configuration. See AHV Networking Recommendations.

AHV Networking Recommendations

Nutanix recommends that you perform the following OVS configuration tasks from the Controller VM, as described in this documentation:

  • Viewing the network configuration
  • Configuring uplink bonds with desired interfaces using the Virtual Switch (VS) configurations.
  • Assigning the Controller VM to a VLAN

For performing other network configuration tasks such as adding an interface to a bridge and configuring LACP for the interfaces in a bond, follow the procedures described in the AHV Networking best practices documentation.

Nutanix recommends that you configure the network as follows:

Table 1. Recommended Network Configuration
Network Component Best Practice
Virtual Switch

Do not modify the OpenFlow tables of any bridges configured in any VS configurations in the AHV hosts.

Do not delete or rename OVS bridge br0.

Do not modify the native Linux bridge virbr0.

Switch Hops Nutanix nodes send storage replication traffic to each other in a distributed fashion over the top-of-rack network. One Nutanix node can, therefore, send replication traffic to any other Nutanix node in the cluster. The network should provide low and predictable latency for this traffic. Ensure that there are no more than three switches between any two Nutanix nodes in the same cluster.
Switch Fabric

A switch fabric is a single leaf-spine topology or all switches connected to the same switch aggregation layer. The Nutanix VLAN shares a common broadcast domain within the fabric. Connect all Nutanix nodes that form a cluster to the same switch fabric. Do not stretch a single Nutanix cluster across multiple, disconnected switch fabrics.

Every Nutanix node in a cluster should therefore be in the same L2 broadcast domain and share the same IP subnet.

WAN Links A WAN (wide area network) or metro link connects different physical sites over a distance. As an extension of the switch fabric requirement, do not place Nutanix nodes in the same cluster if they are separated by a WAN.
VLANs

Add the Controller VM and the AHV host to the same VLAN. Place all CVMs and AHV hosts in a cluster in the same VLAN. By default the CVM and AHV host are untagged, shown as VLAN 0, which effectively places them on the native VLAN configured on the upstream physical switch.

Note: Do not add any other device (including guest VMs) to the VLAN to which the CVM and hypervisor host are assigned. Isolate guest VMs on one or more separate VLANs.

Nutanix recommends configuring the CVM and hypervisor host VLAN as the native, or untagged, VLAN on the connected switch ports. This native VLAN configuration allows for easy node addition and cluster expansion. By default, new Nutanix nodes send and receive untagged traffic. If you use a tagged VLAN for the CVM and hypervisor hosts instead, you must configure that VLAN while provisioning the new node, before adding that node to the Nutanix cluster.

Use tagged VLANs for all guest VM traffic and add the required guest VM VLANs to all connected switch ports for hosts in the Nutanix cluster. Limit guest VLANs for guest VM traffic to the smallest number of physical switches and switch ports possible to reduce broadcast network traffic load. If a VLAN is no longer needed, remove it.

Default VS bonded port (br0-up)

Aggregate the fastest links of the same speed on the physical host to a VS bond on the default vs0 and provision VLAN trunking for these interfaces on the physical switch.

By default, interfaces in the bond in the virtual switch operate in the recommended active-backup mode.
Note: The mixing of bond modes across AHV hosts in the same cluster is not recommended and not supported.
1 GbE and 10 GbE interfaces (physical host)

If 10 GbE or faster uplinks are available, Nutanix recommends that you use them instead of 1 GbE uplinks.

Recommendations for 1 GbE uplinks are as follows:

  • If you plan to use 1 GbE uplinks, do not include them in the same bond as the 10 GbE interfaces.

    Nutanix recommends that you do not use uplinks of different speeds in the same bond.

  • If you choose to configure only 1 GbE uplinks, when migration of memory-intensive VMs becomes necessary, power off and power on in a new host instead of using live migration. In this context, memory-intensive VMs are VMs whose memory changes at a rate that exceeds the bandwidth offered by the 1 GbE uplinks.

    Nutanix recommends the manual procedure for memory-intensive VMs because live migration, which you initiate either manually or by placing the host in maintenance mode, might appear prolonged or unresponsive and might eventually fail.

    Use the aCLI on any CVM in the cluster to start the VMs on another AHV host:

    nutanix@cvm$ acli vm.on vm_list host=host

    Replace vm_list with a comma-delimited list of VM names and replace host with the IP address or UUID of the target host.

  • If you must use only 1GbE uplinks, add them into a bond to increase bandwidth and use the balance-TCP (LACP) or balance-SLB bond mode.
IPMI port on the hypervisor host Do not use VLAN trunking on switch ports that connect to the IPMI interface. Configure the switch ports as access ports for management simplicity.
Upstream physical switch

Nutanix does not recommend the use of Fabric Extenders (FEX) or similar technologies for production use cases. While initial, low-load implementations might run smoothly with such technologies, poor performance, VM lockups, and other issues might occur as implementations scale upward (see Knowledge Base article KB1612). Nutanix recommends the use of 10Gbps, line-rate, non-blocking switches with larger buffers for production workloads.

Cut-through versus store-and-forward selection depends on network design. In designs with no oversubscription and no speed mismatches you can use low-latency cut-through switches. If you have any oversubscription or any speed mismatch in the network design, then use a switch with larger buffers. Port-to-port latency should be no higher than 2 microseconds.

Use fast-convergence technologies (such as Cisco PortFast) on switch ports that are connected to the hypervisor host.

Physical Network Layout Use redundant top-of-rack switches in a traditional leaf-spine architecture. This simple, flat network design is well suited for a highly distributed, shared-nothing compute and storage architecture.

Add all the nodes that belong to a given cluster to the same Layer-2 network segment.

Other network layouts are supported as long as all other Nutanix recommendations are followed.

Jumbo Frames

The Nutanix CVM uses the standard Ethernet MTU (maximum transmission unit) of 1,500 bytes for all the network interfaces by default. The standard 1,500 byte MTU delivers excellent performance and stability. Nutanix does not support configuring the MTU on network interfaces of a CVM to higher values.

You can enable jumbo frames (MTU of 9,000 bytes) on the physical network interfaces of AHV hosts and guest VMs if the applications on your guest VMs require them. If you choose to use jumbo frames on hypervisor hosts, be sure to enable them end to end in the desired network and consider both the physical and virtual network infrastructure impacted by the change.

Controller VM Do not remove the Controller VM from either the OVS bridge br0 or the native Linux bridge virbr0.
Rack Awareness and Block Awareness Block awareness and rack awareness provide smart placement of Nutanix cluster services, metadata, and VM data to help maintain data availability, even when you lose an entire block or rack. The same network requirements for low latency and high throughput between servers in the same cluster still apply when using block and rack awareness.
Note: Do not use features like block or rack awareness to stretch a Nutanix cluster between different physical sites.
Oversubscription

Oversubscription occurs when an intermediate network device or link does not have enough capacity to allow line rate communication between the systems connected to it. For example, if a 10 Gbps link connects two switches and four hosts connect to each switch at 10 Gbps, the connecting link is oversubscribed. Oversubscription is often expressed as a ratio—in this case 4:1, as the environment could potentially attempt to transmit 40 Gbps between the switches with only 10 Gbps available. Achieving a ratio of 1:1 is not always feasible. However, you should keep the ratio as small as possible based on budget and available capacity. If there is any oversubscription, choose a switch with larger buffers.

In a typical deployment where Nutanix nodes connect to redundant top-of-rack switches, storage replication traffic between CVMs traverses multiple devices. To avoid packet loss due to link oversubscription, ensure that the switch uplinks consist of multiple interfaces operating at a faster speed than the Nutanix host interfaces. For example, for nodes connected at 10 Gbps, the inter-switch connection should consist of multiple 10 Gbps or 40 Gbps links.

The following diagrams show sample network configurations using Open vSwitch and Virtual Switch.

Figure. Virtual Switch Click to enlargeDisplaying Virtual Switch mechanism

Figure. AHV Bridge Chain Click to enlargeDisplaying Virtual Switch mechanism

Figure. Default factory configuration of Open vSwitch in AHV Click to enlarge

Figure. Open vSwitch Configuration Click to enlarge

IP Address Management

IP Address Management (IPAM) is a feature of AHV that allows it to assign IP addresses automatically to VMs by using DHCP. You can configure each virtual network with a specific IP address subnet, associated domain settings, and IP address pools available for assignment to VMs.

An AHV network is defined as a managed network or an unmanaged network based on the IPAM setting.

Managed Network

Managed network refers to an AHV network in which IPAM is enabled.

Unmanaged Network

Unmanaged network refers to an AHV network in which IPAM is not enabled or is disabled.

IPAM is enabled, or not, in the Create Network dialog box when you create a virtual network for Guest VMs. See Configuring a Virtual Network for Guest VM Interfaces topic in the Prism Web Console Guide.
Note: You can enable IPAM only when you are creating a virtual network. You cannot enable or disable IPAM for an existing virtual network.

IPAM enabled or disabled status has implications. For example, when you want to reconfigure the IP address of a Prism Central VM, the procedure to do so may involve additional steps for managed networks (that is, networks with IPAM enabled) where the new IP address belongs to an IP address range different from the previous IP address range. See Reconfiguring the IP Address and Gateway of Prism Central VMs in Prism Central Guide.

Layer 2 Network Management

AHV uses virtual switch (VS) to connect the Controller VM, the hypervisor, and the guest VMs to each other and to the physical network. Virtual switch is configured by default on each AHV node and the VS services start automatically when you start a node.

To configure virtual networking in an AHV cluster, you need to be familiar with virtual switch. This documentation gives you a brief overview of virtual switch and the networking components that you need to configure to enable the hypervisor, Controller VM, and guest VMs to connect to each other and to the physical network.

About Virtual Switch

Virtual switches or VS are used to manage multiple bridges and uplinks.

The VS configuration is designed to provide flexibility in configuring virtual bridge connections. A virtual switch (VS) defines a collection of AHV nodes and the uplink ports on each node. It is an aggregation of the same OVS bridge on all the compute nodes in a cluster. For example, vs0 is the default virtual switch is an aggregation of the br0 bridge and br0-up uplinks of all the nodes.

After you configure a VS, you can use the VS as reference for physical network management instead of using the bridge names as reference.

For overview about Virtual Switch, see Virtual Switch Considerations.

For information about OVS, see About Open vSwitch.

Virtual Switch Workflow

A virtual switch (VS) defines a collection of AHV compute nodes and the uplink ports on each node. It is an aggregation of the same OVS bridge on all the compute nodes in a cluster. For example, vs0 is the default virtual switch is an aggregation of the br0 bridge of all the nodes.

The system creates the default virtual switch vs0 connecting the default bridge br0 on all the hosts in the cluster during installation of or upgrade to the compatible versions of AOS and AHV. Default virtual switch vs0 has the following characteristics:

  • The default virtual switch cannot be deleted.

  • The default bridges br0 on all the nodes in the cluster map to vs0. thus, vs0 is not empty. It has at least one uplink configured.

  • The default management connectivity to a node is mapped to default bridge br0 that is mapped to vs0.

  • The default parameter values of vs0 - Name, Description, MTU and Bond Type - can be modified subject to aforesaid characteristics.

  • The default virtual switch is configured with the Active-Backup uplink bond type.

    For more information about bond types, see the Bond Type table.

The virtual switch aggregates the same bridges on all nodes in the cluster. The bridges (for example, br1) connect to the physical port such as eth3 (Ethernet port) via the corresponding uplink (for example, br1-up). The uplink ports of the bridges are connected to the same physical network. For example, the following illustration shows that vs0 is mapped to the br0 bridge, in turn connected via uplink br0-up to various (physical) Ethernet ports on different nodes.

Figure. Virtual Switch Click to enlargeDisplaying Virtual Switch mechanism

Uplink configuration uses bonds to improve traffic management. The bond types are defined for the aggregated OVS bridges.A new bond type - No uplink bond - provides a no-bonding option. A virtual switch configured with the No uplink bond uplink bond type has 0 or 1 uplinks.

When you configure a virtual switch with any other bond type, you must select at least two uplink ports on every node.

If you change the uplink configuration of vs0, AOS applies the updated settings to all the nodes in the cluster one after the other (the rolling update process). To update the settings in a cluster, AOS performs the following tasks when configuration method applied is Standard:

  1. Puts the node in maintenance mode (migrates VMs out of the node)
  2. Applies the updated settings
  3. Checks connectivity with the default gateway
  4. Exits maintenance mode
  5. Proceeds to apply the updated settings to the next node

AOS does not put the nodes in maintenance mode when the Quick configuration method is applied.

Table 1. Bond Types
Bond Type Use Case

Maximum VM NIC Throughput

Maximum Host Throughput

Active-Backup

Recommended. Default configuration, which transmits all traffic over a single active adapter. 10 Gb 10 Gb

Active-Active with MAC pinning

Also known as balance-slb

Works with caveats for multicast traffic. Increases host bandwidth utilization beyond a single 10 Gb adapter. Places each VM NIC on a single adapter at a time. Do not use this bond type with link aggregation protocols such as LACP. 10 Gb 20 Gb

Active-Active

Also known as LACP with balance-tcp

LACP and link aggregation required. Increases host and VM bandwidth utilization beyond a single 10 Gb adapter by balancing VM NIC TCP and UDP sessions among adapters. Also used when network switches require LACP negotiation.

The default LACP settings are:

  • Speed—Fast (1s)
  • Mode—Active fallback-active-backup
  • Priority—Default. This is not configurable.
20 Gb 20 Gb
No Uplink Bond

No uplink or a single uplink on each host.

Virtual switch configured with the No uplink bond uplink bond type has 0 or 1 uplinks. When you configure a virtual switch with any other bond type, you must select at least two uplink ports on every node.

- -

Note the following points about the uplink configuration.

  • Virtual switches are not enabled in a cluster that has one or more compute-only nodes. See Virtual Switch Limitations and Virtual Switch Requirements.
  • If you select the Active-Active policy, you must manually enable LAG and LACP on the corresponding ToR switch for each node in the cluster.
  • If you reimage a cluster with the Active-Active policy enabled, the default virtual switch (vs0) on the reimaged cluster is once again the Active-Backup policy. The other virtual switches are removed during reimage.
  • Nutanix recommends configuring LACP with fallback to active-backup or individual mode on the ToR switches. The configuration and behavior varies based on the switch vendor. Use a switch configuration that allows both switch interfaces to pass traffic after LACP negotiation fails.

Virtual Switch Considerations

Virtual Switch Deployment

A VS configuration is deployed using rolling update of the clusters. After the VS configuration (creation or update) is received and execution starts, every node is first put into maintenance mode before the VS configuration is made or modified on the node. This is called the Standard recommended default method of configuring a VS.

You can select the Quick method of configuration also where the rolling update does not put the clusters in maintenance mode. The VS configuration task is marked as successful when the configuration is successful on the first node. Any configuration failure on successive nodes triggers corresponding NCC alerts. There is no change to the task status.

Note:

If you are modifying an existing bond, AHV removes the bond and then re-creates the bond with the specified interfaces.

Ensure that the interfaces you want to include in the bond are physically connected to the Nutanix appliance before you run the command described in this topic. If the interfaces are not physically connected to the Nutanix appliance, the interfaces are not added to the bond.

Note: If you are modifying an existing bond, AHV removes the bond and then re-creates the bond with the specified interfaces.

Ensure that the interfaces you want to include in the bond are physically connected to the Nutanix appliance before you run the command described in this topic. If the interfaces are not physically connected to the Nutanix appliance, the interfaces are not added to the bond.

Ensure that the pre-checks listed in LCM Prechecks section of the Life Cycle Manager Guide and the Always and Host Disruptive Upgrades types of pre-checks listed KB-4584 pass for Virtual Switch deployments.

The VS configuration is stored and re-enforced at system reboot.

The VM NIC configuration also displays the VS details. When you Update VM configuration or Create NIC for a VM, the NIC details show the virtual switches that can be associated. This view allows you to change a virtual network and the associated virtual switch.

To change the virtual network, select the virtual network in the Subnet Name dropdown list in the Create NIC or Update NIC dialog box..

Figure. Create VM - VS Details Click to enlarge

Figure. VM NIC - VS Details Click to enlarge

Impact of Installation of or Upgrade to Compatible AOS and AHV Versions

See Virtual Switch Requirements for information about minimum and compatible AOS and AHV versions.

When you upgrade the AOS to a compatible version from an older version, the upgrade process:

  • Triggers the creation of the default virtual switch vs0, which is mapped to bridge br0on all the nodes.

  • Validates bridge br0 and its uplinks for consistency in terms of MTU and bond-type on every node.

    If valid, it adds the bridge br0 of each node to the virtual switch vs0.

    If br0 configuration is not consistent, the system generates an NCC alert which provides the failure reason and necessary details about it.

    The system migrates only the bridge br0 on each node to the default virtual switch vs0 because the connectivity of bridge br0 is guaranteed.

  • Does not migrate any other bridges to any other virtual switches during upgrade. You need to manually migrate the other bridges after install or upgrade is complete.

Bridge Migration

After upgrading to a compatible version of AOS, you can migrate bridges other than br0 that existed on the nodes. When you migrate the bridges, the system converts the bridges to virtual switches.

See Virtual Switch Migration Requirements in Virtual Switch Requirements.

Note: You can migrate only those bridges that are present on every compute node in the cluster. See Migrating Bridges after Upgrade topic in Prism Web Console Guide.

Cluster Scaling Impact

VS management for cluster scaling (addition or removal of nodes) is seamless.

Node Removal

When you remove a node, the system detects the removal and automatically removes the node from all the VS configurations that include the node and generates an internal system update. For example, a node has two virtual switches, vs1 and vs2, configured apart from the default vs0. When you remove the node from the cluster, the system removes the node for the vs1 and vs2 configurations automatically with internal system update.

Node Addition

When you add a new node or host to a cluster, the bridges or virtual switches on the new node are treated in the following manner:

Note: If a host already included in a cluster is removed and then added back, it is treated as a new host.
  • The system validates the default bridge br0 and uplink bond br0-up to check if it conforms to the default virtual switch vs0 already present on the cluster.

    If br0 and br0-up conform, the system includes the new host and its uplinks in vs0.

    If br0 and br0-up do not conform,then the system generates an NCC alert.

  • The system does not automatically add any other bridge configured on the new host to any other virtual switch in the cluster.

    It generates NCC alerts for all the other non-default virtual switches.

  • You can manually include the host in the required non-default virtual switches. Update a non-default virtual switch to include the host.

    For information about updating a virtual switch in Prism Element Web Console, see the Configuring a Virtual Network for Guest VM Interfaces section in Prism Web Console Guide.

    For information about updating a virtual switch in Prism Central, see the Network Connections section in the Prism Central Guide.

VS Management

You can manage virtual switches from Prism Central or Prism Web Console. You can also use aCLI or REST APIs to manage them. See the Acropolis API Reference and Command Reference guides for more information.

You can also use the appropriate aCLI commands for virtual switches from the following list:

  • net.create_virtual_switch

  • net.list_virtual_switch

  • net.get_virtual_switch

  • net.update_virtual_switch

  • net.delete_virtual_switch

  • net.migrate_br_to_virtual_switch

  • net.disable_virtual_switch

About Open vSwitch

Open vSwitch (OVS) is an open-source software switch implemented in the Linux kernel and designed to work in a multiserver virtualization environment. By default, OVS behaves like a Layer 2 learning switch that maintains a MAC address learning table. The hypervisor host and VMs connect to virtual ports on the switch.

Each hypervisor hosts an OVS instance, and all OVS instances combine to form a single switch. As an example, the following diagram shows OVS instances running on two hypervisor hosts.

Figure. Open vSwitch Click to enlarge

Default Factory Configuration

The factory configuration of an AHV host includes a default OVS bridge named br0 (configured with the default virtual switch vs0) and a native linux bridge called virbr0.

Bridge br0 includes the following ports by default:

  • An internal port with the same name as the default bridge; that is, an internal port named br0. This is the access port for the hypervisor host.
  • A bonded port named br0-up. The bonded port aggregates all the physical interfaces available on the node. For example, if the node has two 10 GbE interfaces and two 1 GbE interfaces, all four interfaces are aggregated on br0-up. This configuration is necessary for Foundation to successfully image the node regardless of which interfaces are connected to the network.
    Note:

    Before you begin configuring a virtual network on a node, you must disassociate the 1 GbE interfaces from the br0-up port. This disassociation occurs when you modify the default virtual switch (vs0) and create new virtual switches. Nutanix recommends that you aggregate only the 10 GbE or faster interfaces on br0-up and use the 1 GbE interfaces on a separate OVS bridge deployed in a separate virtual switch.

    See Virtual Switch Management for information about virtual switch management.

The following diagram illustrates the default factory configuration of OVS on an AHV node:

Figure. Default factory configuration of Open vSwitch in AHV Click to enlarge

The Controller VM has two network interfaces by default. As shown in the diagram, one network interface connects to bridge br0. The other network interface connects to a port on virbr0. The Controller VM uses this bridge to communicate with the hypervisor host.

Virtual Switch Requirements

The requirements to deploy virtual switches are as follows:

  1. Virtual switches are supported on AOS 5.19 or later with AHV 20201105.12 or later. Therefore you must install or upgrade to AOS 5.19 or later, with AHV 20201105.12 or later, to use virtual switches in your deployments.

  2. Virtual bridges used for a VS on all the nodes must have the same specification such as name, MTU and uplink bond type. For example, if vs1 is mapped to br1 (virtual or OVS bridge 1) on a node, it must be mapped to br1 on all the other nodes of the same cluster.

Virtual Switch Migration Requirements

The AOS upgrade process initiates the virtual switch migration. The virtual switch migration is successful only when the following requirements are fulfilled:

  • Before migrating to Virtual Switch, all bridge br0 bond interfaces should have the same bond type on all hosts in the cluster. For example, all hosts should use the Active-backup bond type or balance-tcp. If some hosts use Active-backup and other hosts use balance-tcp, virtual switch migration fails.
  • Before migrating to Virtual Switch, if using LACP:
    • Confirm that all bridge br0 lacp-fallback parameters on all hosts are set to the case sensitive value True with manage_ovs show_uplinks |grep lacp-fallback:. Any host with lowercase true causes virtual switch migration failure.
    • Confirm that the LACP speed on the physical switch is set to fast or 1 second. Also ensure that the switch ports are ready to fallback to individual mode if LACP negotiation fails due to a configuration such as no lacp suspend-individual.
  • Before migrating to the Virtual Switch, confirm that the upstream physical switch is set to spanning-tree portfast or spanning-tree port type edge trunk. Failure to do so may lead to a 30-second network timeout and the virtual switch migration may fail because it uses 20-second non-modifiable timer.
  • Ensure that the pre-checks listed in LCM Prechecks section of the Life Cycle Manager Guide and the Always and Host Disruptive Upgrades types of pre-checks listed KB-4584 pass for Virtual Switch deployments.

  • For the default virtual switch vs0,
    • All configured uplink ports must be available for connecting the network. In Active-Backup bond type, the active port is selected from any configured uplink port that is linked. Therefore, the virtual switch vs0 can use all the linked ports for communication with other CVMs/hosts.
    • All the host IP addresses in the virtual switch vs0 must be resolvable to the configured gateway using ARP.

Virtual Switch Limitations

MTU Restriction

The Nutanix Controller VM uses the standard Ethernet MTU (maximum transmission unit) of 1,500 bytes for all the network interfaces by default. The standard 1,500-byte MTU delivers excellent performance and stability. Nutanix does not support configuring higher values of MTU on the network interfaces of a Controller VM.

You can enable jumbo frames (MTU of 9,000 bytes) on the physical network interfaces of AHV, ESXi, or Hyper-V hosts and guest VMs if the applications on your guest VMs require such higher MTU values. If you choose to use jumbo frames on the hypervisor hosts, enable the jumbo frames end to end in the specified network, considering both the physical and virtual network infrastructure impacted by the change.

Single node and Two-node cluster configuration.

Virtual switch cannot be deployed is your single-node or two-node cluster has any instantiated user VMs. The virtual switch creation or update process involves a rolling restart, which checks for maintenance mode and whether you can migrate the VMs. On a single-node or two-node cluster, any instantiated user VMs cannot be migrated and the virtual switch operation fails.

Therefore, power down all user VMs for virtual switch operations in a single-node or two-node cluster.

Compute-only node is not supported.

Virtual switch is not compatible with Compute-only (CO) nodes. If a CO node is present in the cluster, then the virtual switches are not deployed (including the default virtual switch). You need to use the net.disable_virtual_switch aCLI command to disable the virtual switch workflow if you want to expand a cluster which has virtual switches and includes a CO node.

The net.disable_virtual_switch aCLI command cleans up all the virtual switch entries from the IDF. All the bridges mapped to the virtual switch or switches are retained as they are.

See Compute-Only Node Configuration (AHV Only).

Including a storage-only node in a VS is not necessary.

Virtual switch is compatible with Storage-only (SO) nodes but you do not need to include an SO node in any virtual switch, including the default virtual switch.

Mixed-mode Clusters with AHV Storage-only Nodes
Consider that you have deployed a mixed-node cluster where the compute-only nodes are ESXi or Hyper-V nodes and the storage-only nodes are AHV nodes. In such a case, the default virtual switch deployment fails.

Without the default VS, the Prism Element, Prism Central and CLI workflows for virtual switch required to manage the bridges and bonds are not available. You need to use the manage_ovs command options to update the bridge and bond configurations on the AHV hosts.

Virtual Switch Management

Virtual Switch can be viewed, created, updated or deleted from both Prism Web Console as well as Prism Central.

Virtual Switch Views and Visualization

For information on the virtual switch network visualization in Prism Element Web Console, see the Network Visualization topic in the Prism Web Console Guide.

Virtual Switch Create, Update and Delete Operations

For information about the procedures to create, update and delete a virtual switch in Prism Element Web Console, see the Configuring a Virtual Network for Guest VM Interfaces section in the Prism Web Console Guide.

For information about the procedures to create, update and delete a virtual switch in Prism Central, see the Network Connections section in the Prism Central Guide.

Re-Configuring Bonds Across Hosts Manually

If you are upgrading AOS to 5.20, 6.0 or later, you need to migrate the existing bridges to virtual switches. If there are inconsistent bond configurations across hosts before migration of the bridges, then after migration of bridges the virtual switches may not be properly deployed. To resolve such issues, you must manually configure the bonds to make them consistent.

About this task

Important: Use this procedure only when you need to modify the inconsistent bonds in a migrated bridge across hosts in a cluster, that is preventing Acropolis (AOS) from deploying the virtual switch for the migrated bridge.

Do not use ovs-vsctl commands to make the bridge level changes. Use the manage_ovs commands, instead.

The manage_ovs command allows you to update the cluster configuration. The changes are applied and retained across host restarts. The ovs-vsctl command allows you to update the live running host configuration but does not update the AOS cluster configuration and the changes are lost at host restart. This behavior of ovs-vsctl introduces connectivity issues during maintenance, such as upgrades or hardware replacements.

ovs-vsctl is usually used during a break/fix situation where a host may be isolated on the network and requires a workaround to gain connectivity before the cluster configuration can actually be updated using manage_ovs.

Note: Disable the virtual switch before you attempt to change the bonds or bridge.

If you hit an issue where the virtual switch is automatically re-created after it is disabled (with AOS versions 5.20.0 or 5.20.1), follow steps 1 and 2 below to disable such an automatically re-created virtual switch again before migrating the bridges. For more information, see KB-3263.

Be cautious when using the disable_virtual_switch command because it deletes all the configurations from the IDF, not only for the default virtual switch vs0, but also any virtual switches that you may have created (such as vs1 or vs2). Therefore, before you use the disable_virtual_switch command, ensure that you check a list of existing virtual switches, that you can get using the acli net.get_virtual_switch command.

Complete this procedure on each host Controller VM that is sharing the bridge that needs to be migrated to a virtual switch.

Procedure

  1. To list the virtual switches, use the following command.
    nutanix@cvm$ acli net.list_virtual_switch
  2. Disable all the virtual switches.
    nutanix@cvm$ acli net.disable_virtual_switch 

    This disables all the virtual switches.

    Note: You can use the nutanix@cvm$ acli net.delete_virtual_switch vs_name command to delete a specific VS and re-create it with the appropriate bond type.
  3. Change the bond type to align with the same bond type on all the hosts for the specified virtual switch
    nutanix@cvm$ manage_ovs --bridge_name bridge-name --bond_name bond_name --bond_mode bond-type update_uplinks

    Where:

    • bridge-name: Provide the name of the bridge, such as br0 for the virtual switch on which you want to set the uplink bond mode.
    • bond-name: Provide the name of the uplink port such as br0-up for which you want to set the bond mode.
    • bond-type: Provide the bond mode that you require to be used uniformly across the hosts on the named bridge.

    Use the manage_ovs --help command for help on this command.

    Note: To disable LACP, change the bond type from LACP Active-Active (balance-tcp) to Active-Backup/Active-Active with MAC pinning (balance-slb) by setting the bond_mode using this command as active-backup or balance-slb.

    Ensure that you turn off LACP on the connected ToR switch port as well. To avoid blocking of the bond uplinks during the bond type change on the host, ensure that you follow the ToR switch best practices to enable LACP fallback or passive mode.

    To enable LACP, configure bond-type as balance-tcp (Active-Active) with additional variables --lacp_mode fast and --lacp_fallback true.

  4. (If migrating to AOS version earlier than 5.20.2) Check if the issue in the note and disable the virtual switch.

What to do next

After making the bonds consistent across all the hosts configured in the bridge, migrate the bridge or enable the virtual switch. For more information, see:

To check whether LACP is enabled or disabled, use the following command.

nutanix@cvm$ manage_ovs show_uplinks

Enabling LACP and LAG (AHV Only)

If you select the Active-Active bond type, you must enable LACP and LAG on the corresponding ToR switch for each node in the cluster one after the other. This section describes the procedure to enable LAG and LACP in AHV nodes and the connected ToR switch.

About this task

Procedure

  1. Change the uplink Bond Type for the virtual switch.
    1. Open the Edit Virtual Switch window.
      • In Prism Central, open Network & Security > Subnets > Network Configuration > Virtual Switch.
      • In Prism Element or Web Console, open Settings > Network Configuration > Virtual Switch
    2. Click the Edit Edit icon
      icon of the virtual switch you want to configure LAG and LACP.
    3. On the Edit Virtual Switch page, in the General tab, ensure that the Standard option is selected for the Select Configuration Method parameter. Click Next.
      The Standard configuration method puts each node in maintenance mode before applying the updated settings. After applying the updated settings, the node exits from maintenance mode. See Virtual Switch Workflow.
    4. On the Uplink Configuration tab, in Bond Type, select Active-Active.
    5. Click Save.
    The Active-Active bond type configures all AHV hosts with the fast setting for LACP speed, causing the AHV host to request LACP control packets at the rate of one per second from the physical switch. In addition, the Active-Active bond type configuration sets LACP fallback to Active-Backup on all AHV hosts. You cannot modify these default settings after you have configured them in Prism, even by using the CLI.

    This completes the LAG and LACP configuration on the cluster.

Perform the following steps on each node, one at a time.
  1. Put the node and the Controller VM into maintenance mode.
    Before you put a node in maintenance mode, see Verifying the Cluster Health and carry out the necessary checks.

    See Putting a Node into Maintenance Mode. Step 6 in this procedure puts the Controller VM in maintenance mode.

  2. Change the settings for the interface on the ToR switch that the node connects to, to match the LACP and LAG setting made on the cluster in step 1 above.
    This is an important step. See the documentation provided by the ToR switch vendor for more information about changing the LACP settings of the switch interface that the node is physically connected to.
    • Nutanix recommends that you enable LACP fallback.

    • Consider the LACP time options (slow and fast). If the switch has a fast configuration, set the LACP time to fast. This is to prevent an outage due to a mismatch on LACP speeds of the cluster and the ToR switch. Keep in mind that the Active-Active bond type configuration set the LACP of cluster to fast.

    Verify that LACP negotiation status is negotiated.

  3. Remove the node and Controller VM from maintenance mode.
    See Exiting a Node from the Maintenance Mode. The Controller VM exits maintenance mode during the same process.

What to do next

Do the following after completing the procedure to enable LAG and LACP in all the AHV nodes the connected ToR switches:
  • Verify that the status of all services on all the CVMs are Up. Run the following command and check if the status of the services is displayed as Up in the output:
    nutanix@cvm$ cluster status
  • Log on to the Prism Element of the node and check the Data Resiliency Status widget displays OK.
    Figure. Data Resiliency Status Click to enlarge

VLAN Configuration

You can set up a VLAN-based segmented virtual network on an AHV node by assigning the ports on virtual bridges managed by virtual switches to different VLANs. VLAN port assignments are configured from the Controller VM that runs on each node.

For best practices associated with VLAN assignments, see AHV Networking Recommendations. For information about assigning guest VMs to a virtual switch and VLAN, see Network Connections in the Prism Central Guide.

Assigning an AHV Host to a VLAN

About this task

Note: Perform the following procedure during a scheduled downtime. Before you begin, stop the cluster. Once the process begins, hosts and CVMs partially lose network access to each other and VM data or storage containers become unavailable until the process completes.

To assign an AHV host to a VLAN, do the following on every AHV host in the cluster:

Procedure

  1. Log on to the AHV host with SSH.
  2. Put the AHV host and the CVM in maintenance mode.
    See Putting a Node into Maintenance Mode for instructions about how to put a node into maintenance mode.
  3. Assign port br0 (the internal port on the default OVS bridge, br0) to the VLAN that you want the host be on.
    root@ahv# ovs-vsctl set port br0 tag=host_vlan_tag

    Replace host_vlan_tag with the VLAN tag for hosts.

  4. Confirm VLAN tagging on port br0.
    root@ahv# ovs-vsctl list port br0
  5. Check the value of the tag parameter that is shown.
  6. Verify connectivity to the IP address of the AHV host by performing a ping test.
  7. Exit the AHV host and the CVM from the maintenance mode.

Assigning the Controller VM to a VLAN

By default, the public interface of a Controller VM is assigned to VLAN 0. To assign the Controller VM to a different VLAN, change the VLAN ID of its public interface. After the change, you can access the public interface from a device that is on the new VLAN.

About this task

Note: Perform the following procedure during a scheduled downtime. Before you begin, stop the cluster. Once the process begins, hosts and CVMs partially lose network access to each other and VM data or storage containers become unavailable until the process completes.
Note: To avoid losing connectivity to the Controller VM, do not change the VLAN ID when you are logged on to the Controller VM through its public interface. To change the VLAN ID, log on to the internal interface that has IP address 192.168.5.254.

Perform these steps on every Controller VM in the cluster. To assign the Controller VM to a VLAN, do the following:

Procedure

  1. Log on to the AHV host with SSH.
  2. Put the AHV host and the Controller VM in maintenance mode.
    See Putting a Node into Maintenance Mode for instructions about how to put a node into maintenance mode.
  3. Check the Controller VM status on the host.
    root@host# virsh list

    An output similar to the following is displayed:

    root@host# virsh list
     Id    Name                           State
    ----------------------------------------------------
     1     NTNX-CLUSTER_NAME-3-CVM            running
     3     3197bf4a-5e9c-4d87-915e-59d4aff3096a running
     4     c624da77-945e-41fd-a6be-80abf06527b9 running
    
    root@host# logout
  4. Log on to the Controller VM.
    root@host# ssh nutanix@192.168.5.254

    Accept the host authenticity warning if prompted, and enter the Controller VM nutanix password.

  5. Assign the public interface of the Controller VM to a VLAN.
    nutanix@cvm$ change_cvm_vlan vlan_id

    Replace vlan_id with the ID of the VLAN to which you want to assign the Controller VM.

    For example, add the Controller VM to VLAN 201.

    nutanix@cvm$ change_cvm_vlan 201
  6. Confirm VLAN tagging on the Controller VM.
    root@host# virsh dumpxml cvm_name

    Replace cvm_name with the CVM name or CVM ID to view the VLAN tagging information.

    Note: Refer to step 3 for Controller VM name and Controller VM ID.

    An output similar to the following is displayed:

    root@host# virsh dumpxml 1 | grep "tag id" -C10 --color
          <target dev='vnet2'/>
          <model type='virtio'/>
          <driver name='vhost' queues='4'/>
          <alias name='net2'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
        </interface>
        <interface type='bridge'>
          <mac address='50:6b:8d:b9:0a:18'/>
          <source bridge='br0'/>
          <vlan>
               <tag id='201'/> 
          </vlan>
          <virtualport type='openvswitch'>
            <parameters interfaceid='c46374e4-c5b3-4e6b-86c6-bfd6408178b5'/>
          </virtualport>
          <target dev='vnet0'/>
          <model type='virtio'/>
          <driver name='vhost' queues='4'/>
          <alias name='net3'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
        </interface>
    root@host#
  7. Check the value of the tag parameter that is shown.
  8. Restart the network service.
    nutanix@cvm$ sudo service network restart
  9. Verify connectivity to the Controller VMs external IP address by performing a ping test from the same subnet. For example, perform a ping from another Controller VM or directly from the host itself.
  10. Exit the AHV host and the Controller VM from the maintenance mode.

Enabling RSS Virtio-Net Multi-Queue by Increasing the Number of VNIC Queues

Multi-Queue in VirtIO-net enables you to improve network performance for network I/O-intensive guest VMs or applications running on AHV hosts.

About this task

You can enable VirtIO-net multi-queue by increasing the number of VNIC queues. If an application uses many distinct streams of traffic, Receive Side Scaling (RSS) can distribute the streams across multiple VNIC DMA rings. This increases the amount of RX buffer space by the number of VNIC queues (N). Also, most guest operating systems pin each ring to a particular vCPU, handling the interrupts and ring-walking on that vCPU, by that means achieving N-way parallelism in RX processing. However, if you increase the number of queues beyond the number of vCPUs, you cannot achieve extra parallelism.

Following workloads have the greatest performance benefit of VirtIO-net multi-queue:

  • VMs where traffic packets are relatively large
  • VMs with many concurrent connections
  • VMs with network traffic moving:
    • Among VMs on the same host
    • Among VMs across hosts
    • From VMs to the hosts
    • From VMs to an external system
  • VMs with high VNIC RX packet drop rate if CPU contention is not the cause

You can increase the number of queues of the AHV VM VNIC to allow the guest OS to use multi-queue VirtIO-net on guest VMs with intensive network I/O. Multi-Queue VirtIO-net scales the network performance by transferring packets through more than one Tx/Rx queue pair at a time as the number of vCPUs increases.

Nutanix recommends that you be conservative when increasing the number of queues. Do not set the number of queues larger than the total number of vCPUs assigned to a VM. Packet reordering and TCP retransmissions increase if the number of queues is larger than the number vCPUs assigned to a VM. For this reason, start by increasing the queue size to 2. The default queue size is 1. After making this change, monitor the guest VM and network performance. Before you increase the queue size further, verify that the vCPU usage has not dramatically or unreasonably increased.

Perform the following steps to make more VNIC queues available to a guest VM. See your guest OS documentation to verify if you must perform extra steps on the guest OS to apply the additional VNIC queues.

Note: You must shut down the guest VM to change the number of queues. Therefore, make this change during a planned maintenance window. The VNIC status might change from Up->Down->Up or a restart of the guest OS might be required to finalize the settings depending on the guest OS implementation requirements.

Procedure

  1. (Optional) Nutanix recommends that you ensure the following:
    1. AHV and AOS are running the latest version.
    2. AHV guest VMs are running the latest version of the Nutanix VirtIO driver package.
      For RSS support, ensure you are running Nutanix VirtIO 1.1.6 or later. See Nutanix VirtIO for Windows for more information about Nutanix VirtIO.
  2. Determine the exact name of the guest VM for which you want to change the number of VNIC queues.
    nutanix@cvm$ acli vm.list

    An output similar to the following is displayed:

    nutanix@cvm$ acli vm.list
    VM name          VM UUID
    ExampleVM1       a91a683a-4440-45d9-8dbe-xxxxxxxxxxxx
    ExampleVM2       fda89db5-4695-4055-a3d4-xxxxxxxxxxxx
    ...
  3. Determine the MAC address of the VNIC and confirm the current number of VNIC queues.
    nutanix@cvm$ acli vm.nic_get VM-name

    Replace VM-name with the name of the VM.

    An output similar to the following is displayed:

    nutanix@cvm$ acli vm.nic_get VM-name
    ...
    mac_addr: "50:6b:8d:2f:zz:zz"
    ...
    (queues: 2)    <- If there is no output of 'queues', the setting is default (1 queue).
    Note: AOS defines queues as the maximum number of Tx/Rx queue pairs (default is 1).
  4. Check the number of vCPUs assigned to the VM.
    nutanix@cvm$ acli vm.get VM-name | grep num_vcpus

    An output similar to the following is displayed:

    nutanix@cvm$ acli vm.get VM-name | grep num_vcpus
    num_vcpus: 1
  5. Shut down the guest VM.
    nutanix@cvm$ acli vm.shutdown VM-name

    Replace VM-name with the name of the VM.

  6. Increase the number of VNIC queues.
    nutanix@cvm$acli vm.nic_update VM-name vNIC-MAC-address queues=N

    Replace VM-name with the name of the guest VM, vNIC-MAC-address with the MAC address of the VNIC, and N with the number of queues.

    Note: N must be less than or equal to the vCPUs assigned to the guest VM.
  7. Start the guest VM.
    nutanix@cvm$ acli vm.on VM-name

    Replace VM-name with the name of the VM.

  8. Confirm in the guest OS documentation if any additional steps are required to enable multi-queue in VirtIO-net.
    Note: Microsoft Windows has RSS enabled by default.

    For example, for RHEL and CentOS VMs, do the following:

    1. Log on to the guest VM.
    2. Confirm if irqbalance.service is active or not.
      uservm# systemctl status irqbalance.service

      An output similar to the following is displayed:

      irqbalance.service - irqbalance daemon
         Loaded: loaded (/usr/lib/systemd/system/irqbalance.service; enabled; vendor preset: enabled)
         Active: active (running) since Tue 2020-04-07 10:28:29 AEST; Ns ago
    3. Start irqbalance.service if it is not active.
      Note: It is active by default on CentOS VMs. You might have to start it on RHEL VMs.
      uservm# systemctl start irqbalance.service
    4. Run the following command:
      uservm$ ethtool -L ethX combined M

      Replace M with the number of VNIC queues.

    Note the following caveat from the RHEL 7 Virtualization Tuning and Optimization Guide : 5.4. NETWORK TUNING TECHNIQUES document:

    "Currently, setting up a multi-queue virtio-net connection can have a negative effect on the performance of outgoing traffic. Specifically, this may occur when sending packets under 1,500 bytes over the Transmission Control Protocol (TCP) stream."

  9. Monitor the VM performance to make sure that the expected network performance increase is observed and that the guest VM vCPU usage is not dramatically increased to impact the application on the guest VM.
    For assistance with the steps described in this document, or if these steps do not resolve your guest VM network performance issues, contact Nutanix Support.

Changing the IP Address of an AHV Host

Change the IP address, netmask, or gateway of an AHV host.

Before you begin

Perform the following tasks before you change the IP address, netmask, or gateway of an AHV host:
Caution: All Controller VMs and hypervisor hosts must be on the same subnet.
Warning: Ensure that you perform the steps in the exact order as indicated in this document.
  1. Verify the cluster health by following the instructions in KB-2852.

    Do not proceed if the cluster cannot tolerate failure of at least one node.

  2. Put the AHV host into the maintenance mode.

    See Putting a Node into Maintenance Mode for instructions about how to put a node into maintenance mode.

About this task

Perform the following procedure to change the IP address, netmask, or gateway of an AHV host.

Procedure

  1. Edit the settings of port br0, which is the internal port on the default bridge br0.
    1. Log on to the host console as root.

      You can access the hypervisor host console either through IPMI or by attaching a keyboard and monitor to the node.

    2. Open the network interface configuration file for port br0 in a text editor.
      root@ahv# vi /etc/sysconfig/network-scripts/ifcfg-br0
    3. Update entries for host IP address, netmask, and gateway.

      The block of configuration information that includes these entries is similar to the following:

      ONBOOT="yes" 
      NM_CONTROLLED="no" 
      PERSISTENT_DHCLIENT=1
      NETMASK="subnet_mask" 
      IPADDR="host_ip_addr" 
      DEVICE="br0" 
      TYPE="ethernet" 
      GATEWAY="gateway_ip_addr"
      BOOTPROTO="none"
      • Replace host_ip_addr with the IP address for the hypervisor host.
      • Replace subnet_mask with the subnet mask for host_ip_addr.
      • Replace gateway_ip_addr with the gateway address for host_ip_addr.
    4. Save your changes.
    5. Restart network services.

      systemctl restart network.service
    6. Assign the host to a VLAN. For information about how to add a host to a VLAN, see Assigning an AHV Host to a VLAN.
    7. Verify network connectivity by pinging the gateway, other CVMs, and AHV hosts.
  2. Log on to the Controller VM that is running on the AHV host whose IP address you changed and restart genesis.
    nutanix@cvm$ genesis restart

    If the restart is successful, output similar to the following is displayed:

    Stopping Genesis pids [1933, 30217, 30218, 30219, 30241]
    Genesis started on pids [30378, 30379, 30380, 30381, 30403]

    See Controller VM Access for information about how to log on to a Controller VM.

    Genesis takes a few minutes to restart.

  3. Verify if the IP address of the hypervisor host has changed. Run the following nCLI command from any CVM other than the one in the maintenance mode.
    nutanix@cvm$ ncli host list 

    An output similar to the following is displayed:

    nutanix@cvm$ ncli host list 
        Id                        : aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee::1234  
        Uuid                      : ffffffff-gggg-hhhh-iiii-jjjjjjjjjjj 
        Name                      : XXXXXXXXXXX-X 
        IPMI Address              : X.X.Z.3 
        Controller VM Address     : X.X.X.1 
        Hypervisor Address        : X.X.Y.4 <- New IP Address 
    ... 
  4. Stop the Acropolis service on all the CVMs.
    1. Stop the Acropolis service on all the CVMs in the cluster.
      nutanix@cvm$ allssh genesis stop acropolis
      Note: You cannot manage your guest VMs after the Acropolis service is stopped.
    2. Verify if the Acropolis service is DOWN on all the CVMs, except the one in the maintenance mode.
      nutanix@cvm$ cluster status | grep -v UP 

      An output similar to the following is displayed:

      nutanix@cvm$ cluster status | grep -v UP 
      
      2019-09-04 14:43:18 INFO zookeeper_session.py:143 cluster is attempting to connect to Zookeeper 
      
      2019-09-04 14:43:18 INFO cluster:2774 Executing action status on SVMs X.X.X.1, X.X.X.2, X.X.X.3 
      
      The state of the cluster: start 
      
      Lockdown mode: Disabled 
              CVM: X.X.X.1 Up 
                                 Acropolis DOWN       [] 
              CVM: X.X.X.2 Up, ZeusLeader 
                                 Acropolis DOWN       [] 
              CVM: X.X.X.3 Maintenance
  5. From any CVM in the cluster, start the Acropolis service.
    nutanix@cvm$ cluster start 
  6. Verify if all processes on all the CVMs, except the one in the maintenance mode, are in the UP state.
    nutanix@cvm$ cluster status | grep -v UP 
  7. Exit the AHV host and the CVM from the maintenance mode.

Virtual Machine Management

The following topics describe various aspects of virtual machine management in an AHV cluster.

Supported Guest VM Types for AHV

The compatibility matrix available on the Nutanix Support portal includes the latest supported AHV guest VM OSes.

AHV Configuration Maximums

The Nutanix configuration maximums available on the Nutanix support portal includes all the latest configuration limits applicable to AHV. Select the appropriate AHV version to view version specific information.

Creating a VM (AHV)

In AHV clusters, you can create a new virtual machine (VM) through the Prism Element web console.

About this task

When creating a VM, you can configure all of its components, such as number of vCPUs and memory, but you cannot attach a volume group to the VM. Attaching a volume group is possible only when you are modifying a VM.

To create a VM, do the following:

Procedure

  1. In the VM dashboard, click the Create VM button.
    Note: This option does not appear in clusters that do not support this feature.
    The Create VM dialog box appears.
    Figure. Create VM Dialog Box Click to enlargeCreate VM screen

  2. Do the following in the indicated fields:
    1. Name: Enter a name for the VM.
    2. Description (optional): Enter a description for the VM.
    3. Timezone: Select the timezone that you want the VM to use. If you are creating a Linux VM, select (UTC) UTC.
      Note:

      The RTC of Linux VMs must be in UTC, so select the UTC timezone if you are creating a Linux VM.

      Windows VMs preserve the RTC in the local timezone, so set up the Windows VM with the hardware clock pointing to the desired timezone.

    4. Use this VM as an agent VM: Select this option to make this VM as an agent VM.

      You can use this option for the VMs that must be powered on before the rest of the VMs (for example, to provide network functions before the rest of the VMs are powered on on the host) and must be powered off after the rest of the VMs are powered off (for example, during maintenance mode operations). Agent VMs are never migrated to any other host in the cluster. If an HA event occurs or the host is put in maintenance mode, agent VMs are powered off and are powered on on the same host once that host comes back to a normal state.

      If an agent VM is powered off, you can manually start that agent VM on another host and the agent VM now permanently resides on the new host. The agent VM is never migrated back to the original host. Note that you cannot migrate an agent VM to another host while the agent VM is powered on.

    5. vCPU(s): Enter the number of virtual CPUs to allocate to this VM.
    6. Number of Cores per vCPU: Enter the number of cores assigned to each virtual CPU.
    7. Memory: Enter the amount of memory (in GiB) to allocate to this VM.
  3. (For GPU-enabled AHV clusters only) To configure GPU access, click Add GPU in the Graphics section, and then do the following in the Add GPU dialog box:
    Figure. Add GPU Dialog Box Click to enlarge

    For more information, see GPU and vGPU Support.

    1. To configure GPU pass-through, in GPU Mode, click Passthrough, select the GPU that you want to allocate, and then click Add.
      If you want to allocate additional GPUs to the VM, repeat the procedure as many times as you need to. Make sure that all the allocated pass-through GPUs are on the same host. If all specified GPUs of the type that you want to allocate are in use, you can proceed to allocate the GPU to the VM, but you cannot power on the VM until a VM that is using the specified GPU type is powered off.

      For more information, see GPU and vGPU Support.

    2. To configure virtual GPU access, in GPU Mode, click virtual GPU, select a GRID license, and then select a virtual GPU profile from the list.
      Note: This option is available only if you have installed the GRID host driver on the GPU hosts in the cluster.

      For more information about the NVIDIA GRID host driver installation instructions, see the NVIDIA Grid Host Driver for Nutanix AHV Installation Guide.

      You can assign multiple virtual GPU to a VM. A vGPU is assigned to the VM only if a vGPU is available when the VM is starting up.

      Before you add multiple vGPUs to the VM, see Multiple Virtual GPU Support and Restrictions for Multiple vGPU Support.

      Note:

      Multiple vGPUs are supported on the same VM only if you select the highest vGPU profile type.

      After you add the first vGPU, to add multiple vGPUs, see Adding Multiple vGPUs to the Same VM.

  4. To attach a disk to the VM, click the Add New Disk button.
    The Add Disks dialog box appears.
    Figure. Add Disk Dialog Box Click to enlargeconfigure a disk screen

    Do the following in the indicated fields:
    1. Type: Select the type of storage device, DISK or CD-ROM, from the pull-down list.
      The following fields and options vary depending on whether you choose DISK or CD-ROM.
    2. Operation: Specify the device contents from the pull-down list.
      • Select Clone from ADSF file to copy any file from the cluster that can be used as an image onto the disk.
      • Select Empty CD-ROM to create a blank CD-ROM device. (This option appears only when CD-ROM is selected in the previous field.) A CD-ROM device is needed when you intend to provide a system image from CD-ROM.
      • Select Allocate on Storage Container to allocate space without specifying an image. (This option appears only when DISK is selected in the previous field.) Selecting this option means you are allocating space only. You have to provide a system image later from a CD-ROM or other source.
      • Select Clone from Image Service to copy an image that you have imported by using image service feature onto the disk. For more information about the Image Service feature, see Configuring Images and Image Management in the Prism Self Service Administration Guide.
    3. Bus Type: Select the bus type from the pull-down list. The choices are IDE, SCSI, or SATA.

      The options displayed in the Bus Type drop-down list varies based on the storage device Type selected in Step a.

      • For device Disk, select from SCSI, SATA, PCI, or IDE bus type.
      • For device CD-ROM, you can select either IDE,or SATA bus type.
      Note: SCSI bus is the preferred bus type and it is used in most cases. Ensure you have installed the VirtIO drivers in the guest OS.
      Caution: Use SATA, PCI, IDE for compatibility purpose when the guest OS does not have VirtIO drivers to support SCSI devices. This may have performance implications.
      Note: For AHV 5.16 and later, you cannot use an IDE device if Secured Boot is enabled for UEFI Mode boot configuration.
    4. ADSF Path: Enter the path to the desired system image.
      This field appears only when Clone from ADSF file is selected. It specifies the image to copy. Enter the path name as /storage_container_name/iso_name.iso. For example to clone an image from myos.iso in a storage container named crt1, enter /crt1/myos.iso. When a user types the storage container name (/storage_container_name/), a list appears of the ISO files in that storage container (assuming one or more ISO files had previously been copied to that storage container).
    5. Image: Select the image that you have created by using the image service feature.
      This field appears only when Clone from Image Service is selected. It specifies the image to copy.
    6. Storage Container: Select the storage container to use from the pull-down list.
      This field appears only when Allocate on Storage Container is selected. The list includes all storage containers created for this cluster.
    7. Size: Enter the disk size in GiBs.
    8. When all the field entries are correct, click the Add button to attach the disk to the VM and return to the Create VM dialog box.
    9. Repeat this step to attach additional devices to the VM.
  5. Select one of the following firmware to boot the VM.
    • Legacy BIOS: Select legacy BIOS to boot the VM with legacy BIOS firmware.
    • UEFI: Select UEFI to boot the VM with UEFI firmware. UEFI firmware supports larger hard drives, faster boot time, and provides more security features. For more information about UEFI firmware, seeUEFI Support for VM .
    • Secure Boot is supported with AOS 5.16. The current support to Secure Boot is limited to the aCLI. For more information about Secure Boot, see Secure Boot Support for VMs . To enable Secure Boot, do the following:
    • Select UEFI.
    • Power-off the VM.
    • Log on to the aCLI and update the VM to enable Secure Boot. For more information, see Updating a VM to Enable Secure Boot in the AHV Administration Guide.
  6. To create a network interface for the VM, click the Add New NIC button.
    The Create NIC dialog box appears.
    Figure. Create NIC Dialog Box Click to enlargeconfigure a NIC screen

    Do the following in the indicated fields:
    1. Subnet Name: Select the target virtual LAN from the drop-down list.
      The list includes all defined networks (see Network Configuration For VM Interfaces.).
      Note: Selecting IPAM enabled subnet from the drop-down list displays the Private IP Assignment information that provides information about the number of free IP addresses available in the subnet and in the IP pool.
    2. Network Connection State: Select the state for the network that you want it to operate in after VM creation. The options are Connected or Disconnected.
    3. Private IP Assignment: This is a read-only field and displays the following:
      • Network Address/Prefix: The network IP address and prefix.
      • Free IPs (Subnet): The number of free IP addresses in the subnet.
      • Free IPs (Pool): The number of free IP addresses available in the IP pools for the subnet.
    4. Assignment Type: This is for IPAM enabled network. Select Assign with DHCP to assign IP address automatically to the VM using DHCP. For more information, see IP Address Management .
    5. When all the field entries are correct, click the Add button to create a network interface for the VM and return to the Create VM dialog box.
    6. Repeat this step to create additional network interfaces for the VM.
    Note: Nutanix guarantees a unique VM MAC address in a cluster. You can come across scenarios where two VM in different clusters can have the same MAC address.
    Note: Acropolis leader generates MAC address for the VM on AHV. The first 24 bits of the MAC address is set to 50-6b-8d (0101 0000 0110 1101 1000 1101) and are reserved by Nutanix, the 25th bit is set to 1 (reserved by Acropolis leader), the 26th bit to 48th bits are auto generated random numbers.
  7. To configure affinity policy for this VM, click Set Affinity.
    1. Select the host or hosts on which you want configure the affinity for this VM.
    2. Click Save.
      The selected host or hosts are listed. This configuration is permanent. The VM will not be moved from this host or hosts even in case of HA event and will take effect once the VM starts.
  8. To customize the VM by using Cloud-init (for Linux VMs) or Sysprep (for Windows VMs), select the Custom Script check box.
    Fields required for configuring Cloud-init and Sysprep, such as options for specifying a configuration script or answer file and text boxes for specifying paths to required files, appear below the check box.
    Figure. Create VM Dialog Box (custom script fields) Click to enlargecustom script fields in the create VM screen

  9. To specify a user data file (Linux VMs) or answer file (Windows VMs) for unattended provisioning, do one of the following:
    • If you uploaded the file to a storage container on the cluster, click ADSF path, and then enter the path to the file.

      Enter the ADSF prefix (adsf://) followed by the absolute path to the file. For example, if the user data is in /home/my_dir/cloud.cfg, enter adsf:///home/my_dir/cloud.cfg. Note the use of three slashes.

    • If the file is available on your local computer, click Upload a file, click Choose File, and then upload the file.
    • If you want to create or paste the contents of the file, click Type or paste script, and then use the text box that is provided.
  10. To copy one or more files to a location on the VM (Linux VMs) or to a location in the ISO file (Windows VMs) during initialization, do the following:
    1. In Source File ADSF Path, enter the absolute path to the file.
    2. In Destination Path in VM, enter the absolute path to the target directory and the file name.
      For example, if the source file entry is /home/my_dir/myfile.txt then the entry for the Destination Path in VM should be /<directory_name>/copy_desitation> i.e. /mnt/myfile.txt.
    3. To add another file or directory, click the button beside the destination path field. In the new row that appears, specify the source and target details.
  11. When all the field entries are correct, click the Save button to create the VM and close the Create VM dialog box.
    The new VM appears in the VM table view.

Managing a VM (AHV)

You can use the web console to manage virtual machines (VMs) in Acropolis managed clusters.

About this task

After creating a VM, you can use the web console to start or shut down the VM, launch a console window, update the VM configuration, take a snapshot, attach a volume group, migrate the VM, clone the VM, or delete the VM.

Note: Your available options depend on the VM status, type, and permissions. Unavailable options are grayed out.

To accomplish one or more of these tasks, do the following:

Procedure

  1. In the VM dashboard, click the Table view.
  2. Select the target VM in the table (top section of screen).
    The Summary line (middle of screen) displays the VM name with a set of relevant action links on the right. You can also right-click on a VM to select a relevant action.

    The possible actions are Manage Guest Tools, Launch Console, Power on (or Power off), Take Snapshot, Migrate, Clone, Update, and Delete.

    Note: VM pause and resume feature is not supported on AHV.
    The following steps describe how to perform each action.
    Figure. VM Action Links Click to enlarge

  3. To manage guest tools as follows, click Manage Guest Tools.
    You can also enable NGT applications (self-service restore, Volume Snapshot Service and application-consistent snapshots) also as part of manage guest tools.
    1. Select Enable Nutanix Guest Tools check box to enable NGT on the selected VM.
    2. Select Mount Nutanix Guest Tools to mount NGT on the selected VM.
      Ensure that VM must have at least one empty IDE CD-ROM slot to attach the ISO.
      The VM is registered with the NGT service. NGT is enabled and mounted on the selected virtual machine. A CD with volume label NUTANIX_TOOLS gets attached to the VM.
    3. To enable self-service restore feature for Windows VMs, click Self Service Restore (SSR) check box.
      The Self-Service Restore feature is enabled of the VM. The guest VM administrator can restore the desired file or files from the VM. For more information about self-service restore feature, see Self-Service Restore in the Data Protection and Recovery with Prism Element guide.

    4. After you select Enable Nutanix Guest Tools check box the VSS snapshot feature is enabled by default.
      After this feature is enabled, Nutanix native in-guest VmQuiesced Snapshot Service (VSS) agent takes snapshots for VMs that support VSS.
      Note:

      The AHV VM snapshots are not application consistent. The AHV snapshots are taken from the VM entity menu by selecting a VM and clicking Take Snapshot.

      The application consistent snapshots feature is available with Protection Domain based snapshots and Recovery Points in Prism Central. For more information, see Conditions for Application-consistent Snapshots in the Data Protection and Recovery with Prism Element guide.

    5. Click Submit.
      The VM is registered with the NGT service. NGT is enabled and mounted on the selected virtual machine. A CD with volume label NUTANIX_TOOLS gets attached to the VM.
      Note:
      If you eject the CD, you can mount the CD back again by logging into the Controller VM and running the following nCLI command.
      nutanix@cvm$ ncli ngt mount vm-id=virtual_machine_id

      For example, to mount the NGT on the VM with VM_ID=00051a34-066f-72ed-0000-000000005400::38dc7bf2-a345-4e52-9af6-c1601e759987, type the following command.

      nutanix@cvm$ ncli ngt mount vm-id=00051a34-066f-72ed-0000-000000005400::38dc7bf2-a345-4e52-9af6-
      c1601e759987
  4. To launch a console window, click the Launch Console action link.
    This opens a Virtual Network Computing (VNC) client and displays the console in a new tab or window. This option is available only when the VM is powered on. The console window includes four menu options (top right):
    • Clicking the Mount ISO button displays the following window that allows you to mount an ISO image to the VM. To mount an image, select the desired image and CD-ROM drive from the pull-down lists and then click the Mount button.
      Figure. Mount Disk Image Window Click to enlargemount ISO image window from VNC console

      Note: For information about how to select CD-ROM as the storage device when you intent to provide a system image from CD-ROM, see Add New Disk in Creating a VM (AHV).
    • Clicking the C-A-D icon button sends a CtrlAltDel command to the VM.
    • Clicking the camera icon button takes a screenshot of the console window.
    • Clicking the power icon button allows you to power on/off the VM. These are the same options that you can access from the Power On Actions or Power Off Actions action link below the VM table (see next step).
    Figure. Virtual Network Computing (VNC) Window Click to enlarge

  5. To start or shut down the VM, click the Power on (or Power off) action link.

    Power on begins immediately. If you want to power off the VMs, you are prompted to select one of the following options:

    • Power Off. Hypervisor performs a hard power off action on the VM.
    • Power Cycle. Hypervisor performs a hard restart action on the VM.
    • Reset. Hypervisor performs an ACPI reset action through the BIOS on the VM.
    • Guest Shutdown. Operating system of the VM performs a graceful shutdown.
    • Guest Reboot. Operating system of the VM performs a graceful restart.
    Note: If you perform power operations such as Guest Reboot or Guest Shutdown by using the Prism Element web console or API on Windows VMs, these operations might silently fail without any error messages if at that time a screen saver is running in the Windows VM. Perform the same power operations again immediately, so that they succeed.
  6. To make a snapshot of the VM, click the Take Snapshot action link.

    For more information, see Virtual Machine Snapshots.

  7. To migrate the VM to another host, click the Migrate action link.
    This displays the Migrate VM dialog box. Select the target host from the pull-down list (or select the System will automatically select a host option to let the system choose the host) and then click the Migrate button to start the migration.
    Figure. Migrate VM Dialog Box Click to enlarge

    Note: Nutanix recommends to live migrate VMs when they are under light load. If they are migrated while heavily utilized, migration may fail because of limited bandwidth.
  8. To clone the VM, click the Clone action link.
    This displays the Clone VM dialog box, which includes the same fields as the Create VM dialog box but with all fields (except the name) filled in with the current VM settings and number of clones needed. Enter a name for the clone and number of clones of the VM that are required and then click the Save button to create the clone. You can create a modified clone by changing some of the settings. You can also customize the VM during initialization by providing a custom script and specifying files needed during the customization process.
    Figure. Clone VM Dialog Box Click to enlarge

  9. To modify the VM configuration, click the Update action link.

    The Update VM dialog box appears, which includes the same fields as the Create VM dialog box. Modify the configuration as needed, and then save the configuration. In addition to modifying the configuration, you can attach a volume group to the VM and enable flash mode on the VM. If you attach a volume group to a VM that is part of a protection domain, the VM is not protected automatically. Add the VM to the same Consistency Group manually.

    (For GPU-enabled AHV clusters only) You can add pass-through GPUs if a VM is already using GPU pass-through. You can also change the GPU configuration from pass-through to vGPU or vGPU to pass-through, change the vGPU profile, add more vGPUs, and change the specified vGPU license. However, you need to power off the VM before you perform these operations.

    • Before you add multiple vGPUs to the VM, see Multiple Virtual GPU Support and Restrictions for Multiple vGPU Support in the AHV Administration Guide.

    • Multiple vGPUs are supported on the same VM only if you select the highest vGPU profile type.

    • For more information on vGPU profile selection, see:

      • Virtual GPU Types for Supported GPUs in the NVIDIA Virtual GPU Software User Guide in the NVIDIA's Virtual GPU Software Documentation webpage, and

      • GPU and vGPU Support in the AHV Administration Guide.

    • After you add the first vGPU, to add multiple vGPUs, see Adding Multiple vGPUs to the Same VM in the AHV Administration Guide.

    You can add new network adapters or NICs using the Add New NIC option. You can also modify the network used by an existing NIC. See Limitation for vNIC Hot-Unplugging and Creating a VM (AHV) before you modify the NIC network or create a new NIC for a VM.

    Figure. VM Update Dialog Box Click to enlarge

    Note: If you delete a vDisk attached to a VM and snapshots associated with this VM exist, space associated with that vDisk is not reclaimed unless you also delete the VM snapshots.
    To increase the memory allocation and the number of vCPUs on your VMs while the VMs are powered on (hot-pluggable), do the following:
    1. In the vCPUs field, you can increase the number of vCPUs on your VMs while the VMs are powered on.
    2. In the Number of Cores Per vCPU field, you can change the number of cores per vCPU only if the VMs are powered off.
      Note: This is not a hot-pluggable feature.
    3. In the Memory field, you can increase the memory allocation on your VMs while the VMs are powered on.
    For more information about hot-pluggable vCPUs and memory, see Virtual Machine Memory and CPU Hot-Plug Configurations in the AHV Administration Guide.
    To attach a volume group to the VM, do the following:
    1. In the Volume Groups section, click Add volume group, and then do one of the following:
      • From the Available Volume Groups list, select the volume group that you want to attach to the VM.
      • Click Create new volume group, and then, in the Create Volume Group dialog box, create a volume group (see Creating a Volume Group). After you create a volume group, select it from the Available Volume Groups list.
      Repeat these steps until you have added all the volume groups that you want to attach to the VM.
    2. Click Add.
    1. To enable flash mode on the VM, click the Enable Flash Mode check box.
      • After you enable this feature on the VM, the status is updated in the VM table view. To view the status of individual virtual disks (disks that are flashed to the SSD), go the Virtual Disks tab in the VM table view.
      • You can disable the flash mode feature for individual virtual disks. To update the flash mode for individual virtual disks, click the update disk icon in the Disks pane and deselect the Enable Flash Mode check box.
  10. To delete the VM, click the Delete action link. A window prompt appears; click the OK button to delete the VM.
    The deleted VM disappears from the list of VMs in the table.

Limitation for vNIC Hot-Unplugging

If you detach (hot-unplug) the vNIC for the VM with guest OS installed on it, the AOS displays the detach result as successful, but the actual detach success depends on the status of the ACPI mechanism in guest OS.

The following table describes the vNIC detach observations and workaround applicable based on guest OS response to ACPI request:

Table 1. vNIC Detach - Observations and Workaround
Detach Procedure Followed Guest OS responds to ACPI request (Yes/No) AOS Behavior Actual Detach Result Workaround
vNIC Detach (hot-unplug)
  • Using Prism Central: See Managing a VM (AHV) topic in Prism Central Guide.
  • Using Prism Element web console: See Managing a VM (AHV).
  • Using acli: Log on to the CVM with SSH and run the following command:

    nutanix@cvm$ acli vm.nic_delete <vm_name> <nic mac address>

    or,

    nutanix@cvm$ acli vm.nic_update <vm_name> <nic mac address> connected=false

    Replace the following attributes in the above commands:

    • <vm_name> with the name of the guest VM for which the vNIC is to be detached.
    • <nic mac address> with the vNIC MAC address that needs to be detached.
Yes

vNIC detach is Successful.

Observe the following logs:

Device detached successfully

vNIC detach is successful. No action needed
No vNIC detach is not successful. Power cycle the VM for successful vNIC detach.
Note: In most cases, it is observed that the ACPI mechanism failure occurs when no guest OS is installed on the VM.

Virtual Machine Snapshots

You can generate snapshots of virtual machines or VMs. You can generate snapshots of VMs manually or automatically. Some of the purposes that VM snapshots serve are as follows:

  • Disaster recovery
  • Testing - as a safe restoration point in case something went wrong during testing.
  • Migrate VMs
  • Create multiple instances of a VM.

Snapshot is a point-in-time state of entities such as VM and Volume Groups, and used for restoration and replication of data.. You can generate snapshots and store them locally or remotely. Snapshots are mechanism to capture the delta changes that has occurred over time. Snapshots are primarily used for data protection and disaster recovery. Snapshots are not autonomous like backup, in the sense that they depend on the underlying VM infrastructure and other snapshots to restore the VM. Snapshots consume less resources compared to a full autonomous backup. Typically, a VM snapshot captures the following:

  • The state including the power state (for example, powered-on, powered-off, suspended) of the VMs.
  • The data includes all the files that make up the VM. This data also includes the data from disks, configurations, and devices, such as virtual network interface cards.

For more information about creating VM snapshots, see Creating a VM Snapshot Manually section in the Prism Web Console Guide.

VM Snapshots and Snapshots for Disaster Recovery

The VM Dashboard only allows you to generate VM snapshots manually. You cannot select VMs and schedule snapshots of the VMs using the VM dashboard. The snapshots generated manually have very limited utility.

Note: These snapshots (stored locally) cannot be replicated to other sites.

You can schedule and generate snapshots as a part of the disaster recovery process using Nutanix DR solutions. AOS generates snapshots when you protect a VM with a protection domain using the Data Protection dashboard in Prism Web Console (see the Data Protection and Recovery with Prism Element guide). Similarly, Recovery Points (snapshots are called Recovery Points in Prism Central) when you protect a VM with a protection policy using Data Protection dashboard in Prism Central (see the Leap Administration Guide).

For example, in the Data Protection dashboard in Prism Web Console, you can create schedules to generate snapshots using various RPO schemes such as asynchronous replication with frequency intervals of 60 minutes or more, or NearSync replication with frequency intervals of as less as 20 seconds up to 15 minutes. These schemes create snapshots in addition to the ones generated by the schedules, for example, asynchronous replication schedules generate snapshots according to the configured schedule and, in addition, an extra snapshot every 6 hours. Similarly, NearSync generates snapshots according to the configured schedule and also generates one extra snapshot every hour.

Similarly, you can use the options in the Data Protection section of Prism Central to generate Recovery Points using the same RPO schemes.

Windows VM Provisioning

Nutanix VirtIO for Windows

Nutanix VirtIO is a collection of drivers for paravirtual devices that enhance the stability and performance of virtual machines on AHV.

Nutanix VirtIO is available in two formats:

  • To install Windows in a VM on AHV, use the VirtIO ISO.
  • To update VirtIO for Windows, use the VirtIO MSI installer file.

Use Nutanix Guest Tools (NGT) to install the Nutanix VirtlO Package. For more information about installing the Nutanix VirtIO package using the NGT, see NGT Installation in the Prism Web Console Guide.

VirtIO Requirements

Requirements for Nutanix VirtIO for Windows.

VirtIO supports the following operating systems:

  • Microsoft Windows server version: Windows 2008 R2 or later
  • Microsoft Windows client version: Windows 7 or later
Note: On Windows 7 and Windows Server 2008 R2, install Microsoft KB3033929 or update the operating system with the latest Windows Update to enable support for SHA2 certificates.
Caution: The VirtIO installation or upgrade may fail if multiple Windows VSS snapshots are present in the guest VM. The installation or upgrade failure is due to the timeout that occurs during installation of Nutanix VirtIO SCSI pass-through controller driver.

It is recommended to clean up the VSS snapshots or temporarily disconnect the drive that contains the snapshots. Ensure that you only delete the snapshots that are no longer needed. For more information about how to observe the VirtIO installation or upgrade failure that occurs due to availability of multiple Windows VSS snapshots, see KB-12374.

Installing or Upgrading Nutanix VirtIO for Windows

Download Nutanix VirtIO and the Nutanix VirtIO Microsoft installer (MSI). The MSI installs and upgrades the Nutanix VirtIO drivers.

Before you begin

Make sure that your system meets the VirtIO requirements described in VirtIO Requirements.

About this task

If you have already installed Nutanix VirtIO, use the following procedure to upgrade VirtIO to a latest version. If you have not yet installed Nutanix VirtIO, use the following procedure to install Nutanix VirtIO.

Procedure

  1. Go to the Nutanix Support portal and select Downloads > AHV and click VirtIO.
  2. Select the appropriate VirtIO package.
    • If you are creating a new Windows VM, download the ISO file. The installer is available on the ISO if your VM does not have Internet access.
    • If you are updating drivers in a Windows VM, download the MSI installer file.
    Figure. Search filter and VirtIO options Click to enlargeUse filter to search for the latest VirtIO package, ISO or MSI.

  3. Run the selected package.
    • For the ISO: Upload the ISO to the cluster, as described in the Configuring Images topic in Prism Web Console Guide.
    • For the MSI: open the download file to run the MSI.
  4. Read and accept the Nutanix VirtIO license agreement. Click Install.
    Figure. Nutanix VirtIO Windows Setup Wizard Click to enlargeAccept the License Agreement for Nutanix VirtIO Windows Installer

    The Nutanix VirtIO setup wizard shows a status bar and completes installation.

Manually Installing or Upgrading Nutanix VirtIO

Manually install or upgrade Nutanix VirtIO.

Before you begin

Make sure that your system meets the VirtIO requirements described in VirtIO Requirements.

About this task

Note: To automatically install Nutanix VirtIO, see Installing or Upgrading Nutanix VirtIO for Windows.

If you have already installed Nutanix VirtIO, use the following procedure to upgrade VirtIO to a latest version. If you have not yet installed Nutanix VirtIO, use the following procedure to install Nutanix VirtIO.

Procedure

  1. Go to the Nutanix Support portal, select Downloads > AHV, and click VirtIO.
  2. Do one of the following:
    • Extract the VirtIO ISO into the same VM where you load Nutanix VirtIO, for easier installation.

      If you choose this option, proceed directly to step 7.

    • Download the VirtIO ISO for Windows to your local machine.

      If you choose this option, proceed to step 3.

  3. Upload the ISO to the cluster, as described in the Configuring Images topic of Prism Web Console Guide.
  4. Locate the VM where you want to install the Nutanix VirtIO ISO and update the VM.
  5. Add the Nutanix VirtIO ISO by clicking Add New Disk and complete the indicated fields.
    • TYPE: CD-ROM
    • OPERATION: CLONE FROM IMAGE SERVICE
    • BUS TYPE: IDE
    • IMAGE: Select the Nutanix VirtIO ISO
  6. Click Add.
  7. Log on to the VM and browse to Control Panel > Device Manager.
  8. Note: Select the x86 subdirectory for 32-bit Windows, or the amd64 for 64-bit Windows.
    Open the devices and select the specific Nutanix drivers for download. For each device, right-click and Update Driver Software into the drive containing the VirtIO ISO. For each device, follow the wizard instructions until you receive installation confirmation.
    1. System Devices > Nutanix VirtIO Balloon Drivers
    2. Network Adapter > Nutanix VirtIO Ethernet Adapter.
    3. Processors > Storage Controllers > Nutanix VirtIO SCSI pass through Controller
      The Nutanix VirtIO SCSI pass-through controller prompts you to restart your system. Restart at any time to install the controller.
      Figure. List of Nutanix VirtIO downloads Click to enlargeThis image lists the Nutanix VirtIO downloads required for 32-bit Windows.

Creating a Windows VM on AHV with Nutanix VirtIO

Create a Windows VM in AHV, or migrate a Windows VM from a non-Nutanix source to AHV, with the Nutanix VirtIO drivers.

Before you begin

  • Upload the Windows installer ISO to your cluster as described in the Configuring Images topic in Web Console Guide.
  • Upload the Nutanix VirtIO ISO to your cluster as described in the Configuring Images topic in Web Console Guide.

About this task

To install a new or migrated Windows VM with Nutanix VirtIO, complete the following.

Procedure

  1. Log on to the Prism web console using your Nutanix credentials.
  2. At the top-left corner, click Home > VM.
    The VM page appears.
  3. Click + Create VM in the corner of the page.
    The Create VM dialog box appears.
    Figure. Create VM dialog box Click to enlargeCreate VM dialog box

  4. Complete the indicated fields.
    1. NAME: Enter a name for the VM.
    2. Description (optional): Enter a description for the VM.
    3. Timezone: Select the timezone that you want the VM to use. If you are creating a Linux VM, select (UTC) UTC.
      Note:

      The RTC of Linux VMs must be in UTC, so select the UTC timezone if you are creating a Linux VM.

      Windows VMs preserve the RTC in the local timezone, so set up the Windows VM with the hardware clock pointing to the desired timezone.

    4. Number of Cores per vCPU: Enter the number of cores assigned to each virtual CPU.
    5. MEMORY: Enter the amount of memory for the VM (in GiB).
  5. If you are creating a Windows VM, add a Windows CD-ROM to the VM.
    1. Click the pencil icon next to the CD-ROM that is already present and fill out the indicated fields.
      • OPERATION: CLONE FROM IMAGE SERVICE
      • BUS TYPE: IDE
      • IMAGE: Select the Windows OS install ISO.
    2. Click Update.
      The current CD-ROM opens in a new window.
  6. Add the Nutanix VirtIO ISO.
    1. Click Add New Disk and complete the indicated fields.
      • TYPE: CD-ROM
      • OPERATION: CLONE FROM IMAGE SERVICE
      • BUS TYPE: IDE
      • IMAGE: Select the Nutanix VirtIO ISO.
    2. Click Add.
  7. Add a new disk for the hard drive.
    1. Click Add New Disk and complete the indicated fields.
      • TYPE: DISK
      • OPERATION: ALLOCATE ON STORAGE CONTAINER
      • BUS TYPE: SCSI
      • STORAGE CONTAINER: Select the appropriate storage container.
      • SIZE: Enter the number for the size of the hard drive (in GiB).
    2. Click Add to add the disk driver.
  8. If you are migrating a VM, create a disk from the disk image.
    1. Click Add New Disk and complete the indicated fields.
      • TYPE: DISK
      • OPERATION: CLONE FROM IMAGE
      • BUS TYPE: SCSI
      • CLONE FROM IMAGE SERVICE: Click the drop-down menu and choose the image you created previously.
    2. Click Add to add the disk driver.
  9. Optionally, after you have migrated or created a VM, add a network interface card (NIC).
    1. Click Add New NIC.
    2. In the VLAN ID field, choose the VLAN ID according to network requirements and enter the IP address, if necessary.
    3. Click Add.
  10. Click Save.

What to do next

Install Windows by following Installing Windows on a VM.

Installing Windows on a VM

Install a Windows virtual machine.

Before you begin

Create a Windows VM.

Procedure

  1. Log on to the web console.
  2. Click Home > VM to open the VM dashboard.
  3. Select the Windows VM.
  4. In the center of the VM page, click Power On.
  5. Click Launch Console.
    The Windows console opens in a new window.
  6. Select the desired language, time and currency format, and keyboard information.
  7. Click Next > Install Now.
    The Windows setup dialog box shows the operating systems to install.
  8. Select the Windows OS you want to install.
  9. Click Next and accept the license terms.
  10. Click Next > Custom: Install Windows only (advanced) > Load Driver > OK > Browse.
  11. Choose the Nutanix VirtIO driver.
    1. Select the Nutanix VirtIO CD drive.
    2. Expand the Windows OS folder and click OK.
    Figure. Select the Nutanix VirtIO drivers for your OS Click to enlargeChoose the driver folder.

    The Select the driver to install window appears.
  12. Select the VirtIO SCSI driver (vioscsi.inf) and click Next.
    Figure. Select the Driver for Installing Windows on a VM Click to enlargeChoose the VirtIO driver.

    The amd64 folder contains drivers for 64-bit operating systems. The x86 folder contains drivers for 32-bit operating systems.
    Note: From Nutanix VirtIO driver version 1.1.5, the driver package contains Windows Hardware Quality Lab (WHQL) certified driver for Windows.
  13. Select the allocated disk space for the VM and click Next.
    Windows shows the installation progress, which can take several minutes.
  14. Enter your user name and password information and click Finish.
    Installation can take several minutes.
    Once you complete the logon information, Windows setup completes installation.
  15. Follow the instructions in Installing or Upgrading Nutanix VirtIO for Windows to install other drivers which are part of Nutanix VirtIO package.

Windows Defender Credential Guard Support in AHV

AHV enables you to use the Windows Defender Credential Guard security feature on Windows guest VMs.

Windows Defender Credential Guard feature of Microsoft Windows operating systems allows you to securely isolate user credentials from the rest of the operating system. By that means, you can protect guest VMs from credential theft attacks such as Pass-the-Hash or Pass-The-Ticket.

See the Microsoft documentation for more information about the Windows Defender Credential Guard security feature.

Windows Defender Credential Guard Architecture in AHV

Figure. Architecture Click to enlarge

Windows Defender Credential Guard uses Microsoft virtualization-based security to isolate user credentials in the virtualization-based security (VBS) module in AHV. When you enable Windows Defender Credential Guard on an AHV guest VM, the guest VM runs on top of AHV running both the Windows OS and VBS. Each Windows OS guest VM, which has credential guard enabled, has a VBS to securely store credentials.

Windows Defender Credential Guard Requirements

Ensure the following to enable Windows Defender Credential Guard:

  1. AOS, AHV, and Windows Server versions support Windows Defender Credential Guard:
    • AOS version must be 5.19 or later
    • AHV version must be AHV 20201007.1 or later
    • Windows Server version must be Windows server 2016 or later, Windows 10 Enterprise or later and Windows Server 2019 or later
  2. UEFI, Secure Boot, and machine type q35 are enabled in the Windows VM from AOS.

    The Prism Element workflow to enable Windows Defender Credential Guard includes the workflow to enable these features.

Limitations

  • Windows Defender Credential guard is not supported on hosts with AMD CPUs.
  • If you enable Windows Defender Credential Guard for your AHV guest VMs, the following optional configurations are not supported:

    • vTPM (Virtual Trusted Platform Modules) to store MS policies.
    • DMA protection (vIOMMU).
    • Nutanix Live Migration.
    • Cross hypervisor DR of Credential Guard VMs.
Caution: Use of Windows Defender Credential Guard in your AHV clusters impacts VM performance. If you enable Windows Defender Credential Guard on AHV guest VMs, VM density drops by ~15–20%. This expected performance impact is due to nested virtualization overhead added as a result of enabling credential guard.

Enabling Windows Defender Credential Guard Support in AHV Guest VMs

You can enable Windows Defender Credential Guard when you are either creating a VM or updating a VM.

About this task

Perform the following procedure to enable Windows Defender Credential Guard:

Procedure

  1. Enable Windows Defender Credential Guard when you are either creating a VM or updating a VM. Do one of the following:
    • If you are creating a VM, see step 2.
    • If you are updating a VM, see step 3.
  2. If you are creating a Windows VM, do the following:
    1. Log on to the Prism Element web console.
    2. In the VM dashboard, click Create VM.
    3. Fill in the mandatory fields to configure a VM.
    4. Under Boot Configuration, select UEFI, and then select the Secure Boot and Windows Defender Credential Guard options.
      Figure. Enable Windows Defender Credential Guard Click to enlarge

      See UEFI Support for VM and Secure Boot Support for VMs for more information about these features.

    5. Proceed to configure other attributes for your Windows VM.
    6. Click Save.
    7. Turn on the VM.
  3. If you are updating an existing VM, do the following:
    1. Log on to the Prism Element web console.
    2. In the VM dashboard, click the Table view, select the VM, and click Update.
    3. Under Boot Configuration, select UEFI, and then select the Secure Boot and Windows Defender Credential Guard options.
      Note:

      If the VM is configured to use BIOS, install the guest OS again.

      If the VM is already configured to use UEFI, skip the step to select Secure Boot.

      See UEFI Support for VM and Secure Boot Support for VMs for more information about these features.

    4. Click Save.
    5. Turn on the VM.
  4. Enable Windows Defender Credential Guard in the Windows VM by using group policy.
    See the Enable Windows Defender Credential Guard by using the Group Policy procedure of the Manage Windows Defender Credential Guard topic in the Microsoft documentation to enable VBS, Secure Boot, and Windows Defender Credential Guard for the Windows VM.
  5. Open command prompt in the Windows VM and apply the Group Policy settings:
    > gpupdate /force

    If you have not enabled Windows Defender Credential Guard (step 4) and perform this step (step 5), a warning similar to the following is displayed:

    Updating policy...
     
    Computer Policy update has completed successfully.
     
    The following warnings were encountered during computer policy processing:
     
    Windows failed to apply the {F312195E-3D9D-447A-A3F5-08DFFA24735E} settings. {F312195E-3D9D-447A-A3F5-08DFFA24735E} settings might have its own log file. Please click on the "More information" link.
    User Policy update has completed successfully.
     
    For more detailed information, review the event log or run GPRESULT /H GPReport.html from the command line to access information about Group Policy results.
    

    Event Viewer displays a warning for the group policy with an error message that indicates Secure Boot is not enabled on the VM.

    To view the warning message in Event Viewer, do the following:

    • In the Windows VM, open Event Viewer.
    • Go to Windows Logs -> System and click the warning with the Source as GroupPolicy (Microsoft-Windows-GroupPolicy) and Event ID as 1085.
    Figure. Warning in Event Viewer Click to enlarge

    Note: Ensure that you follow the steps in the order that is stated in this document to successfully enable Windows Defender Credential Guard.
  6. Restart the VM.
  7. Verify if Windows Defender Credential Guard is enabled in your Windows VM.
    1. Start a Windows PowerShell terminal.
    2. Run the following command.
      PS > Get-CimInstance -ClassName Win32_DeviceGu
      ard -Namespace 'root\Microsoft\Windows\DeviceGuard'

      An output similar to the following is displayed.

      PS > Get-CimInstance -ClassName Win32_DeviceGuard -Namespace 'root\Microsoft\Windows\DeviceGuard'
      AvailableSecurityProperties              	: {1, 2, 3, 5}
      CodeIntegrityPolicyEnforcementStatus     	: 0
      InstanceIdentifier                       	: 4ff40742-2649-41b8-bdd1-e80fad1cce80
      RequiredSecurityProperties               	: {1, 2}
      SecurityServicesConfigured               	: {1}
      SecurityServicesRunning                  	: {1}
      UsermodeCodeIntegrityPolicyEnforcementStatus : 0
      Version                                  	: 1.0
      VirtualizationBasedSecurityStatus        	: 2
      PSComputerName 
      

      Confirm that both SecurityServicesConfigured and SecurityServicesRunning have the value { 1 }.

    Alternatively, you can verify if Windows Defender Credential Guard is enabled by using System Information (msinfo32):

    1. In the Windows VM, open System Information by typing msinfo32 in the search field next to the Start menu.
    2. Verify if the values of the parameters are as indicated in the following screen shot:
      Figure. Verify Windows Defender Credential Guard Click to enlarge

Affinity Policies for AHV

As an administrator of an AHV cluster, you can specify scheduling policies for virtual machines on an AHV cluster. By defining these policies, you can control the placement of the virtual machines on the hosts within a cluster.

You can define two types of affinity policies.

VM-Host Affinity Policy

The VM-host affinity policy controls the placement of the VMs. You can use this policy to specify that a selected VM can only run on the members of the affinity host list. This policy checks and enforces where a VM can be hosted when you restart or migrate the VM.
Note:
  • If you choose to apply the VM-host affinity policy, it limits Acropolis HA and Acropolis Dynamic Scheduling (ADS) in such a way that a virtual machine cannot be restarted or migrated to a host that does not conform to the requirements of the affinity policy as this policy is enforced mandatorily.
  • The VM-host anti-affinity policy is not supported.
  • VMs configured with host affinity settings retain these settings if the VM is migrated to a new cluster. Remove the VM-host affinity policies applied to a VM that you want to migrate to another cluster, as the UUID of the host is retained by the VM and it does not allow VM restart on the destination cluster. When you attempt to protect such VMs, it is successful. However, some disaster recovery operations like migration fail and attempts to power on these VMs also fail.
  • VMs with host affinity policies can only be migrated to the hosts specified in the affinity policy. If only one host is specified, the VM cannot be migrated or started on another host during an HA event. For more information, see Non-Migratable Hosts.

You can define the VM-host affinity policies by using Prism Element during the VM create or update operation. For more information, see Creating a VM (AHV).

VM-VM Anti-Affinity Policy

You can use this policy to specify anti-affinity between the virtual machines. The VM-VM anti-affinity policy keeps the specified virtual machines apart in such a way that when a problem occurs with one host, you should not lose both the virtual machines. However, this is a preferential policy. This policy does not limit the Acropolis Dynamic Scheduling (ADS) feature to take necessary action in case of resource constraints.
Note:
Note: If a VM is cloned that has the affinity policies configured, then the policies are not automatically applied to the cloned VM. However, if a VM is restored from a DR snapshot, the policies are automatically applied to the VM.

Limitations of Affinity Rules

Even though if a host is removed from a cluster, the host UUID is not removed from the host-affinity list for a VM.

Configuring VM-VM Anti-Affinity Policy

To configure VM-VM anti-affinity policies, you must first define a group and then add all the VMs on which you want to define VM-VM anti-affinity policy.

About this task

Note: Currently, the VM-VM affinity policy is not supported.

Perform the following procedure to configure the VM-VM anti-affinity policy.

Procedure

  1. Log on to the Controller VM with SSH session.
  2. Create a group.
    nutanix@cvm$ acli vm_group.create group_name

    Replace group_name with the name of the group.

  3. Add the VMs on which you want to define anti-affinity to the group.
    nutanix@cvm$ acli vm_group.add_vms group_name vm_list=vm_name

    Replace group_name with the name of the group. Replace vm_name with the name of the VMs that you want to define anti-affinity on. In case of multiple VMs, you can specify comma-separated list of VM names.

  4. Configure VM-VM anti-affinity policy.
    nutanix@cvm$ acli vm_group.antiaffinity_set group_name

    Replace group_name with the name of the group.

    After you configure the group, the new anti-affinity rule is applied when the ADS runs again the next time. ADS runs every 15 minutes.

Removing VM-VM Anti-Affinity Policy

Perform the following procedure to remove the VM-VM anti-affinity policy.

Procedure

  1. Log on to the Controller VM with SSH session.
  2. Remove the VM-VM anti-affinity policy.
    nutanix@cvm$ acli vm_group.antiaffinity_unset group_name

    Replace group_name with the name of the group.

    The VM-VM anti-affinity policy is removed for the VMs that are present in the group, and they can start on any host during the next power on operation (as necessitated by the ADS feature).

Non-Migratable Hosts

VMs with GPU, CPU passthrough, PCI passthrough, and host affinity policies are not migrated to other hosts in the cluster. Such VMs are treated in a different manner in scenarios where VMs are required to migrate to other hosts in the cluster.

Table 1. Scenarios Where VMs Are Required to Migrate to Other Hosts
Scenario Behavior
One-click upgrade VM is powered off.
Life-cycle management (LCM) Pre-check for LCM fails and the VMs are not migrated.
Rolling restart VM is powered off.
AHV host maintenance mode Use the tunable option to shut down the VMs while putting the node in maintenance mode. For more information, see Putting a Node into Maintenance Mode.

Performing Power Operations on VMs by Using Nutanix Guest Tools (aCLI)

You can initiate safe and graceful power operations such as soft shutdown and restart of the VMs running on the AHV hosts by using the aCLI. Nutanix Guest Tools (NGT) initiates and performs the soft shutdown and restart operations within the VM. This workflow ensures a safe and graceful shutdown or restart of the VM. You can create a pre-shutdown script that you can choose to run before a shutdown or restart of the VM. In the pre-shutdown script, include any tasks or checks that you want to run before a VM is shut down or restarted. You can choose to cancel the power operation if the pre-shutdown script fails. If the script fails, an alert (guest_agent_alert) is generated in the Prism web console.

Before you begin

Ensure that you have met the following prerequisites before you initiate the power operations:
  1. NGT is enabled on the VM. All operating systems that NGT supports are supported for this feature.
  2. NGT version running on the Controller VM and guest VM is the same.
  3. (Optional) If you want to run a pre-shutdown script, place the script in the following locations depending on your VMs:
    • Windows VMs: installed_dir\scripts\power_off.bat

      The file name of the script must be power_off.bat.

    • Linux VMs: installed_dir/scripts/power_off

      The file name of the script must be power_off.

About this task

Note: You can also perform these power operations by using the V3 API calls. For more information, see developer.nutanix.com.

Perform the following steps to initiate the power operations:

Procedure

  1. Log on to a Controller VM with SSH.
  2. Do one of the following:
    • Soft shut down the VM.
      nutanix@cvm$ acli vm.guest_shutdown vm_name enable_script_exec=[true or false] fail_on_script_failure=[true or false]

      Replace vm_name with the name of the VM.

    • Restart the VM.
      nutanix@cvm$ acli vm.guest_reboot vm_name enable_script_exec=[true or false] fail_on_script_failure=[true or false]

      Replace vm_name with the name of the VM.

    Set the value of enable_script_exec to true to run your pre-shutdown script and set the value of fail_on_script_failure to true to cancel the power operation if the pre-shutdown script fails.

UEFI Support for VM

UEFI firmware is a successor to legacy BIOS firmware that supports larger hard drives, faster boot time, and provides more security features.

VMs with UEFI firmware have the following advantages:

  • Boot faster
  • Avoid legacy option ROM address constraints
  • Include robust reliability and fault management
  • Use UEFI drivers
Note:
  • Nutanix supports the starting of VMs with UEFI firmware in an AHV cluster. However, if a VM is added to a protection domain and later restored on a different cluster, the VM loses boot configuration. To restore the lost boot configuration, see Setting up Boot Device.
  • Nutanix also provides limited support for VMs migrated from a Hyper-V cluster.

You can create or update VMs with UEFI firmware by using acli commands, Prism Element web console, or Prism Central web console. For more information about creating a VM by using the Prism Element web console or Prism Central web console, see Creating a VM (AHV). For information about creating a VM by using aCLI, see Creating UEFI VMs by Using aCLI.

Note: If you are creating a VM by using aCLI commands, you can define the location of the storage container for UEFI firmware and variables. Prism Element web console or Prism Central web console does not provide the option to define the storage container to store UEFI firmware and variables.

For more information about the supported OSes for the guest VMs, see the AHV Guest OS section in the ]Compatibility and Interoperability Matrix document.

Creating UEFI VMs by Using aCLI

In AHV clusters, you can create a virtual machine (VM) to start with UEFI firmware by using Acropolis CLI (aCLI). This topic describes the procedure to create a VM by using aCLI. See the "Creating a VM (AHV)" topic for information about how to create a VM by using the Prism Element web console.

Before you begin

Ensure that the VM has an empty vDisk.

About this task

Perform the following procedure to create a UEFI VM by using aCLI:

Procedure

  1. Log on to any Controller VM in the cluster with SSH.
  2. Create a UEFI VM.
    
    nutanix@cvm$ acli vm.create vm-name uefi_boot=true
    A VM is created with UEFI firmware. Replace vm-name with a name of your choice for the VM. By default, the UEFI firmware and variables are stored in an NVRAM container. If you would like to specify a location of the NVRAM storage container to store the UEFI firmware and variables, do so by running the following command.
    nutanix@cvm$ acli vm.create vm-name uefi_boot=true nvram_container=NutanixManagementShare
    Replace NutanixManagementShare with a storage container in which you want to store the UEFI variables.
    The UEFI variables are stored in a default NVRAM container. Nutanix recommends you to choose a storage container with at least RF2 storage policy to ensure the VM high availability for node failure scenarios. For more information about RF2 storage policy, see Failure and Recovery Scenarios in the Prism Web Console Guide document.
    Note: When you update the location of the storage container, clear the UEFI configuration and update the location of nvram_container to a container of your choice.

What to do next

Go to the UEFI BIOS menu and configure the UEFI firmware settings. For more information about accessing and setting the UEFI firmware, see Getting Familiar with UEFI Firmware Menu.

Getting Familiar with UEFI Firmware Menu

After you launch a VM console from the Prism Element web console, the UEFI firmware menu allows you to do the following tasks for the VM.

  • Changing default boot resolution
  • Setting up boot device
  • Changing boot-time value

Changing Boot Resolution

You can change the default boot resolution of your Windows VM from the UEFI firmware menu.

Before you begin

Ensure that the VM is in powered on state.

About this task

Perform the following procedure to change the default boot resolution of your Windows VM by using the UEFI firmware menu.

Procedure

  1. Log on to the Prism Element web console.
  2. Launch the console for the VM.
    For more details about launching console for the VM, see Managing a VM (AHV) section in Prism Web Console Guide.
  3. To go to the UEFI firmware menu, press the F2 keys on your keyboard.
    Tip: To enter UEFI menu, open the VM console, select Reset in the Power off/Reset VM dialog box, and immediately press F2 when the VM starts to boot.
    Important: Resetting the VM causes a downtime. We suggest that you reset the VM only during off-production hours or during a maintenance period.
    Figure. UEFI Firmware Menu Click to enlargeUEFI Firmware Menu

  4. Use the up or down arrow key to go to Device Manager and press Enter.
    The Device Manager page appears.
  5. In the Device Manager screen, use the up or down arrow key to go to OVMF Platform Configuration and press Enter.
    Figure. OVMF Settings Click to enlarge

    The OVMF Settings page appears.
  6. In the OVMF Settings page, use the up or down arrow key to go to the Change Preferred field and use the right or left arrow key to increase or decrease the boot resolution.
    The default boot resolution is 1280X1024.
  7. Do one of the following.
    • To save the changed resolution, press the F10 key.
    • To go back to the previous screen, press the Esc key.
  8. Select Reset and click Submit in the Power off/Reset dialog box to restart the VM.
    After you restart the VM, the OS displays the changed resolution.

Setting up Boot Device

You cannot set the boot order for UEFI VMs by using the aCLI, Prism Central web console, or Prism Element web console. You can change the boot device for a UEFI VM by using the UEFI firmware menu.

Before you begin

Ensure that the VM is in powered on state.

Procedure

  1. Log on to the Prism Element web console.
  2. Launch the console for the VM.
    For more details about launching console for the VM, see Managing a VM (AHV) section in Prism Web Console Guide.
  3. To go to the UEFI firmware menu, press the F2 keys on your keyboard.
    Tip: To enter UEFI menu, open the VM console, select Reset in the Power off/Reset VM dialog box, and immediately press F2 when the VM starts to boot.
    Important: Resetting the VM causes a downtime. We suggest that you reset the VM only during off-production hours or during a maintenance period.
  4. Use the up or down arrow key to go to Boot Manager and press Enter.
    The Boot Manager screen displays the list of available boot devices in the cluster.
    Figure. Boot Manager Click to enlarge

  5. In the Boot Manager screen, use the up or down arrow key to select the boot device and press Enter.
    The boot device is saved. After you select and save the boot device, the VM boots up with the new boot device.
  6. To go back to the previous screen, press Esc.

Changing Boot Time-Out Value

The boot time-out value determines how long the boot menu is displayed (in seconds) before the default boot entry is loaded to the VM. This topic describes the procedure to change the default boot-time value of 0 seconds.

About this task

Ensure that the VM is in powered on state.

Procedure

  1. Log on to the Prism Element web console.
  2. Launch the console for the VM.
    For more details about launching console for the VM, see Managing a VM (AHV) section in Prism Web Console Guide.
  3. To go to the UEFI firmware menu, press the F2 keys on your keyboard.
    Tip: To enter UEFI menu, open the VM console, select Reset in the Power off/Reset VM dialog box, and immediately press F2 when the VM starts to boot.
    Important: Resetting the VM causes a downtime. We suggest that you reset the VM only during off-production hours or during a maintenance period.
  4. Use the up or down arrow key to go to Boot Maintenance Manager and press Enter.
    Figure. Boot Maintenance Manager Click to enlarge

  5. In the Boot Maintenance Manager screen, use the up or down arrow key to go to the Auto Boot Time-out field.
    The default boot-time value is 0 seconds.
  6. In the Auto Boot Time-out field, enter the boot-time value and press Enter.
    Note: The valid boot-time value ranges from 1 second to 9 seconds.
    The boot-time value is changed. The VM starts after the defined boot-time value.
  7. To go back to the previous screen, press Esc.

Secure Boot Support for VMs

The pre-operating system environment is vulnerable to attacks by possible malicious loaders. Secure boot addresses this vulnerability with UEFI secure boot using policies present in the firmware along with certificates, to ensure that only properly signed and authenticated components are allowed to execute.

Supported Operating Systems

For more information about the supported OSes for the guest VMs, see the AHV Guest OS section in the Compatibility and Interoperability Matrix document.

Secure Boot Considerations

This section provides the limitations and requirements to use Secure Boot.

Limitations

Secure Boot for guest VMs has the following limitation:

  • Nutanix does not support converting a VM that uses IDE disks or legacy BIOS to VMs that use Secure Boot.
  • The minimum supported version of the Nutanix VirtIO package for Secure boot-enabled VMs is 1.1.6.
  • Secure boot VMs do not permit CPU, memory, or PCI disk hot plug.

Requirements

Following are the requirements for Secure Boot:

  • Secure Boot is supported only on the Q35 machine type.

Creating/Updating a VM with Secure Boot Enabled

You can enable Secure Boot with UEFI firmware, either while creating a VM or while updating a VM by using aCLI commands or Prism Element web console.

See Creating a VM (AHV) for instructions about how to enable Secure Boot by using the Prism Element web console.

Creating a VM with Secure Boot Enabled

About this task

To create a VM with Secure Boot enabled:

Procedure

  1. Log on to any Controller VM in the cluster with SSH.
  2. To create a VM with Secure Boot enabled:
    nutanix@cvm$ acli vm.create  <vm_name> secure_boot=true machine_type=q35
    Note: Specifying the machine type is required to enable the secure boot feature. UEFI is enabled by default when the Secure Boot feature is enabled.

Updating a VM to Enable Secure Boot

About this task

To update a VM to enable Secure Boot:

Procedure

  1. Log on to any Controller VM in the cluster with SSH.
  2. To update a VM to enable Secure Boot, ensure that the VM is powered off.
    nutanix@cvm$ acli vm.update <vm_name> secure_boot=true machine_type=q35
    Note:
    • If you disable the secure boot flag alone, the machine type remains q35, unless you disable that flag explicitly.
    • UEFI is enabled by default when the Secure Boot feature is enabled. Disabling Secure Boot does not revert the UEFI flags.

Virtual Machine Network Management

Virtual machine network management involves configuring connectivity for guest VMs through virtual switches and VPCs.

For information about creating or updating a virtual switch and other VM network options, see Network and Security Management in Prism Central Guide. Virtual switch creation and updates are also covered in Network Management in Prism Web Console Guide.

Configuring a Virtual NIC to Operate in Access or Trunk Mode

By default, a virtual NIC on a guest VM operates in access mode. In this mode, the virtual NIC can send and receive traffic only over its own VLAN, which is the VLAN of the virtual network to which it is connected. If restricted to using access mode interfaces, a VM running an application on multiple VLANs (such as a firewall application) must use multiple virtual NICs—one for each VLAN. Instead of configuring multiple virtual NICs in access mode, you can configure a single virtual NIC on the VM to operate in trunk mode. A virtual NIC in trunk mode can send and receive traffic over any number of VLANs in addition to its own VLAN. You can trunk specific VLANs or trunk all VLANs. You can also convert a virtual NIC from the trunk mode to the access mode, in which case the virtual NIC reverts to sending and receiving traffic only over its own VLAN.

About this task

To configure a virtual NIC as an access port or trunk port, do the following:

Procedure

  1. Log on to the CVM with SSH.
  2. Do one of the following:
    1. Create a virtual NIC on the VM and configure the NIC to operate in the required mode.
      nutanix@cvm$ acli vm.nic_create <vm_name> network=network [vlan_mode={kAccess | kTrunked}] [trunked_networks=networks]

      Specify appropriate values for the following parameters:

      • <vm_name>. Name of the VM.
      • network. Name of the virtual network to which you want to connect the virtual NIC.
      • trunked_networks. Comma-separated list of the VLAN IDs that you want to trunk. The parameter is processed only if vlan_mode is set to kTrunked and is ignored if vlan_mode is set to kAccess. To include the default VLAN, VLAN 0, include it in the list of trunked networks. To trunk all VLANs, set vlan_mode to kTrunked and skip this parameter.
      • vlan_mode. Mode in which the virtual NIC must operate. Set the parameter to kAccess for access mode and to kTrunked for trunk mode. Default: kAccess.
    2. Configure an existing virtual NIC to operate in the required mode.
      nutanix@cvm$ acli vm.nic_update <vm_name> mac_addr update_vlan_trunk_info=true [vlan_mode={kAccess | kTrunked}] [trunked_networks=networks]

      Specify appropriate values for the following parameters:

      • <vm_name>. Name of the VM.
      • mac_addr. MAC address of the virtual NIC to update (the MAC address is used to identify the virtual NIC). Required to update a virtual NIC.
      • update_vlan_trunk_info. Update the VLAN type and list of trunked VLANs. Set update_vlan_trunk_info=true to enable trunked mode. If not specified, the parameter defaults to false and the vlan_mode and trunked_networks parameters are ignored.
        Note: You must set the update_vlan_trunk_info to true. If you do not set this parameter to true, "trunked_networks" are not changed.
      • vlan_mode. Mode in which the virtual NIC must operate. Set the parameter to kAccess for access mode and to kTrunked for trunk mode.
      • trunked_networks. Comma-separated list of the VLAN IDs that you want to trunk. The parameter is processed only if vlan_mode is set to kTrunked and is ignored if vlan_mode is set to kAccess. To include the default VLAN, VLAN 0, include it in the list of trunked networks. To trunk all VLANs, set vlan_mode to kTrunked and skip this parameter.

Virtual Machine Memory and CPU Hot-Plug Configurations

Memory and CPUs are hot-pluggable on guest VMs running on AHV. You can increase the memory allocation and the number of CPUs on your VMs while the VMs are powered on. You can change the number of vCPUs (sockets) while the VMs are powered on. However, you cannot change the number of cores per socket while the VMs are powered on.

Note: You cannot decrease the memory allocation and the number of CPUs on your VMs while the VMs are powered on.

You can change the memory and CPU configuration of your VMs by using the Acropolis CLI (aCLI) (see Managing a VM (AHV) in the Prism Web Console Guide or see Managing a VM (AHV) and Managing a VM (Self Service) in the Prism Central Guide).

See the AHV Guest OS Compatibility Matrix for information about operating systems on which you can hot plug memory and CPUs.

Memory OS Limitations

  1. On Linux operating systems, the Linux kernel might not make the hot-plugged memory online. If the memory is not online, you cannot use the new memory. Perform the following procedure to make the memory online.
    1. Identify the memory block that is offline.

      Display the status of all of the memory.

      $ cat /sys/devices/system/memory/memoryXXX/state 
      

      Display the state of a specific memory block.

      $ grep line /sys/devices/system/memory/*/state 
      
    2. Make the memory online.
      $ echo online > /sys/devices/system/memory/memoryXXX/state 
      
  2. If your VM has CentoOS 7.2 as the guest OS and less than 3 GB memory, hot plugging more memory to that VM so that the final memory is greater than 3 GB, results in a memory-overflow condition. To resolve the issue, restart the guest OS (CentOS 7.2) with the following setting:
    swiotlb=force 
    

CPU OS Limitation

On CentOS operating systems, if the hot-plugged CPUs are not displayed in /proc/cpuinfo, you might have to bring the CPUs online. For each hot-plugged CPU, run the following command to bring the CPU online.

$ echo 1 > /sys/devices/system/cpu/cpu<n>/online  

Replace <n> with the number of the hot plugged CPU.

Hot-Plugging the Memory and CPUs on Virtual Machines (AHV)

About this task

Perform the following procedure to hot plug the memory and CPUs on the AHV VMs.

Procedure

  1. Log on the Controller VM with SSH.
  2. Update the memory allocation for the VM.
    nutanix@cvm$ acli vm.update vm-name memory=new_memory_size 
    

    Replace vm-name with the name of the VM and new_memory_size with the memory size.

  3. Update the number of CPUs on the VM.
    nutanix@cvm$ acli vm.update vm-name num_vcpus=n 
    

    Replace vm-name with the name of the VM and n with the number of CPUs.

    Note: After you upgrade from a hot-plug unsupported version to the hot-plug supported version, you must power cycle the VM that was instantiated and powered on before the upgrade, so that it is compatible with the memory and CPU hot-plug feature. This power-cycle has to be done only once after the upgrade. New VMs created on the supported version shall have the hot-plug compatibility by default.

Virtual Machine Memory Management (vNUMA)

AHV hosts support Virtual Non-uniform Memory Access (vNUMA) on virtual machines. You can enable vNUMA on VMs when you create or modify the VMs to optimize memory performance.

Non-uniform Memory Access (NUMA)

In a NUMA topology, the memory access times of a VM depend on the memory location relative to a processor. A VM accesses memory local to a processor faster than the non-local memory. If the VM uses both CPU and memory from the same physical NUMA node, you can achieve optimal resource utilization. If you are running the CPU on one NUMA node (for example, node 0) and the VM accesses the memory from another node (node 1) then memory latency occurs. Ensure that the virtual topology of VMs matches the physical hardware topology to achieve minimum memory latency.

Virtual Non-uniform Memory Access (vNUMA)

vNUMA optimizes the memory performance of virtual machines that require more vCPUs or memory than the capacity of a single physical NUMA node. In a vNUMA topology, you can create multiple vNUMA nodes where each vNUMA node includes vCPUs and virtual RAM. When you assign a vNUMA node to a physical NUMA node, the vCPUs can intelligently determine the memory latency (high or low). Low memory latency within a vNUMA node results in low latency in the physical NUMA node as well.

vNUMA vCPU hard-pinning

When you configure NUMA and hyper-threading, you ensure that the VM can schedule on virtual peers. You also expose the NUMA topology to the VM. While this configuration helps you limit the amount of memory that is available to each virtual NUMA node, the distribution underneath, in the hardware, still occurs randomly.

Enable virtual CPU (vCPU) hard-pinning in the topology to define which NUMA node the vCPUs (and hyper-threads or peers) are located on and how much memory that NUMA node has. vCPU hard-pinning also allows you to see a proper mapping of vCPU to CPU set (virtual CPU to physical core or hyper-thread). It ensures that a VM is never scheduled on a different core or peer that is not defined in the hard-pin configuration. It also results in memory being allocated and distributed correctly across the configured mapping,

While vCPU hard-pinning gives a benefit to scheduling operations and memory operations, it also has a couple of caveats.

  • Acropolis Dynamic Scheduling (ADS) is not NUMA aware, so the high availability (HA) process is not NUMA aware. This lack of awareness can lead to potential issues when a host fails.

  • When you start a VM, a process running in the background nullifies the memory pages for a VM. The more memory is allocated to a VM, the longer this process takes. Consider a deployment having 10 VMs: 9 have 4GB RAM and one has 4.5TB RAM. The process runs faster on the 9 VMs with lesser RAM while it takes longer to complete on the VM with more RAM (perhaps a couple of seconds for the VMs with less RAM vs potentially 2 minutes for the VM with more RAM). The potential issue this time lag leads to is: the smaller VMs are already running on a socket, and when trying to power on the large memory VM, that socket or the cores is unavailable. The unavailability could result in a boot failure and error message when starting the VM.

    The workaround is to use affinity rules and ensure that large VMs that have vCPU hard-pinning configured have a failover node available to them, with a different affinity rule for the non-pinned VMs.

For information about configuring vCPU hard-pinning, see Enabling vNUMA on Virtual Machines.

Enabling vNUMA on Virtual Machines

Before you begin

Before you enable vNUMA, see AHV Best Practices Guide under Solutions Documentation.

About this task

Perform the following procedure to enable vNUMA on your VMs running on the AHV hosts.

Procedure

  1. Log on to a Controller VM with SSH.
  2. Check how many NUMA nodes are available on each AHV host in the cluster.
    nutanix@cvm$ hostssh "numactl --hardware"

    The console displays an output similar to the following:

    ============= 10.x.x.x ============
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23
    node 0 size: 128837 MB
    node 0 free: 862 MB
    node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31
    node 1 size: 129021 MB
    node 1 free: 352 MB
    node distances:
    node   0   1
      0:  10  21
      1:  21  10
    ============= 10.x.x.x ============
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
    node 0 size: 128859 MB
    node 0 free: 1076 MB
    node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
    node 1 size: 129000 MB
    node 1 free: 436 MB
    node distances:
    node   0   1
      0:  10  21
      1:  21  10
    ============= 10.x.x.x ============
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
    node 0 size: 128859 MB
    node 0 free: 701 MB
    node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
    node 1 size: 129000 MB
    node 1 free: 357 MB
    node distances:
    node   0   1
      0:  10  21
      1:  21  10
    ============= 10.x.x.x ============
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
    node 0 size: 128838 MB
    node 0 free: 1274 MB
    node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
    node 1 size: 129021 MB
    node 1 free: 424 MB
    node distances:
    node   0   1
      0:  10  21
      1:  21  10
    ============= 10.x.x.x ============
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
    node 0 size: 128837 MB
    node 0 free: 577 MB
    node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
    node 1 size: 129021 MB
    node 1 free: 612 MB
    node distances:
    node   0   1
      0:  10  21
      1:  21  10

    The example output shows that each AHV host has two NUMA nodes.

  3. Do one of the following:
    • Enable vNUMA if you are creating a VM.
      nutanix@cvm$ acli vm.create <vm_name> num_vcpus=x \
      num_cores_per_vcpu=x memory=xG \
      num_vnuma_nodes=x
    • Enable vNUMA if you are modifying an existing VM.
      nutanix@cvm$ acli vm.update <vm_name> \
      num_vnuma_nodes=x
    Replace <vm_name> with the name of the VM on which you want to enable vNUMA or vUMA. Replace x with the values for the following indicated parameters:
    • num_vcpus: Type the number of vCPUs for the VM.
    • num_cores_per_vcpu: Type the number of cores per vCPU.
    • memory: Type the memory in GB for the VM.
    • num_vnuma_nodes: Type the number of vNUMA nodes for the VM.

    For example:

    nutanix@cvm$ acli vm.create test_vm num_vcpus=20 memory=150G num_vnuma_nodes=2

    This command creates a VM with 2 vNUMA nodes, 10 vCPUs and 75 GB memory for each vNUMA node.

What to do next

To configure vCPU hard-pinning on existing VMs, do the following:
nutanix@cvm$ acli vm.update <vm_name> num_vcpus=x
nutanix@cvm$ acli vm.update <vm_name> num_cores_per_vcpu=x
nutanix@cvm$ acli vm.update <vm_name> num_threads_per_core=x
nutanix@cvm$ acli vm.update <vm_name> num_vnuma_nodes=x
nutanix@cvm$ acli vm.update <vm_name> vcpu_hard_pin=true

For example,

nutanix@cvm$ acli vm.update <vm_name> num_vcpus=3
nutanix@cvm$ acli vm.update <vm_name> num_cores_per_vcpu=28
nutanix@cvm$ acli vm.update <vm_name> num_threads_per_core=2
nutanix@cvm$ acli vm.update <vm_name> num_vnuma_nodes=3
nutanix@cvm$ acli vm.update <vm_name> vcpu_hard_pin=true

GPU and vGPU Support

AHV supports GPU-accelerated computing for guest VMs. You can configure either GPU pass-through or a virtual GPU.
Note: You can configure either pass-through or a vGPU for a guest VM but not both.

This guide describes the concepts related to the GPU and vGPU support in AHV. For the configuration procedures, see the Prism Web Console Guide.

For driver installation instructions, see the NVIDIA Grid Host Driver for Nutanix AHV Installation Guide.

Note: VMs with GPU are not migrated to other hosts in the cluster. For more information, see Non-Migratable Hosts.

Supported GPUs

The following GPUs are supported:
Note: These GPUs are supported only by the AHV version that is bundled with the AOS release.
  • NVIDIA® Ampere® A16
  • NVIDIA® Ampere® A30
  • NVIDIA® Ampere® A40
  • NVIDIA® Ampere® A100
  • NVIDIA® Quadro® RTX 6000
  • NVIDIA® Quadro® RTX 8000
  • NVIDIA® Tesla® M10
  • NVIDIA® Tesla® M60
  • NVIDIA® Tesla® P4
  • NVIDIA® Tesla® P40
  • NVIDIA® Tesla® P100
  • NVIDIA® Tesla® T4 16 GB
  • NVIDIA® Tesla® V100 16 GB
  • NVIDIA® Tesla® V100 32 GB
  • NVIDIA® Tesla® V100S 32 GB

GPU Pass-Through for Guest VMs

AHV hosts support GPU pass-through for guest VMs, allowing applications on VMs direct access to GPU resources. The Nutanix user interfaces provide a cluster-wide view of GPUs, allowing you to allocate any available GPU to a VM. You can also allocate multiple GPUs to a VM. However, in a pass-through configuration, only one VM can use a GPU at any given time.

Host Selection Criteria for VMs with GPU Pass-Through

When you power on a VM with GPU pass-through, the VM is started on the host that has the specified GPU, provided that the Acropolis Dynamic Scheduler determines that the host has sufficient resources to run the VM. If the specified GPU is available on more than one host, the Acropolis Dynamic Scheduler ensures that a host with sufficient resources is selected. If sufficient resources are not available on any host with the specified GPU, the VM is not powered on.

If you allocate multiple GPUs to a VM, the VM is started on a host if, in addition to satisfying Acropolis Dynamic Scheduler requirements, the host has all of the GPUs that are specified for the VM.

If you want a VM to always use a GPU on a specific host, configure host affinity for the VM.

Support for Graphics and Compute Modes

AHV supports running GPU cards in either graphics mode or compute mode. If a GPU is running in compute mode, Nutanix user interfaces indicate the mode by appending the string compute to the model name. No string is appended if a GPU is running in the default graphics mode.

Switching Between Graphics and Compute Modes

If you want to change the mode of the firmware on a GPU, put the host in maintenance mode, and then flash the GPU manually by logging on to the AHV host and performing standard procedures as documented for Linux VMs by the vendor of the GPU card.

Typically, you restart the host immediately after you flash the GPU. After restarting the host, redo the GPU configuration on the affected VM, and then start the VM. For example, consider that you want to re-flash an NVIDIA Tesla® M60 GPU that is running in graphics mode. The Prism web console identifies the card as an NVIDIA Tesla M60 GPU. After you re-flash the GPU to run in compute mode and restart the host, redo the GPU configuration on the affected VMs by adding back the GPU, which is now identified as an NVIDIA Tesla M60.compute GPU, and then start the VM.

Supported GPU Cards

For a list of supported GPUs, see Supported GPUs.

Limitations

GPU pass-through support has the following limitations:

  • Live migration of VMs with a GPU configuration is not supported. Live migration of VMs is necessary when the BIOS, BMC, and the hypervisor on the host are being upgraded. During these upgrades, VMs that have a GPU configuration are powered off and then powered on automatically when the node is back up.
  • VM pause and resume are not supported.
  • You cannot hot add VM memory if the VM is using a GPU.
  • Hot add and hot remove support is not available for GPUs.
  • You can change the GPU configuration of a VM only when the VM is turned off.
  • The Prism web console does not support console access for VMs that are configured with GPU pass-through. Before you configure GPU pass-through for a VM, set up an alternative means to access the VM. For example, enable remote access over RDP.

    Removing GPU pass-through from a VM restores console access to the VM through the Prism web console.

Configuring GPU Pass-Through

For information about configuring GPU pass-through for guest VMs, see Creating a VM (AHV) in the "Virtual Machine Management" chapter of the Prism Web Console Guide.

NVIDIA GRID Virtual GPU Support on AHV

AHV supports NVIDIA GRID technology, which enables multiple guest VMs to use the same physical GPU concurrently. Concurrent use is possible by dividing a physical GPU into discrete virtual GPUs (vGPUs) and allocating those vGPUs to guest VMs. Each vGPU has a fixed range of frame buffer and uses all the GPU processing cores in a time-sliced manner.

Virtual GPUs are of different types (vGPU types are also called vGPU profiles). vGPUs differ by the amount of physical GPU resources allocated to them and the class of workload that they target. The number of vGPUs into which a single physical GPU can be divided therefore depends on the vGPU profile that is used on a physical GPU.

Each physical GPU supports more than one vGPU profile, but a physical GPU cannot run multiple vGPU profiles concurrently. After a vGPU of a given profile is created on a physical GPU (that is, after a vGPU is allocated to a VM that is powered on), the GPU is restricted to that vGPU profile until it is freed up completely. To understand this behavior, consider that you configure a VM to use an M60-1Q vGPU. When the VM is powering on, it is allocated an M60-1Q vGPU instance only if a physical GPU that supports M60-1Q is either unused or already running the M60-1Q profile and can accommodate the requested vGPU.

If an entire physical GPU that supports M60-1Q is free at the time the VM is powering on, an M60-1Q vGPU instance is created for the VM on the GPU, and that profile is locked on the GPU. In other words, until the physical GPU is completely freed up again, only M60-1Q vGPU instances can be created on that physical GPU (that is, only VMs configured with M60-1Q vGPUs can use that physical GPU).

Note: NVIDIA does not support Windows Guest VMs on the C-series NVIDIA vGPU types. See the NVIDIA documentation on Virtual GPU software for more information.

NVIDIA Grid Host Drivers and License Installation

To enable guest VMs to use vGPUs on AHV, you must install NVIDIA drivers on the guest VMs, install the NVIDIA GRID host driver on the hypervisor, and set up an NVIDIA GRID License Server.

See the NVIDIA Grid Host Driver for Nutanix AHV Installation Guide for details about the workflow to enable guest VMs to use vGPUs on AHV and the NVIDIA GRID host driver installation instructions.

vGPU Profile Licensing

vGPU profiles are licensed through an NVIDIA GRID license server. The choice of license depends on the type of vGPU that the applications running on the VM require. Licenses are available in various editions, and the vGPU profile that you want might be supported by more than one license edition.

Note: If the specified license is not available on the licensing server, the VM starts up and functions normally, but the vGPU runs with reduced capability.

You must determine the vGPU profile that the VM requires, install an appropriate license on the licensing server, and configure the VM to use that license and vGPU type. For information about licensing for different vGPU types, see the NVIDIA GRID licensing documentation.

Guest VMs check out a license over the network when starting up and return the license when shutting down. As the VM is powering on, it checks out the license from the licensing server. When a license is checked back in, the vGPU is returned to the vGPU resource pool.

When powered on, guest VMs use a vGPU in the same way that they use a physical GPU that is passed through.

Supported GPU Cards

For a list of supported GPUs, see Supported GPUs.

High Availability Support for VMs with vGPUs

Nutanix conditionally supports high availability (HA) of VMs that have NVIDIA GRID vGPUs configured. The cluster does not reserve any specific resources to guarantee High Availability for the VMs with vGPUs. The vGPU VMs are restarted on best effort basis in the event of a node failure. You can restart a VM with vGPUs on another (failover) host which has compatible or identical vGPU resources available. The vGPU profile available on the failover host must be identical to the vGPU profile configured on the VM that needs HA. The system attempts to restart the VM after an event. If the failover host has insufficient memory and vGPU resources for the VM to start, the VM fails to start after failover.

The following conditions are applicable to HA of VMs with vGPUs:

  • Memory is not reserved for the VM on the failover host by the HA process. When the VM fails over, if sufficient memory is not available, the VM cannot power on.
  • vGPU resources are not reserved on the failover host. When the VM fails over, if the required vGPU resources are not available on the failover host, the VM cannot power on.

Limitations for vGPU Support

vGPU support on AHV has the following limitations:

  • You cannot hot-add memory to VMs that have a vGPU.
  • The Prism web console does not support console access for a VM that is configured with multiple vGPUs. The Prism web console supports console access for a VM that is configured with a single vGPU only.

    Before you add multiple vGPUs to a VM, set up an alternative means to access the VM. For example, enable remote access over RDP. For Linux VMs, instead of RDP, use Virtual Network Computing (VNC) or equivalent.

Console Support for VMs with vGPU

Like other VMs, you can access a VMs with a single vGPU using the console. Enable or disable console support for a VM with only one vGPU configured. Enabling console support for a VM with multiple vGPUs is not supported. By default, console support for a vGPU VM is disabled.

See Enabling or Disabling Console Support for vGPU VMs for more information about configuring the support.

Recovery of vGPU Console-enabled VMs

With AHV, you can recover vGPU console-enabled guest VMs efficiently. When you perform DR of vGPU console-enabled guest VMs, the VMs recovers with the vGPU console. The guest VMs fail to recover when you perform cross-hypervisor disaster recovery (CHDR).

For AHV with minimum AOS versions 6.1, 6.0.2.4 and 5.20:

  • vGPU-enabled VMs can be recovered when protected by protection domains in PD-based DR or protection policies in Leap based solutions using asynchronous, NearSync, or Synchronous (Leap only) replications.
    Note: GPU Passthrough is not supported.
  • If both site A and site B have the same GPU boards (and the same assignable vGPU profiles), failovers work seamlessly. However, with protection domains, no additional steps are required. GPU profiles are restored correctly and vGPU console settings persist after recovery. With Leap DR, vGPU console settings do not persist after recovery.
  • If site A and site B have different GPU boards and vGPU profiles, you must manually remove the vGPU profile before you power on the VM in site B.

The vGPU console settings are persistent after recovery and all failovers are supported for the following:

Table 1. Persistent vGPU Console Settings with Failover Support
Recovery using For vGPU enabled AHV VMs
Protection domain based DR Yes
VMware SRM with Nutanix SRA Not applicable

For information about the behavior See the Recovery of vGPU-enabled VMs topic in the Data Protection and Recovery with Prism Element guide.

See Enabling or Disabling Console Support for vGPU VMs for more information about configuring the support.

For SRA and SRM support, see the Nutanix SRA documentation.

ADS support for VMs with vGPUs

AHV supports Acropolis Dynamic Scheduling (ADS) for VMs with vGPUs.

Note: ADS support requires live migration of VMs with vGPU be operational in the cluster. See Live Migration of VMs with vGPUs above for minimum NVIDIA and AOS versions that support live migration of VMs with vGPUs.

When a number of VMs with vGPUs are running on a host and you enable ADS support for the cluster, the Lazan manager invokes VM migration tasks to resolve resource hotspots or fragmentation in the cluster to power on incoming vGPU VMs. The Lazan manager can migrate vGPU-enabled VMs to other hosts in the cluster only if:

  • The other hosts support compatible or identical vGPU resources as the source host (hosting the vGPU-enabled VMs).

  • The host affinity is not set for the vGPU-enabled VM.

For more information about limitations, see Live Migration of VMs with Virtual GPUs and Limitations of Live Migration Support.

For more information about ADS, see Acropolis Dynamic Scheduling in AHV.

Multiple Virtual GPU Support

Prism Central and Prism Element Web Console can deploy VMs with multiple virtual GPU instances. This support harnesses the capabilities of NVIDIA GRID virtual GPU (vGPU) support for multiple vGPU instances for a single VM.

Note: Multiple vGPUs on the same VM are supported on NVIDIA Virtual GPU software version 10.1 (440.53) or later.

You can deploy virtual GPUs of different types. A single physical GPU can be divided into the number of vGPUs depending on the type of vGPU profile that is used on the physical GPU. Each physical GPU on a GPU board supports more than one type of vGPU profile. For example, a Tesla® M60 GPU device provides different types of vGPU profiles like M60-0Q, M60-1Q, M60-2Q, M60-4Q, and M60-8Q.

You can only add multiple vGPUs of the same type of vGPU profile to a single VM. For example, consider that you configure a VM on a Node that has one NVidia Tesla® M60 GPU board. Tesla® M60 provides two physical GPUs, each supporting one M60-8Q (profile) vGPU, thus supporting a total of two M60-8Q vGPUs for the entire host.

For restrictions on configuring multiple vGPUs on the same VM, see Restrictions for Multiple vGPU Support.

For steps to add multiple vGPUs to the same VM, see Creating a VM (AHV) and Adding Multiple vGPUs to a VM in Prism Web Console Guide or Creating a VM through Prism Central (AHV) and Adding Multiple vGPUs to a VM in Prism Central Guide.

Restrictions for Multiple vGPU Support

You can configure multiple vGPUs subject to the following restrictions:

  • All the vGPUs that you assign to one VM must be of the same type. In the aforesaid example, with the Tesla® M60 GPU device, you can assign multiple M60-8Q vGPU profiles. You cannot assign one vGPU of the M60-1Q type and another vGPU of the M60-8Q type.

    Note: You can configure any number of vGPUs of the same type on a VM. However, the cluster calculates a maximum number of vGPUs of the same type per VM. This number is defined as max_instances_per_vm. This number is variable and changes based on the GPU resources available in the cluster and the number of VMs deployed. If the number of vGPUs of a specific type that you configured on a VM exceeds the max_instances_per_vm number, then the VM fails to power on and the following error message is displayed:
    Operation failed: NoHostResources: No host has enough available GPU for VM <name of VM>(UUID of VM).
    You could try reducing the GPU allotment...

    When you configure multiple vGPUs on a VM, after you select the appropriate vGPU type for the first vGPU assignment, Prism (Prism Central and Prism Element Web Console) automatically restricts the selection of vGPU type for subsequent vGPU assignments to the same VM.

    Figure. vGPU Type Restriction Message Click to enlargevGPU Type Restriction Message

    Note:

    You can use CLI (acli) to configure multiple vGPUs of multiple types to the same VM. See Acropolis Command-Line Interface (aCLI) for information about aCLI. Use the vm.gpu_assign <vm.name> gpu=<gpu-type> command multiple times, once for each vGPU, to add multiple vGPUs of multiple types to the same VM.

    See the GPU board and software documentation for information about the combinations of the number and types of vGPUs profiles supported by the GPU resources installed in the cluster. For example, see the NVIDIA Virtual GPU Software Documentation for the vGPU type and number combinations on the Tesla® M60 board.

  • Configure multiple vGPUs only of the highest type using Prism. The highest type of vGPU profile is based on the driver deployed in the cluster. In the aforesaid example, on a Tesla® M60 device, you can only configure multiple vGPUs of the M60-8Q type. Prism prevents you from configuring multiple vGPUs of any other type such as M60-2Q.

    Figure. vGPU Type Restriction Message Click to enlargeMessage showing the restriction of number of vGPUs of specified type.

    Note:

    You can use CLI (acli) to configure multiple vGPUs of other available types. See Acropolis Command-Line Interface (aCLI) for the aCLI information. Use the vm.gpu_assign <vm.name> gpu=<gpu-type> command multiple times, once for each vGPU, to configure multiple vGPUs of other available types.

    See the GPU board and software documentation for more information.

  • Configure either a passthrough GPU or vGPUs on the same VM. You cannot configure both passthrough GPU and vGPUs. Prism automatically disallows such configurations after the first GPU is configured.

  • The VM powers on only if the requested type and number of vGPUs are available in the node.

    In the aforesaid example, the VM, which is configured with two M60-8Q vGPUs, fails to power on if another VM sharing the same GPU board is already using one M60-8Q vGPU. This is because the Tesla® M60 GPU board allows only two M60-8Q vGPUs. Of these, one is already used by another VM. Thus, the VM configured with two M60-8Q vGPUs fails to power on due to unavailability of required vGPUs.

  • Multiple vGPUs on the same VM are supported on NVIDIA Virtual GPU software version 10.1 (440.53) or later. Ensure that the relevant GRID version license is installed and select it when you configure multiple vGPUs.
Adding Multiple vGPUs to the Same VM

About this task

You can add multiple vGPUs of the same vGPU type to:

  • A new VM when you create it.

  • An existing VM when you update it.

Important:

Before you add multiple vGPUs to the VM, see Multiple Virtual GPU Support and Restrictions for Multiple vGPU Support.

After you add the first vGPU, do the following on the Create VM or Update VM dialog box (the main dialog box) to add more vGPUs:

Procedure

  1. Click Add GPU.
  2. In the Add GPU dialog box, click Add.

    The License field is grayed out because you cannot select a different license when you add a vGPU for the same VM.

    The VGPU Profile is also auto-selected because you can only select the additional vGPU of the same vGPU type as indicated by the message at the top of the dialog box.

    Figure. Add GPU for multiple vGPUs Click to enlargeAdding multiple vGPU

  3. In the main dialog box, you see the newly added vGPU.
    Figure. New vGPUs Added Click to enlargeMultiple vGPUs Added

  4. Repeat the steps for each vGPU addition you want to make.

Live Migration of VMs with Virtual GPUs

You can perform live migration of VMs enabled with virtual GPUs (vGPU-enabled VM). The primary advantage of the live migration support is that unproductive downtime is avoided. Therefore, your vGPUs can continue to run while the VMs that are running the vGPUs are seamlessly migrated in the background. With very low stun times, as a graphics user, you barely notice the migration.

Note: Live migration of VMs with vGPUs is supported for vGPUs created with minimum NVIDIA Virtual GPU software version 10.1 (440.53).
Table 1. Minimum Versions
Component Supports With Minimum Version
AOS Live migration within the same cluster 5.18.1
AHV Live migration within the same cluster 20190916.294
AOS Live migration across cluster 6.1
AHV Live migration across cluster 20201105.30142
Important: In an HA event involving any GPU node, the node locality of the affected vGPU VMs is not restored after GPU node recovery. The affected vGPU VMs are not migrated back to their original GPU host intentionally to avoid extended VM stun time expected while migrating vGPU frame buffer. If vGPU VM node locality is required, migrate the affected vGPU VMs to the desired host manually. For information about the steps to migrate a live VM with vGPUs, see Migrating Live a VM with Virtual GPUs in the Prism Central Guide and the Prism Web Console Guide.
Note:

Important frame buffer and VM stun time considerations are:

  • The GPU board (for example, NVIDIA Tesla M60) vendor provides the information for maximum frame buffer size of vGPU types (for example, M60-8Q type) that can be configured on VMs. However, the actual frame buffer usage may be lower than the maximum sizes.

  • The VM stun time depends on the number of vGPUs configured on the VM being migrated. Stun time may be longer in case of multiple vGPUs operating on the VM.

    The stun time also depends on the network factors such bandwidth available for use during the migration.

For information about the limitations applicable to the live migration support, see Limitations of Live Migration Support and Restrictions for Multiple vGPU Support.

For information about the steps to migrate live a VM with vGPUs, see Migrating Live a VM with Virtual GPUs in the Prism Central Guide and Migrating Live a VM with Virtual GPUs in the Prism Web Console Guide.

Limitations of Live Migration Support
  • Live migration is supported for VMs configured with single or multiple virtual GPUs. It is not supported for VMs configured with passthrough GPUs.

  • The target cluster for the migration must have adequate and available GPU resources, with the same vGPU types as configured for the VMs to be migrated, to support the vGPUs on the VMs that need to be migrated.

    See Restrictions for Multiple vGPU Support for more details.

  • The VMs with vGPUs that need to be migrated live cannot be protected with high availability.
  • Ensure that the VM is not powered off.
  • Ensure that you have the right GPU software license that supports live migration of vGPUs. The source and target clusters must have the same license type. You require an appropriate license of NVIDIA GRID software version. See Live Migration of VMs with Virtual GPUs for minimum license requirements.

Enabling or Disabling Console Support for vGPU VMs

About this task

Enable or disable console support for a VM with only one vGPU configured. Enabling console support for a VM with multiple vGPUs is not supported. By default, console support for a vGPU VM is disabled.

To enable or disable console support for each VM with vGPUs, do the following:

Procedure

  1. Run the following aCLI command to check if console support is enabled or disabled for the VM with vGPUs.
    acli> vm.get vm-name

    Where vm-name is the name of the VM for which you want to check the console support status.

    The step result includes the following parameter for the specified VM:
    gpu_console=False

    Where False indicates that console support is not enabled for the VM. This parameter is displayed as True when you enable console support for the VM. The default value for gpu_console= is False since console support is disabled by default.

    Note: The console may not display the gpu_console parameter in the output of the vm.get command if the gpu_console parameter was not previously enabled.
  2. Run the following aCLI command to enable or disable console support for the VM with vGPU:
    vm.update vm-name gpu_console=true | false

    Where:

    • true—indicates that you are enabling console support for the VM with vGPU.
    • false—indicates that you are disabling console support for the VM with vGPU.
  3. Run the vm.get command to check if gpu_console value is true indicating that console support is enabled or false indicating that console support is disabled as you configured it.

    If the value indicated in the vm.get command output is not what is expected, then perform Guest Shutdown of the VM with vGPU. Next, run the vm.on vm-name aCLI command to turn the VM on again. Then run vm.get command and check the gpu_console= value.

  4. Click a VM name in the VM table view to open the VM details page. Click Launch Console.
    The Console opens but only a black screen is displayed.
  5. Click on the console screen. Click one of the following key combinations based on the operating system you are accessing the cluster from.
    • For Apple Mac OS: Control+Command+2
    • For MS Windows: Ctrl+Alt+2
    The console is fully enabled and displays the content.

PXE Configuration for AHV VMs

You can configure a VM to boot over the network in a Preboot eXecution Environment (PXE). Booting over the network is called PXE booting and does not require the use of installation media. When starting up, a PXE-enabled VM communicates with a DHCP server to obtain information about the boot file it requires.

Configuring PXE boot for an AHV VM involves performing the following steps:

  • Configuring the VM to boot over the network.
  • Configuring the PXE environment.

The procedure for configuring a VM to boot over the network is the same for managed and unmanaged networks. The procedure for configuring the PXE environment differs for the two network types, as follows:

  • An unmanaged network does not perform IPAM functions and gives VMs direct access to an external Ethernet network. Therefore, the procedure for configuring the PXE environment for AHV VMs is the same as for a physical machine or a VM that is running on any other hypervisor. VMs obtain boot file information from the DHCP or PXE server on the external network.
  • A managed network intercepts DHCP requests from AHV VMs and performs IP address management (IPAM) functions for the VMs. Therefore, you must add a TFTP server and the required boot file information to the configuration of the managed network. VMs obtain boot file information from this configuration.

A VM that is configured to use PXE boot boots over the network on subsequent restarts until the boot order of the VM is changed.

Configuring the PXE Environment for AHV VMs

The procedure for configuring the PXE environment for a VM on an unmanaged network is similar to the procedure for configuring a PXE environment for a physical machine on the external network and is beyond the scope of this document. This procedure configures a PXE environment for a VM in a managed network on an AHV host.

About this task

To configure a PXE environment for a VM on a managed network on an AHV host, do the following:

Procedure

  1. Log on to the Prism web console, click the gear icon, and then click Network Configuration in the menu.
  2. On Network Configuration > Subnets tab, click the Edit action link of the network for which you want to configure a PXE environment.
    The VMs that require the PXE boot information must be on this network.
  3. In the Update Subnet dialog box:
    1. Select the Enable IP address management check box and complete the following configurations:
      • In the Network IP Prefix field, enter the network IP address, with prefix, of the subnet that you are updating.
      • In the Gateway IP Address field, enter the gateway IP address of the subnet that you are updating.
      • To provide DHCP settings for the VM, select the DHCP Settings check box and provide the following information.
Fields Description and Values
Domain Name Servers

Provide a comma-separated list of DNS IP addresses.

Example: 8.8.8.8, or 9.9.9.9

Domain Search

Enter the VLAN domain name. Use only the domain name format.

Example: nutanix.com

TFTP Server Name

Enter a valid TFTP host server name of the TFTP server where you host the host boot file. The IP address of the TFTP server must be accessible to the virtual machines to download a boot file.

Example: tftp_vlan103

Boot File Name

The name of the boot file that the VMs need to download from the TFTP host server.

Example: boot_ahv202010

  1. Under IP Address Pools, click Create Pool to add IP address pools for the subnet.

    (Mandatory for Overlay type subnets) This section provides the Network IP Prefix and Gateway IP fields for the subnet.

    (Optional for VLAN type subnet) Check this box to display the Network IP Prefix and Gateway IP fields and configure the IP address details.

  2. (Optional and for VLAN networks only) Check the Override DHCP Server dialog box and enter an IP address in the DHCP Server IP Address field.

    You can configure a DHCP server using the Override DHCP Server option only in case of VLAN networks.

    The DHCP Server IP address (reserved IP address for the Acropolis DHCP server) is visible only to VMs on this network and responds only to DHCP requests. If this box is not checked, the DHCP Server IP Address field is not displayed and the DHCP server IP address is generated automatically. The automatically generated address is network_IP_address_subnet.254, or if the default gateway is using that address, network_IP_address_subnet.253.

    Usually the default DHCP server IP is configured as the last usable IP in the subnet (For eg., its 10.0.0.254 for 10.0.0.0/24 subnet). If you want to use a different IP address in the subnet as the DHCP server IP, use the override option.

  3. Click Close.

Configuring a VM to Boot over a Network

To enable a VM to boot over the network, update the VM's boot device setting. Currently, the only user interface that enables you to perform this task is the Acropolis CLI (aCLI).

About this task

To configure a VM to boot from the network, do the following:

Procedure

  1. Log on to any CVM in the cluster with SSH.
  2. Create a VM.
    
    nutanix@cvm$ acli vm.create vm num_vcpus=num_vcpus memory=memory

    Replace vm with a name for the VM, and replace num_vcpus and memory with the number of vCPUs and amount of memory that you want to assign to the VM, respectively.

    For example, create a VM named nw-boot-vm.

    nutanix@cvm$ acli vm.create nw-boot-vm num_vcpus=1 memory=512
  3. Create a virtual interface for the VM and place it on a network.
    nutanix@cvm$ acli vm.nic_create vm network=network

    Replace vm with the name of the VM and replace network with the name of the network. If the network is an unmanaged network, make sure that a DHCP server and the boot file that the VM requires are available on the network. If the network is a managed network, configure the DHCP server to provide TFTP server and boot file information to the VM. See Configuring the PXE Environment for AHV VMs.

    For example, create a virtual interface for VM nw-boot-vm and place it on a network named network1.

    nutanix@cvm$ acli vm.nic_create nw-boot-vm network=network1
  4. Obtain the MAC address of the virtual interface.
    nutanix@cvm$ acli vm.nic_list vm

    Replace vm with the name of the VM.

    For example, obtain the MAC address of VM nw-boot-vm.

    nutanix@cvm$ acli vm.nic_list nw-boot-vm
    00-00-5E-00-53-FF
  5. Update the boot device setting so that the VM boots over the network.
    nutanix@cvm$ acli vm.update_boot_device vm mac_addr=mac_addr

    Replace vm with the name of the VM and mac_addr with the MAC address of the virtual interface that the VM must use to boot over the network.

    For example, update the boot device setting of the VM named nw-boot-vm so that the VM uses the virtual interface with MAC address 00-00-5E-00-53-FF.

    nutanix@cvm$ acli vm.update_boot_device nw-boot-vm mac_addr=00-00-5E-00-53-FF
  6. Power on the VM.
    nutanix@cvm$ acli vm.on vm_list [host="host"]

    Replace vm_list with the name of the VM. Replace host with the name of the host on which you want to start the VM.

    For example, start the VM named nw-boot-vm on a host named host-1.

    nutanix@cvm$ acli vm.on nw-boot-vm host="host-1"

Uploading Files to DSF for Microsoft Windows Users

If you are a Microsoft Windows user, you can securely upload files to DSF by using the following procedure.

Procedure

  1. Use WinSCP, with SFTP selected, to connect to Controller VM through port 2222 and start browsing the DSF datastore.
    Note: The root directory displays storage containers and you cannot change it. You can only upload files to one of the storage containers and not directly to the root directory. To create or delete storage containers, you can use the Prism user interface.
  2. Authenticate by using Prism username and password or, for advanced users, use the public key that is managed through the Prism cluster lockdown user interface.

Enabling Load Balancing of vDisks in a Volume Group

AHV hosts support load balancing of vDisks in a volume group for guest VMs. Load balancing of vDisks in a volume group enables IO-intensive VMs to use the storage capabilities of multiple Controller VMs (CVMs).

About this task

If you enable load balancing on a volume group, the guest VM communicates directly with each CVM hosting a vDisk. Each vDisk is served by a single CVM. Therefore, to use the storage capabilities of multiple CVMs, create more than one vDisk for a file system and use the OS-level striped volumes to spread the workload. This configuration improves performance and prevents storage bottlenecks.

Note:
  • vDisk load balancing is disabled by default for volume groups that are directly attached to VMs.

    However, vDisk load balancing is enabled by default for volume groups that are attached to VMs by using a data services IP address.

  • If you use web console to clone a volume group that has load balancing enabled, the volume group clone does not have load balancing enabled by default. To enable load balancing on the volume group clone, you must set the load_balance_vm_attachments parameter to true using acli or Rest API.
  • You can attach a maximum number of 10 load balanced volume groups per guest VM.
  • For Linux VMs, ensure that the SCSI device timeout is 60 seconds. For information about how to check and modify the SCSI device timeout, see the Red Hat documentation at https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/online_storage_reconfiguration_guide/task_controlling-scsi-command-timer-onlining-devices.

Perform the following procedure to enable load balancing of vDisks by using aCLI.

Procedure

  1. Log on to a Controller VM with SSH.
  2. Do one of the following:
    • Enable vDisk load balancing if you are creating a volume group.
      nutanix@cvm$ acli vg.create vg_name load_balance_vm_attachments=true

      Replace vg_name with the name of the volume group.

    • Enable vDisk load balancing if you are updating an existing volume group.
      nutanix@cvm$ acli vg.update vg_name load_balance_vm_attachments=true

      Replace vg_name with the name of the volume group.

      Note: To modify an existing volume group, you must first detach all the VMs that are attached to that volume group before you enable vDisk load balancing.
  3. Verify if vDisk load balancing is enabled.
    nutanix@cvm$ acli vg.get vg_name

    An output similar to the following is displayed:

    nutanix@cvm$ acli vg.get ERA_DB_VG_xxxxxxxx
    ERA_DB_VG_xxxxxxxx {
      attachment_list {
        vm_uuid: "xxxxx"
    .
    .
    .
    .
    iscsi_target_name: "xxxxx"
    load_balance_vm_attachments: True
    logical_timestamp: 4
    name: "ERA_DB_VG_xxxxxxxx"
    shared: True
    uuid: "xxxxxx"
    }

    If vDisk load balancing is enabled on a volume group, load_balance_vm_attachments: True is displayed in the output. The output does not display the load_balance_vm_attachments: parameter at all if vDisk load balancing is disabled.

  4. (Optional) Disable vDisk load balancing.
    nutanix@cvm$ acli vg.update vg_name load_balance_vm_attachments=false

    Replace vg_name with the name of the volume group.

Viewing list of restarted VMs after an HA event

This section provides the information about how to view the list of VMs that are restarted after an HA event in the AHV cluster.

About this task

If an AHV host becomes inaccessible or fails due to some unplanned event, the AOS restarts the VMs across the remaining hosts in the cluster.

To view the list of restarted VMs after an HA event:

Procedure

  1. Log in to Prism Central or Prism web console.
  2. View the list of restarted VMs on either of the following page:
    • Events page:
      1. Navigate to Activity > Events from the entities menu to access the Events page in Prism Central.

        Navigate to Alerts > Events from the main menu to access the Events page in the Prism web console.

      2. Locate or search for the following string, and hover over or click the string:

        VMs restarted due to HA failover

        The system displays the list of restarted VMs in the Summary page and as a hover text for the selected event.

        For example:

        VMs restarted due to HA failover: <VM_Name1>, <VM_Name2>, <VM_Name3>, <VM_Name4>. VMs were running on host X.X.X.1 prior to HA.

      Observe <VM_Name1>, <VM_Name2>, <VM_Name3>, and <VM_Name4> as the actual VMs in your cluster.

    • Tasks page:
      1. Navigate to Activity > Tasks from the entities menu to access the Tasks page in Prism Central.

        Navigate to Tasks from the main menu to access the Tasks page in the Prism web console.

      2. Locate or search for the following task, and click Details:

        HA failover

        The system displays a list of related tasks for the HA failover event.

      3. Locate or search for the following related task, and click Details:

        Host restart all VMs

        The system displays Restart VM group task for the HA failover event.

      4. In the Entity Affected column, click Details, or hover over the VMs text for Restart VM group task:

      The system displays the list of restarted VMs:

      Figure. List of restarted VMs Click to enlargeThis figure shows the list of restarted VMs.

Live vDisk Migration Across Storage Containers

vDisk migration allows you to change the container of a vDisk. You can migrate vDisks across storage containers while they are attached to guest VMs without the need to shut down or delete VMs (live migration). You can either migrate all vDisks attached to a VM or migrate specific vDisks to another container.

In a Nutanix solution, you group vDisks into storage containers and attach vDisks to guest VMs. AOS applies storage policies such as replication factor, encryption, compression, deduplication, and erasure coding at the storage container level. If you apply a storage policy to a storage container, AOS enables that policy on all the vDisks of the container. If you want to change the policies of the vDisks (for example, from RF2 to RF3), create another container with a different policy and move the vDisk to that container. With live migration of vDisks across containers, you can migrate vDisk across containers even if those vDisks are attached to a live VM. Thus, live migration of vDisks across storage containers enables you to efficiently manage storage policies for guest VMs.

General Considerations

You cannot migrate images or volume groups.

You cannot perform the following operations during an ongoing vDisk migration:

  • Clone the VM
  • Resize the VM
  • Take a snapshot
Note: During vDisk migration, the logical usage of a vDisk is more than the total capacity of the vDisk. The issue occurs because the logical usage of the vDisk includes the space occupied in both the source and destination containers. Once the migration is complete, the logical usage of the vDisk returns to its normal value.

Migration of vDisks stalls if sufficient storage space is not available in the target storage container. Ensure that the target container has sufficient storage space before you begin migration.

Disaster Recovery Considerations

Consider the following points if you have a disaster recovery and backup setup:

  • You cannot migrate vDisks of a VM that is protected by a protection domain or protection policy. When you start the migration, ensure that the VM is not protected by a protection domain or protection policy. If you want to migrate vDisks of such a VM, do the following:
    • Remove the VM from the protection domain or protection policy.
    • Migrate the vDisks to the target container.
    • Add the VM back to the protection domain or protection policy.
    • Configure the remote site with the details of the new container.

    vDisk migration fails if the VM is protected by a protection domain or protection policy.

  • If you are using a third-party backup solution, AOS temporarily blocks snapshot operations for a VM if vDisk migration is in progress for that VM.

Migrating a vDisk to Another Container

You can either migrate all vDisks attached to a VM or migrate specific vDisks to another container.

About this task

Perform the following procedure to migrate vDisks across storage containers:

Procedure

  1. Log on to a CVM in the cluster with SSH.
  2. Do one of the following:
    • Migrate all vDisks of a VM to the target storage container.
      nutanix@cvm$ acli vm.update_container vm-name container=target-container wait=false

      Replace vm-name with the name of the VM whose vDisks you want to migrate and target-container with the name of the target container.

    • Migrate specific vDisks by using either the UUID of the vDisk or address of the vDisk.

      Migrate specific vDisks by using the UUID of the vDisk.

      nutanix@cvm$ acli vm.update_container vm-name device_uuid_list=device_uuid container=target-container wait=false

      Replace vm-name with the name of VM, device_uuid with the device UUID of the vDisk, and target-container with the name of the target storage container.

      Run nutanix@cvm$ acli vm.get <vm-name> to determine the device UUID of the vDisk.

      You can migrate multiple vDisks at a time by specifying a comma-separated list of device UUIDs of the vDisks.

      Alternatively, you can migrate vDisks by using the address of the vDisk.

      nutanix@cvm$ acli vm.update_container vm-name disk_addr_list=disk-address container=target-container wait=false

      Replace vm-name with the name of VM, disk-address with the address of the disk, and target-container with the name of the target storage container.

      Run nutanix@cvm$ acli vm.get <vm-name> to determine the address of the vDisk.

      Following is the format of the vDisk address:

      bus.index

      Following is a section of the output of the acli vm.get vm-name command:

      disk_list {
           addr {
             bus: "scsi"
             index: 0
           }

      Combine the values of bus and index as shown in the following example:

      nutanix@cvm$ acli vm.update_container TestUVM_1 disk_addr_list=scsi.0 container=test-container-17475

      You can migrate multiple vDisks at a time by specifying a comma-separated list of vDisk addresses.

  3. Check the status of the migration in the Tasks menu of the Prism Element web console.
  4. (Optional) Cancel the migration if you no longer want to proceed with it.
    nutanix@cvm$ ecli task.cancel task_list=task-ID

    Replace task-ID with the ID of the migration task.

    Determine the task ID as follows:

    nutanix@cvm$ ecli task.list

    In the Type column of the tasks list, look for VmChangeDiskContainer.

    VmChangeDiskContainer indicates that it is a vDisk migration task. Note the ID of such a task.

    Note: Note the following points about canceling migration:
    • If you cancel an ongoing migration, AOS retains the vDisks that have not yet been migrated in the source container. AOS does not migrate vDisks that have already been migrated to the target container back to the source container.
    • If sufficient storage space is not available in the original storage container, migration of vDisks back to the original container stalls. To resolve the issue, ensure that the source container has sufficient storage space.

OVAs

An Open Virtual Appliance (OVA) file is a tar archive file created by converting a virtual machine (VM) into an Open Virtualization Format (OVF) package for easy distribution and deployment. OVA helps you to quickly create, move or deploy VMs on different hypervisors.

Prism Central helps you perform the following operations with OVAs:

  • Export an AHV VM as an OVA file.

  • Upload OVAs of VMs or virtual appliances (vApps). You can import (upload) an OVA file with the QCOW2 or VMDK disk formats from a URL or the local machine.

  • Deploy an OVA file as a VM.

  • Download an OVA file to your local machine.

  • Rename an OVA file.

  • Delete an OVA file.

  • Track or monitor the tasks associated with OVA operations in Tasks.

The access to OVA operations is based on your role. See Role Details View in the Prism Central Guide to check if your role allows you to perform the OVA operations.

For information about:

OVA Restrictions

You can perform the OVA operations subject to the following restrictions:

  • Export to or upload OVAs with one of the following disk formats:
    • QCOW2: Default disk format auto-selected in the Export as OVA dialog box.
    • VMDK: Deselect QCOW2 and select VMDK, if required, before you submit the VM export request when you export a VM.
    • When you export a VM or upload an OVA and the VM or OVA does not have any disks, the disk format is irrelevant.
  • Upload an OVA to multiple clusters using a URL as the source for the OVA. You can upload an OVA only to a single cluster when you use the local OVA File source.
  • Perform the OVA operations only with appropriate permissions. You can run the OVA operations that you have permissions for, based on your assigned user role.
  • The OVA that results from exporting a VM on AHV is compatible with any AHV version 5.18 or later.
  • The minimum supported versions for performing OVA operations are AOS 5.18, Prism Central 2020.8, and AHV-20190916.253.
Read article
AHV Administration Guide

AHV 6.5

Product Release Date: 2022-07-25

Last updated: 2022-12-15

AHV Overview

As the default option for Nutanix HCI, the native Nutanix hypervisor, AHV, represents a unique approach to virtualization that offers the powerful virtualization capabilities needed to deploy and manage enterprise applications. AHV compliments the HCI value by integrating native virtualization along with networking, infrastructure, and operations management with a single intuitive interface - Nutanix Prism.

Virtualization teams find AHV easy to learn and transition to from legacy virtualization solutions with familiar workflows for VM operations, live migration, VM high availability, and virtual network management. AHV includes resiliency features, including high availability and dynamic scheduling without the need for additional licensing, and security is integral to every aspect of the system from the ground up. AHV also incorporates the optional Flow Security and Networking, allowing easy access to hypervisor-based network microsegmentation and advanced software-defined networking.

See the Field Installation Guide for information about how to deploy and create a cluster. Once you create the cluster by using Foundation, you can use this guide to perform day-to-day management tasks.

AOS and AHV Compatibility

For information about the AOS and AHV compatibility with this release, see Compatibility and Interoperability Matrix.

Minimum Field Requirements for Nutanix Cloud Infrastructure (NCI)

For information about minimum field requirements for NCI, see Minimum Field Requirements for Nutanix Cloud Infrastructure (NCI) topic in Acropolis Advanced Administration Guide.

Limitations

For information about AHV configuration limitations, see Nutanix Configuration Maximums webpage.

Nested Virtualization

Nutanix does not support nested virtualization (nested VMs) in an AHV cluster.

Storage Overview

AHV uses a Distributed Storage Fabric to deliver data services such as storage provisioning, snapshots, clones, and data protection to VMs directly.

In AHV clusters, AOS passes all disks to the VMs as raw SCSI block devices. By that means, the I/O path is lightweight and optimized. Each AHV host runs an iSCSI redirector, which establishes a highly resilient storage path from each VM to storage across the Nutanix cluster.

QEMU is configured with the iSCSI redirector as the iSCSI target portal. Upon a login request, the redirector performs an iSCSI login redirect to a healthy Stargate (preferably the local one).

Figure. AHV Storage Click to enlarge

AHV Turbo

AHV Turbo represents significant advances to the data path in AHV. AHV Turbo provides an I/O path that bypasses QEMU and services storage I/O requests, which lowers CPU usage and increases the amount of storage I/O available to VMs.

AHV Turbo represents significant advances to the data path in AHV.

When you use QEMU, all I/O travels through a single queue that can impact system performance. AHV Turbo provides an I/O path that uses the multi-queue approach to bypasses QEMU. The multi-queue approach allows the data to flow from a VM to the storage more efficiently. This results in a much higher I/O capacity and lower CPU usage. The storage queues automatically scale out to match the number of vCPUs configured for a given VM, and results in a higher performance as the workload scales up.

AHV Turbo is transparent to VMs and is enabled by default on VMs that runs in AHV clusters. For maximum VM performance, ensure that the following conditions are met:

  • The latest Nutanix VirtIO package is installed for Windows VMs. For information on how to download and install the latest VirtIO package, see Installing or Upgrading Nutanix VirtIO for Windows.
    Note: No additional configuration is required at this stage.
  • The VM has more than one vCPU.
  • The workloads are multi-threaded.
Note: Multi-queue is enabled by default in current Linux distributions. For details, refer your vendor-specific documentation for Linux distribution.
In addition to multi-queue approach for storage I/O, you can also achieve the maximum network I/O performance using the multi-queue approach for any vNICs in the system. For information about how to enable multi-queue and set an optimum number of queues, see Enabling RSS Virtio-Net Multi-Queue by Increasing the Number of VNIC Queues.
Note: Ensure that the guest operating system fully supports multi-queue before you enable it. For details, refer your vendor-specific documentation for Linux distribution.

Acropolis Dynamic Scheduling in AHV

Acropolis Dynamic Scheduling (ADS) proactively monitors your cluster for any compute and storage I/O contentions or hotspots over a period of time. If ADS detects a problem, ADS creates a migration plan that eliminates hotspots in the cluster by migrating VMs from one host to another.

You can monitor VM migration tasks from the Task dashboard of the Prism Element web console.

Following are the advantages of ADS:

  • ADS improves the initial placement of the VMs depending on the VM configuration.
  • Nutanix Volumes uses ADS for balancing sessions of the externally available iSCSI targets.
Note: ADS honors all the configured host affinities, VM-host affinities, VM-VM antiaffinity policies, and HA policies.

By default, ADS is enabled and Nutanix recommends you keep this feature enabled. However, see Disabling Acropolis Dynamic Scheduling for information about how to disable the ADS feature. See Enabling Acropolis Dynamic Scheduling for information about how to enable the ADS feature if you previously disabled the feature.

ADS monitors the following resources:

  • VM CPU Utilization: Total CPU usage of each guest VM.
  • Storage CPU Utilization: Storage controller (Stargate) CPU usage per VM or iSCSI target

ADS does not monitor memory and networking usage.

How Acropolis Dynamic Scheduling Works

Lazan is the ADS service in an AHV cluster. AOS selects a Lazan manager and Lazan solver among the hosts in the cluster to effectively manage ADS operations.

ADS performs the following tasks to resolve compute and storage I/O contentions or hotspots:

  • The Lazan manager gathers statistics from the components it monitors.
  • The Lazan solver (runner) checks the statistics for potential anomalies and determines how to resolve them, if possible.
  • The Lazan manager invokes the tasks (for example, VM migrations) to resolve the situation.
Note:
  • During migration, a VM consumes resources on both the source and destination hosts as the High Availability (HA) reservation algorithm must protect the VM on both hosts. If a migration fails due to lack of free resources, turn off some VMs so that migration is possible.
  • If a problem is detected and ADS cannot solve the issue (for example, because of limited CPU or storage resources), the migration plan might fail. In these cases, an alert is generated. Monitor these alerts from the Alerts dashboard of the Prism Element web console and take necessary remedial actions.
  • If the host, firmware, or AOS upgrade is in progress and if any resource contention occurs during the upgrade period, ADS does not perform any resource contention rebalancing.

When Is a Hotspot Detected?

Lazan runs every 15 minutes and analyzes the resource usage for at least that period of time. If the resource utilization of an AHV host remains >85% for the span of 15 minutes, Lazan triggers migration tasks to remove the hotspot.

Note: For a storage hotspot, ADS looks at the last 40 minutes of data and uses a smoothing algorithm to use the most recent data. For a CPU hotspot, ADS looks at the last 10 minutes of data only, that is, the average CPU usage over the last 10 minutes.

Following are the possible reasons if there is an obvious hotspot, but the VMs did not migrate:

  • Lazan cannot resolve a hotspot. For example:
    • If there is a huge VM (16 vCPUs) at 100% usage, and accounts for 75% of the AHV host usage (which is also at 100% usage).
    • The other hosts are loaded at ~ 40% usage.

    In these situations, the other hosts cannot accommodate the large VM without causing contention there as well. Lazan does not prioritize one host or VM over others for contention, so it leaves the VM where it is hosted.

  • Number of all-flash nodes in the cluster is less than the replication factor.

    If the cluster has an RF2 configuration, the cluster must have a minimum of two all-flash nodes for successful migration of VMs on all the all-flash nodes.

Migrations Audit

Prism Central displays the list of all the VM migration operations generated by ADS. In Prism Central, go to Menu -> Activity -> Audits to display the VM migrations list. You can filter the migrations by clicking Filters and selecting Migrate in the Operation Type tab. The list displays all the VM migration tasks created by ADS with details such as the source and target host, VM name, and time of migration.

Disabling Acropolis Dynamic Scheduling

Perform the procedure described in this topic to disable ADS. Nutanix recommends you keep ADS enabled.

Procedure

  1. Log on to a Controller VM in your cluster with SSH.
  2. Disable ADS.
    nutanix@cvm$ acli ads.update enable=false

    No action is taken by ADS to solve the contentions after you disable the ADS feature. You must manually take the remedial actions or you can enable the feature.

Enabling Acropolis Dynamic Scheduling

If you have disabled the ADS feature and want to enable the feature, perform the following procedure.

Procedure

  1. Log onto a Controller VM in your cluster with SSH.
  2. Enable ADS.
    nutanix@cvm$ acli ads.update enable=true

Virtualization Management Web Console Interface

You can manage the virtualization management features by using the Prism GUI (Prism Element and Prism Central web consoles).

You can do the following by using the Prism web consoles:

  • Configure network connections
  • Create virtual machines
  • Manage virtual machines (launch console, start/shut down, take snapshots, migrate, clone, update, and delete)
  • Monitor virtual machines
  • Enable VM high availability

See Prism Web Console Guide and Prism Central Guide for more information.

Finding the AHV Version on Prism Element

You can see the installed AHV version in the Prism Element web console.

About this task

To view the AHV version installed on the host, do the following.

Procedure

  1. Log on to Prism Web Console
  2. The Hypervisor Summary widget widget on the top left side of the Home page displays the AHV version.
    Figure. LCM Page Displays AHV Version Click to enlargeDisplaying the LCM page which shows the AHV version installed.

Finding the AHV Version on Prism Central

You can see the installed AHV version in the Prism Central console.

About this task

To view the AHV version installed on any host in the clusters managed by the Prism Central, do the following.
Video: Finding the AHV Version on Prism Central

Procedure

  1. Log on to Prism Central.
  2. In side bar, select Hardware > Hosts > Summary tab.
  3. Click the host you want to see the hypervisor version for.
  4. The Host detail view page displays the Properties widget that lists the Hypervisor Version.
    Figure. Hypervisor Version in Host Detail View Click to enlargeDisplaying the Host details page showing the Hypervisor Version.

Node Management

Nonconfigurable AHV Components

The components listed here are configured by the Nutanix manufacturing and installation processes. Do not modify any of these components except under the direction of Nutanix Support.

Nutanix Software

Modifying any of the following Nutanix software settings may inadvertently constrain performance of your Nutanix cluster or render the Nutanix cluster inoperable.

  • Local datastore name.
  • Configuration and contents of any CVM (except memory configuration to enable certain features).
Important: Note the following important considerations about CVMs.
  • Do not delete the Nutanix CVM.
  • Do not take a snapshot of the CVM for backup.
  • Do not rename, modify, or delete the admin and nutanix user accounts of the CVM.
  • Do not create additional CVM user accounts.

    Use the default accounts (admin or nutanix), or use sudo to elevate to the root account.

  • Do not decrease CVM memory below recommended minimum amounts required for cluster and add-in features.

    Nutanix Cluster Checks (NCC), preupgrade cluster checks, and the AOS upgrade process detect and monitor CVM memory.

  • Nutanix does not support the usage of third-party storage on the host part of Nutanix clusters.

    Normal cluster operations might be affected if there are connectivity issues with the third-party storage you attach to the hosts in a Nutanix cluster.

  • Do not run any commands on a CVM that are not in the Nutanix documentation.

AHV Settings

Nutanix AHV is a cluster-optimized hypervisor appliance.

Alteration of the hypervisor appliance (unless advised by Nutanix Technical Support) is unsupported and may result in the hypervisor or VMs functioning incorrectly.

Unsupported alterations include (but are not limited to):

  • Hypervisor configuration, including installed packages
  • Controller VM virtual hardware configuration file (.xml file). Each AOS version and upgrade includes a specific Controller VM virtual hardware configuration. Therefore, do not edit or otherwise modify the Controller VM virtual hardware configuration file.
  • iSCSI settings
  • Open vSwitch settings

  • Installation of third-party software not approved by Nutanix
  • Installation or upgrade of software packages from non-Nutanix sources (using yum, rpm, or similar)
  • Taking snapshots of the Controller VM
  • Creating user accounts on AHV hosts
  • Changing the timezone of the AHV hosts. By default, the timezone of an AHV host is set to UTC.
  • Joining AHV hosts to Active Directory or OpenLDAP domains

Controller VM Access

Although each host in a Nutanix cluster runs a hypervisor independent of other hosts in the cluster, some operations affect the entire cluster.

Most administrative functions of a Nutanix cluster can be performed through the web console (Prism), however, there are some management tasks that require access to the Controller VM (CVM) over SSH. Nutanix recommends restricting CVM SSH access with password or key authentication.

This topic provides information about how to access the Controller VM as an admin user and nutanix user.

admin User Access

Use the admin user access for all tasks and operations that you must perform on the controller VM. As an admin user with default credentials, you cannot access nCLI. You must change the default password before you can use nCLI. Nutanix recommends that you do not create additional CVM user accounts. Use the default accounts (admin or nutanix), or use sudo to elevate to the root account.

For more information about admin user access, see Admin User Access to Controller VM.

nutanix User Access

Nutanix strongly recommends that you do not use the nutanix user access unless the procedure (as provided in a Nutanix Knowledge Base article or user guide) specifically requires the use of the nutanix user access.

For more information about nutanix user access, see Nutanix User Access to Controller VM.

You can perform most administrative functions of a Nutanix cluster through the Prism web consoles or REST API. Nutanix recommends using these interfaces whenever possible and disabling Controller VM SSH access with password or key authentication. Some functions, however, require logging on to a Controller VM with SSH. Exercise caution whenever connecting directly to a Controller VM as it increases the risk of causing cluster issues.

Warning: When you connect to a Controller VM with SSH, ensure that the SSH client does not import or change any locale settings. The Nutanix software is not localized, and running the commands with any locale other than en_US.UTF-8 can cause severe cluster issues.

To check the locale used in an SSH session, run /usr/bin/locale. If any environment variables are set to anything other than en_US.UTF-8, reconnect with an SSH configuration that does not import or change any locale settings.

Admin User Access to Controller VM

You can access the Controller VM as the admin user (admin user name and password) with SSH. For security reasons, the password of the admin user must meet Controller VM Password Complexity Requirements. When you log on to the Controller VM as the admin user for the first time, you are prompted to change the default password.

See Controller VM Password Complexity Requirements to set a secure password.

After you have successfully changed the password, the new password is synchronized across all Controller VMs and interfaces (Prism web console, nCLI, and SSH).

Note:
  • As an admin user, you cannot access nCLI by using the default credentials. If you are logging in as the admin user for the first time, you must log on through the Prism web console or SSH to the Controller VM. Also, you cannot change the default password of the admin user through nCLI. To change the default password of the admin user, you must log on through the Prism web console or SSH to the Controller VM.
  • When you make an attempt to log in to the Prism web console for the first time after you upgrade to AOS 5.1 from an earlier AOS version, you can use your existing admin user password to log in and then change the existing password (you are prompted) to adhere to the password complexity requirements. However, if you are logging in to the Controller VM with SSH for the first time after the upgrade as the admin user, you must use the default admin user password (Nutanix/4u) and then change the default password (you are prompted) to adhere to the Controller VM Password Complexity Requirements.
  • You cannot delete the admin user account.
  • The default password expiration age for the admin user is 60 days. You can configure the minimum and maximum password expiration days based on your security requirement.
    • nutanix@cvm$ sudo chage -M MAX-DAYS admin
    • nutanix@cvm$ sudo chage -m MIN-DAYS admin

When you change the admin user password, you must update any applications and scripts using the admin user credentials for authentication. Nutanix recommends that you create a user assigned with the admin role instead of using the admin user for authentication. The Prism Web Console Guide describes authentication and roles.

Following are the default credentials to access a Controller VM.

Table 1. Controller VM Credentials
Interface Target User Name Password
SSH client Nutanix Controller VM admin Nutanix/4u
nutanix nutanix/4u
Prism web console Nutanix Controller VM admin Nutanix/4u

Accessing the Controller VM Using the Admin User Account

About this task

Perform the following procedure to log on to the Controller VM by using the admin user with SSH for the first time.

Procedure

  1. Log on to the Controller VM with SSH by using the management IP address of the Controller VM and the following credentials.
    • User name: admin
    • Password: Nutanix/4u
    You are now prompted to change the default password.
  2. Respond to the prompts, providing the current and new admin user password.
    Changing password for admin.
    Old Password:
    New password:
    Retype new password:
    Password changed.
    

    See the requirements listed in Controller VM Password Complexity Requirements to set a secure password.

    For information about logging on to a Controller VM by using the admin user account through the Prism web console, see Logging Into The Web Console in the Prism Web Console Guide.

Nutanix User Access to Controller VM

You can access the Controller VM as the nutanix user (nutanix user name and password) with SSH. For security reasons, the password of the nutanix user must meet the Controller VM Password Complexity Requirements. When you log on to the Controller VM as the nutanix user for the first time, you are prompted to change the default password.

See Controller VM Password Complexity Requirementsto set a secure password.

After you have successfully changed the password, the new password is synchronized across all Controller VMs and interfaces (Prism web console, nCLI, and SSH).

Note:
  • As a nutanix user, you cannot access nCLI by using the default credentials. If you are logging in as the nutanix user for the first time, you must log on through the Prism web console or SSH to the Controller VM. Also, you cannot change the default password of the nutanix user through nCLI. To change the default password of the nutanix user, you must log on through the Prism web console or SSH to the Controller VM.

  • When you make an attempt to log in to the Prism web console for the first time after you upgrade the AOS from an earlier AOS version, you can use your existing nutanix user password to log in and then change the existing password (you are prompted) to adhere to the password complexity requirements. However, if you are logging in to the Controller VM with SSH for the first time after the upgrade as the nutanix user, you must use the default nutanix user password (nutanix/4u) and then change the default password (you are prompted) to adhere to the Controller VM Password Complexity Requirements.

  • You cannot delete the nutanix user account.
  • You can configure the minimum and maximum password expiration days based on your security requirement.
    • nutanix@cvm$ sudo chage -M MAX-DAYS admin
    • nutanix@cvm$ sudo chage -m MIN-DAYS admin

When you change the nutanix user password, you must update any applications and scripts using the nutanix user credentials for authentication. Nutanix recommends that you create a user assigned with the nutanix role instead of using the nutanix user for authentication. The Prism Web Console Guide describes authentication and roles.

Following are the default credentials to access a Controller VM.

Table 1. Controller VM Credentials
Interface Target User Name Password
SSH client Nutanix Controller VM admin Nutanix/4u
nutanix nutanix/4u
Prism web console Nutanix Controller VM admin Nutanix/4u

Accessing the Controller VM Using the Nutanix User Account

About this task

Perform the following procedure to log on to the Controller VM by using the nutanix user with SSH for the first time.

Procedure

  1. Log on to the Controller VM with SSH by using the management IP address of the Controller VM and the following credentials.
    • User name: nutanix
    • Password: nutanix/4u
    You are now prompted to change the default password.
  2. Respond to the prompts, providing the current and new nutanix user password.
    Changing password for nutanix.
    Old Password:
    New password:
    Retype new password:
    Password changed.
    

    See Controller VM Password Complexity Requirementsto set a secure password.

    For information about logging on to a Controller VM by using the nutanix user account through the Prism web console, see Logging Into The Web Console in the Prism Web Console Guide.

Controller VM Password Complexity Requirements

The password must meet the following complexity requirements:

  • At least eight characters long.
  • At least one lowercase letter.
  • At least one uppercase letter.
  • At least one number.
  • At least one special character.
    Note: Ensure that the following conditions are met for the special characters usage in the CVM password:
    • The special characters are appropriately used while setting up the CVM password. In some cases, for example when you use ! followed by a number in the CVM password, it leads to a special meaning at the system end, and the system may replace it with a command from the bash history. In this case, you may generate a password string different from the actual password that you intend to set.
    • The special character used in the CVM password are ASCII printable characters only. For information about ACSII printable characters, refer ASCII printable characters (character code 32-127) article on ASCII code website.
  • At least four characters difference from the old password.
  • Must not be among the last 5 passwords.
  • Must not have more than 2 consecutive occurrences of a character.
  • Must not be longer than 199 characters.

AHV Host Access

You can perform most of the administrative functions of a Nutanix cluster using the Prism web consoles or REST API. Nutanix recommends using these interfaces whenever possible. Some functions, however, require logging on to an AHV host with SSH.

Note: From AOS 5.15.5 with AHV 20190916.410 onwards, AHV has two new user accounts—admin and nutanix.

Nutanix provides the following users to access the AHV host:

  • root—It is used internally by the AOS. The root user is used for the initial access and configuration of the AHV host.
  • admin—It is used to log on to an AHV host. The admin user is recommended for accessing the AHV host.
  • nutanix—It is used internally by the AOS and must not be used for interactive logon.

Exercise caution whenever connecting directly to an AHV host as it increases the risk of causing cluster issues.

Following are the default credentials to access an AHV host:

Table 1. AHV Host Credentials
Interface Target User Name Password
SSH client AHV Host root nutanix/4u
admin

There is no default password for admin. You must set it during the initial configuration.

nutanix nutanix/4u

Initial Configuration

About this task

The AHV host is shipped with the default password for the root and nutanix users, which must be changed using SSH when you log on to the AHV host for the first time. After changing the default passwords and the admin password, all subsequent logins to the AHV host must be with the admin user.

Perform the following procedure to change admin user account password for the first time:
Note: Perform this initial configuration on all the AHV hosts.

Procedure

  1. Use SSH and log on to the AHV host using the root account.
    $ ssh root@<AHV Host IP Address>
    Nutanix AHV
    root@<AHV Host IP Address> password: # default password nutanix/4u
    
  2. Change the default root user password.
    root@ahv# passwd root
    Changing password for user root.
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.
    
  3. Change the default nutanix user password.
    root@ahv# passwd nutanix
    Changing password for user nutanix.
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.
    
  4. Change the admin user password.
    root@ahv# passwd admin
    Changing password for user admin.
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.
    

Accessing the AHV Host Using the Admin Account

About this task

After setting the admin password in the Initial Configuration, use the admin user for all subsequent logins.

Perform the following procedure to log on to the Controller VM by using the admin user with SSH for the first time.

Procedure

  1. Log on to the AHV host with SSH using the admin account.
    $ ssh admin@ <AHV Host IP Address> 
    Nutanix AHV
    
  2. Enter the admin user password configured in the Initial Configuration.
    admin@<AHV Host IP Address> password:
  3. Append sudo to the commands if privileged access is required.
    $ sudo ls /var/log

Changing Admin User Password

About this task

Perform these steps to change the admin password on every AHV host in the cluster:

Procedure

  1. Log on to the AHV host using the admin account with SSH.
  2. Enter the admin user password configured in the Initial Configuration.
  3. Run the sudo command to change to admin user password.
    $ sudo passwd admin
  4. Respond to the prompts and provide the new password.
    [sudo] password for admin: 
    Changing password for user admin.
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.
    
    Note: Repeat this step for each AHV host.

    See AHV Host Password Complexity Requirements to set a secure password.

Changing the Root User Password

About this task

Perform these steps to change the root password on every AHV host in the cluster:

Procedure

  1. Log on to the AHV host using the admin account with SSH.
  2. Run the sudo command to change to root user.
  3. Change the root password.
    root@ahv# passwd root
  4. Respond to the prompts and provide the current and new root password.
    Changing password for root.
    New password:
    Retype new password:
    passwd: all authentication tokens updated successfully.
    
    Note: Repeat this step for each AHV host.

    See AHV Host Password Complexity Requirements to set a secure password.

Changing Nutanix User Password

About this task

Perform these steps to change the nutanix password on every AHV host in the cluster:

Procedure

  1. Log on to the AHV host using the admin account with SSH.
  2. Run the sudo command to change to root user.
  3. Change the nutanix password.
    root@ahv# passwd nutanix
  4. Respond to the prompts and provide the current and new nutanix password.
    Changing password for nutanix.
    New password:
    Retype new password:
    passwd: all authentication tokens updated successfully.
    
    Note: Repeat this step for each AHV host.

    See AHV Host Password Complexity Requirements to set a secure password.

AHV Host Password Complexity Requirements

The password you choose must meet the following complexity requirements:

  • In configurations with high-security requirements, the password must contain:
    • At least 15 characters.
    • At least one upper case letter (A–Z).
    • At least one lower case letter (a–z).
    • At least one digit (0–9).
    • At least one printable ASCII special (non-alphanumeric) character. For example, a tilde (~), exclamation point (!), at sign (@), number sign (#), or dollar sign ($).
    • At least eight characters different from the previous password.
    • At most three consecutive occurrences of any given character.
    • At most four consecutive occurrences of any given class.

The password cannot be the same as the last 5 passwords.

  • In configurations without high-security requirements, the password must contain:
    • At least eight characters.
    • At least one upper case letter (A–Z).
    • At least one lower case letter (a–z).
    • At least one digit (0–9).
    • At least one printable ASCII special (non-alphanumeric) character. For example, a tilde (~), exclamation point (!), at sign (@), number sign (#), or dollar sign ($).
    • At least three characters different from the previous password.
    • At most three consecutive occurrences of any given character.

The password cannot be the same as the last 5 passwords.

In both types of configuration, if a password for an account is entered three times unsuccessfully within a 15-minute period, the account is locked for 15 minutes.

Verifying the Cluster Health

Before you perform operations such as restarting a CVM or AHV host and putting an AHV host into maintenance mode, check if the cluster can tolerate a single-node failure.

Before you begin

Ensure that you are running the most recent version of NCC.

About this task

Note: If you see any critical alerts, resolve the issues by referring to the indicated KB articles. If you are unable to resolve any issues, contact Nutanix Support.

Perform the following steps to avoid unexpected downtime or performance issues.

Procedure

  1. Review and resolve any critical alerts. Do one of the following:
    • In the Prism Element web console, go to the Alerts page.
    • Log on to a Controller VM (CVM) with SSH and display the alerts.
      nutanix@cvm$ ncli alert ls
    Note: If you receive alerts indicating expired encryption certificates or a key manager is not reachable, resolve these issues before you shut down the cluster. If you do not resolve these issues, data loss of the cluster might occur.
  2. Verify if the cluster can tolerate a single-node failure. Do one of the following:
    • In the Prism Element web console, in the Home page, check the status of the Data Resiliency Status dashboard.

      Verify that the status is OK. If the status is anything other than OK, resolve the indicated issues before you perform any maintenance activity.

    • Log on to a Controller VM (CVM) with SSH and check the fault tolerance status of the cluster.
      nutanix@cvm$ ncli cluster get-domain-fault-tolerance-status type=node
      

      An output similar to the following is displayed:

      Important:
      Domain Type               : NODE
          Component Type            : STATIC_CONFIGURATION
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 14:22:09 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : ERASURE_CODE_STRIP_SIZE
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 13:19:58 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : METADATA
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Mon Sep 28 14:35:25 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : ZOOKEEPER
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Thu Sep 17 11:09:39 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : EXTENT_GROUPS
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 13:19:58 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : OPLOG
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 13:19:58 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : FREE_SPACE
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 14:20:57 GMT+05:00 2015
      

      The value of the Current Fault Tolerance column must be at least 1 for all the nodes in the cluster.

Node Maintenance Mode

You are required to gracefully place a node into the maintenance mode or non-operational state for reasons such as making changes to the network configuration of a node, performing manual firmware upgrades or replacements, performing CVM maintenance or any other maintenance operations.

Entering and Exiting Maintenance Mode

You can only place one node at a time in maintenance mode for each cluster.​ When a host is in maintenance mode, the CVM is placed in maintenance mode as part of the node maintenance operation and any associated RF1 VMs are powered-off. The cluster marks the host as unschedulable so that no new VM instances are created on it. When a node is placed in the maintenance mode from the Prism web console, an attempt is made to evacuate VMs from the host. If the evacuation attempt fails, the host remains in the entering maintenance mode state, where it is marked unschedulable, waiting for user remediation.

When a host is placed in the maintenance mode, the non-migratable VMs (for example, pinned or RF1 VMs which have affinity towards a specific node) are powered-off while live migratable or high availability (HA) VMs are moved from the original host to other hosts in the cluster. After exiting the maintenance mode, all non-migratable guest VMs are powered on again and the live migrated VMs are automatically restored on the original host.
Note: VMs with CPU passthrough or PCI passthrough, pinned VMs (with host affinity policies), and RF1 VMs are not migrated to other hosts in the cluster when a node undergoes maintenance. Click View these VMs link to view the list of VMs that cannot be live-migrated.

For information about how to place a node under maintenance, see Putting a Node into Maintenance Mode using Web Console.

You can also place an AHV host under maintenance mode or exit an AHV host from maintenance mode through the CLI.
Note: Using the CLI method to place an AHV host under maintenance only places the hypervisor under maintenance mode. The CVM is up running in this method. To place the entire node under maintenance, Nutanix recommends using the UI method (through web console).

Exiting a Node from Maintenance Mode

For information about how to remove a node from the maintenance mode, see Exiting a Node from the Maintenance Mode using Web Console.

Viewing a Node under Maintenance Mode

For information about how to view the node under maintenance mode, see Viewing a Node that is in Maintenance Mode.

UVM Status When Node under Maintenance Mode

For information about how to view the status of UVMs when a node is undergoing maintenance operations, see Guest VM Status when Node is in Maintenance Mode.

Best Practices and Recommendations

Nutanix strongly recommends using the Enter Maintenance Mode option on the Prism web console to place a node under maintenance.

Known Issues and Limitations

  • The Prism web console enabled maintenance operations (enter and exit node maintenance) are currently supported only on AHV.
  • Entering or exiting a node under maintenance from the CLI is not equivalent to entering or exiting the node under maintenance from the Prism Element web console. For example, placing a node under maintenance from the CLI places the AHV host and CVM under maintenance while the CVM continues to remain powered on.
  • You must exit the node from maintenance mode using the same method that you have used to put the node into maintenance mode. For example, if you used CLI to put the node into maintenance mode, you must use CLI to exit the node from maintenance mode. Similarly, if you used web console to put the node into maintenance mode, you must use the web console to exit the node from maintenance mode.

Putting a Node into Maintenance Mode using Web Console

Before you begin

Check the cluster status and resiliency before putting a node under maintenance. You can also verify the status of the UVMs. See Guest VM Status when Node is in Maintenance Mode for more information.

About this task

As the node enter the maintenance mode, the following high-level tasks are performed internally.
  • The AHV host initiates entering the maintenance mode.
  • The HA VMs are live migrated.
  • The pinned and RF1 VMs are powered-off.
  • The AHV host completes entering the maintenance mode.
    Note: At this stage, the AHV host is not shut down. For information about how to shut down the AHV host, see Shutting Down a Node in a Cluster (AHV). You can list all the hosts in the cluster by running nutanix@cvm$ acli host.list command, and note the value of Hypervisor IP for the node you want to shut down.
  • The CVM enters the maintenance mode.
  • The CVM is shut down.

For more information, see Guest VM Status when Node is in Maintenance Mode to view the status of the UVMs.

Perform the following steps to put the node into maintenance mode.

Procedure

  1. Login to the Prism Element web console.
  2. On the home page, select Hardware from the drop-down menu.
  3. Go to the Table > Host view.
  4. Select the node which you intend to put under maintenance.
  5. Click the Enter Maintenance Mode option.
    Figure. Enter Maintenance Mode Option Click to enlarge

    The Host Maintenance window appears with a prompt to power-off all VMs that cannot be live migrated.
    Figure. Host Maintenance Window (Enter Maintenance Mode Enabled) Click to enlarge

    Note: VMs with CPU passthrough, PCI passthrough, pinned VMs (with host affinity policies), and RF1 are not migrated to other hosts in the cluster when a node undergoes maintenance. Click View these VMs link to view the list of VMs that cannot be live-migrated.
  6. Select the Power-off VMs that can not migrate check-box to enable the Enter Maintenance Mode button.
  7. Click the Enter Maintenance Mode button.
    • A revolving icon appears as a tool tip beside the selected node and also in the Host Details view. This indicates that the host is entering the maintenance mode.
    • The revolving icon disappears and the Exit Maintenance Mode option is enabled after the node completely enters the maintenance mode.
      Figure. Enter Node Maintenance (On-going) Click to enlarge

    • You can also monitor the progress of the node maintenance operation through the newly created Host enter maintenance and Enter maintenance mode tasks which appear in the task tray.
    Note: In case of a node maintenance failure, certain rolled-back operations are performed. For example, the CVM is rebooted. But the live migrated are not restored to the original host.

What to do next

Once the maintenance activity is complete, you can perform any of the following.

Viewing a Node that is in Maintenance Mode

About this task

Note: This procedure is the same for AHV and ESXI nodes.

Perform the following steps to view a node under maintenance.

Procedure

  1. Login to the Prism Element web console.
  2. On the home page, select Hardware from the drop-down menu.
  3. Go to the Table > Host view.
  4. Observe the icon along with a tool tip that appears beside the node which is under maintenance. You can also view this icon in the host details view.
    Figure. Example: Node under Maintenance (Table and Host Details View) in AHV Click to enlarge

  5. Alternatively, view the node under maintenance from the Hardware > Diagram view.
    Figure. Example: Node under Maintenance (Diagram and Host Details View) in AHV Click to enlarge

What to do next

You can:

Exiting a Node from the Maintenance Mode using Web Console

After you perform any maintenance activity, exit the node from the maintenance mode.

About this task

As the node exits the maintenance mode, the following high-level tasks are performed internally.After the host exits the maintenance mode, the RF1 VMs continue to be powered on and the VMs migrate to restore host locality.

For more information, see Guest VM Status when Node is in Maintenance Mode to view the status of the UVMs.

Perform the following steps to remove the node into maintenance mode.

Procedure

  1. On the Prism web console home page, select Hardware from the drop-down menu.
  2. Go to the Table > Host view.
  3. Select the node which you intend to remove from the maintenance mode.
  4. Click the Exit Maintenance Mode option.
    Figure. Exit Maintenance Mode Option Click to enlarge

    The Host Maintenance window appears.
  5. Click the Exit Maintenance Mode button.
    Figure. Host Maintenance Window (Exit Maintenance Mode) Click to enlarge

    • A revolving icon appears as a tool tip beside the selected node and also in the Host Details view. This indicates that the host is exiting the maintenance mode.
    • The revolving icon disappears and the Enter Maintenance Mode option is enabled after the node completely exits the maintenance mode.
      Figure. Exit Node Maintenance (On-going) Click to enlarge

    • You can also monitor the progress of the exit node maintenance operation through the newly created Host exit maintenance and Exit maintenance mode tasks which appear in the task tray.

What to do next

Once a node exits the maintenance mode, you can perform any of the following.

Guest VM Status when Node is in Maintenance Mode

The following scenarios demonstrate the behavior of three guest VM types - high availability (HA) VMs, pinned VMs, and RF1 VMs, when a node enters and exits a maintenance operation. The HA VMs are live VMs that can migrate across nodes if the host server goes down or reboots. The pinned VMs have the host affinity set to a specific node. The RF1 VMs have affinity towards a specific node or a CVM. To view the status of the guest VMs, go to VM > Table.

Note: The following scenarios are the same for AHV and ESXI nodes.

Scenario 1: Guest VMs before Node Entering Maintenance Mode

In this example, you can observe the status of the guest VMs on the node prior to the node entering the maintenance mode. All the guest VMs are powered-on and reside on the same host.

Figure. Example: Original State of VM and Hosts in AHV Click to enlarge

Scenario 2: Guest VMs during Node Maintenance Mode

  • As the node enter the maintenance mode, the following high-level tasks are performed internally.
    1. The host initiates entering the maintenance mode.
    2. The HA VMs are live migrated.
    3. The pinned and RF1 VMs are powered-off.
    4. The host completes entering the maintenance mode.
    5. The CVM enters the maintenance mode.
    6. The AHV host completes entering the maintenance mode.
    7. The CVM enters the maintenance mode.
    8. The CVM is shut down.
Figure. Example: VM and Hosts before Entering Maintenance Mode Click to enlarge

Scenario 3: Guest VMs after Node Exiting Maintenance Mode

  • As the node exits the maintenance mode, the following high-level tasks are performed internally.
    1. The CVM is powered on.
    2. The CVM is taken out of maintenance.
    3. The host is taken out of maintenance.
    After the host exits the maintenance mode, the RF1 VMs continue to be powered on and the VMs migrate to restore host locality.
Figure. Example: Original State of VM and Hosts in AHV Click to enlarge

Putting a Node into Maintenance Mode using CLI

You are required to put a node into maintenance mode for reasons such as making changes to the network configuration of a node, performing manual firmware upgrades, or any other.

Before you begin

Caution: Verify the data resiliency status of your cluster. If the cluster only has replication factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have more than one node shut down, shut down the entire cluster.

About this task

When a host is in maintenance mode, AOS marks the host as unschedulable so that no new VM instances are created on it. Next, an attempt is made to evacuate VMs from the host.

If the evacuation attempt fails, the host remains in the "entering maintenance mode" state, where it is marked unschedulable, waiting for user remediation. You can shut down VMs on the host or move them to other nodes. Once the host has no more running VMs, it is in maintenance mode.

When a host is in maintenance mode, VMs are moved from that host to other hosts in the cluster. After exiting maintenance mode, those VMs are automatically returned to the original host, eliminating the need to manually move them.

VMs with GPU, CPU passthrough, PCI passthrough, and host affinity policies are not migrated to other hosts in the cluster. You can choose to shut down such VMs while putting the node into maintenance mode.

Agent VMs are always shut down if you put a node in maintenance mode and are powered on again after exiting maintenance mode.

Perform the following steps to put the node into maintenance mode.

Procedure

  1. Use SSH to log on to a Controller VM in the cluster.
  2. Determine the IP address of the node you want to put into maintenance mode.
    nutanix@cvm$ acli host.list

    Note the value of Hypervisor IP for the node you want to put in maintenance mode.

  3. Put the node into maintenance mode.
    nutanix@cvm$ acli host.enter_maintenance_mode hypervisor-IP-address [wait="{ true | false }" ] [non_migratable_vm_action="{ acpi_shutdown | block }" ]
    Note: Never put Controller VM and AHV hosts into maintenance mode on single-node clusters. It is recommended to shutdown user VMs before proceeding with disruptive changes.

    Replace host-IP-address with either the IP address or host name of the AHV host you want to shut down.

    The following are optional parameters for running the acli host.enter_maintenance_mode command:

    • wait: Set the wait parameter to true to wait for the host evacuation attempt to finish.
    • non_migratable_vm_action: By default the non_migratable_vm_action parameter is set to block, which means VMs with GPU, CPU passthrough, PCI passthrough, and host affinity policies are not migrated or shut down when you put a node into maintenance mode.

      If you want to automatically shut down such VMs, set the non_migratable_vm_action parameter to acpi_shutdown.

  4. Verify if the host is in the maintenance mode.
    nutanix@cvm$ acli host.get host-ip

    In the output that is displayed, ensure that node_state equals to EnteredMaintenanceMode and schedulable equals to False.

    Do not continue if the host has failed to enter the maintenance mode.

  5. See Verifying the Cluster Health to once again check if the cluster can tolerate a single-node failure.
  6. Put the CVM into the maintenance mode.
    nutanix@cvm$ ncli host edit id=host-ID enable-maintenance-mode=true

    Replace host-ID with the ID of the host.

    This step prevents the CVM services from being affected by any connectivity issues.

  7. Determine the ID of the host.
    nutanix@cvm$ ncli host list

    An output similar to the following is displayed:

    Id                        : aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee::1234  
    Uuid                      : ffffffff-gggg-hhhh-iiii-jjjjjjjjjjj 
    Name                      : XXXXXXXXXXX-X 
    IPMI Address              : X.X.Z.3 
    Controller VM Address     : X.X.X.1 
    Hypervisor Address        : X.X.Y.2
    

    In this example, the host ID is 1234.

    Wait for a few minutes until the CVM is put into the maintenance mode.

  8. Verify if the CVM is in the maintenance mode.

    Run the following command on the CVM that you put in the maintenance mode.

    nutanix@cvm$ genesis status | grep -v "\[\]"

    An output similar to the following is displayed:

    nutanix@cvm$ genesis status | grep -v "\[\]"
    2021-09-24 05:28:03.827628: Services running on this node:
      genesis: [11189, 11390, 11414, 11415, 15671, 15672, 15673, 15676]
      scavenger: [27241, 27525, 27526, 27527]
      xmount: [25915, 26055, 26056, 26074]
      zookeeper: [13053, 13101, 13102, 13103, 13113, 13130]
    nutanix@cvm$ 

    Only the Genesis, Scavenger, Xmount, and Zookeeper processes must be running (process ID is displayed next to the process name).

    Do not continue if the CVM has failed to enter the maintenance mode, because it can cause a service interruption.

What to do next

Perform the maintenance activity. Once the maintenance activity is complete, remove the node from the maintenance mode. See Exiting a Node from the Maintenance Mode Using CLI for more information.

Exiting a Node from the Maintenance Mode Using CLI

After you perform any maintenance activity, exit the node from the maintenance mode.

About this task

Perform the following to exit the host from the maintenance mode.

Procedure

  1. Remove the CVM from the maintenance mode.
    1. Determine the ID of the host.
      nutanix@cvm$ ncli host list

      An output similar to the following is displayed:

      Id                        : aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee::1234  
      Uuid                      : ffffffff-gggg-hhhh-iiii-jjjjjjjjjjj 
      Name                      : XXXXXXXXXXX-X 
      IPMI Address              : X.X.Z.3 
      Controller VM Address     : X.X.X.1 
      Hypervisor Address        : X.X.Y.2
      

      In this example, the host ID is 1234.

    1. From any other CVM in the cluster, run the following command to exit the CVM from the maintenance mode.
      nutanix@cvm$ ncli host edit id=host-ID enable-maintenance-mode=false

      Replace host-ID with the ID of the host.

      Note: The command fails if you run the command from the CVM that is in the maintenance mode.
    2. Verify if all the processes on all the CVMs are in the UP state.
      nutanix@cvm$ cluster status | grep -v UP
    Do not continue if the CVM has failed to exit the maintenance mode.
  2. Remove the AHV host from the maintenance mode.
    1. From any CVM in the cluster, run the following command to exit the AHV host from the maintenance mode.
      nutanix@cvm$ acli host.exit_maintenance_mode host-ip 
      

      Replace host-ip with the new IP address of the host.

      This command migrates (live migration) all the VMs that were previously running on the host back to the host.

    2. Verify if the host has exited the maintenance mode.
      nutanix@cvm$ acli host.get host-ip 

      In the output that is displayed, ensure that node_state equals to kAcropolisNormal or AcropolisNormal and schedulable equals to True.

    Contact Nutanix Support if any of the steps described in this document produce unexpected results.

Shutting Down a Node in a Cluster (AHV)

Before you begin

Caution: Verify the data resiliency status of your cluster. If the cluster only has replication factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have more than one node shut down, shut down the entire cluster.

See Verifying the Cluster Health to check if the cluster can tolerate a single-node failure. Do not proceed if the cluster cannot tolerate a single-node failure.

About this task

Perform the following procedure to shut down a node.

Procedure

  1. Put the node into maintenance mode as described in Putting a Node into Maintenance Mode using Web Console.
  2. Log on to the AHV host with SSH.
  3. Shut down the host.
    root@ahv# shutdown -h now

What to do next

See Starting a Node in a Cluster (AHV) for instructions about how to start a node, including how to start a CVM and how to exit a node from maintenance mode.

Starting a Node in a Cluster (AHV)

About this task

Procedure

  1. On the hardware appliance, power on the node. The CVM starts automatically when your reboot the node.
  2. If the node is in maintenance mode, log on to Prism Web Console and remove the node from the maintenance mode.
  3. Log on to another CVM in the Nutanix cluster with SSH.
  4. Verify that the status of all services on all the CVMs are Up.
    nutanix@cvm$ cluster status
    If the Nutanix cluster is running properly, output similar to the following is displayed for each node in the Nutanix cluster.
    CVM:host IP-Address Up
                                    Zeus   UP       [9935, 9980, 9981, 9994, 10015, 10037]
                               Scavenger   UP       [25880, 26061, 26062]
                                  Xmount   UP       [21170, 21208]
                        SysStatCollector   UP       [22272, 22330, 22331]
                               IkatProxy   UP       [23213, 23262]
                        IkatControlPlane   UP       [23487, 23565]
                           SSLTerminator   UP       [23490, 23620]
                          SecureFileSync   UP       [23496, 23645, 23646]
                                  Medusa   UP       [23912, 23944, 23945, 23946, 24176]
                      DynamicRingChanger   UP       [24314, 24404, 24405, 24558]
                                  Pithos   UP       [24317, 24555, 24556, 24593]
                              InsightsDB   UP       [24322, 24472, 24473, 24583]
                                  Athena   UP       [24329, 24504, 24505]
                                 Mercury   UP       [24338, 24515, 24516, 24614]
                                  Mantle   UP       [24344, 24572, 24573, 24634]
                              VipMonitor   UP       [18387, 18464, 18465, 18466, 18474]
                                Stargate   UP       [24993, 25032]
                    InsightsDataTransfer   UP       [25258, 25348, 25349, 25388, 25391, 25393, 25396]
                                   Ergon   UP       [25263, 25414, 25415]
                                 Cerebro   UP       [25272, 25462, 25464, 25581]
                                 Chronos   UP       [25281, 25488, 25489, 25547]
                                 Curator   UP       [25294, 25528, 25529, 25585]
                                   Prism   UP       [25718, 25801, 25802, 25899, 25901, 25906, 25941, 25942]
                                     CIM   UP       [25721, 25829, 25830, 25856]
                            AlertManager   UP       [25727, 25862, 25863, 25990]
                                Arithmos   UP       [25737, 25896, 25897, 26040]
                                 Catalog   UP       [25749, 25989, 25991]
                               Acropolis   UP       [26011, 26118, 26119]
                                   Uhura   UP       [26037, 26165, 26166]
                                    Snmp   UP       [26057, 26214, 26215]
                       NutanixGuestTools   UP       [26105, 26282, 26283, 26299]
                              MinervaCVM   UP       [27343, 27465, 27466, 27730]
                           ClusterConfig   UP       [27358, 27509, 27510]
                                Aequitas   UP       [27368, 27567, 27568, 27600]
                             APLOSEngine   UP       [27399, 27580, 27581]
                                   APLOS   UP       [27853, 27946, 27947]
                                   Lazan   UP       [27865, 27997, 27999]
                                  Delphi   UP       [27880, 28058, 28060]
                                    Flow   UP       [27896, 28121, 28124]
                                 Anduril   UP       [27913, 28143, 28145]
                                   XTrim   UP       [27956, 28171, 28172]
                           ClusterHealth   UP       [7102, 7103, 27995, 28209,28495, 28496, 28503, 28510,	
    28573, 28574, 28577, 28594, 28595, 28597, 28598, 28602, 28603, 28604, 28607, 28645, 28646, 28648, 28792,	
    28793, 28837, 28838, 28840, 28841, 28858, 28859, 29123, 29124, 29127, 29133, 29135, 29142, 29146, 29150,	
    29161, 29162, 29163, 29179, 29187, 29219, 29268, 29273]

Rebooting an AHV Node in a Nutanix Cluster

About this task

The Request Reboot operation in the Prism web console gracefully restarts the selected nodes one after the other.

Perform the following procedure to restart the nodes in the cluster.

Procedure

  1. Click the gear icon in the main menu and then select Reboot in the Settings page.
  2. In the Request Reboot window, select the nodes you want to restart, and click Reboot.
    Figure. Request Reboot of AHV Node Click to enlarge

    A progress bar is displayed that indicates the progress of the restart of each node.

Shutting Down an AHV Cluster

You might need to shut down an AHV cluster to perform a maintenance activity or tasks such as relocating the hardware.

Before you begin

Ensure the following before you shut down the cluster.

  1. Upgrade to the most recent version of NCC.
  2. Log on to a Controller VM (CVM) with SSH and run the complete NCC health check.
    nutanix@cvm$ ncc health_checks run_all

    If you receive any failure or error messages, resolve those issues by referring to the KB articles indicated in the output of the NCC check results. If you are unable to resolve these issues, contact Nutanix Support.

    Warning: If you receive alerts indicating expired encryption certificates or a key manager is not reachable, resolve these issues before you shut down the cluster. If you do not resolve these issues, data loss of the cluster might occur.

About this task

Shut down an AHV cluster in the following sequence.

Procedure

  1. Shut down the services or VMs associated with AOS features or Nutanix products. For example, shut down all the Nutanix file server VMs (FSVMs). See the documentation of those features or products for more information.
  2. Shut down all the guest VMs in the cluster in one of the following ways.
    • Shut down the guest VMs from within the guest OS.
    • Shut down the guest VMs by using the Prism Element web console.
    • If you are running many VMs, shut down the VMs by using aCLI:
    1. Log on to a CVM in the cluster with SSH.
    2. Shut down all the guest VMs in the cluster.
      nutanix@cvm$ for i in `acli vm.list power_state=on | awk '{print $1}' | grep -v NTNX` ; do acli vm.shutdown $i ; done
      
    3. Verify if all the guest VMs are shut down.
      nutanix@CVM$ acli vm.list power_state=on
    4. If any VMs are on, consider powering off the VMs from within the guest OS. To force shut down through AHV, run the following command:
      nutanix@cvm$ acli vm.off vm-name

      Replace vm-name with the name of the VM you want to shut down.

  3. Stop the Nutanix cluster.
    1. Log on to any CVM in the cluster with SSH.
    2. Stop the cluster.
      nutanix@cvm$ cluster stop
    3. Verify if the cluster services have stopped.
      nutanix@CVM$ cluster status

      The output displays the message The state of the cluster: stop, which confirms that the cluster has stopped.

      Note: Some system services continue to run even if the cluster has stopped.
  4. Shut down all the CVMs in the cluster. Log on to each CVM in the cluster with SSH and shut down that CVM.
    nutanix@cvm$ sudo shutdown -P now
  5. Shut down each node in the cluster. Perform the following steps for each node in the cluster.
    1. Log on to the IPMI web console of each node.
    2. Under Remote Control > Power Control, select Power Off Server - Orderly Shutdown to gracefully shut down the node.
    3. Ping each host to verify that all AHV hosts are shut down.
  6. Complete the maintenance activity or any other tasks.
  7. Start all the nodes in the cluster.
    1. Press the power button on the front of the block for each node.
    2. Log on to the IPMI web console of each node.
    3. On the System tab, check the Power Control status to verify if the node is powered on.
  8. Start the cluster.
    1. Wait for approximately 5 minutes after you start the last node to allow the cluster services to start.
      All CVMs start automatically after you start all the nodes.
    2. Log on to any CVM in the cluster with SSH.
    3. Start the cluster.
      nutanix@cvm$ cluster start
    4. Verify that all the cluster services are in the UP state.
      nutanix@cvm$ cluster status
    5. Start the guest VMs from within the guest OS or use the Prism Element web console.

      If you are running many VMs, start the VMs by using aCLI:

      nutanix@cvm$ for i in `acli vm.list power_state=off | awk '{print $1}' | grep -v NTNX` ; do acli vm.on $i; done
    6. Start the services or VMs associated with AOS features or Nutanix products. For example, start all the FSVMs. See the documentation of those features or products for more information.
    7. Verify if all guest VMs are powered on by using the Prism Element web console.

Changing CVM Memory Configuration (AHV)

About this task

You can increase the memory reserved for each Controller VM in your cluster by using the 1-click Controller VM Memory Upgrade available from the Prism Element web console. Increase memory size depending on the workload type or to enable certain AOS features. See the Increasing the Controller VM Memory Size topic in the Prism Web Console Guide for CVM memory sizing recommendations and instructions about how to increase the CVM memory.

Changing the AHV Hostname

To change the name of an AHV host, log on to any Controller VM (CVM) in the cluster as admin or nutanix user and run the change_ahv_hostname script.

About this task

Perform the following procedure to change the name of an AHV host:

Procedure

  1. Log on to any CVM in the cluster with SSH.
  2. Change the hostname of the AHV host.
    • If you are logged in as nutanix user, run the following command:
      nutanix@cvm$ change_ahv_hostname --host_ip=host-IP-address --host_name=new-host-name
    • If you are logged in as admin user, run the following command:
      admin@cvm$ sudo change_ahv_hostname --host_ip=host-IP-address --host_name=new-host-name
    Note: The system prompts you to enter the admin user password if you run the change_ahv_hostname command with sudo.

    Replace host-IP-address with the IP address of the host whose name you want to change and new-host-name with the new hostname for the AHV host.

    Note: This entity must fulfill the following naming conventions:
    • The maximum length is 63 characters.
    • Allowed characters are uppercase and lowercase letters (A-Z and a-z), decimal digits (0-9), dots (.), and hyphens (-).
    • The entity name must start and end with a number or letter.

    If you want to update the hostname of multiple hosts in the cluster, run the script for one host at a time (sequentially).

    Note: The Prism Element web console displays the new hostname after a few minutes.

Changing the Name of the CVM Displayed in the Prism Web Console

You can change the CVM name that is displayed in the Prism web console. The procedure described in this document does not change the CVM name that is displayed in the terminal or console of an SSH session.

About this task

You can change the CVM name by using the change_cvm_display_name script. Run this script from a CVM other than the CVM whose name you want to change. When you run the change_cvm_display_name script, AOS performs the following steps:

    1. Checks if the new name starts with NTNX- and ends with -CVM. The CVM name must have only letters, numbers, and dashes (-).
    2. Checks if the CVM has received a shutdown token.
    3. Powers off the CVM. The script does not put the CVM or host into maintenance mode. Therefore, the VMs are not migrated from the host and continue to run with the I/O operations redirected to another CVM while the current CVM is in a powered off state.
    4. Changes the CVM name, enables autostart, and powers on the CVM.

Perform the following to change the CVM name displayed in the Prism web console.

Procedure

  1. Use SSH to log on to a CVM other than the CVM whose name you want to change.
  2. Change the name of the CVM.
    nutanix@cvm$ change_cvm_display_name --cvm_ip=CVM-IP --cvm_name=new-name

    Replace CVM-IP with the IP address of the CVM whose name you want to change and new-name with the new name for the CVM.

    The CVM name must have only letters, numbers, and dashes (-), and must start with NTNX- and end with -CVM.

    Note: Do not run this command from the CVM whose name you want to change, because the script powers off the CVM. In this case, when the CVM is powered off, you lose connectivity to the CVM from the SSH console and the script abruptly ends.

Adding a Never-Schedulable Node (AHV Only)

Add a never-schedulable node if you want to add a node to increase data storage on your Nutanix cluster, but do not want any AHV VMs to run on that node.

About this task

AOS never schedules any VMs on a never-schedulable node. Therefore, a never-schedulable node configuration ensures that no additional compute resources such as CPUs are consumed from the Nutanix cluster. In this way, you can meet the compliance and licensing requirements of your virtual applications.

Note the following points about a never-schedulable node configuration.

Note:
  • Ensure that at any given time, the cluster has a minimum of three nodes (never-schedulable or otherwise) in function. To add your first never-schedulable node to your Nutanix cluster, the cluster must comprise of at least three schedulable nodes.
  • You can add any number of never-schedulable nodes to your Nutanix cluster.
  • If you want a node that is already a part of the cluster to work as a never-schedulable node, remove that node from the cluster and then add that node as a never-schedulable node.
  • If you no longer need a node to work as a never-schedulable node, remove the node from the cluster.

Procedure

You can add a never-schedulable node (storage-only node) to a cluster using the Expand Cluster operation from Prism Web Console.
For information about how to add a never-schedulable node to a cluster, see the Expanding a Cluster topic in Prism Web Console Guide.

Compute-Only Node Configuration (AHV Only)

A compute-only (CO) node allows you to seamlessly and efficiently expand the computing capacity (CPU and memory) of your AHV cluster. The Nutanix cluster uses the resources (CPUs and memory) of a CO node exclusively for computing purposes.

Note: Clusters that have compute-only nodes do not support virtual switches. Instead, use bridge configurations for network connections. For more information, see Virtual Switch Limitations.

You can use a supported server or an existing hyperconverged (HC) node as a CO node. To use a node as CO, image the node as CO by using Foundation and then add that node to the cluster by using the Prism Element web console. For more information about how to image a node as a CO node, see the Field Installation Guide.

Note: If you want an existing HC node that is already a part of the cluster to work as a CO node, remove that node from the cluster, image that node as CO by using Foundation, and add that node back to the cluster. For more information about how to remove a node, see Modifying a Cluster.

Key Features of Compute-Only Node

Following are the key features of CO nodes.

  • CO nodes do not have a Controller VM (CVM) and local storage.
  • AOS sources the storage for vDisks associated with VMs running on CO nodes from the hyperconverged (HC) nodes in the cluster.
  • You can seamlessly manage your VMs (CRUD operations, ADS, and HA) by using the Prism Element web console.
  • AHV runs on the local storage media of the CO node.
  • To update AHV on a cluster that contains a compute-only node, use the Life Cycle Manager. For more information, see the LCM Updates topic in the Life Cycle Manager Guide.

Use Case of Compute-Only Node

CO nodes enable you to achieve more control and value from restrictive licenses such as Oracle. A CO node is part of a Nutanix HC cluster, and there is no CVM running on the CO node (VMs use CVMs running on the HC nodes to access disks). As a result, licensed cores on the CO node are used only for the application VMs.

Applications or databases that are licensed on a per CPU core basis require the entire node to be licensed and that also includes the cores on which the CVM runs. With CO nodes, you get a much higher ROI on the purchase of your database licenses (such as Oracle and Microsoft SQL Server) since the CVM does not consume any compute resources.

Minimum Cluster Requirements

Following are the minimum cluster requirements for compute-only nodes.

  • The Nutanix cluster must be at least a three-node cluster before you add a compute-only node.

    However, Nutanix recommends that the cluster has four nodes before you add a compute-only node.

  • The ratio of compute-only to hyperconverged nodes in a cluster must not exceed the following:

    1 compute-only : 2 hyperconverged

  • All the hyperconverged nodes in the cluster must be all-flash nodes.
  • The number of vCPUs assigned to CVMs on the hyperconverged nodes must be greater than or equal to the total number of available cores on all the compute-only nodes in the cluster. The CVM requires a minimum of 12 vCPUs. For more information about how Foundation allocates memory and vCPUs to your platform model, see CVM vCPU and vRAM Allocation in the Field Installation Guide.
  • The total amount of NIC bandwidth allocated to all the hyperconverged nodes must be twice the amount of the total NIC bandwidth allocated to all the compute-only nodes in the cluster.

    Nutanix recommends you use dual 25 GbE on CO nodes and quad 25 GbE on an HC node serving storage to a CO node.

  • The AHV version of the compute-only node must be the same as the other nodes in the cluster.

    When you are adding a CO node to the cluster, AOS checks if the AHV version of the node matches with the AHV version of the existing nodes in the cluster. If there is a mismatch, the add node operation fails.

For general requirements about adding a node to a Nutanix cluster, see Expanding a Cluster.

Restrictions

Nutanix does not support the following features or tasks on a CO node in this release:

  1. Host boot disk replacement
  2. Network segmentation
  3. Virtual Switch configuration: Use bridge configurations instead.

Supported AOS Versions

Nutanix supports compute-only nodes on AOS releases 5.11 or later.

Supported Hardware Platforms

Compute-only nodes are supported on the following hardware platforms.

  • All the NX series hardware
  • Dell XC Core
  • Cisco UCS

Networking Configuration

To perform network tasks on a compute-only node such as creating or modifying bridges or uplink bonds or uplink load balancing, you must use the manage_ovs commands and add the --host flag to the manage_ovs commands as shown in the following example:

Note: If you have storage-only AHV nodes in clusters with compute-only nodes being ESXI or Hyper-V, deployment of default virtual switch vs0 fails. In such cases, the Prism Element, Prism Central or CLI workflows for virtual switch management are unavailable to manage the bridges and bonds. Use the manage_ovs command options to manage the bridges and bonds.
nutanix@cvm$ manage_ovs --host IP_address_of_co_node --bridge_name bridge_name create_single_bridge

Replace IP_address_of_co_node with the IP address of the CO node and bridge_name with the name of bridge you want to create.

Note: Run the manage_ovs commands for a CO from any CVM running on a hyperconverged node.

Perform the networking tasks for each CO node in the cluster individually.

For more information about networking configuration of the AHV hosts, see Host Network Management in the AHV Administration Guide.

Adding a Compute-Only Node to an AHV Cluster

About this task

Perform the following procedure to add a compute-only node to a Nutanix cluster.

Procedure

  1. Log on to the Prism Element web console.
  2. Do one of the following:
    • Click the gear icon in the main menu and select Expand Cluster in the Settings page.
    • Go to the hardware dashboard (see Hardware Dashboard) and click Expand Cluster.
  3. In the Select Host screen, scroll down and, under Manual Host Discovery, click Discover Hosts Manually.
    Figure. Discover Hosts Manually Click to enlarge

  4. Click Add Host.
    Figure. Add Host Click to enlarge

  5. Under Host or CVM IP, type the IP address of the AHV host and click Save.
    This node does not have a Controller VM and you must therefore provide the IP address of the AHV host.
  6. Click Discover and Add Hosts.
    Prism Element discovers this node and the node appears in the list of nodes in the Select Host screen.
  7. Select the node to display the details of the compute-only node.
  8. Click Next.
  9. In the Configure Host screen, click Expand Cluster.

    The add node process begins and Prism Element performs a set of checks before the node is added to the cluster.

    Check the progress of the operation in the Tasks menu of the Prism Element web console. The operation takes approximately five to seven minutes to complete.

  10. Check the Hardware Diagram view to verify if the node is added to the cluster.
    You can identity a node as a CO node if the Prism Element web console does not display the IP address for the CVM.

Host Network Management

Network management in an AHV cluster consists of the following tasks:

  • Configuring Layer 2 switching through virtual switch and Open vSwitch bridges. When configuring virtual switch vSwitch, you configure bridges, bonds, and VLANs.
  • Optionally changing the IP address, netmask, and default gateway that were specified for the hosts during the imaging process.

Virtual Networks (Layer 2)

Each VM network interface is bound to a virtual network. Each virtual network is bound to a single VLAN; trunking VLANs to a virtual network is not supported. Networks are designated by the Layer 2 type (vlan) and the VLAN number.

By default, each virtual network maps to virtual switch such as the default virtual switch vs0. However, you can change this setting to map a virtual network to a custom virtual switch. The user is responsible for ensuring that the specified virtual switch exists on all hosts, and that the physical switch ports for the virtual switch uplinks are properly configured to receive VLAN-tagged traffic.

For more information about virtual switches, see About Virtual Switch.

A VM NIC must be associated with a virtual network. You can change the virtual network of a vNIC without deleting and recreating the vNIC.

Managed Networks (Layer 3)

A virtual network can have an IPv4 configuration, but it is not required. A virtual network with an IPv4 configuration is a managed network; one without an IPv4 configuration is an unmanaged network. A VLAN can have at most one managed network defined. If a virtual network is managed, every NIC is assigned an IPv4 address at creation time.

A managed network can optionally have one or more non-overlapping DHCP pools. Each pool must be entirely contained within the network's managed subnet.

If the managed network has a DHCP pool, the NIC automatically gets assigned an IPv4 address from one of the pools at creation time, provided at least one address is available. Addresses in the DHCP pool are not reserved. That is, you can manually specify an address belonging to the pool when creating a virtual adapter. If the network has no DHCP pool, you must specify the IPv4 address manually.

All DHCP traffic on the network is rerouted to an internal DHCP server, which allocates IPv4 addresses. DHCP traffic on the virtual network (that is, between the guest VMs and the Controller VM) does not reach the physical network, and vice versa.

A network must be configured as managed or unmanaged when it is created. It is not possible to convert one to the other.

Figure. AHV Networking Architecture Click to enlargeAHV Networking Architecture image

Prerequisites for Configuring Networking

Change the configuration from the factory default to the recommended configuration. See AHV Networking Recommendations.

AHV Networking Recommendations

Nutanix recommends that you perform the following OVS configuration tasks from the Controller VM, as described in this documentation:

  • Viewing the network configuration
  • Configuring uplink bonds with desired interfaces using the Virtual Switch (VS) configurations.
  • Assigning the Controller VM to a VLAN

For performing other network configuration tasks such as adding an interface to a bridge and configuring LACP for the interfaces in a bond, follow the procedures described in the AHV Networking best practices documentation.

Nutanix recommends that you configure the network as follows:

Table 1. Recommended Network Configuration
Network Component Best Practice
Virtual Switch

Do not modify the OpenFlow tables of any bridges configured in any VS configurations in the AHV hosts.

Do not rename default virtual switch vs0. You cannot delete the default virtual switch vs0.

Do not delete or rename OVS bridge br0.

Do not modify the native Linux bridge virbr0.

Switch Hops Nutanix nodes send storage replication traffic to each other in a distributed fashion over the top-of-rack network. One Nutanix node can, therefore, send replication traffic to any other Nutanix node in the cluster. The network should provide low and predictable latency for this traffic. Ensure that there are no more than three switches between any two Nutanix nodes in the same cluster.
Switch Fabric

A switch fabric is a single leaf-spine topology or all switches connected to the same switch aggregation layer. The Nutanix VLAN shares a common broadcast domain within the fabric. Connect all Nutanix nodes that form a cluster to the same switch fabric. Do not stretch a single Nutanix cluster across multiple, disconnected switch fabrics.

Every Nutanix node in a cluster should therefore be in the same L2 broadcast domain and share the same IP subnet.

WAN Links A WAN (wide area network) or metro link connects different physical sites over a distance. As an extension of the switch fabric requirement, do not place Nutanix nodes in the same cluster if they are separated by a WAN.
VLANs

Add the Controller VM and the AHV host to the same VLAN. Place all CVMs and AHV hosts in a cluster in the same VLAN. By default the CVM and AHV host are untagged, shown as VLAN 0, which effectively places them on the native VLAN configured on the upstream physical switch.

Note: Do not add any other device (including guest VMs) to the VLAN to which the CVM and hypervisor host are assigned. Isolate guest VMs on one or more separate VLANs.

Nutanix recommends configuring the CVM and hypervisor host VLAN as the native, or untagged, VLAN on the connected switch ports. This native VLAN configuration allows for easy node addition and cluster expansion. By default, new Nutanix nodes send and receive untagged traffic. If you use a tagged VLAN for the CVM and hypervisor hosts instead, you must configure that VLAN while provisioning the new node, before adding that node to the Nutanix cluster.

Use tagged VLANs for all guest VM traffic and add the required guest VM VLANs to all connected switch ports for hosts in the Nutanix cluster. Limit guest VLANs for guest VM traffic to the smallest number of physical switches and switch ports possible to reduce broadcast network traffic load. If a VLAN is no longer needed, remove it.

Default VS bonded port (br0-up)

Aggregate the fastest links of the same speed on the physical host to a VS bond on the default vs0 and provision VLAN trunking for these interfaces on the physical switch.

By default, interfaces in the bond in the virtual switch operate in the recommended active-backup mode.
Note: The mixing of bond modes across AHV hosts in the same cluster is not recommended and not supported.
1 GbE and 10 GbE interfaces (physical host)

If 10 GbE or faster uplinks are available, Nutanix recommends that you use them instead of 1 GbE uplinks.

Recommendations for 1 GbE uplinks are as follows:

  • If you plan to use 1 GbE uplinks, do not include them in the same bond as the 10 GbE interfaces.

    Nutanix recommends that you do not use uplinks of different speeds in the same bond.

  • If you choose to configure only 1 GbE uplinks, when migration of memory-intensive VMs becomes necessary, power off and power on in a new host instead of using live migration. In this context, memory-intensive VMs are VMs whose memory changes at a rate that exceeds the bandwidth offered by the 1 GbE uplinks.

    Nutanix recommends the manual procedure for memory-intensive VMs because live migration, which you initiate either manually or by placing the host in maintenance mode, might appear prolonged or unresponsive and might eventually fail.

    Use the aCLI on any CVM in the cluster to start the VMs on another AHV host:

    nutanix@cvm$ acli vm.on vm_list host=host

    Replace vm_list with a comma-delimited list of VM names and replace host with the IP address or UUID of the target host.

  • If you must use only 1GbE uplinks, add them into a bond to increase bandwidth and use the balance-TCP (LACP) or balance-SLB bond mode.
IPMI port on the hypervisor host Do not use VLAN trunking on switch ports that connect to the IPMI interface. Configure the switch ports as access ports for management simplicity.
Upstream physical switch

Nutanix does not recommend the use of Fabric Extenders (FEX) or similar technologies for production use cases. While initial, low-load implementations might run smoothly with such technologies, poor performance, VM lockups, and other issues might occur as implementations scale upward (see Knowledge Base article KB1612). Nutanix recommends the use of 10Gbps, line-rate, non-blocking switches with larger buffers for production workloads.

Cut-through versus store-and-forward selection depends on network design. In designs with no oversubscription and no speed mismatches you can use low-latency cut-through switches. If you have any oversubscription or any speed mismatch in the network design, then use a switch with larger buffers. Port-to-port latency should be no higher than 2 microseconds.

Use fast-convergence technologies (such as Cisco PortFast) on switch ports that are connected to the hypervisor host.

Physical Network Layout Use redundant top-of-rack switches in a traditional leaf-spine architecture. This simple, flat network design is well suited for a highly distributed, shared-nothing compute and storage architecture.

Add all the nodes that belong to a given cluster to the same Layer-2 network segment.

Other network layouts are supported as long as all other Nutanix recommendations are followed.

Jumbo Frames

The Nutanix CVM uses the standard Ethernet MTU (maximum transmission unit) of 1,500 bytes for all the network interfaces by default. The standard 1,500 byte MTU delivers excellent performance and stability. Nutanix does not support configuring the MTU on network interfaces of a CVM to higher values.

You can enable jumbo frames (MTU of 9,000 bytes) on the physical network interfaces of AHV hosts and guest VMs if the applications on your guest VMs require them. If you choose to use jumbo frames on hypervisor hosts, be sure to enable them end to end in the desired network and consider both the physical and virtual network infrastructure impacted by the change.

Controller VM Do not remove the Controller VM from either the OVS bridge br0 or the native Linux bridge virbr0.
Rack Awareness and Block Awareness Block awareness and rack awareness provide smart placement of Nutanix cluster services, metadata, and VM data to help maintain data availability, even when you lose an entire block or rack. The same network requirements for low latency and high throughput between servers in the same cluster still apply when using block and rack awareness.
Note: Do not use features like block or rack awareness to stretch a Nutanix cluster between different physical sites.
Oversubscription

Oversubscription occurs when an intermediate network device or link does not have enough capacity to allow line rate communication between the systems connected to it. For example, if a 10 Gbps link connects two switches and four hosts connect to each switch at 10 Gbps, the connecting link is oversubscribed. Oversubscription is often expressed as a ratio—in this case 4:1, as the environment could potentially attempt to transmit 40 Gbps between the switches with only 10 Gbps available. Achieving a ratio of 1:1 is not always feasible. However, you should keep the ratio as small as possible based on budget and available capacity. If there is any oversubscription, choose a switch with larger buffers.

In a typical deployment where Nutanix nodes connect to redundant top-of-rack switches, storage replication traffic between CVMs traverses multiple devices. To avoid packet loss due to link oversubscription, ensure that the switch uplinks consist of multiple interfaces operating at a faster speed than the Nutanix host interfaces. For example, for nodes connected at 10 Gbps, the inter-switch connection should consist of multiple 10 Gbps or 40 Gbps links.

The following diagrams show sample network configurations using Open vSwitch and Virtual Switch.

Figure. Virtual Switch Click to enlargeDisplaying Virtual Switch mechanism

Figure. AHV Bridge Chain Click to enlargeDisplaying Virtual Switch mechanism

Figure. Default factory configuration of Open vSwitch in AHV Click to enlarge

Figure. Open vSwitch Configuration Click to enlarge

IP Address Management

IP Address Management (IPAM) is a feature of AHV that allows it to assign IP addresses automatically to VMs by using DHCP. You can configure each virtual network with a specific IP address subnet, associated domain settings, and IP address pools available for assignment to VMs.

An AHV network is defined as a managed network or an unmanaged network based on the IPAM setting.

Managed Network

Managed network refers to an AHV network in which IPAM is enabled.

Unmanaged Network

Unmanaged network refers to an AHV network in which IPAM is not enabled or is disabled.

IPAM is enabled, or not, in the Create Network dialog box when you create a virtual network for Guest VMs. See Configuring a Virtual Network for Guest VM Interfaces topic in the Prism Web Console Guide.
Note: You can enable IPAM only when you are creating a virtual network. You cannot enable or disable IPAM for an existing virtual network.

IPAM enabled or disabled status has implications. For example, when you want to reconfigure the IP address of a Prism Central VM, the procedure to do so may involve additional steps for managed networks (that is, networks with IPAM enabled) where the new IP address belongs to an IP address range different from the previous IP address range. See Reconfiguring the IP Address and Gateway of Prism Central VMs in Prism Central Guide.

Traffic Marking for Quality of Service

To prioritize outgoing (or egress) traffic as required, you can configure quality of service on the traffic for a cluster.

There are two distinct types of outgoing or egress traffic:

  • Management traffic (mgmt)
  • Data services (data-svc)

Data services traffic consists of the following protocols:

Table 1. Data Services Protocols
Protocol Port Nutanix Services
NFS

Source ports (TCP): 445, 2049, 20048, 20049, 20050, and 7508.

Source ports (UDP): 2049, 20048, 20049, 20050, and 7508.

Source and Destination ports (TCP for Replicator-dr): 7515.

-Nutanix Files-
SMB

Source ports (TCP): 445, 2049, 20048, 20049, 20050, and 7508.

Source ports (UDP): 2049, 20048, 20049, 20050, and 7508.

Source and Destination ports (TCP for Replicator-dr): 7515.

-Nutanix Files-
Cluster-to-cluster replications (external or inter-site)

Destination Ports: 2009 and 2020 on CVM.

Stargate and Cerebro
Node-to-node replications (internal or intra-site)

Destination Ports: 2009 and 2020 on CVM.

Stargate and Cerebro
iSCSI

Source Ports: 3260,3261,3205 on CVM.

Destination Ports: 3260,3261,3205 on AHV.

Nutanix Files and Volumes

Traffic other than data services traffic is management traffic. Traffic marking for QoS is disabled by default.

When you enable QoS, you can mark both the types of traffic with QoS values. AOS considers the values in hexadecimal even if you provide the values in decimal. When you view or get the QoS configuration enabled on the cluster, nCLI provides the QoS values in hexadecimal format (0xXX where XX is hexadecimal value in the range 00–3f).
Note: Set any QoS value in the range 0x0–0x3f. The default QoS values for the traffic are as follows:
  • Management traffic (mgmt) = 0x10
  • Data services (data-svc) = 0xa

Configuring Traffic Marking for QoS

Configure Quality of Service (QoS) for management and data services traffic using nCLI.

About this task

To perform the following operations for QoS on the egress traffic of a cluster, use the nCLI commands in this section:

  • Enable traffic marking for QoS on the cluster. QoS traffic marking is disabled by default.
  • View or get the QoS configuration enabled on the cluster.
  • Set QoS values for all traffic types or specific traffic types.
  • Disable QoS on the cluster.

When you run any of the QoS configuration commands and the command succeeds, the console displays the following output indicating the successful command run:

QoSUpdateStatusDTO(status=true, message=null)

Where:

  • status=true indicates that the command succeeded.
  • message=null indicates that there is no error.

When you run any of the QoS configuration commands and the command fails, the console displays the following sample output indicating the failure:

QoSUpdateStatusDTO(status=false, message=QoS is already enabled.)

Where:

  • status=false indicates that the command failed.
  • message=QoS is already enabled. indicates why the command failed. This sample error message indicates that the net enable-qos command failed because QoS enable command was run again when QoS is already enabled.

Procedure

  • To enable QoS on a cluster, run the following command:
    ncli> net enable-qos [data-svc="data-svc value"][mgmt="mgmt value"]

    If you run the command as net enable-qos without the options, AOS enables QoS with the default values (mgmt=0x10 and data-svc=0xa).

    Note: After you run the net enable-qos command, if you run it again, the command fails and AOS displays the following output:
    QoSUpdateStatusDTO(status=false, message=QoS is already enabled.)
    Note: If you need to change the QoS values after you enable it , run the net edit-qos command with the option (data-svc or mgmt or both as necessary).
    Note: Set any QoS value in the range 0x0–0x3f.
  • To view or get the QoS configuration enabled on a cluster, run the following command:
    ncli> net get-qos
    Note: When you get the QoS configuration enabled on the cluster, nCLI provides the QoS values in hexadecimal format (0xXX where XX is hexadecimal value in the range 00–3f).

    A sample output on the console is as follows:

    QoSDTO(status=true, isEnabled=true, mgmt=0x10, dataSvc=0xa, message=null)
    Note:

    Where:

    • status=true indicates that the net get-qos command passed. status=false indicates that the net get-qos command failed. See the message= value for the failure error message.
    • isEnabled=true indicates that QoS is enabled. isEnabled=false indicates that QoS is not enabled.
    • mgmt=0x10 indicates that QoS value for Management traffic (mgmt option) is set to 0x10 (represented in hexadecimal value as 0x10. If you disabled QoS, then this parameter is displayed as mgmt=null.
    • dataSvc=0xa indicates that QoS value for data services traffic (data-svc option) is set to 0xa (represented in hexadecimal value as 0xa. If you disabled QoS, then this parameter is displayed as dataSvc=null.
    • message=null indicates there is no error message. message= parameter provides the command failure error message if the command fails.
  • To set the QoS values for the traffic types on a cluster after you enabled QoS on the cluster, run the following command:
    ncli> net edit-qos [data-svc="data-svc value"][mgmt="mgmt value"]

    You can provide QoS values between 0x0-0x3f for one or both the options. The value is hexadecimal representation of a value between decimal 0-63 both inclusive.

  • To disable QoS on a cluster, run the following command:
    ncli> net disable-qos
    QoSDTO(status=true, isEnabled=false, mgmt=null, dataSvc=null, message=null)

Layer 2 Network Management

AHV uses virtual switch (VS) to connect the Controller VM, the hypervisor, and the guest VMs to each other and to the physical network. Virtual switch is configured by default on each AHV node and the VS services start automatically when you start a node.

To configure virtual networking in an AHV cluster, you need to be familiar with virtual switch. This documentation gives you a brief overview of virtual switch and the networking components that you need to configure to enable the hypervisor, Controller VM, and guest VMs to connect to each other and to the physical network.

About Virtual Switch

Virtual switches or VS are used to manage multiple bridges and uplinks.

The VS configuration is designed to provide flexibility in configuring virtual bridge connections. A virtual switch (VS) defines a collection of AHV nodes and the uplink ports on each node. It is an aggregation of the same OVS bridge on all the compute nodes in a cluster. For example, vs0 is the default virtual switch is an aggregation of the br0 bridge and br0-up uplinks of all the nodes.

After you configure a VS, you can use the VS as reference for physical network management instead of using the bridge names as reference.

For overview about Virtual Switch, see Virtual Switch Considerations.

For information about OVS, see About Open vSwitch.

Virtual Switch Workflow

A virtual switch (VS) defines a collection of AHV compute nodes and the uplink ports on each node. It is an aggregation of the same OVS bridge on all the compute nodes in a cluster. For example, vs0 is the default virtual switch is an aggregation of the br0 bridge of all the nodes.

The system creates the default virtual switch vs0 connecting the default bridge br0 on all the hosts in the cluster during installation of or upgrade to the compatible versions of AOS and AHV. Default virtual switch vs0 has the following characteristics:

  • The default virtual switch cannot be deleted.

  • The default bridges br0 on all the nodes in the cluster map to vs0. thus, vs0 is not empty. It has at least one uplink configured.

  • The default management connectivity to a node is mapped to default bridge br0 that is mapped to vs0.

  • The default parameter values of vs0 - Name, Description, MTU and Bond Type - can be modified subject to aforesaid characteristics.

  • The default virtual switch is configured with the Active-Backup uplink bond type.

    For more information about bond types, see the Bond Type table.

The virtual switch aggregates the same bridges on all nodes in the cluster. The bridge (for example, br1) connects to the physical port such as eth3 (Ethernet port) via the corresponding uplink (for example, br1-up). The uplink ports of the bridges are connected to the same physical network. For example, the following illustration shows that vs0 is mapped to the br0 bridge, in turn connected via uplink br0-up to various (physical) Ethernet ports on different nodes.

Figure. Virtual Switch Click to enlargeDisplaying Virtual Switch mechanism

Uplink configuration uses bonds to improve traffic management. The bond types are defined for the aggregated OVS bridges. A new bond type - No uplink bond - provides a no-bonding option. A virtual switch configured with the No uplink bond uplink bond type has 0 or 1 uplinks.

When you configure a virtual switch with any other bond type, you must select at least two uplink ports on every node.

If you change the uplink configuration of vs0, AOS applies the updated settings to all the nodes in the cluster one after the other (the rolling update process). To update the settings in a cluster, AOS performs the following tasks when configuration method applied is Standard:

  1. Puts the node in maintenance mode (migrates VMs out of the node)
  2. Applies the updated settings
  3. Checks connectivity with the default gateway
  4. Exits maintenance mode
  5. Proceeds to apply the updated settings to the next node

AOS does not put the nodes in maintenance mode when the Quick configuration method is applied.

Note:

Do not create, update or delete any virtual switch when the AOS or AHV upgrade process is running.

Table 1. Bond Types
Bond Type Use Case

Maximum VM NIC Throughput

Maximum Host Throughput

Active-Backup

Recommended. Default configuration, which transmits all traffic over a single active adapter. 10 Gb 10 Gb

Active-Active with MAC pinning

Also known as balance-slb

Works with caveats for multicast traffic. Increases host bandwidth utilization beyond a single 10 Gb adapter. Places each VM NIC on a single adapter at a time. Do not use this bond type with link aggregation protocols such as LACP. 10 Gb 20 Gb

Active-Active

Also known as LACP with balance-tcp

LACP and link aggregation required. Increases host and VM bandwidth utilization beyond a single 10 Gb adapter by balancing VM NIC TCP and UDP sessions among adapters. Also used when network switches require LACP negotiation.

The default LACP settings are:

  • Speed—Fast (1s)
  • Mode—Active fallback-active-backup
  • Priority—Default. This is not configurable.
20 Gb 20 Gb
No Uplink Bond

No uplink or a single uplink on each host.

Virtual switch configured with the No uplink bond uplink bond type has 0 or 1 uplinks. When you configure a virtual switch with any other bond type, you must select at least two uplink ports on every node.

- -

Note the following points about the uplink configuration.

  • Virtual switches are not enabled in a cluster that has one or more compute-only nodes. See Virtual Switch Limitations and Virtual Switch Requirements.
  • If you select the Active-Active policy, you must manually enable LAG and LACP on the corresponding ToR switch for each node in the cluster.
  • If you reimage a cluster with the Active-Active policy enabled, the default virtual switch (vs0) on the reimaged cluster is once again the Active-Backup policy. The other virtual switches are removed during reimage.
  • Nutanix recommends configuring LACP with fallback to active-backup or individual mode on the ToR switches. The configuration and behavior varies based on the switch vendor. Use a switch configuration that allows both switch interfaces to pass traffic after LACP negotiation fails.

Virtual Switch Considerations

Virtual Switch Deployment

A VS configuration is deployed using rolling update of the clusters. After the VS configuration (creation or update) is received and execution starts, every node is first put into maintenance mode before the VS configuration is made or modified on the node. This is called the Standard recommended default method of configuring a VS.

You can select the Quick method of configuration also where the rolling update does not put the clusters in maintenance mode. The VS configuration task is marked as successful when the configuration is successful on the first node. Any configuration failure on successive nodes triggers corresponding NCC alerts. There is no change to the task status.

Note: If you are modifying an existing bond, AHV removes the bond and then re-creates the bond with the specified interfaces.

Ensure that the interfaces you want to include in the bond are physically connected to the Nutanix appliance before you run the command described in this topic. If the interfaces are not physically connected to the Nutanix appliance, the interfaces are not added to the bond.

Ensure that the pre-checks listed in LCM Prechecks section of the Life Cycle Manager Guide and the Always and Host Disruptive Upgrades types of pre-checks listed KB-4584 pass for Virtual Switch deployments.

The VS configuration is stored and re-enforced at system reboot.

The VM NIC configuration also displays the VS details. When you Update VM configuration or Create NIC for a VM, the NIC details show the virtual switches that can be associated. This view allows you to change a virtual network and the associated virtual switch.

To change the virtual network, select the virtual network in the Subnet Name dropdown list in the Create NIC or Update NIC dialog box.

Figure. Create VM - VS Details Click to enlarge

Figure. VM NIC - VS Details Click to enlarge

Impact of Installation of or Upgrade to Compatible AOS and AHV Versions

See Virtual Switch Requirements for information about minimum and compatible AOS and AHV versions.

When you upgrade the AOS to a compatible version from an older version, the upgrade process:

  • Triggers the creation of the default virtual switch vs0, which is mapped to bridge br0on all the nodes.

  • Validates bridge br0 and its uplinks for consistency in terms of MTU and bond-type on every node.

    If valid, it adds the bridge br0 of each node to the virtual switch vs0.

    If br0 configuration is not consistent, the system generates an NCC alert which provides the failure reason and necessary details about it.

    The system migrates only the bridge br0 on each node to the default virtual switch vs0 because the connectivity of bridge br0 is guaranteed.

  • Does not migrate any other bridges to any other virtual switches during upgrade. You need to manually migrate the other bridges after install or upgrade is complete.

Note:

Do not create, update or delete any virtual switch when the AOS or AHV upgrade process is running.

Bridge Migration

After upgrading to a compatible version of AOS, you can migrate bridges other than br0 that existed on the nodes. When you migrate the bridges, the system converts the bridges to virtual switches.

See Virtual Switch Migration Requirements in Virtual Switch Requirements.

Note: You can migrate only those bridges that are present on every compute node in the cluster. See Migrating Bridges after Upgrade topic in Prism Web Console Guide.

Cluster Scaling Impact

VS management for cluster scaling (addition or removal of nodes) is seamless.

Node Removal

When you remove a node, the system detects the removal and automatically removes the node from all the VS configurations that include the node and generates an internal system update. For example, a node has two virtual switches, vs1 and vs2, configured apart from the default vs0. When you remove the node from the cluster, the system removes the node for the vs1 and vs2 configurations automatically with internal system update.

Node Addition

When you add a new node or host to a cluster, the bridges or virtual switches on the new node are treated in the following manner:

Note: If a host already included in a cluster is removed and then added back, it is treated as a new host.
  • The system validates the default bridge br0 and uplink bond br0-up to check if it conforms to the default virtual switch vs0 already present on the cluster.

    If br0 and br0-up conform, the system includes the new host and its uplinks in vs0.

    If br0 and br0-up do not conform,then the system generates an NCC alert.

  • The system does not automatically add any other bridge configured on the new host to any other virtual switch in the cluster.

    It generates NCC alerts for all the other non-default virtual switches.

  • You can manually include the host in the required non-default virtual switches. Update a non-default virtual switch to include the host.

    For information about updating a virtual switch in Prism Element Web Console, see the Configuring a Virtual Network for Guest VM Interfaces section in Prism Web Console Guide.

    For information about updating a virtual switch in Prism Central, see the Network Connections section in the Prism Central Guide.

VS Management

You can manage virtual switches from Prism Central or Prism Web Console. You can also use aCLI or REST APIs to manage them. See the Acropolis API Reference and Command Reference guides for more information.

You can also use the appropriate aCLI commands for virtual switches from the following list:

  • net.create_virtual_switch

  • net.list_virtual_switch

  • net.get_virtual_switch

  • net.update_virtual_switch

  • net.delete_virtual_switch

  • net.migrate_br_to_virtual_switch

  • net.disable_virtual_switch

About Open vSwitch

Open vSwitch (OVS) is an open-source software switch implemented in the Linux kernel and designed to work in a multiserver virtualization environment. By default, OVS behaves like a Layer 2 learning switch that maintains a MAC address learning table. The hypervisor host and VMs connect to virtual ports on the switch.

Each hypervisor hosts an OVS instance, and all OVS instances combine to form a single switch. As an example, the following diagram shows OVS instances running on two hypervisor hosts.

Figure. Open vSwitch Click to enlarge

Default Factory Configuration

The factory configuration of an AHV host includes a default OVS bridge named br0 (configured with the default virtual switch vs0) and a native linux bridge called virbr0.

Bridge br0 includes the following ports by default:

  • An internal port with the same name as the default bridge; that is, an internal port named br0. This is the access port for the hypervisor host.
  • A bonded port named br0-up. The bonded port aggregates all the physical interfaces available on the node. For example, if the node has two 10 GbE interfaces and two 1 GbE interfaces, all four interfaces are aggregated on br0-up. This configuration is necessary for Foundation to successfully image the node regardless of which interfaces are connected to the network.
    Note:

    Before you begin configuring a virtual network on a node, you must disassociate the 1 GbE interfaces from the br0-up port. This disassociation occurs when you modify the default virtual switch (vs0) and create new virtual switches. Nutanix recommends that you aggregate only the 10 GbE or faster interfaces on br0-up and use the 1 GbE interfaces on a separate OVS bridge deployed in a separate virtual switch.

    See Virtual Switch Management for information about virtual switch management.

The following diagram illustrates the default factory configuration of OVS on an AHV node:

Figure. Default factory configuration of Open vSwitch in AHV Click to enlarge

The Controller VM has two network interfaces by default. As shown in the diagram, one network interface connects to bridge br0. The other network interface connects to a port on virbr0. The Controller VM uses this bridge to communicate with the hypervisor host.

Virtual Switch Requirements

The requirements to deploy virtual switches are as follows:

  1. Virtual switches are supported on AOS 5.19 or later with AHV 20201105.12 or later. Therefore you must install or upgrade to AOS 5.19 or later, with AHV 20201105.12 or later, to use virtual switches in your deployments.

  2. Virtual bridges used for a VS on all the nodes must have the same specification such as name, MTU and uplink bond type. For example, if vs1 is mapped to br1 (virtual or OVS bridge 1) on a node, it must be mapped to br1 on all the other nodes of the same cluster.

Virtual Switch Migration Requirements

The AOS upgrade process initiates the virtual switch migration. The virtual switch migration is successful only when the following requirements are fulfilled:

  • Before migrating to Virtual Switch, all bridge br0 bond interfaces should have the same bond type on all hosts in the cluster. For example, all hosts should use the Active-backup bond type or balance-tcp. If some hosts use Active-backup and other hosts use balance-tcp, virtual switch migration fails.
  • Before migrating to Virtual Switch, if using LACP:
    • Confirm that all bridge br0 lacp-fallback parameters on all hosts are set to the case sensitive value True with manage_ovs show_uplinks |grep lacp-fallback:. Any host with lowercase true causes virtual switch migration failure.
    • Confirm that the LACP speed on the physical switch is set to fast or 1 second. Also ensure that the switch ports are ready to fallback to individual mode if LACP negotiation fails due to a configuration such as no lacp suspend-individual.
  • Before migrating to the Virtual Switch, confirm that the upstream physical switch is set to spanning-tree portfast or spanning-tree port type edge trunk. Failure to do so may lead to a 30-second network timeout and the virtual switch migration may fail because it uses 20-second non-modifiable timer.
  • Ensure that the pre-checks listed in LCM Prechecks section of the Life Cycle Manager Guide and the Always and Host Disruptive Upgrades types of pre-checks listed KB-4584 pass for Virtual Switch deployments.

  • For the default virtual switch vs0,
    • All configured uplink ports must be available for connecting the network. In Active-Backup bond type, the active port is selected from any configured uplink port that is linked. Therefore, the virtual switch vs0 can use all the linked ports for communication with other CVMs/hosts.
    • All the host IP addresses in the virtual switch vs0 must be resolvable to the configured gateway using ARP.

Virtual Switch Limitations

Virtual Switch Operations During Upgrade

Do not create, update or delete any virtual switch when the AOS or AHV upgrade process is running.

MTU Restriction

The Nutanix Controller VM uses the standard Ethernet MTU (maximum transmission unit) of 1,500 bytes for all the network interfaces by default. The standard 1,500-byte MTU delivers excellent performance and stability. Nutanix does not support configuring higher values of MTU on the network interfaces of a Controller VM.

You can enable jumbo frames (MTU of 9,000 bytes) on the physical network interfaces of AHV, ESXi, or Hyper-V hosts and guest VMs if the applications on your guest VMs require such higher MTU values. If you choose to use jumbo frames on the hypervisor hosts, enable the jumbo frames end to end in the specified network, considering both the physical and virtual network infrastructure impacted by the change.

Single node and Two-node cluster configuration.

Virtual switch cannot be deployed is your single-node or two-node cluster has any instantiated user VMs. The virtual switch creation or update process involves a rolling restart, which checks for maintenance mode and whether you can migrate the VMs. On a single-node or two-node cluster, any instantiated user VMs cannot be migrated and the virtual switch operation fails.

Therefore, power down all user VMs for virtual switch operations in a single-node or two-node cluster.

Compute-only node is not supported.

Virtual switch is not compatible with Compute-only (CO) nodes. If a CO node is present in the cluster, then the virtual switches are not deployed (including the default virtual switch). You need to use the net.disable_virtual_switch aCLI command to disable the virtual switch workflow if you want to expand a cluster which has virtual switches and includes a CO node.

The net.disable_virtual_switch aCLI command cleans up all the virtual switch entries from the IDF. All the bridges mapped to the virtual switch or switches are retained as they are.

See Compute-Only Node Configuration (AHV Only).

Including a storage-only node in a VS is not necessary.

Virtual switch is compatible with Storage-only (SO) nodes but you do not need to include an SO node in any virtual switch, including the default virtual switch.

Mixed-mode Clusters with AHV Storage-only Nodes
Consider that you have deployed a mixed-node cluster where the compute-only nodes are ESXi or Hyper-V nodes and the storage-only nodes are AHV nodes. In such a case, the default virtual switch deployment fails.

Without the default VS, the Prism Element, Prism Central and CLI workflows for virtual switch required to manage the bridges and bonds are not available. You need to use the manage_ovs command options to update the bridge and bond configurations on the AHV hosts.

Virtual Switch Management

Virtual Switch can be viewed, created, updated or deleted from both Prism Web Console as well as Prism Central.

Virtual Switch Views and Visualization

For information on the virtual switch network visualization in Prism Element Web Console, see the Network Visualization topic in the Prism Web Console Guide.

Virtual Switch Create, Update and Delete Operations

For information about the procedures to create, update and delete a virtual switch in Prism Element Web Console, see the Configuring a Virtual Network for Guest VM Interfaces section in the Prism Web Console Guide.

For information about the procedures to create, update and delete a virtual switch in Prism Central, see the Network Connections section in the Prism Central Guide.

Note:

Do not create, update or delete any virtual switch when the AOS or AHV upgrade process is running.

Uplinks for Virtual Private Cloud Traffic

Starting with a minimum AOS version 6.1.1 with Prism Central version pc.2022.4 and Flow networking controller version 2.1.1, you can use virtual switches to separate traffic of the guest VMs that are networked using Flow Networking Virtual Private Cloud (VPC) configurations.

AHV uses the default virtual switch for the management and other Controller VM traffic (unless if you have configured network segmentation to route the Controller VM traffic on another virtual switch). When you enable Flow networking in a cluster, Prism Central with the Flow networking controller and network gateway allows you to deploy Virtual Private Clouds (VPCs) that network guest VMs on hosts within the cluster and on other clusters. By default, AHV uses the default virtual switch vs0 for the VPC (Flow networking) traffic as well.

You can configure AHV to route the VPC (Flow networking) traffic on a different virtual switch, other than the default virtual switch.

Conditions for VPC Uplinks

Certain conditions apply to the use of virtual switches to separate the Controller VM traffic and traffic of the guest VMs that are networked using Virtual Private Cloud (VPC) configurations.

Host IP Addresses in Virtual Switch

The virtual switch selected for Flow networking VPC traffic must have IP addresses configured on the hosts. If the selected virtual switch does not have IP addresses configured on the hosts, then the following error is displayed:

Bridge interface IP address is not configured for host: <host-UUID> on virtual_switch <name-of-selected-virtual-switch>

Configure IP addresses from this subnet on the hosts in the virtual switch.

Note:

Do not create, update or delete any virtual switch when the AOS or AHV upgrade process is running.

Requirements

Ensure that the default virtual switch vs0 is enabled.

The following conditions apply to the IP addresses that you configure:

  • Ensure that the host IP addresses in the subnet do not overlap with the primary IP addresses of the host configured during installation or the IP addresses used in any other configured virtual switches.
  • Ensure that the host IP addresses in the subnet do not overlap with the IP addresses configured for the backplane operations (using network segmentation).
  • Ensure that the host IP addresses configured on the hosts in the virtual switch is not a network IP address. For example, in a subnet 10.10.10.0/24, the network IP address of the subnet is 10.10.10.0. Ensure that this IP address (10.10.10.0) is not configured as a host IP address in the virtual switch. Failure message is as follows:
    Host IP address cannot be assigned equal to the subnet.
  • Ensure that the host IP addresses configured on the hosts in the virtual switch is not the broadcast IP address of the subnet. For example, in a subnet 10.10.10.0/24, the broadcast IP address of the subnet is 10.10.10.255. Ensure that this IP address (10.10.10.255) is not configured as a host IP address in the virtual switch. Failure message is as follows:
    Host IP address cannot be assigned equal to the subnet broadcast address.
  • Ensure that the subnet configured in the virtual switch has a prefix of /30 or less. For example, you can configure a subnet with a prefix of /30 such as 10.10.10.0/30, but not a subnet with prefix of /31 or /32 such as 10.10.10.0/31 or 10.10.10.0/32. Any subnet that you configure in a virtual switch must have not less than 2 usable IP addresses. Failure message is as follows:
    Prefix length cannot be greater than 30.
  • Ensure that the host IP addresses configured in a virtual switch belongs to the same subnet. In other words, you cannot configure host IP addresses from two or more different subnets. For example, one host IP address is 10.10.10.10 from the subnet 10.10.10.0/24 and another host IP address is 10.100.10.10 from the subnet 10.100.10.0/24. This configuration fails. Both the hosts must have IP addresses from the 10.10.10.0/24 subnet (or both IP addresses must be from 10.100.10.0/24 subnet). Failure message is as follows:
    Different host IP address subnets found.
  • Ensure that the gateway IP address for the host IP addresses configured in a virtual switch belongs to the same subnet as host IP addresses. In other words, you cannot configure host IP addresses from one subnet while the gateway IP address of any of those host IP address is in a different subnet. Failure message is as follows:
    Gateway IP address is not in the same subnet.
Configuring Virtual Switch for VPC Traffic

Configure a new or existing non-default virtual switch for Flow networking VPC traffic.

Before you begin

You need a virtual switch, other than the default virtual switch vs0, that can be used to route the VPC traffic. Create a separate virtual switch that you can use to route the Flow networking VPC traffic.

For information about the procedures to create or update in Prism Element Web Console, see the Configuring a Virtual Network for Guest VMs section in the Prism Web Console Guide.

For information about the procedures to create or update a virtual switch in Prism Central, see Network Connections in the Prism Central Guide.

Note:

Do not create, update or delete any virtual switch when the AOS or AHV upgrade process is running.

About this task

Follow these steps to configure the uplinks for guest VMs networked by Flow networking VPCs:

Procedure

  1. Create the virtual switch you want to use for VPC traffic. For example, create vs1 as a virtual switch for the VPC traffic.
    See Creating or Updating a Virtual Switch in the Prism Web Console Guide.
  2. Configure IP addresses for the hosts that you have included in the virtual switch and a gateway IP address for the network.
    Note: You can configure the IP addresses for the hosts when you are creating the virtual switch. Ensure that you add other necessary options like host_upink_config, bond_uplink in the net.create_virtual_switch or net.update_virtual_switch commands when you create or update a virtual switch, respectively, with the host IP addresses and gateway IP address.

    See the Command Reference for more information.

    The options are:

    • host_ip_addr_config=: Provide the host UUID and associated IP address with prefix as follows:
      host_ip_addr_config={host-uuid1:host_ip_address/prefix}
      Where there are more than one host on the virtual switch, use a semicolon separated list as follows:
      host_ip_addr_config={host-uuid1:host_ip_address/prefix;host-uuid2:host_ip_address/prefix;host-uuid3:host_ip_address/prefix}
    • gateway_ip_address=: Provide the gateway IP address as follows:
      gateway_ip_address=IP_address/prefix

    For example, to update the host IP addresses and gateway IP address for virtual switch vs1, the sample command would be as follows:

    nutanix@cvm$ acli net.update_virtual_switch vs1 host_ip_addr_config={ebeae8d8-47cb-40d0-87f9-d03a762ffad7:10.XX.XX.15/24} gateway_ip_address=10.XX.XX.1/24
  3. Set the virtual switch for use with Flow networking VPCs.
    Use the following command:
    nutanix@cvm$ acli net.set_vpc_east_west_traffic_config virtual_switch=virtual-switch-name
    Note: When you run this command, if the virtual switch does not have IP addresses configured for the hosts, the command fails with an error message. See Conditions for VPC Uplinks for more information.

    For example, to set vs1 as the virtual switch for Flow networking VPC traffic, the sample command is as follows:

    nutanix@cvm$ net.set_vpc_east_west_traffic_config virtual_switch=vs1
    Note: You can configure the virtual switch to route all traffic. Set the value for the permit_all_traffic= option in the net.set_vpc_east_west_traffic_config command to true to route all the traffic using the virtual switch. The default value for this option is false which allows the virtual switch to route only VPC traffic.

    Do not configure the permit_all_traffic= option if you want to use the virtual switch only for VPC traffic. Configure the permit_all_traffic= option with the value true only when you want the virtual switch to allow all traffic.

  4. You can update the virtual switch that is set for Flow networking VPC traffic using the following command:
    nutanix@cvm$ net.update_vpc_east_west_traffic_config virtual_switch=vs1

    To update the virtual switch to allow all traffic, use the permit_all_traffic= option with the value true as follows:

    nutanix@cvm$ net.update_vpc_east_west_traffic_config permit_all_traffic=true
  5. Update the subnet to use the new virtual switch for the external traffic, on the Prism Central VM.
    nutanix@pcvm$ atlas_cli subnet.update external_subnet_name virtual_switch_uuid=virtual_switch_uuid

What to do next

  • To verify if the settings are made as required, use the atlas_config.get command and check the output.

    <acropolis> atlas_config.get
    config {
      anc_domain_name_server_list: "10.xxx.xxx.xxx"
      dvs_physnet_mapping_list {
        dvs_uuid: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
        physnet: "physnet1"
      }
      enable_atlas_networking: True
      logical_timestamp: 54
      minimum_ahv_version: "20201105.2016"
      ovn_cacert_path: "/home/certs/OvnController/ca.pem"
      ovn_certificate_path: "/home/certs/OvnController/OvnController.crt"
      ovn_privkey_path: "/home/certs/OvnController/OvnController.key"
      ovn_remote_address: "ssl:anc-ovn-external.default.xxxx.nutanix.com:6652"
      vpc_east_west_traffic_config {
        dvs_uuid: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
        permit_all_traffic: True
      }
    }

    Where:

    • dvs_physnet_mapping_list provides details of the virtual switch.
    • vpc_east_west_traffic_config provides the configuration for traffic with permit_all_traffic being True. It also provides the UUID of the virtual switch being used for traffic as dvs_uuid: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
  • VLAN tagging: See VLAN Configuration.
Clearing, Disabling and Deleting the Virtual Switch

You can disable and delete a non-default virtual switch used for Flow networking VPC traffic.

About this task

Before you begin

Before you delete a virtual switch that allows Flow networking VPC traffic, you must clear the virtual switch configuration that assigns the VPC traffic to that virtual switch. Use the net.clear_vpc_east_west-traffic_config command.
Note:

Do not create, update or delete any virtual switch when the AOS or AHV upgrade process is running.

About this task

To disable or delete a virtual switch configured to manage Flow networking VPC traffic, do the following:

Note: After you clear the virtual switch settings using step1, you can disable and delete the virtual switch in Prism Central.

Procedure

  1. Use the net.clear_vpc_east_west_traffic_config command to remove the settings on the virtual switch or switches (vs1 per example) configured for Flow networking VPC traffic.
  2. Use the net.disable_virtual_switch virtual_switch=<virtual-switch-name> option to disable the virtual switch.
  3. Use the net.delete_virtual_switch virtual_switch=<virtual-switch-name> option to delete the virtual switch.

Re-Configuring Bonds Across Hosts Manually

If you are upgrading AOS to 5.20, 6.0 or later, you need to migrate the existing bridges to virtual switches. If there are inconsistent bond configurations across hosts before migration of the bridges, then after migration of bridges the virtual switches may not be properly deployed. To resolve such issues, you must manually configure the bonds to make them consistent.

About this task

Important: Use this procedure only when you need to modify the inconsistent bonds in a migrated bridge across hosts in a cluster, that is preventing Acropolis (AOS) from deploying the virtual switch for the migrated bridge.

Do not use ovs-vsctl commands to make the bridge level changes. Use the manage_ovs commands, instead.

The manage_ovs command allows you to update the cluster configuration. The changes are applied and retained across host restarts. The ovs-vsctl command allows you to update the live running host configuration but does not update the AOS cluster configuration and the changes are lost at host restart. This behavior of ovs-vsctl introduces connectivity issues during maintenance, such as upgrades or hardware replacements.

ovs-vsctl is usually used during a break/fix situation where a host may be isolated on the network and requires a workaround to gain connectivity before the cluster configuration can actually be updated using manage_ovs.

Note: Disable the virtual switch before you attempt to change the bonds or bridge.

If you hit an issue where the virtual switch is automatically re-created after it is disabled (with AOS versions 5.20.0 or 5.20.1), follow steps 1 and 2 below to disable such an automatically re-created virtual switch again before migrating the bridges. For more information, see KB-3263.

Be cautious when using the disable_virtual_switch command because it deletes all the configurations from the IDF, not only for the default virtual switch vs0, but also any virtual switches that you may have created (such as vs1 or vs2). Therefore, before you use the disable_virtual_switch command, ensure that you check a list of existing virtual switches, that you can get using the acli net.get_virtual_switch command.

Complete this procedure on each host Controller VM that is sharing the bridge that needs to be migrated to a virtual switch.

Procedure

  1. To list the virtual switches, use the following command.
    nutanix@cvm$ acli net.list_virtual_switch
  2. Disable all the virtual switches.
    nutanix@cvm$ acli net.disable_virtual_switch 

    This disables all the virtual switches.

    Note: You can use the nutanix@cvm$ acli net.delete_virtual_switch vs_name command to delete a specific VS and re-create it with the appropriate bond type.
  3. Change the bond type to align with the same bond type on all the hosts for the specified virtual switch
    nutanix@cvm$ manage_ovs --bridge_name bridge-name --bond_name bond_name --bond_mode bond-type update_uplinks

    Where:

    • bridge-name: Provide the name of the bridge, such as br0 for the virtual switch on which you want to set the uplink bond mode.
    • bond-name: Provide the name of the uplink port such as br0-up for which you want to set the bond mode.
    • bond-type: Provide the bond mode that you require to be used uniformly across the hosts on the named bridge.

    Use the manage_ovs --help command for help on this command.

    Note: To disable LACP, change the bond type from LACP Active-Active (balance-tcp) to Active-Backup/Active-Active with MAC pinning (balance-slb) by setting the bond_mode using this command as active-backup or balance-slb.

    Ensure that you turn off LACP on the connected ToR switch port as well. To avoid blocking of the bond uplinks during the bond type change on the host, ensure that you follow the ToR switch best practices to enable LACP fallback or passive mode.

    To enable LACP, configure bond-type as balance-tcp (Active-Active) with additional variables --lacp_mode fast and --lacp_fallback true.

  4. (If migrating to AOS version earlier than 5.20.2) Check if the issue in the note and disable the virtual switch.

What to do next

After making the bonds consistent across all the hosts configured in the bridge, migrate the bridge or enable the virtual switch. For more information, see:

To check whether LACP is enabled or disabled, use the following command.

nutanix@cvm$ manage_ovs show_uplinks

Enabling LACP and LAG (AHV Only)

If you select the Active-Active bond type, you must enable LACP and LAG on the corresponding ToR switch for each node in the cluster one after the other. This section describes the procedure to enable LAG and LACP in AHV nodes and the connected ToR switch.

About this task

Procedure

  1. Change the uplink Bond Type for the virtual switch.
    1. Open the Edit Virtual Switch window.
      • In Prism Central, open Network & Security > Subnets > Network Configuration > Virtual Switch.
      • In Prism Element or Web Console, open Settings > Network Configuration > Virtual Switch
    2. Click the Edit Edit icon
      icon of the virtual switch you want to configure LAG and LACP.
    3. On the Edit Virtual Switch page, in the General tab, ensure that the Standard option is selected for the Select Configuration Method parameter. Click Next.
      The Standard configuration method puts each node in maintenance mode before applying the updated settings. After applying the updated settings, the node exits from maintenance mode. See Virtual Switch Workflow.
    4. On the Uplink Configuration tab, in Bond Type, select Active-Active.
    5. Click Save.
    The Active-Active bond type configures all AHV hosts with the fast setting for LACP speed, causing the AHV host to request LACP control packets at the rate of one per second from the physical switch. In addition, the Active-Active bond type configuration sets LACP fallback to Active-Backup on all AHV hosts. You cannot modify these default settings after you have configured them in Prism, even by using the CLI.

    This completes the LAG and LACP configuration on the cluster.

Perform the following steps on each node, one at a time.
  1. Put the node and the Controller VM into maintenance mode.
    Before you put a node in maintenance mode, see Verifying the Cluster Health and carry out the necessary checks.

    See Putting a Node into Maintenance Mode using Web Console. Step 6 in this procedure puts the Controller VM in maintenance mode.

  2. Change the settings for the interface on the ToR switch that the node connects to, to match the LACP and LAG setting made on the cluster in step 1 above.
    This is an important step. See the documentation provided by the ToR switch vendor for more information about changing the LACP settings of the switch interface that the node is physically connected to.
    • Nutanix recommends that you enable LACP fallback.

    • Consider the LACP time options (slow and fast). If the switch has a fast configuration, set the LACP time to fast. This is to prevent an outage due to a mismatch on LACP speeds of the cluster and the ToR switch. Keep in mind that the Active-Active bond type configuration set the LACP of cluster to fast.

    Verify that LACP negotiation status is negotiated.

  3. Remove the node and Controller VM from maintenance mode.
    See Exiting a Node from the Maintenance Mode using Web Console. The Controller VM exits maintenance mode during the same process.

What to do next

Do the following after completing the procedure to enable LAG and LACP in all the AHV nodes the connected ToR switches:
  • Verify that the status of all services on all the CVMs are Up. Run the following command and check if the status of the services is displayed as Up in the output:
    nutanix@cvm$ cluster status
  • Log on to the Prism Element of the node and check the Data Resiliency Status widget displays OK.
    Figure. Data Resiliency Status Click to enlarge

VLAN Configuration

You can set up a VLAN-based segmented virtual network on an AHV node by assigning the ports on virtual bridges managed by virtual switches to different VLANs. VLAN port assignments are configured from the Controller VM that runs on each node.

For best practices associated with VLAN assignments, see AHV Networking Recommendations. For information about assigning guest VMs to a virtual switch and VLAN, see Network Connections in the Prism Central Guide.

Assigning an AHV Host to a VLAN

About this task

Note: Perform the following procedure during a scheduled downtime. Before you begin, stop the cluster. Once the process begins, hosts and CVMs partially lose network access to each other and VM data or storage containers become unavailable until the process completes.

To assign an AHV host to a VLAN, do the following on every AHV host in the cluster:

Procedure

  1. Log on to the AHV host with SSH.
  2. Put the AHV host and the CVM in maintenance mode.
    See Putting a Node into Maintenance Mode using CLI for instructions about how to put a node into maintenance mode.
  3. Assign port br0 (the internal port on the default OVS bridge, br0 on defaul virtual switch vs0) to the VLAN that you want the host be on.
    root@ahv# ovs-vsctl set port br0 tag=host_vlan_tag

    Replace host_vlan_tag with the VLAN tag for hosts.

  4. Confirm VLAN tagging on port br0.
    root@ahv# ovs-vsctl list port br0
  5. Check the value of the tag parameter that is shown.
  6. Verify connectivity to the IP address of the AHV host by performing a ping test.
  7. Exit the AHV host and the CVM from the maintenance mode.

Assigning the Controller VM to a VLAN

By default, the public interface of a Controller VM is assigned to VLAN 0. To assign the Controller VM to a different VLAN, change the VLAN ID of its public interface. After the change, you can access the public interface from a device that is on the new VLAN.

About this task

Note: Perform the following procedure during a scheduled downtime. Before you begin, stop the cluster. Once the process begins, hosts and CVMs partially lose network access to each other and VM data or storage containers become unavailable until the process completes.
Note: To avoid losing connectivity to the Controller VM, do not change the VLAN ID when you are logged on to the Controller VM through its public interface. To change the VLAN ID, log on to the internal interface that has IP address 192.168.5.254.

Perform these steps on every Controller VM in the cluster. To assign the Controller VM to a VLAN, do the following:

Procedure

  1. Log on to the AHV host with SSH.
  2. Put the AHV host and the Controller VM in maintenance mode.
    See Putting a Node into Maintenance Mode using CLI for instructions about how to put a node into maintenance mode.
  3. Check the Controller VM status on the host.
    root@host# virsh list

    An output similar to the following is displayed:

    root@host# virsh list
     Id    Name                           State
    ----------------------------------------------------
     1     NTNX-CLUSTER_NAME-3-CVM            running
     3     3197bf4a-5e9c-4d87-915e-59d4aff3096a running
     4     c624da77-945e-41fd-a6be-80abf06527b9 running
    
    root@host# logout
  4. Log on to the Controller VM.
    root@host# ssh nutanix@192.168.5.254

    Accept the host authenticity warning if prompted, and enter the Controller VM nutanix password.

  5. Assign the public interface of the Controller VM to a VLAN.
    nutanix@cvm$ change_cvm_vlan vlan_id

    Replace vlan_id with the ID of the VLAN to which you want to assign the Controller VM.

    For example, add the Controller VM to VLAN 201.

    nutanix@cvm$ change_cvm_vlan 201
  6. Confirm VLAN tagging on the Controller VM.
    root@host# virsh dumpxml cvm_name

    Replace cvm_name with the CVM name or CVM ID to view the VLAN tagging information.

    Note: Refer to step 3 for Controller VM name and Controller VM ID.

    An output similar to the following is displayed:

    root@host# virsh dumpxml 1 | grep "tag id" -C10 --color
          <target dev='vnet2'/>
          <model type='virtio'/>
          <driver name='vhost' queues='4'/>
          <alias name='net2'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
        </interface>
        <interface type='bridge'>
          <mac address='50:6b:8d:b9:0a:18'/>
          <source bridge='br0'/>
          <vlan>
               <tag id='201'/> 
          </vlan>
          <virtualport type='openvswitch'>
            <parameters interfaceid='c46374e4-c5b3-4e6b-86c6-bfd6408178b5'/>
          </virtualport>
          <target dev='vnet0'/>
          <model type='virtio'/>
          <driver name='vhost' queues='4'/>
          <alias name='net3'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
        </interface>
    root@host#
  7. Check the value of the tag parameter that is shown.
  8. Restart the network service.
    nutanix@cvm$ sudo service network restart
  9. Verify connectivity to the Controller VMs external IP address by performing a ping test from the same subnet. For example, perform a ping from another Controller VM or directly from the host itself.
  10. Exit the AHV host and the Controller VM from the maintenance mode.

IGMP Snooping

On an AHV host, when multicast traffic flows to a virtual switch the host floods the Mcast traffic to all the VMs on the specific VLAN. This mechanism is inefficient when many of the VMs on the VLAN do not need that multicast traffic. IGMP snooping allows the host to track which VMs on the VLAN need the multicast traffic and send the multicast traffic to only those VMs. For example, assume there are 50 VMs on VLAN 100 on virtual switch vs1 and only 25 VMs need to receive (hence, receiver VMs) the multicast traffic. Turn on IGMP snooping to help the AHV host track the 25 receiver VMs and deliver the multicast traffic to only the 25 receiver VMs instead of pushing the multicast traffic to all 50 VMs.

When IGMP snooping is enabled in a virtual switch on a VLAN, the ToR switch or router queries the VMs about the Mcast traffic that the VMs are interested in. When the switch receives a join request from a VM in response to the query, it adds the VM to the multicast list for that source entry as a receiver VM. When the switch sends a query, only the VMs that require the multicast traffic respond to the switch. The VMs that do not need the traffic do not respond at all. So, the switch does not add a VM to a multicast group or list unless it receives a response from that VM for the query.

Typically, in a multicast scenario, there is a source entity that casts the multicast traffic. This source may be another VM in this target cluster (that contains the target VMs that need to receive the multicast traffic) or another cluster connected to the target cluster. The host in the target cluster acts as the multicast router. Enable IGMP snooping in the virtual switch that hosts the VLAN connecting the VMs. You must also enable either the native Acropolis IGMP querier on the host or a separate third party querier that you install on the host. The native Acropolis IGMP querier sends IGMP v2 query packets to the VMs.

The IGMP Querier sends out queries periodically to keep the multicast groups or lists updated. The periodicity of the query is determined by the IGMP snooping timeout value that you must specify when you enable IGMP snooping. For example, if you have configured the IGMP snooping timeout as 30 seconds then the IGMP Querier sends out a query every 15 seconds.

When you enable IGMP snooping and are using the native Acropolis IGMP querier, you must configure the IGMP VLAN list. The IGMP VLAN list is a list of VLANs that the native IGMP Querier must send the query out to. This list value is a comma-separated list of the VLAN IDs that the query needs to be sent to. If you do not provide a list of VLANs, then the native IGMP Querier sends the query to all the VLANs in the switch.

When a VM needs to receive the multicast traffic from specific multicast source, configure the multicast application on the VM to listen to the queries received by the VM from the IGMP Querier. Also, configure The multicast application on the VM to respond to the relevant query, that is, the query for the specific multicast source. The response that the application sends is logged by the virtual switch which then sends the multicast traffic to that VM instead of flooding it to all the VMs on the VLAN.

A multicast source always sends multicast traffic to a multicast group or list that is indicated by a multicast group IP address.

Enabling or Disabling IGMP Snooping

IGMP snooping helps you manage multicast traffic to specific VMs configured on a VLAN.

About this task

You can only enable IGMP snooping using aCLI.

Procedure

Run the following command:
net.update_virtual_switch virtual-switch-name enable_igmp_snooping=true enable_igmp_querier=[true | false] igmp_query_vlan_list=VLAN IDs igmp_snooping_timeout=timeout

Provide:

  • virtual-switch-name—The name of the virtual switch in which the VLANs are configured. For example, the name of the default virtual switch is vs0. Provide the name of the virtual switch exactly as it is configured.
  • enable_igmp_snooping=[true | false]true to enable IGMP snooping. Provide false to disable IGMP snooping. The default setting is false.
  • enable_igmp_querier=[true | false]true to enable the native IGMP querier. Provide false to disable the native IGMP querier. The default setting is false.
  • igmp_query_vlan_list=VLAN IDs—List of VLAN IDs mapped to the virtual switch for which IGMP querier is enabled. When it's not set or set as an empty list, querier is enabled for all VLANs of the virtual switch.
  • igmp_snooping_timeout=timeout—An integer indicating time in seconds. For example, you can provide 30 to indicate IGMP snooping timeout of 30 seconds.

    The default timeout is 300 seconds.

    You can set the timeout in the range of 15 - 3600 seconds.

What to do next

You can verify whether IGMP snooping is enabled or disabled by running the following command:
net.get_virtual_switch virtual-switch-name

The output of this command includes the following sample configuration:

igmp_config {
  enable_querier: True
  enable_snooping: True
 }

The above sample shows that IGMP snooping and the native acropolis IGMP querier are enabled.

Switch Port ANalyzer on AHV Hosts

Switch Port ANalyzer (SPAN) or port mirroring enables you to mirror traffic from interfaces of the AHV hosts to the VNIC of guest VMs. SPAN mirrors some or all packets from a set of source ports to a set of destination ports. You can mirror inbound, outbound, or bidirectional traffic on a set of source ports. You can then use the mirrored traffic for security analysis and gain visibility of traffic flowing through the set of source ports. SPAN is a useful tool for troubleshooting packets and can prove to be necessary for compliance reasons.

AHV supports the following types of source ports in a SPAN session:

  1. A bond port that is already mapped to a Virtual Switch (VS) such as vs0, vs1, or any other VS you have created.

  2. A non-bond port that is already mapped to a VS such as vs0, vs1, or any other VS you have created.

  3. An uplink port that is not assigned to any VS or bridge on the host.

Important Considerations

Consider the following before you configure SPAN on AHV hosts:

  • In this release, AHV supports mirroring of traffic only from physical interfaces.

  • The SPAN destination VM or guest VM must be running on the same AHV host where the source ports are located.

  • Delete the SPAN session before you delete the SPAN destination VM or VNIC. Otherwise, the state of the SPAN session is displayed as error.

  • AHV does not support SPAN from a member of a bond port. For example, if you have mapped br0-up to bridge br0 with members eth0 and eth1, you cannot create a SPAN session with either eth0 or eth1 as the source port. You must use only br0-up as the source port.

  • AHV supports different types of source ports in one session. For example, you can create a session with br0-up (bond port) and eth5 (single uplink port) on the same host as two different source ports in the same session. You can even have two different bond ports in the same session.

  • One SPAN session supports up to two source and two destination ports.

  • One host supports up to two SPAN sessions.

  • You cannot create a SPAN session on an AHV host that is in the maintenance mode.

  • If you move the uplink interface to another Virtual Switch, the SPAN session fails. Note that the system does not generate an alert in this situation.

  • With TCP Segmentation Offload, multiple packets belonging to the same stream can be coalesced into a single one before being delivered to the SPAN destination VM. With TCP Segmentation Offload enabled, there can be a difference between the number of packets received on the uplink interface and packets forwarded to the SPAN destination VM (session packet count <= uplink interface packet count). However, the byte count at the SPAN destination VM is closer to the number at the uplink interface.

Configuring SPAN on an AHV Host

To configure SPAN on an AHV host, create a SPAN destination VNIC where you assign that VNIC to a guest VM (SPAN destination VM). After you create the VNIC, create a SPAN session specifying the source and destination ports between which you want to run the SPAN session.

Before you begin

Ensure that you have created the guest VM that you want to configure as the SPAN destination VM.
Note: The SPAN destination VM must run on the same AHV host where the source ports are located. Therefore, Nutanix highly recommends that you create or modify the guest VM as an agent VM so that the VM is not migrated from the host.

Command and example for modifying a guest VM as an agent VM: (Recommended)

nutanix@cvm$ acli vm.update vm-name agent_vm=true
nutanix@cvm$ acli vm.update span-dest-VM agent_vm=true

In this example, span-dest-VM is the name of the guest VM that you are modifying as an agent VM.

About this task

Perform the following procedure to configure SPAN on an AHV host:

Procedure

  1. Log on to a Controller VM in the cluster with SSH.
  2. Determine the name and UUID of the guest VM that you want to configure as the SPAN destination VM.
    nutanix@cvm$ acli vm.list

    Example:

    nutanix@cvm$ acli vm.list
    VM name       VM UUID
    span-dest-VM  85abfdd5-7419-4f7c-bffa-8f961660e516

    In this example, span-dest-VM is the name and 85abfdd5-7419-4f7c-bffa-8f961660e516 is the UUID of the guest VM.

    Note: If you delete the SPAN destination VM without deleting the SPAN session you create with this SPAN destination VM, the SPAN session State displays kError.
  3. Create a SPAN destination VNIC for the guest VM.
    nutanix@cvm$ acli vm.nic_create vm-name type=kSpanDestinationNic

    Replace vm-name with the name of the guest VM on which you want to configure SPAN.

    Note: Do not include any other parameter when you are creating a SPAN destination VNIC.

    Example:

    nutanix@cvm$ acli vm.nic_create span-dest-VM type=kSpanDestinationNic
    NicCreate: complete
    Note: If you delete the SPAN destination VNIC without deleting the SPAN session you create with this SPAN destination VNIC, the SPAN session State displays kError.
  4. Determine the MAC address of the VNIC.
    nutanix@cvm$ acli vm.nic_get vm-name
    

    Replace vm-name with the name of the guest VM to which you assigned the VNIC.

    Example:

    nutanix@cvm$ acli vm.nic_get span-dest-VM
    x.x.x.x {
      connected: True
      ip_address: "x.x.x.x"
      mac_addr: "50:6b:8d:8b:2c:94"
      network_name: "mgmt"
      network_type: "kNativeNetwork"
      network_uuid: "c14b0092-877e-489b-a399-2749a60b3206"
      type: "kNormalNic"
      uuid: "9dd4f307-2506-4354-86a3-0b99abdeba6c"
      vlan_mode: "kAccess"
    }
    50:6b:8d:de:c6:44 {
      mac_addr: "50:6b:8d:de:c6:44"
      network_type: "kNativeNetwork"
      type: "kSpanDestinationNic"
      uuid: "b59e99bc-6bc7-4fab-ac35-543695c300d1"
    }
    

    Note the MAC address (value of mac_addr) of the VNIC whose type is set to kSpanDestinationNic.

  5. Determine the UUID of the host whose traffic you want to monitor by using SPAN.
    nutanix@cvm$ acli host.list
  6. Create a SPAN session.
    nutanix@cvm$ acli net.create_span_session span-session-name description="description-text" source_list=\{uuid=host-uuid,type=kHostNic,identifier=source-port-name,direction=traffic-type} dest_list=\{uuid=vm-uuid,type=kVmNic,identifier=vnic-mac-address}

    Replace the variables mentioned in the command for the following parameters with their appropriate values as follows:

    • span-session-name: Replace span-session-name with a name for the session.
    • description (Optional): Replace description-text with a description for the session. This is an optional parameter.
    Note:

    All source_list and dest_list parameters are mandatory inputs. The parameters do not have default values. Provide an appropriate value for each parameter.

    Source list parameters:

    • uuid: Replace host-uuid with the UUID of the host whose traffic you want to monitor by using SPAN. (determined in step 5).
    • type: Specify kHostNic as the type. Only the kHostNic type is supported in this release.
    • identifier: Replace source-port-name with name of the source port whose traffic you want to mirror. For example, br0-up, eth0, or eth1.
    • direction: Replace traffic-type with kIngress if you want to mirror inbound traffic, kEgress for outbound traffic, or kBiDir for bidirectional traffic.

    Destination list parameters:

    • uuid: Replace vm-uuid with the UUID of the guest VM that you want to configure as the SPAN destination VM. (determined in step 2).
    • type: Specify kVmNic as the type. Only the kVmNic type is supported in this release.
    • identifier: Replace vnic-mac-address with the MAC address of the destination port where you want to mirror the traffic (determined in step 4).
    Note: The syntax for source_list and dest_list is as follows:

    source_list/dest_list=[{key1=value1,key2=value2,..}]

    Each pair of curly brackets includes the details of one source or destination port with a comma-separated list of the key-value pairs. There must not be any space between two key-value pairs.

    One SPAN session supports up to two source and two destination ports. If you want to include an extra port, separate the curly brackets with a semicolon (no space) and list the key-value pairs of the second port in the other curly bracket.

    Example:

    nutanix@cvm$ acli net.create_span_session span1 description="span session 1" source_list=\{uuid=492a2bda-ffc0-486a-8bc0-8ae929471714,type=kHostNic,identifier=br0-up,direction=kBiDir} dest_list=\{uuid=85abfdd5-7419-4f7c-bffa-8f961660e516,type=kVmNic,identifier=50:6b:8d:de:c6:44}
    SpanCreate: complete
  7. Display the list of all SPAN sessions running on a host.
    nutanix@cvm$ acli net.list_span_session

    Example:

    nutanix@cvm$ acli net.list_span_session
    Name   UUID                                  State
    span1  69252eb5-8047-4e3a-8adc-91664a7104af  kActive

    Possible values for State are:

    • kActive: Denotes that the SPAN session is active.
    • kError: denotes that there is an error and the configuration is not working. For example, if there are two surces and one source is down, the State of the session is diplayed as kError.
  8. Display the details of a SPAN session.
    nutanix@cvm$ acli net.get_span_session span-session-name

    Replace span-session-name with the name of the SPAN session whose details you want to view.

    Example:

    nutanix@cvm$ acli net.get_span_session span1
    span1 {
      config {
        datapath_name: "s6925"
        description: "span session 1"
        destination_list {
          nic_type: "kVmNic"
          port_identifier: "50:6b:8d:de:c6:44"
          uuid: "85abfdd5-7419-4f7c-bffa-8f961660e516"
        }
        name: "span1"
        session_uuid: "69252eb5-8047-4e3a-8adc-91664a7104af"
        source_list {
          direction: "kBiDir"
          nic_type: "kHostNic"
          port_identifier: "br0-up"
          uuid: "492a2bda-ffc0-486a-8bc0-8ae929471714"
        }
      }
      stats {
        name: "span1"
        session_uuid: "69252eb5-8047-4e3a-8adc-91664a7104af"
        state: "kActive"
        stats_list {
          tx_byte_cnt: 67498
          tx_pkt_cnt: 436
        }
      }
    }

    Note the value of the datapath_name field in the SPAN session configuration, which is a unique key that identifies the SPAN session. You might need the unique key to correctly identify the SPAN session for troubleshooting reasons.

Updating a SPAN Session

You can update any of the details of a SPAN session. When you are updating a SPAN session, specify the values of the parameters you want to update and then specify the rest of the parameters again as you specified them when you created the SPAN session. For example, if you want to change only the name and description, specify the updated name and description and then include the complete details of the source and destination ports again even though you are not updating those details.

About this task

Perform the following procedure to update a SPAN session:

Procedure

  1. Log on to a Controller VM in the cluster with SSH.
  2. Update the SPAN session.
    nutanix@cvm$ acli net.update_span_session span-session-name description="description-text" source_list=\{uuid=host-uuid,type=kHostNic,identifier=source-port-name,direction=traffic-type} dest_list=\{uuid=vm-UUID,type=kVmNic,identifier=vNIC-mac-address}

    The update command includes the same parameters as the create command. See Configuring SPAN on an AHV Host for more information.

    Example:

    nutanix@cvm$ acli net.update_span_session span1 name=span_br0_to_span_dest description="span from br0-up to span-dest VM" source_list=\{uuid=492a2bda-ffc0-486a-8bc0-8ae929471714,type=kHostNic,identifier=br0-up,direction=kBiDir} dest_list=\{uuid=85abfdd5-7419-4f7c-bffa-8f961660e516,type=kVmNic,identifier=50:6b:8d:de:c6:44}
    SpanUpdate: complete
    
    nutanix@cvm$ acli net.list_span_session
    Name                   UUID                                  State
    span_br0_to_span_dest  69252eb5-8047-4e3a-8adc-91664a7104af  kActive
    
    nutanix@cvm$ acli net.get_span_session span_br0_to_span_dest
    span_br0_to_span_dest {
      config {
        datapath_name: "s6925"
        description: "span from br0-up to span-dest VM"
        destination_list {
          nic_type: "kVmNic"
          port_identifier: "50:6b:8d:de:c6:44"
          uuid: "85abfdd5-7419-4f7c-bffa-8f961660e516"
        }
        name: "span_br0_to_span_dest"
        session_uuid: "69252eb5-8047-4e3a-8adc-91664a7104af"
        source_list {
          direction: "kBiDir"
          nic_type: "kHostNic"
          port_identifier: "br0-up"
          uuid: "492a2bda-ffc0-486a-8bc0-8ae929471714"
        }
      }
      stats {
        name: "span_br0_to_span_dest"
        session_uuid: "69252eb5-8047-4e3a-8adc-91664a7104af"
        state: "kActive"
        stats_list {
          tx_byte_cnt: 805705
          tx_pkt_cnt: 4792
        }
      }
    }

    In this example, only the name and description were updated. However, complete details of the source and destation ports were included in the command again.

    If you want to change the name of a SPAN session, specify the existing name first and then include the new name by using the “name=” parameter as shown in this example.

Deleting a SPAN Session

Delete the SPAN session if you want to disable SPAN on an AHV host. Nutanix recommends that you delete the SPAN session associated with a SPAN destination VM or SPAN destination VNIC.

About this task

Perform the following procedure to delete a SPAN session:

Procedure

  1. Log on to a Controller VM in the cluster with SSH.
  2. Delete the SPAN session.
    nutanix@cvm$ acli net.delete_span_session span-session-name

    Replace span-session-name with the name of the SPAN session you want to delete.

Enabling RSS Virtio-Net Multi-Queue by Increasing the Number of VNIC Queues

Multi-Queue in VirtIO-net enables you to improve network performance for network I/O-intensive guest VMs or applications running on AHV hosts.

About this task

You can enable VirtIO-net multi-queue by increasing the number of VNIC queues. If an application uses many distinct streams of traffic, Receive Side Scaling (RSS) can distribute the streams across multiple VNIC DMA rings. This increases the amount of RX buffer space by the number of VNIC queues (N). Also, most guest operating systems pin each ring to a particular vCPU, handling the interrupts and ring-walking on that vCPU, by that means achieving N-way parallelism in RX processing. However, if you increase the number of queues beyond the number of vCPUs, you cannot achieve extra parallelism.

Following workloads have the greatest performance benefit of VirtIO-net multi-queue:

  • VMs where traffic packets are relatively large
  • VMs with many concurrent connections
  • VMs with network traffic moving:
    • Among VMs on the same host
    • Among VMs across hosts
    • From VMs to the hosts
    • From VMs to an external system
  • VMs with high VNIC RX packet drop rate if CPU contention is not the cause

You can increase the number of queues of the AHV VM VNIC to allow the guest OS to use multi-queue VirtIO-net on guest VMs with intensive network I/O. Multi-Queue VirtIO-net scales the network performance by transferring packets through more than one Tx/Rx queue pair at a time as the number of vCPUs increases.

Nutanix recommends that you be conservative when increasing the number of queues. Do not set the number of queues larger than the total number of vCPUs assigned to a VM. Packet reordering and TCP retransmissions increase if the number of queues is larger than the number vCPUs assigned to a VM. For this reason, start by increasing the queue size to 2. The default queue size is 1. After making this change, monitor the guest VM and network performance. Before you increase the queue size further, verify that the vCPU usage has not dramatically or unreasonably increased.

Perform the following steps to make more VNIC queues available to a guest VM. See your guest OS documentation to verify if you must perform extra steps on the guest OS to apply the additional VNIC queues.

Note: You must shut down the guest VM to change the number of queues. Therefore, make this change during a planned maintenance window. The VNIC status might change from Up->Down->Up or a restart of the guest OS might be required to finalize the settings depending on the guest OS implementation requirements.

Procedure

  1. (Optional) Nutanix recommends that you ensure the following:
    1. AHV and AOS are running the latest version.
    2. AHV guest VMs are running the latest version of the Nutanix VirtIO driver package.
      For RSS support, ensure you are running Nutanix VirtIO 1.1.6 or later. See Nutanix VirtIO for Windows for more information about Nutanix VirtIO.
  2. Determine the exact name of the guest VM for which you want to change the number of VNIC queues.
    nutanix@cvm$ acli vm.list

    An output similar to the following is displayed:

    nutanix@cvm$ acli vm.list
    VM name          VM UUID
    ExampleVM1       a91a683a-4440-45d9-8dbe-xxxxxxxxxxxx
    ExampleVM2       fda89db5-4695-4055-a3d4-xxxxxxxxxxxx
    ...
  3. Determine the MAC address of the VNIC and confirm the current number of VNIC queues.
    nutanix@cvm$ acli vm.nic_get VM-name

    Replace VM-name with the name of the VM.

    An output similar to the following is displayed:

    nutanix@cvm$ acli vm.nic_get VM-name
    ...
    mac_addr: "50:6b:8d:2f:zz:zz"
    ...
    (queues: 2)    <- If there is no output of 'queues', the setting is default (1 queue).
    Note: AOS defines queues as the maximum number of Tx/Rx queue pairs (default is 1).
  4. Check the number of vCPUs assigned to the VM.
    nutanix@cvm$ acli vm.get VM-name | grep num_vcpus

    An output similar to the following is displayed:

    nutanix@cvm$ acli vm.get VM-name | grep num_vcpus
    num_vcpus: 1
  5. Shut down the guest VM.
    nutanix@cvm$ acli vm.shutdown VM-name

    Replace VM-name with the name of the VM.

  6. Increase the number of VNIC queues.
    nutanix@cvm$acli vm.nic_update VM-name vNIC-MAC-address queues=N

    Replace VM-name with the name of the guest VM, vNIC-MAC-address with the MAC address of the VNIC, and N with the number of queues.

    Note: N must be less than or equal to the vCPUs assigned to the guest VM.
  7. Start the guest VM.
    nutanix@cvm$ acli vm.on VM-name

    Replace VM-name with the name of the VM.

  8. Confirm in the guest OS documentation if any additional steps are required to enable multi-queue in VirtIO-net.
    Note: Microsoft Windows has RSS enabled by default.

    For example, for RHEL and CentOS VMs, do the following:

    1. Log on to the guest VM.
    2. Confirm if irqbalance.service is active or not.
      uservm# systemctl status irqbalance.service

      An output similar to the following is displayed:

      irqbalance.service - irqbalance daemon
         Loaded: loaded (/usr/lib/systemd/system/irqbalance.service; enabled; vendor preset: enabled)
         Active: active (running) since Tue 2020-04-07 10:28:29 AEST; Ns ago
    3. Start irqbalance.service if it is not active.
      Note: It is active by default on CentOS VMs. You might have to start it on RHEL VMs.
      uservm# systemctl start irqbalance.service
    4. Run the following command:
      uservm$ ethtool -L ethX combined M

      Replace M with the number of VNIC queues.

    Note the following caveat from the RHEL 7 Virtualization Tuning and Optimization Guide : 5.4. NETWORK TUNING TECHNIQUES document:

    "Currently, setting up a multi-queue virtio-net connection can have a negative effect on the performance of outgoing traffic. Specifically, this may occur when sending packets under 1,500 bytes over the Transmission Control Protocol (TCP) stream."

  9. Monitor the VM performance to make sure that the expected network performance increase is observed and that the guest VM vCPU usage is not dramatically increased to impact the application on the guest VM.
    For assistance with the steps described in this document, or if these steps do not resolve your guest VM network performance issues, contact Nutanix Support.

Changing the IP Address of an AHV Host

Change the IP address, netmask, or gateway of an AHV host.

Before you begin

Perform the following tasks before you change the IP address, netmask, or gateway of an AHV host:
Caution: All Controller VMs and hypervisor hosts must be on the same subnet.
Warning: Ensure that you perform the steps in the exact order as indicated in this document.
  1. Verify the cluster health by following the instructions in KB-2852.

    Do not proceed if the cluster cannot tolerate failure of at least one node.

  2. Put the AHV host into the maintenance mode.

    See Putting a Node into Maintenance Mode using CLI for instructions about how to put a node into maintenance mode.

About this task

Perform the following procedure to change the IP address, netmask, or gateway of an AHV host.

Procedure

  1. Edit the settings of port br0, which is the internal port on the default bridge br0.
    1. Log on to the host console as root.

      You can access the hypervisor host console either through IPMI or by attaching a keyboard and monitor to the node.

    2. Open the network interface configuration file for port br0 in a text editor.
      root@ahv# vi /etc/sysconfig/network-scripts/ifcfg-br0
    3. Update entries for host IP address, netmask, and gateway.

      The block of configuration information that includes these entries is similar to the following:

      ONBOOT="yes" 
      NM_CONTROLLED="no" 
      PERSISTENT_DHCLIENT=1
      NETMASK="subnet_mask" 
      IPADDR="host_ip_addr" 
      DEVICE="br0" 
      TYPE="ethernet" 
      GATEWAY="gateway_ip_addr"
      BOOTPROTO="none"
      • Replace host_ip_addr with the IP address for the hypervisor host.
      • Replace subnet_mask with the subnet mask for host_ip_addr.
      • Replace gateway_ip_addr with the gateway address for host_ip_addr.
    4. Save your changes.
    5. Restart network services.

      systemctl restart network.service
    6. Assign the host to a VLAN. For information about how to add a host to a VLAN, see Assigning an AHV Host to a VLAN.
    7. Verify network connectivity by pinging the gateway, other CVMs, and AHV hosts.
  2. Log on to the Controller VM that is running on the AHV host whose IP address you changed and restart genesis.
    nutanix@cvm$ genesis restart

    If the restart is successful, output similar to the following is displayed:

    Stopping Genesis pids [1933, 30217, 30218, 30219, 30241]
    Genesis started on pids [30378, 30379, 30380, 30381, 30403]

    See Controller VM Access for information about how to log on to a Controller VM.

    Genesis takes a few minutes to restart.

  3. Verify if the IP address of the hypervisor host has changed. Run the following nCLI command from any CVM other than the one in the maintenance mode.
    nutanix@cvm$ ncli host list 

    An output similar to the following is displayed:

    nutanix@cvm$ ncli host list 
        Id                        : aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee::1234  
        Uuid                      : ffffffff-gggg-hhhh-iiii-jjjjjjjjjjj 
        Name                      : XXXXXXXXXXX-X 
        IPMI Address              : X.X.Z.3 
        Controller VM Address     : X.X.X.1 
        Hypervisor Address        : X.X.Y.4 <- New IP Address 
    ... 
  4. Stop the Acropolis service on all the CVMs.
    1. Stop the Acropolis service on all the CVMs in the cluster.
      nutanix@cvm$ allssh genesis stop acropolis
      Note: You cannot manage your guest VMs after the Acropolis service is stopped.
    2. Verify if the Acropolis service is DOWN on all the CVMs, except the one in the maintenance mode.
      nutanix@cvm$ cluster status | grep -v UP 

      An output similar to the following is displayed:

      nutanix@cvm$ cluster status | grep -v UP 
      
      2019-09-04 14:43:18 INFO zookeeper_session.py:143 cluster is attempting to connect to Zookeeper 
      
      2019-09-04 14:43:18 INFO cluster:2774 Executing action status on SVMs X.X.X.1, X.X.X.2, X.X.X.3 
      
      The state of the cluster: start 
      
      Lockdown mode: Disabled 
              CVM: X.X.X.1 Up 
                                 Acropolis DOWN       [] 
              CVM: X.X.X.2 Up, ZeusLeader 
                                 Acropolis DOWN       [] 
              CVM: X.X.X.3 Maintenance
  5. From any CVM in the cluster, start the Acropolis service.
    nutanix@cvm$ cluster start 
  6. Verify if all processes on all the CVMs, except the one in the maintenance mode, are in the UP state.
    nutanix@cvm$ cluster status | grep -v UP 
  7. Exit the AHV host and the CVM from the maintenance mode.

Virtual Machine Management

The following topics describe various aspects of virtual machine management in an AHV cluster.

Supported Guest VM Types for AHV

The compatibility matrix available on the Nutanix Support portal includes the latest supported AHV guest VM OSes.

AHV Configuration Maximums

The Nutanix configuration maximums available on the Nutanix support portal includes all the latest configuration limits applicable to AHV. Select the appropriate AHV version to view version specific information.

Creating a VM (AHV)

In AHV clusters, you can create a new virtual machine (VM) through the Prism Element web console.

About this task

Note: Use Prism Central to create a VM with the memory overcommit feature enabled. Prism Element web console does not allow you to enable memory overcommit while creating a VM. If you create a VM using the Prism Element web console and want to enable memory overcommit for it, update the VM using Prism Central and enable memory overcommit in the Update VM page in Prism Central.

When creating a VM, you can configure all of its components, such as number of vCPUs and memory, but you cannot attach a volume group to the VM. Attaching a volume group is possible only when you are modifying a VM.

To create a VM, do the following:

Procedure

  1. In the VM dashboard, click the Create VM button.
    Note: This option does not appear in clusters that do not support this feature.
    The Create VM dialog box appears.
    Figure. Create VM Dialog Box Click to enlargeCreate VM screen

  2. Do the following in the indicated fields:
    1. Name: Enter a name for the VM.
    2. Description (optional): Enter a description for the VM.
    3. Timezone: Select the timezone that you want the VM to use. If you are creating a Linux VM, select (UTC) UTC.
      Note:

      The RTC of Linux VMs must be in UTC, so select the UTC timezone if you are creating a Linux VM.

      Windows VMs preserve the RTC in the local timezone, so set up the Windows VM with the hardware clock pointing to the desired timezone.

    4. Use this VM as an agent VM: Select this option to make this VM as an agent VM.

      You can use this option for the VMs that must be powered on before the rest of the VMs (for example, to provide network functions before the rest of the VMs are powered on on the host) and must be powered off after the rest of the VMs are powered off (for example, during maintenance mode operations). Agent VMs are never migrated to any other host in the cluster. If an HA event occurs or the host is put in maintenance mode, agent VMs are powered off and are powered on on the same host once that host comes back to a normal state.

      If an agent VM is powered off, you can manually start that agent VM on another host and the agent VM now permanently resides on the new host. The agent VM is never migrated back to the original host. Note that you cannot migrate an agent VM to another host while the agent VM is powered on.

    5. vCPU(s): Enter the number of virtual CPUs to allocate to this VM.
    6. Number of Cores per vCPU: Enter the number of cores assigned to each virtual CPU.
    7. Memory: Enter the amount of memory (in GiB) to allocate to this VM.
  3. (For GPU-enabled AHV clusters only) To configure GPU access, click Add GPU in the Graphics section, and then do the following in the Add GPU dialog box:
    Figure. Add GPU Dialog Box Click to enlarge

    For more information, see GPU and vGPU Support.

    1. To configure GPU pass-through, in GPU Mode, click Passthrough, select the GPU that you want to allocate, and then click Add.
      If you want to allocate additional GPUs to the VM, repeat the procedure as many times as you need to. Make sure that all the allocated pass-through GPUs are on the same host. If all specified GPUs of the type that you want to allocate are in use, you can proceed to allocate the GPU to the VM, but you cannot power on the VM until a VM that is using the specified GPU type is powered off.

      For more information, see GPU and vGPU Support.

    2. To configure virtual GPU access, in GPU Mode, click virtual GPU, select a GRID license, and then select a virtual GPU profile from the list.
      Note: This option is available only if you have installed the GRID host driver on the GPU hosts in the cluster.

      For more information about the NVIDIA GRID host driver installation instructions, see the NVIDIA Grid Host Driver for Nutanix AHV Installation Guide.

      You can assign multiple virtual GPU to a VM. A vGPU is assigned to the VM only if a vGPU is available when the VM is starting up.

      Before you add multiple vGPUs to the VM, see Multiple Virtual GPU Support and Restrictions for Multiple vGPU Support.

      Note:

      Multiple vGPUs are supported on the same VM only if you select the highest vGPU profile type.

      After you add the first vGPU, to add multiple vGPUs, see Adding Multiple vGPUs to the Same VM.

  4. Select one of the following firmware to boot the VM.
    • Legacy BIOS: Select legacy BIOS to boot the VM with legacy BIOS firmware.
    • UEFI: Select UEFI to boot the VM with UEFI firmware. UEFI firmware supports larger hard drives, faster boot time, and provides more security features. For more information about UEFI firmware, see UEFI Support for VM.

    If you select UEFI, you can enable the following features:

    • Secure Boot: Select this option to enable UEFI secure boot policies for your guest VMs. For more information about Secure Boot, see Secure Boot Support for VMs.
    • Windows Defender Credential Guard: Select this option to enable the Windows Defender Credential Guard feature of Microsoft Windows operating systems that allows you to securely isolate user credentials from the rest of the operating system. Follow the detailed instructions described in Windows Defender Credential Guard Support in AHV to enable this feature.
      Note: To add virtual TPM, see Creating AHV VMs with vTPM (aCLI).
  5. To attach a disk to the VM, click the Add New Disk button.
    The Add Disk dialog box appears.
    Figure. Add Disk Dialog Box Click to enlargeconfigure a disk screen

    Do the following in the indicated fields:
    1. Type: Select the type of storage device, DISK or CD-ROM, from the drop-down list.
      The following fields and options vary depending on whether you choose DISK or CD-ROM.
    2. Operation: Specify the device contents from the drop-down list.
      • Select Clone from ADSF file to copy any file from the cluster that can be used as an image onto the disk.
      • Select Empty CD-ROM to create a blank CD-ROM device. (This option appears only when CD-ROM is selected in the previous field.) A CD-ROM device is needed when you intend to provide a system image from CD-ROM.
      • Select Allocate on Storage Container to allocate space without specifying an image. (This option appears only when DISK is selected in the previous field.) Selecting this option means you are allocating space only. You have to provide a system image later from a CD-ROM or other source.
      • Select Clone from Image Service to copy an image that you have imported by using image service feature onto the disk. For more information about the Image Service feature, see Configuring Images and Image Management in the Prism Self Service Administration Guide.
    3. Bus Type: Select the bus type from the pull-down list. The choices are IDE, SCSI, or SATA.

      The options displayed in the Bus Type drop-down list varies based on the storage device Type selected in Step a.

      • For device Disk, select from SCSI, SATA, PCI, or IDE bus type.
      • For device CD-ROM, you can select either IDE,or SATA bus type.
      Note: SCSI bus is the preferred bus type and it is used in most cases. Ensure you have installed the VirtIO drivers in the guest OS.
      Caution: Use SATA, PCI, IDE for compatibility purpose when the guest OS does not have VirtIO drivers to support SCSI devices. This may have performance implications.
      Note: For AHV 5.16 and later, you cannot use an IDE device if Secured Boot is enabled for UEFI Mode boot configuration.
    4. ADSF Path: Enter the path to the desired system image.
      This field appears only when Clone from ADSF file is selected. It specifies the image to copy. Enter the path name as /storage_container_name/iso_name.iso. For example to clone an image from myos.iso in a storage container named crt1, enter /crt1/myos.iso. When a user types the storage container name (/storage_container_name/), a list appears of the ISO files in that storage container (assuming one or more ISO files had previously been copied to that storage container).
    5. Image: Select the image that you have created by using the image service feature.
      This field appears only when Clone from Image Service is selected. It specifies the image to copy.
    6. Storage Container: Select the storage container to use from the drop-down list.
      This field appears only when Allocate on Storage Container is selected. The list includes all storage containers created for this cluster.
    7. Size: Enter the disk size in GiB.
    8. Index: Displays Next Available by default.
    9. When all the field entries are correct, click the Add button to attach the disk to the VM and return to the Create VM dialog box.
    10. Repeat this step to attach additional devices to the VM.
  6. To create a network interface for the VM, click the Add New NIC button.
    Prism console displays the Create NIC dialog box.
    Note: To create or update a SPAN destination type VM or VNIC, use command line interface. Prism does not support SPAN destination type configurations. See Switch Port ANalyzer on AHV Hosts.

    Figure. Create NIC Dialog Box Click to enlargeconfigure a NIC screen

    Do the following in the indicated fields:
    1. Subnet Name: Select the target virtual LAN from the drop-down list.
      The list includes all defined networks (see Network Configuration For VM Interfaces.).
      Note: Selecting IPAM enabled subnet from the drop-down list displays the Private IP Assignment information that provides information about the number of free IP addresses available in the subnet and in the IP pool.
    2. Network Connection State: Select the state for the network that you want it to operate in after VM creation. The options are Connected or Disconnected.
    3. Private IP Assignment: This is a read-only field and displays the following:
      • Network Address/Prefix: The network IP address and prefix.
      • Free IPs (Subnet): The number of free IP addresses in the subnet.
      • Free IPs (Pool): The number of free IP addresses available in the IP pools for the subnet.
    4. Assignment Type: This is for IPAM enabled network. Select Assign with DHCP to assign IP address automatically to the VM using DHCP. For more information, see IP Address Management .
    5. When all the field entries are correct, click the Add button to create a network interface for the VM and return to the Create VM dialog box.
    6. Repeat this step to create additional network interfaces for the VM.
    Note: Nutanix guarantees a unique VM MAC address in a cluster. You can come across scenarios where two VM in different clusters can have the same MAC address.
    Note: Acropolis leader generates MAC address for the VM on AHV. The first 24 bits of the MAC address is set to 50-6b-8d (0101 0000 0110 1101 1000 1101) and are reserved by Nutanix, the 25th bit is set to 1 (reserved by Acropolis leader), the 26th bit to 48th bits are auto generated random numbers.
  7. To configure affinity policy for this VM, click Set Affinity.
    The Set VM Host Affinity dialog box appears.
    1. Select the host or hosts on which you want configure the affinity for this VM.
    2. Click Save.
      The selected host or hosts are listed. This configuration is permanent. The VM will not be moved from this host or hosts even in case of HA event and will take effect once the VM starts.
  8. To customize the VM by using Cloud-init (for Linux VMs) or Sysprep (for Windows VMs), select the Custom Script check box.
    Fields required for configuring Cloud-init and Sysprep, such as options for specifying a configuration script or answer file and text boxes for specifying paths to required files, appear below the check box.
    Figure. Create VM Dialog Box (custom script fields) Click to enlargecustom script fields in the create VM screen

  9. To specify a user data file (Linux VMs) or answer file (Windows VMs) for unattended provisioning, do one of the following:
    • If you uploaded the file to a storage container on the cluster, click ADSF path, and then enter the path to the file.

      Enter the ADSF prefix (adsf://) followed by the absolute path to the file. For example, if the user data is in /home/my_dir/cloud.cfg, enter adsf:///home/my_dir/cloud.cfg. Note the use of three slashes.

    • If the file is available on your local computer, click Upload a file, click Choose File, and then upload the file.
    • If you want to create or paste the contents of the file, click Type or paste script, and then use the text box that is provided.
  10. To copy one or more files to a location on the VM (Linux VMs) or to a location in the ISO file (Windows VMs) during initialization, do the following:
    1. In Source File ADSF Path, enter the absolute path to the file.
    2. In Destination Path in VM, enter the absolute path to the target directory and the file name.
      For example, if the source file entry is /home/my_dir/myfile.txt then the entry for the Destination Path in VM should be /<directory_name>/copy_desitation> i.e. /mnt/myfile.txt.
    3. To add another file or directory, click the button beside the destination path field. In the new row that appears, specify the source and target details.
  11. When all the field entries are correct, click the Save button to create the VM and close the Create VM dialog box.
    The new VM appears in the VM table view.

Managing a VM (AHV)

You can use the web console to manage virtual machines (VMs) in AHV managed clusters.

About this task

Note: Use Prism Central to update a VM if you want to enable memory overcommit for it. Prism Element web console does not allow you to enable memory overcommit while updating a VM. You can enable memory overcommit in the Update VM page in Prism Central.

After creating a VM (see Creating a VM (AHV)), you can use the web console to start or shut down the VM, launch a console window, update the VM configuration, take a snapshot, attach a volume group, migrate the VM, clone the VM, or delete the VM.

Note: Your available options depend on the VM status, type, and permissions. Unavailable options are grayed out.

To accomplish one or more of these tasks, do the following:

Procedure

  1. In the VM dashboard, click the Table view.
  2. Select the target VM in the table (top section of screen).
    The Summary line (middle of screen) displays the VM name with a set of relevant action links on the right. You can also right-click on a VM to select a relevant action.

    The possible actions are Manage Guest Tools, Launch Console, Power on (or Power off), Take Snapshot, Migrate, Clone, Update, and Delete.

    Note: VM pause and resume feature is not supported on AHV.
    The following steps describe how to perform each action.
    Figure. VM Action Links Click to enlarge

  3. To manage guest tools as follows, click Manage Guest Tools.
    You can also enable NGT applications (self-service restore, Volume Snapshot Service and application-consistent snapshots) also as part of manage guest tools.
    1. Select Enable Nutanix Guest Tools check box to enable NGT on the selected VM.
    2. Select Mount Nutanix Guest Tools to mount NGT on the selected VM.
      Ensure that VM must have at least one empty IDE CD-ROM slot to attach the ISO.
      The VM is registered with the NGT service. NGT is enabled and mounted on the selected virtual machine. A CD with volume label NUTANIX_TOOLS gets attached to the VM.
    3. To enable self-service restore feature for Windows VMs, click Self Service Restore (SSR) check box.
      The Self-Service Restore feature is enabled of the VM. The guest VM administrator can restore the desired file or files from the VM. For more information about self-service restore feature, see Self-Service Restore in the Data Protection and Recovery with Prism Element guide.

    4. After you select Enable Nutanix Guest Tools check box the VSS snapshot feature is enabled by default.
      After this feature is enabled, Nutanix native in-guest VmQuiesced Snapshot Service (VSS) agent takes snapshots for VMs that support VSS.
      Note:

      The AHV VM snapshots are not application consistent. The AHV snapshots are taken from the VM entity menu by selecting a VM and clicking Take Snapshot.

      The application consistent snapshots feature is available with Protection Domain based snapshots and Recovery Points in Prism Central. For more information, see Conditions for Application-consistent Snapshots in the Data Protection and Recovery with Prism Element guide.

    5. Click Submit.
      The VM is registered with the NGT service. NGT is enabled and mounted on the selected virtual machine. A CD with volume label NUTANIX_TOOLS gets attached to the VM.
      Note:
      If you eject the CD, you can mount the CD back again by logging into the Controller VM and running the following nCLI command.
      nutanix@cvm$ ncli ngt mount vm-id=virtual_machine_id

      For example, to mount the NGT on the VM with VM_ID=00051a34-066f-72ed-0000-000000005400::38dc7bf2-a345-4e52-9af6-c1601e759987, type the following command.

      nutanix@cvm$ ncli ngt mount vm-id=00051a34-066f-72ed-0000-000000005400::38dc7bf2-a345-4e52-9af6-
      c1601e759987
  4. To launch a console window, click the Launch Console action link.
    This opens a Virtual Network Computing (VNC) client and displays the console in a new tab or window. This option is available only when the VM is powered on. The console window includes four menu options (top right):
    • Clicking the Mount ISO button displays the following window that allows you to mount an ISO image to the VM. To mount an image, select the desired image and CD-ROM drive from the drop-down lists and then click the Mount button.
      Figure. Mount Disk Image Window Click to enlargemount ISO image window from VNC console

      Note: For information about how to select CD-ROM as the storage device when you intent to provide a system image from CD-ROM, see Add New Disk in Creating a VM (AHV).
    • Clicking the C-A-D icon button sends a CtrlAltDel command to the VM.
    • Clicking the camera icon button takes a screenshot of the console window.
    • Clicking the power icon button allows you to power on/off the VM. These are the same options that you can access from the Power On Actions or Power Off Actions action link below the VM table (see next step).
    Figure. Virtual Network Computing (VNC) Window Click to enlarge

  5. To start or shut down the VM, click the Power on (or Power off) action link.

    Power on begins immediately. If you want to power off the VMs, you are prompted to select one of the following options:

    • Power Off. Hypervisor performs a hard power off action on the VM.
    • Power Cycle. Hypervisor performs a hard restart action on the VM.
    • Reset. Hypervisor performs an ACPI reset action through the BIOS on the VM.
    • Guest Shutdown. Operating system of the VM performs a graceful shutdown.
    • Guest Reboot. Operating system of the VM performs a graceful restart.
    Note: If you perform power operations such as Guest Reboot or Guest Shutdown by using the Prism Element web console or API on Windows VMs, these operations might silently fail without any error messages if at that time a screen saver is running in the Windows VM. Perform the same power operations again immediately, so that they succeed.
  6. To make a snapshot of the VM, click the Take Snapshot action link.

    For more information, see Virtual Machine Snapshots.

  7. To migrate the VM to another host, click the Migrate action link.
    This displays the Migrate VM dialog box. Select the target host from the drop-down list (or select the System will automatically select a host option to let the system choose the host) and then click the Migrate button to start the migration.
    Figure. Migrate VM Dialog Box Click to enlarge

    Note: Nutanix recommends to live migrate VMs when they are under light load. If they are migrated while heavily utilized, migration may fail because of limited bandwidth.
  8. To clone the VM, click the Clone action link.

    This displays the Clone VM dialog box, which includes the same fields as the Create VM dialog box. A cloned VM inherits the most the configurations (except the name) of the source VM. Enter a name for the clone and then click the Save button to create the clone. You can optionally override some of the configurations before clicking the Save button. For example, you can override the number of vCPUs, memory size, boot priority, NICs, or the guest customization.

    Note:
    • You can clone up to 250 VMs at a time.
    • You cannot override the secure boot setting while cloning a VM, unless the source VM already had secure boot setting enabled.

    Figure. Clone VM Window Click to enlargeclone VM window display

  9. To modify the VM configuration, click the Update action link.

    The Update VM dialog box appears, which includes the same fields as the Create VM dialog box. Modify the configuration as needed, and then save the configuration. In addition to modifying the configuration, you can attach a volume group to the VM and enable flash mode on the VM. If you attach a volume group to a VM that is part of a protection domain, the VM is not protected automatically. Add the VM to the same Consistency Group manually.

    (For GPU-enabled AHV clusters only) You can add pass-through GPUs if a VM is already using GPU pass-through. You can also change the GPU configuration from pass-through to vGPU or vGPU to pass-through, change the vGPU profile, add more vGPUs, and change the specified vGPU license. However, you need to power off the VM before you perform these operations.

    • Before you add multiple vGPUs to the VM, see Multiple Virtual GPU Support and Restrictions for Multiple vGPU Support in the AHV Administration Guide.

    • Multiple vGPUs are supported on the same VM only if you select the highest vGPU profile type.

    • For more information on vGPU profile selection, see:

      • Virtual GPU Types for Supported GPUs in the NVIDIA Virtual GPU Software User Guide in the NVIDIA's Virtual GPU Software Documentation webpage, and

      • GPU and vGPU Support in the AHV Administration Guide.

    • After you add the first vGPU, to add multiple vGPUs, see Adding Multiple vGPUs to the Same VM in the AHV Administration Guide.

    You can add new network adapters or NICs using the Add New NIC option. You can also modify the network used by an existing NIC. See Limitation for vNIC Hot-Unplugging and Creating a VM (AHV) before you modify the NIC network or create a new NIC for a VM.

    Note: To create or update a SPAN destination type VM or VNIC, use command line interface. Prism does not support SPAN destination type configurations. See Switch Port ANalyzer on AHV Hosts.

    Figure. VM Update Dialog Box Click to enlargeclone VM window display

    Note: If you delete a vDisk attached to a VM and snapshots associated with this VM exist, space associated with that vDisk is not reclaimed unless you also delete the VM snapshots.
    To increase the memory allocation and the number of vCPUs on your VMs while the VMs are powered on (hot-pluggable), do the following:
    1. In the vCPUs field, you can increase the number of vCPUs on your VMs while the VMs are powered on.
    2. In the Number of Cores Per vCPU field, you can change the number of cores per vCPU only if the VMs are powered off.
      Note: This is not a hot-pluggable feature.
    3. In the Memory field, you can increase the memory allocation on your VMs while the VMs are powered on.
    For more information about hot-pluggable vCPUs and memory, see Virtual Machine Memory and CPU Hot-Plug Configurations in the AHV Administration Guide.
    To attach a volume group to the VM, do the following:
    1. In the Volume Groups section, click Add volume group, and then do one of the following:
      • From the Available Volume Groups list, select the volume group that you want to attach to the VM.
      • Click Create new volume group, and then, in the Create Volume Group dialog box, create a volume group (see Creating a Volume Group). After you create a volume group, select it from the Available Volume Groups list.
      Repeat these steps until you have added all the volume groups that you want to attach to the VM.
    2. Click Add.
  10. To enable flash mode on the VM, click the Enable Flash Mode check box.
    • After you enable this feature on the VM, the status is updated in the VM table view. To view the status of individual virtual disks (disks that are flashed to the SSD), click the update disk icon in the Disks pane in the Update VM window.
    • You can disable the flash mode feature for individual virtual disks. To update the flash mode for individual virtual disks, click the update disk icon in the Disks pane and deselect the Enable Flash Mode check box.
    Figure. Update VM Resources Click to enlargeVM update resources display - VM Flash Mode

    Figure. Update VM Resources - VM Disk Flash Mode Click to enlargeVM update resources display - VM Disk Flash Mode

  11. To delete the VM, click the Delete action link. A window prompt appears; click the OK button to delete the VM.
    The deleted VM disappears from the list of VMs in the table.

Limitation for vNIC Hot-Unplugging

If you detach (hot-unplug) the vNIC for the VM with guest OS installed on it, the AOS displays the detach result as successful, but the actual detach success depends on the status of the ACPI mechanism in guest OS.

The following table describes the vNIC detach observations and workaround applicable based on guest OS response to ACPI request:

Table 1. vNIC Detach - Observations and Workaround
Detach Procedure Followed Guest OS responds to ACPI request (Yes/No) AOS Behavior Actual Detach Result Workaround
vNIC Detach (hot-unplug)
  • Using Prism Central: See Managing a VM (AHV) topic in Prism Central Guide.
  • Using Prism Element web console: See Managing a VM (AHV).
  • Using acli: Log on to the CVM with SSH and run the following command:

    nutanix@cvm$ acli vm.nic_delete <vm_name> <nic mac address>

    or,

    nutanix@cvm$ acli vm.nic_update <vm_name> <nic mac address> connected=false

    Replace the following attributes in the above commands:

    • <vm_name> with the name of the guest VM for which the vNIC is to be detached.
    • <nic mac address> with the vNIC MAC address that needs to be detached.
Yes

vNIC detach is Successful.

Observe the following logs:

Device detached successfully

vNIC detach is successful. No action needed
No vNIC detach is not successful. Power cycle the VM for successful vNIC detach.
Note: In most cases, it is observed that the ACPI mechanism failure occurs when no guest OS is installed on the VM.

Virtual Machine Snapshots

You can generate snapshots of virtual machines or VMs. You can generate snapshots of VMs manually or automatically. Some of the purposes that VM snapshots serve are as follows:

  • Disaster recovery
  • Testing - as a safe restoration point in case something went wrong during testing.
  • Migrate VMs
  • Create multiple instances of a VM.

Snapshot is a point-in-time state of entities such as VM and Volume Groups, and used for restoration and replication of data.. You can generate snapshots and store them locally or remotely. Snapshots are mechanism to capture the delta changes that has occurred over time. Snapshots are primarily used for data protection and disaster recovery. Snapshots are not autonomous like backup, in the sense that they depend on the underlying VM infrastructure and other snapshots to restore the VM. Snapshots consume less resources compared to a full autonomous backup. Typically, a VM snapshot captures the following:

  • The state including the power state (for example, powered-on, powered-off, suspended) of the VMs.
  • The data includes all the files that make up the VM. This data also includes the data from disks, configurations, and devices, such as virtual network interface cards.

VM Snapshots and Snapshots for Disaster Recovery

The VM Dashboard only allows you to generate VM snapshots manually. You cannot select VMs and schedule snapshots of the VMs using the VM dashboard. The snapshots generated manually have very limited utility.

Note: These snapshots (stored locally) cannot be replicated to other sites.

You can schedule and generate snapshots as a part of the disaster recovery process using Nutanix DR solutions. AOS generates snapshots when you protect a VM with a protection domain using the Data Protection dashboard in Prism Web Console (see the Data Protection and Recovery with Prism Element guide). Similarly, Recovery Points (snapshots are called Recovery Points in Prism Central) when you protect a VM with a protection policy using Data Protection dashboard in Prism Central (see the Leap Administration Guide).

For example, in the Data Protection dashboard in Prism Web Console, you can create schedules to generate snapshots using various RPO schemes such as asynchronous replication with frequency intervals of 60 minutes or more, or NearSync replication with frequency intervals of as less as 20 seconds up to 15 minutes. These schemes create snapshots in addition to the ones generated by the schedules, for example, asynchronous replication schedules generate snapshots according to the configured schedule and, in addition, an extra snapshot every 6 hours. Similarly, NearSync generates snapshots according to the configured schedule and also generates one extra snapshot every hour.

Similarly, you can use the options in the Data Protection section of Prism Central to generate Recovery Points using the same RPO schemes.

Windows VM Provisioning

Nutanix VirtIO for Windows

Nutanix VirtIO is a collection of drivers for paravirtual devices that enhance the stability and performance of virtual machines on AHV.

Nutanix VirtIO is available in two formats:

  • To install Windows in a VM on AHV, use the VirtIO ISO.
  • To update VirtIO for Windows, use the VirtIO MSI installer file.

Use Nutanix Guest Tools (NGT) to install the Nutanix VirtlO Package. For more information about installing the Nutanix VirtIO package using the NGT, see NGT Installation in the Prism Web Console Guide.

VirtIO Requirements

Requirements for Nutanix VirtIO for Windows.

VirtIO supports the following operating systems:

  • Microsoft Windows server version: Windows 2008 R2 or later
  • Microsoft Windows client version: Windows 7 or later
Note: On Windows 7 and Windows Server 2008 R2, install Microsoft KB3033929 or update the operating system with the latest Windows Update to enable support for SHA2 certificates.
Caution: The VirtIO installation or upgrade may fail if multiple Windows VSS snapshots are present in the guest VM. The installation or upgrade failure is due to the timeout that occurs during installation of Nutanix VirtIO SCSI pass-through controller driver.

It is recommended to clean up the VSS snapshots or temporarily disconnect the drive that contains the snapshots. Ensure that you only delete the snapshots that are no longer needed. For more information about how to observe the VirtIO installation or upgrade failure that occurs due to availability of multiple Windows VSS snapshots, see KB-12374.

Installing or Upgrading Nutanix VirtIO for Windows

Download Nutanix VirtIO and the Nutanix VirtIO Microsoft installer (MSI). The MSI installs and upgrades the Nutanix VirtIO drivers.

Before you begin

Make sure that your system meets the VirtIO requirements described in VirtIO Requirements.

About this task

If you have already installed Nutanix VirtIO, use the following procedure to upgrade VirtIO to a latest version. If you have not yet installed Nutanix VirtIO, use the following procedure to install Nutanix VirtIO.

Procedure

  1. Go to the Nutanix Support portal, select Downloads > AHV, and click VirtIO.
  2. Select the appropriate VirtIO package.
    • If you are creating a new Windows VM, download the ISO file. The installer is available on the ISO if your VM does not have Internet access.
    • If you are updating drivers in a Windows VM, download the MSI installer file.
    Figure. Search filter and VirtIO options Click to enlargeUse filter to search for the latest VirtIO package, ISO or MSI.

  3. Run the selected package.
    • For the ISO: Upload the ISO to the cluster, as described in the Configuring Images topic in Prism Web Console Guide.
    • For the MSI: open the download file to run the MSI.
  4. Read and accept the Nutanix VirtIO license agreement. Click Install.
    Figure. Nutanix VirtIO Windows Setup Wizard Click to enlargeAccept the License Agreement for Nutanix VirtIO Windows Installer

    The Nutanix VirtIO setup wizard shows a status bar and completes installation.

Manually Installing or Upgrading Nutanix VirtIO

Manually install or upgrade Nutanix VirtIO.

Before you begin

Make sure that your system meets the VirtIO requirements described in VirtIO Requirements.

About this task

Note: To automatically install Nutanix VirtIO, see Installing or Upgrading Nutanix VirtIO for Windows.

If you have already installed Nutanix VirtIO, use the following procedure to upgrade VirtIO to a latest version. If you have not yet installed Nutanix VirtIO, use the following procedure to install Nutanix VirtIO.

Procedure

  1. Go to the Nutanix Support portal, select Downloads > AHV, and click VirtIO.
  2. Do one of the following:
    • Extract the VirtIO ISO into the same VM where you load Nutanix VirtIO, for easier installation.

      If you choose this option, proceed directly to step 7.

    • Download the VirtIO ISO for Windows to your local machine.

      If you choose this option, proceed to step 3.

  3. Upload the ISO to the cluster, as described in the Configuring Images topic of Prism Web Console Guide.
  4. Locate the VM where you want to install the Nutanix VirtIO ISO and update the VM.
  5. Add the Nutanix VirtIO ISO by clicking Add New Disk and complete the indicated fields.
    • TYPE: CD-ROM
    • OPERATION: CLONE FROM IMAGE SERVICE
    • BUS TYPE: IDE
    • IMAGE: Select the Nutanix VirtIO ISO
  6. Click Add.
  7. Log on to the VM and browse to Control Panel > Device Manager.
  8. Note: Select the x86 subdirectory for 32-bit Windows, or the amd64 for 64-bit Windows.
    Open the devices and select the specific Nutanix drivers for download. For each device, right-click and Update Driver Software into the drive containing the VirtIO ISO. For each device, follow the wizard instructions until you receive installation confirmation.
    1. System Devices > Nutanix VirtIO Balloon Drivers
    2. Network Adapter > Nutanix VirtIO Ethernet Adapter.
    3. Processors > Storage Controllers > Nutanix VirtIO SCSI pass through Controller
      The Nutanix VirtIO SCSI pass-through controller prompts you to restart your system. Restart at any time to install the controller.
      Figure. List of Nutanix VirtIO downloads Click to enlargeThis image lists the Nutanix VirtIO downloads required for 32-bit Windows.

Creating a Windows VM on AHV with Nutanix VirtIO

Create a Windows VM in AHV, or migrate a Windows VM from a non-Nutanix source to AHV, with the Nutanix VirtIO drivers.

Before you begin

  • Upload the Windows installer ISO to your cluster as described in the Configuring Images topic in Web Console Guide.
  • Upload the Nutanix VirtIO ISO to your cluster as described in the Configuring Images topic in Web Console Guide.

About this task

To install a new or migrated Windows VM with Nutanix VirtIO, complete the following.

Procedure

  1. Log on to the Prism web console using your Nutanix credentials.
  2. At the top-left corner, click Home > VM.
    The VM page appears.
  3. Click + Create VM in the corner of the page.
    The Create VM dialog box appears.
    Figure. Create VM dialog box Click to enlargeCreate VM dialog box

  4. Complete the indicated fields.
    1. NAME: Enter a name for the VM.
    2. Description (optional): Enter a description for the VM.
    3. Timezone: Select the timezone that you want the VM to use. If you are creating a Linux VM, select (UTC) UTC.
      Note:

      The RTC of Linux VMs must be in UTC, so select the UTC timezone if you are creating a Linux VM.

      Windows VMs preserve the RTC in the local timezone, so set up the Windows VM with the hardware clock pointing to the desired timezone.

    4. Number of Cores per vCPU: Enter the number of cores assigned to each virtual CPU.
    5. MEMORY: Enter the amount of memory for the VM (in GiB).
  5. If you are creating a Windows VM, add a Windows CD-ROM to the VM.
    1. Click the pencil icon next to the CD-ROM that is already present and fill out the indicated fields.
      • OPERATION: CLONE FROM IMAGE SERVICE
      • BUS TYPE: IDE
      • IMAGE: Select the Windows OS install ISO.
    2. Click Update.
      The current CD-ROM opens in a new window.
  6. Add the Nutanix VirtIO ISO.
    1. Click Add New Disk and complete the indicated fields.
      • TYPE: CD-ROM
      • OPERATION: CLONE FROM IMAGE SERVICE
      • BUS TYPE: IDE
      • IMAGE: Select the Nutanix VirtIO ISO.
    2. Click Add.
  7. Add a new disk for the hard drive.
    1. Click Add New Disk and complete the indicated fields.
      • TYPE: DISK
      • OPERATION: ALLOCATE ON STORAGE CONTAINER
      • BUS TYPE: SCSI
      • STORAGE CONTAINER: Select the appropriate storage container.
      • SIZE: Enter the number for the size of the hard drive (in GiB).
    2. Click Add to add the disk driver.
  8. If you are migrating a VM, create a disk from the disk image.
    1. Click Add New Disk and complete the indicated fields.
      • TYPE: DISK
      • OPERATION: CLONE FROM IMAGE
      • BUS TYPE: SCSI
      • CLONE FROM IMAGE SERVICE: Click the drop-down menu and choose the image you created previously.
    2. Click Add to add the disk driver.
  9. Optionally, after you have migrated or created a VM, add a network interface card (NIC).
    1. Click Add New NIC.
    2. In the VLAN ID field, choose the VLAN ID according to network requirements and enter the IP address, if necessary.
    3. Click Add.
  10. Click Save.

What to do next

Install Windows by following Installing Windows on a VM.

Installing Windows on a VM

Install a Windows virtual machine.

Before you begin

Create a Windows VM.

Procedure

  1. Log on to the web console.
  2. Click Home > VM to open the VM dashboard.
  3. Select the Windows VM.
  4. In the center of the VM page, click Power On.
  5. Click Launch Console.
    The Windows console opens in a new window.
  6. Select the desired language, time and currency format, and keyboard information.
  7. Click Next > Install Now.
    The Windows setup dialog box shows the operating systems to install.
  8. Select the Windows OS you want to install.
  9. Click Next and accept the license terms.
  10. Click Next > Custom: Install Windows only (advanced) > Load Driver > OK > Browse.
  11. Choose the Nutanix VirtIO driver.
    1. Select the Nutanix VirtIO CD drive.
    2. Expand the Windows OS folder and click OK.
    Figure. Select the Nutanix VirtIO drivers for your OS Click to enlargeChoose the driver folder.

    The Select the driver to install window appears.
  12. Select the VirtIO SCSI driver (vioscsi.inf) and click Next.
    Figure. Select the Driver for Installing Windows on a VM Click to enlargeChoose the VirtIO driver.

    The amd64 folder contains drivers for 64-bit operating systems. The x86 folder contains drivers for 32-bit operating systems.
    Note: From Nutanix VirtIO driver version 1.1.5, the driver package contains Windows Hardware Quality Lab (WHQL) certified driver for Windows.
  13. Select the allocated disk space for the VM and click Next.
    Windows shows the installation progress, which can take several minutes.
  14. Enter your user name and password information and click Finish.
    Installation can take several minutes.
    Once you complete the logon information, Windows setup completes installation.
  15. Follow the instructions in Installing or Upgrading Nutanix VirtIO for Windows to install other drivers which are part of Nutanix VirtIO package.

Windows Defender Credential Guard Support in AHV

AHV enables you to use the Windows Defender Credential Guard security feature on Windows guest VMs.

Windows Defender Credential Guard feature of Microsoft Windows operating systems allows you to securely isolate user credentials from the rest of the operating system. By that means, you can protect guest VMs from credential theft attacks such as Pass-the-Hash or Pass-The-Ticket.

See the Microsoft documentation for more information about the Windows Defender Credential Guard security feature.

Windows Defender Credential Guard Architecture in AHV

Figure. Architecture Click to enlarge

Windows Defender Credential Guard uses Microsoft virtualization-based security to isolate user credentials in the virtualization-based security (VBS) module in AHV. When you enable Windows Defender Credential Guard on an AHV guest VM, the guest VM runs on top of AHV running both the Windows OS and VBS. Each Windows OS guest VM, which has credential guard enabled, has a VBS to securely store credentials.

Windows Defender Credential Guard Requirements

Ensure the following to enable Windows Defender Credential Guard:

  1. AOS, AHV, and Windows Server versions support Windows Defender Credential Guard:
    • AOS version must be 5.19 or later
    • AHV version must be AHV 20201007.1 or later
    • Windows Server version must be Windows server 2016 or later, Windows 10 Enterprise or later and Windows Server 2019 or later
  2. UEFI, Secure Boot, and machine type q35 are enabled in the Windows VM from AOS.

    The Prism Element workflow to enable Windows Defender Credential Guard includes the workflow to enable these features.

Limitations

  • Windows Defender Credential guard is not supported on hosts with AMD CPUs.
  • If you enable Windows Defender Credential Guard for your AHV guest VMs, the following optional configurations are not supported:

    • vTPM (Virtual Trusted Platform Modules) to store MS policies.
      Note: vTPM is supported with AOS 6.5.1 or later and AHV 20220304.242 or later release versions only.
    • DMA protection (vIOMMU).
    • Nutanix Live Migration.
    • Cross hypervisor DR of Credential Guard VMs.
Caution: Use of Windows Defender Credential Guard in your AHV clusters impacts VM performance. If you enable Windows Defender Credential Guard on AHV guest VMs, VM density drops by ~15–20%. This expected performance impact is due to nested virtualization overhead added as a result of enabling credential guard.

Enabling Windows Defender Credential Guard Support in AHV Guest VMs

You can enable Windows Defender Credential Guard when you are either creating a VM or updating a VM.

About this task

Perform the following procedure to enable Windows Defender Credential Guard:

Procedure

  1. Enable Windows Defender Credential Guard when you are either creating a VM or updating a VM. Do one of the following:
    • If you are creating a VM, see step 2.
    • If you are updating a VM, see step 3.
  2. If you are creating a Windows VM, do the following:
    1. Log on to the Prism Element web console.
    2. In the VM dashboard, click Create VM.
    3. Fill in the mandatory fields to configure a VM.
    4. Under Boot Configuration, select UEFI, and then select the Secure Boot and Windows Defender Credential Guard options.
      Figure. Enable Windows Defender Credential Guard Click to enlarge

      See UEFI Support for VM and Secure Boot Support for VMs for more information about these features.

    5. Proceed to configure other attributes for your Windows VM.
    6. Click Save.
    7. Turn on the VM.
  3. If you are updating an existing VM, do the following:
    1. Log on to the Prism Element web console.
    2. In the VM dashboard, click the Table view, select the VM, and click Update.
    3. Under Boot Configuration, select UEFI, and then select the Secure Boot and Windows Defender Credential Guard options.
      Note:

      If the VM is configured to use BIOS, install the guest OS again.

      If the VM is already configured to use UEFI, skip the step to select Secure Boot.

      See UEFI Support for VM and Secure Boot Support for VMs for more information about these features.

    4. Click Save.
    5. Turn on the VM.
  4. Enable Windows Defender Credential Guard in the Windows VM by using group policy.
    See the Enable Windows Defender Credential Guard by using the Group Policy procedure of the Manage Windows Defender Credential Guard topic in the Microsoft documentation to enable VBS, Secure Boot, and Windows Defender Credential Guard for the Windows VM.
  5. Open command prompt in the Windows VM and apply the Group Policy settings:
    > gpupdate /force

    If you have not enabled Windows Defender Credential Guard (step 4) and perform this step (step 5), a warning similar to the following is displayed:

    Updating policy...
     
    Computer Policy update has completed successfully.
     
    The following warnings were encountered during computer policy processing:
     
    Windows failed to apply the {F312195E-3D9D-447A-A3F5-08DFFA24735E} settings. {F312195E-3D9D-447A-A3F5-08DFFA24735E} settings might have its own log file. Please click on the "More information" link.
    User Policy update has completed successfully.
     
    For more detailed information, review the event log or run GPRESULT /H GPReport.html from the command line to access information about Group Policy results.
    

    Event Viewer displays a warning for the group policy with an error message that indicates Secure Boot is not enabled on the VM.

    To view the warning message in Event Viewer, do the following:

    • In the Windows VM, open Event Viewer.
    • Go to Windows Logs -> System and click the warning with the Source as GroupPolicy (Microsoft-Windows-GroupPolicy) and Event ID as 1085.
    Figure. Warning in Event Viewer Click to enlarge

    Note: Ensure that you follow the steps in the order that is stated in this document to successfully enable Windows Defender Credential Guard.
  6. Restart the VM.
  7. Verify if Windows Defender Credential Guard is enabled in your Windows VM.
    1. Start a Windows PowerShell terminal.
    2. Run the following command.
      PS > Get-CimInstance -ClassName Win32_DeviceGu
      ard -Namespace 'root\Microsoft\Windows\DeviceGuard'

      An output similar to the following is displayed.

      PS > Get-CimInstance -ClassName Win32_DeviceGuard -Namespace 'root\Microsoft\Windows\DeviceGuard'
      AvailableSecurityProperties              	: {1, 2, 3, 5}
      CodeIntegrityPolicyEnforcementStatus     	: 0
      InstanceIdentifier                       	: 4ff40742-2649-41b8-bdd1-e80fad1cce80
      RequiredSecurityProperties               	: {1, 2}
      SecurityServicesConfigured               	: {1}
      SecurityServicesRunning                  	: {1}
      UsermodeCodeIntegrityPolicyEnforcementStatus : 0
      Version                                  	: 1.0
      VirtualizationBasedSecurityStatus        	: 2
      PSComputerName 
      

      Confirm that both SecurityServicesConfigured and SecurityServicesRunning have the value { 1 }.

    Alternatively, you can verify if Windows Defender Credential Guard is enabled by using System Information (msinfo32):

    1. In the Windows VM, open System Information by typing msinfo32 in the search field next to the Start menu.
    2. Verify if the values of the parameters are as indicated in the following screen shot:
      Figure. Verify Windows Defender Credential Guard Click to enlarge

Windows Subsystem for Linux (WSL2) Support on AHV

AHV supports WSL2 that enables you to run a Linux setup on a Windows OS without a dedicated VM and dual-boot environment.

For more information about WSL, refer to What is the Windows Subsystem for Linux? topic in Microsoft Technical Documentation.
Note:
  • Both hardware and software support are required to enable a guest VM to communicate with its nested guest VMs in a WSL2 setup.
  • The system performance gets impacted in a WSL2 environment due to specific workloads and lack of hardware features in the processors. These attributes are required to enhance the virtualization environment.
  • VM live migration is currently not supported for WSL. You must power off the VM during any AOS or AHV upgrades.

Limitations

The following table provides the information about the limitations that are applicable for WSL2 based on the AOS and AHV version:

Table 1. Limitations for WSL2
AOS Version AHV Version Limitations for WSL2

AOS 6.5.1

AHV 20201105.30411 (default bundled AHV with AOS 6.5.1)

The following optional configurations are not supported:

  • vTPM (Virtual Trusted Platform Modules) to store MS policies

  • Hosts with AMD CPUs

  • DMA protection (vIOMMU).

  • Nutanix Live Migration.

  • Cross hypervisor DR of WSL2 VM

AOS 6.5.1 or later

AHV 20220304.242 or later

The following optional configurations are not supported:

  • Hosts with AMD CPUs

  • DMA protection (vIOMMU).

  • Nutanix Live Migration.

  • Cross hypervisor DR of WSL2 VM

Enabling WSL2 on AHV

This section provides the information on how to enable the WSL2 on AHV.

Before you begin

Ensure that the AOS 6.5.1 or later along with AHV 20220304.242 or later release versions are deployed at your site.

About this task

Note:

In the following procedure, ensure that you replace the <VM_name> with the actual guest VM name.

To configure WSL2 on AHV:

Procedure

  1. Power off the guest VM on which you want to configure WSL2.
  2. Log on to any CVM in the cluster with SSH.
  3. Run the following command to enable the guest VM to support WSL2:
    nutanix@CVM~ $ acli vm.update <VM_name> hardware_virtualization=true
    Note: In case you need to create a new guest VM (Windows VM) on AHV with Nutanix VirtIO, see Creating a Windows VM on AHV with Nutanix VirtIO.
  4. (Optional) Retrieve the guest VM details using the following command to check whether all the attributes are correctly set for the guest VM:
    nutanix@CVM~ $ acli vm.get <VM_name>

    Observe the following log attributes to verify whether the infrastructure to support WSL2 is configured successfully in the guest VM:

    hardware_virtualization: True
  5. Power on the guest VM using the following command:
    nutanix@CVM~ $ acli vm.on <VM_name>
  6. Enable WSL2 on Windows OS. For information on how to install WSL, refer Install WSL topic in Microsoft Technical Documentation.

Affinity Policies for AHV

As an administrator of an AHV cluster, you can create VM-Host affinity policies for virtual machines on an AHV cluster. By defining these policies, you can control the placement of virtual machines on the hosts within a cluster.

Note: VMs with Host affinity policies can only be migrated to the hosts specified in the affinity policy. If only one host is specified, the VM cannot be migrated or started on another host during an HA event. For more information, see Non-Migratable Hosts.

You can define affinity policies for a VM at two levels:

Affinity Policies defined in Prism Element

In Prism Element, you can define affinity policies at VM level during the VM create or update operation. You can use an affinity policy to specify that a particular VM can only run on the members of the affinity host list.

Affinity Policies defined in Prism Central

In Prism Central, you can define category-based VM-Host affinity policies, where a set of VMs can be affined to run only on a particular set of hosts. Category-based affinity policy enables you to easily manage affinities for a large number of VMs.

Affinity Policies Defined in Prism Element

In Prism Element, you can define scheduling policies for virtual machines on an AHV cluster at a VM level. By defining these policies, you can control the placement of a virtual machine on specific hosts within a cluster.

You can define two types of affinity policies in Prism Element.

VM-Host Affinity Policy

The VM-host affinity policy controls the placement of a VM. You can use this policy to specify that a selected VM can only run on the members of the affinity host list. This policy checks and enforces where a VM can be hosted when you power on or migrate the VM.
Note:
  • If you choose to apply the VM-host affinity policy, it limits Acropolis HA and Acropolis Dynamic Scheduling (ADS) in such a way that a virtual machine cannot be powered on or migrated to a host that does not conform to requirements of the affinity policy as this policy is enforced mandatorily.
  • The VM-host anti-affinity policy is not supported.
  • VMs configured with host affinity settings retain these settings if the VM is migrated to a new cluster. Remove the VM-host affinity policies applied to a VM that you want to migrate to another cluster, as the UUID of the host is retained by the VM and it does not allow VM restart on the destination cluster. When you attempt to protect such VMs, it is successful. However, some disaster recovery operations like migration fail and attempts to power on these VMs also fail.

You can define the VM-host affinity policies by using Prism Element during the VM create or update operation. For more information, see Creating a VM (AHV).

VM-VM Anti-Affinity Policy

You can use this policy to specify anti-affinity between the virtual machines. The VM-VM anti-affinity policy keeps the specified virtual machines apart in such a way that when a problem occurs with one host, you should not lose both the virtual machines. However, this is a preferential policy. This policy does not limit the Acropolis Dynamic Scheduling (ADS) feature to take necessary action in case of resource constraints.
Note:
Note: If a VM is cloned that has the affinity policies configured, then the policies are not automatically applied to the cloned VM. However, if a VM is restored from a DR snapshot, the policies are automatically applied to the VM.

Limitations of Affinity Rules

Even though if a host is removed from a cluster, the host UUID is not removed from the host-affinity list for a VM.

Configuring VM-VM Anti-Affinity Policy

To configure VM-VM anti-affinity policies, you must first define a group and then add all the VMs on which you want to define VM-VM anti-affinity policy.

About this task

Note: Currently, the VM-VM affinity policy is not supported.

Perform the following procedure to configure the VM-VM anti-affinity policy.

Procedure

  1. Log on to the Controller VM with SSH session.
  2. Create a group.
    nutanix@cvm$ acli vm_group.create group_name

    Replace group_name with the name of the group.

  3. Add the VMs on which you want to define anti-affinity to the group.
    nutanix@cvm$ acli vm_group.add_vms group_name vm_list=vm_name

    Replace group_name with the name of the group. Replace vm_name with the name of the VMs that you want to define anti-affinity on. In case of multiple VMs, you can specify comma-separated list of VM names.

  4. Configure VM-VM anti-affinity policy.
    nutanix@cvm$ acli vm_group.antiaffinity_set group_name

    Replace group_name with the name of the group.

    After you configure the group and then power on the VMs, the VMs that are part of the group are started (attempt to start) on the different hosts. However, this is a preferential policy. This policy does not limit the Acropolis Dynamic Scheduling (ADS) feature to take necessary action in case of resource constraints.

Removing VM-VM Anti-Affinity Policy

Perform the following procedure to remove the VM-VM anti-affinity policy.

Procedure

  1. Log on to the Controller VM with SSH session.
  2. Remove the VM-VM anti-affinity policy.
    nutanix@cvm$ acli vm_group.antiaffinity_unset group_name

    Replace group_name with the name of the group.

    The VM-VM anti-affinity policy is removed for the VMs that are present in the group, and they can start on any host during the next power on operation (as necessitated by the ADS feature).

Affinity Policies Defined in Prism Central

In Prism Central, you can define category-based VM-Host affinity policies, where a set of VMs can be affined to run only on a particular set of hosts. Category-based affinity policy enables you to easily manage affinities for a large number of VMs. In case of any changes to the affined hosts, you only need to update the category of the host, and it updates the affinity policy for all the affected VMs.

This policy checks and enforces where a VM can be hosted when you start or migrate the VM. If there are no resources available on any of the affined hosts, the VM does not get started.

Note:

If you create a VM-Host affinity policy for a VM that is configured for asynchronous replication, you must create similar categories and corresponding policies on the remote site as well. If you define similar categories and policies on the remote site, affinity policies will be applied when the VMs are migrated to the remote site.

Limitations of Affinity Policies

Affinity policies created in Prism Central have the following limitations:

  • Only a super admin can create, modify, or delete affinity policies.
  • The minimum supported versions for VM-Host affinity policies are version 6.1 for Prism Element and version 2022.1 for Prism Central.
  • You cannot apply VM-Host affinity policy on a VM that is enabled for synchronous replication. Also, you cannot enable synchronous replication on a VM that is associated with a VM-Host affinity policy.
  • Host category attachment or detachment takes around 5 minutes to get reflected in the applicable affinity policies.

    When you assign a category to a host and map the host category to the affinity policy, you can observe that the host count gets updated immediately on the Entities page.

    The following figure shows the host count on Entities page:

    Figure. Host Count - Entities Page Click to enlargeHost Count on Entities page

    However, the system takes approximately 5 minutes to update the host count on Affinity Policies page.
    Note: The delay in host count update is due to the usage of different APIs to derive the host count on Entities and Affinity Policies pages.

    The following figure shows the host count on Affinity Policies page after delay of approximately 5 minutes:

    Figure. Host Count - Affinity Policies Page Click to enlargeHost count on Affinity Policies page

    For information about how to create a category, see Creating a Category topic in Prism Central Guide.

    For information about how to assign a category to host, see Associating Hosts with Categories.

    For information about how to create the affinity policy and map the host category to the affinity policy, see Creating an Affinity Policy.

Affinity Policy Configuration Workflow

About this task

To set up an affinity policy, do the following:

Procedure

  1. Create categories for the following entities:
    1. VMs
    2. Hosts
    For information about creating a category, see Creating a Category.
  2. Apply the VM categories to the VMs and host categories to the hosts.
    For information about associating categories with VMs, see Associating VMs with Categories topic in the Prism Central Guide (see Prism). For information about associating categories with hosts, see Associating hosts with Categories.
  3. Create the affinity policy. See Creating an Affinity Policy topic in the Prism Central Guide (see Prism).

Associating VMs with Categories

About this task

To associate categories with VMs, do the following:

Procedure