UCX Fabric Support (DAOS 2.2 Technology Preview)¶
DAOS 2.2 includes a technology preview of UCX support for clusters using InfiniBand, as an alternative to the default libfabric network stack.
Note
 EL8 and Leap15 only. It is not supported on CentOS7.
The goal of this technology preview is to allow early evaluation and testing. DAOS over UCX has not been fully validated yet, and it is not recommended to use it in a production environment with DAOS 2.2. It is a roadmap item to fully support UCX in DAOS 2.4.
!!! note The network provider is an immutable property of a DAOS system. Changing the network provider to UCX requires that the DAOS storage is reformatted.
To enable DAOS UCX support on InfiniBand fabrics, the following steps are needed:
- 
A supported version of MLNX_OFED must be installed before DAOS is installed. This is the same for libfabric and for UCX: DAOS only supports the NVIDIA-provided MLNX_OFED stack, not the inbox drivers. Refer to the DAOS Support Matrix for information about supported MLNX_OFED releases. 
- 
The mercury-ucxRPM package needs to be manually selected for installation: For the technology preview, themercurypackage is provided in two different versions, which are mutually exclusive:
- 
The standard mercuryRPM does support libfabric, but not UCX. This RPM will be installed by default, and must be used in non-InfiniBand environments.
- 
A new mercury-ucxRPM is also provided, which supports both libfabric and UCX. This RPM must be used in InfiniBand environments when the intention is to use UCX. It may also be used in InfiniBand environments if the intention is to use libfabric. Attempts to install this RPM in non-Infiniband environments will fail, because it has a dependency on UCX packages.
- 
At DAOS installation time, to enable UCX support the new mercury-ucxRPM package must be explicitly listed in order to prevent the installation of the defaultmercurypackage (which does not include the UCX support). For example, using theyumpackage manager on EL8:
      # on DAOS_ADMIN nodes:
      yum install mercury-ucx daos-admin
      # on DAOS_SERVER nodes:
      yum install mercury-ucx daos-server
      # on DAOS_CLIENT nodes:
      yum install mercury-ucx daos-client
- To change an existing DAOS installation from libfabric to
   UCX, the default mercuryRPM first needs to be un-installed, and themercury-ucxRPM must be installed instead. To prevent the removal of DAOS altogether (it has a package dependency on mercury), therpmcommand with the--nodepsoption should be used:
      # on EL8:
      rpm -e --nodeps mercury
      yum install mercury-ucx
      # on Leap15:
      rpm -e --nodeps mercury
      zypper install mercury-ucx
- To update from DAOS 2.0 (with libfabric) to DAOS 2.2 with
   UCX, the recommended path is to first perform a standard DAOS
   RPM update (which will update the default mercurypackage). After the update, themercuryRPM package can be replaced bymercury-ucxas described above.
After UCX support has been enabled by installing the mercury-ucx
package, the network provider must be changed in the DAOS server's
configuration file (/etc/daos/daos_server.yml).
A sample YML file is available on
github.
The recommended setting for UCX is provider: ucx+dc_x.