Skip to content

DAOS System Administration

RAS Events

Reliability, Availability, and Serviceability (RAS) related events are communicated and logged within DAOS and syslog.

Event Structure

The following table describes the structure of a DAOS RAS event, including descriptions of mandatory and optional fields.

Field Optional/Mandatory Description
ID Mandatory Unique event identifier referenced in the manual.
Type Mandatory Event type of STATE_CHANGE causes an update to the Management Service (MS) database in addition to event being written to SYSLOG. INFO_ONLY type events are only written to SYSLOG.
Timestamp Mandatory Resolution at the microseconds and include the timezone offset to avoid locality issues.
Severity Mandatory Indicates event severity, Error/Warning/Notice.
Msg Mandatory Human readable message.
HID Optional Identify hardware components involved in the event. E.g., PCI address for SSD, network interface
Rank Optional DAOS rank involved in the event.
PID Optional Identifier of the process involved in the RAS event
TID Optional Identifier of the thread involved in the RAS event.
JOBID Optional Identifier of the job involved in the RAS event.
Hostname Optional Hostname of the node involved in the event.
PUUID Optional Pool UUID involved in the event, if any.
CUUID Optional Container UUID involved in the event, if relevant.
OID Optional Object identifier involved in the event, if relevant.
Control Operation Optional Recommended automatic action, if any.
Data Optional Specific instance data treated as a blob.

Below is an example of a RAS event signaling an exclusion of an unresponsive engine:

&&& RAS EVENT id: [swim_rank_dead] ts: [2021-11-21T13:32:31.747408+0000] host: [] type: [STATE_CHANGE] sev: [NOTICE] msg: [SWIM marked rank as dead.] pid: [253454] tid: [1] rank: [6] inc: [63a058833280000]

Event List

The following table lists supported DAOS RAS events, including IDs, type, severity, message, description, and cause.

Event Event type Severity Message Description Cause
engine_format_required INFO_ONLY NOTICE DAOS engine <idx> requires a <type> format Indicates engine is waiting for allocated storage to be formatted on formatted on instance <idx> with dmg tool. <type> can be either SCM or Metadata. DAOS server attempts to bring-up an engine that has unformatted storage.
engine_died STATE_CHANGE ERROR DAOS engine <idx> exited exited unexpectedly: <error> Indicates engine instance <idx> unexpectedly. describes the exit state returned from exited daos_engine process. N/A
engine_asserted STATE_CHANGE ERROR TBD Indicates engine instance threw a runtime assertion, causing a crash. An unexpected internal state resulted in assert failure.
engine_clock_drift INFO_ONLY ERROR clock drift detected Indicates CART comms layer has detected clock skew between engines. NTP may not be syncing clocks across DAOS system.
pool_rebuild_started INFO_ONLY NOTICE Pool rebuild started. Indicates a pool rebuild has started. The event data field contains pool map version and pool operation identifier. When a pool rank becomes unavailable a rebuild will be triggered.
pool_rebuild_finished INFO_ONLY NOTICE Pool rebuild finished. Indicates a pool rebuild has finished successfully. The event data field includes the pool map version and pool operation identifier. N/A
pool_rebuild_failed INFO_ONLY ERROR Pool rebuild failed: <rc>. Indicates a pool rebuild has failed. The event data field includes the pool map version and pool operation identifier. <rc> provides a string representation of DER code. N/A
pool_replicas_updated STATE_CHANGE NOTICE List of pool service replica ranks has been updated. Indicates a pool service replica list has changed. The event contains the new service replica list in a custom payload. When a pool service replica rank becomes unavailable a new rank is selected to replace it (if available).
pool_durable_format_incompat INFO_ONLY ERROR incompatible layout version: <current> not in [<min>, <max>] Indicates the given pool's layout version does not match any of the versions supported by the currently running DAOS software. DAOS engine is started with pool data in local storage that has an incompatible layout version.
container_durable_format_incompat INFO_ONLY ERROR incompatible layout version[: <current> not in [<min>, <max>] Indicates the given container's layout version does not match any of the versions supported by the currently running DAOS software. DAOS engine is started with container data in local storage that has an incompatible layout version.
rdb_durable_format_incompatible INFO_ONLY ERROR incompatible layout version[: <current> not in [<min>, <max>]] OR incompatible DB UUID: <uuid> Indicates the given RDB's layout version does not match any of the versions supported by the currently running DAOS software, or the given RDB's UUID does not match the expected UUID (usually because the RDB belongs to a pool created by a pre-2.0 DAOS version). DAOS engine is started with rdb data in local storage that has an incompatible layout version.
swim_rank_alive STATE_CHANGE NOTICE TBD The SWIM protocol has detected the specified rank is responsive. A remote DAOS engine has become responsive.
swim_rank_dead STATE_CHANGE NOTICE SWIM rank marked as dead. The SWIM protocol has detected the specified rank is unresponsive. A remote DAOS engine has become unresponsive.
system_start_failed INFO_ONLY ERROR System startup failed, <errors> Indicates that a user initiated controlled startup failed. <errors> shows which ranks failed. Ranks failed to start.
system_stop_failed INFO_ONLY ERROR System shutdown failed during <action> action, <errors> Indicates that a user initiated controlled shutdown failed. <action> identifies the failing shutdown action and <errors> shows which ranks failed. Ranks failed to stop.

System Logging

Engine logging is initially configured by setting the log_file and log_mask parameters in the server config file. Logging is described in detail in the Debugging System section.

Engine log levels can be changed dynamically (at runtime) by setting log masks for a set of facilities to a given level. Settings will be applied to all running DAOS I/O Engines present in the configured dmg hostlist using the command dmg server set-logmasks [<masks>]. The command accepts 0-1 positional arguments. If no args are passed, then the log masks for each running engine will be reset to the value of engine "log_mask" parameter in the server config file (as set at the time of daos_server startup). If a single arg is passed, then this will be used as the log masks setting.

Example usage:

dmg server set-logmasks ERR,mgmt=DEBUG

The input string should look like PREFIX1=LEVEL1,PREFIX2=LEVEL2,... where the syntax is identical to what is expected by the 'D_LOG_MASK' environment variable. If the 'PREFIX=' part is omitted, then the level applies to all defined facilities (e.g., a value of 'WARN' sets everything to WARN).

Supported priority levels for engine logging are FATAL, CRIT, ERR, WARN, NOTE, INFO, DEBUG.

System Monitoring

The DAOS servers maintain a set of metrics on I/O and internal state of the DAOS processes. The metrics collection is very lightweight and is always enabled. It cannot be manually enabled or disabled.

The DAOS metrics can be accessed locally on each DAOS server, or remotely by configuring an HTTP endpoint on each server.

Local metrics collection with daos_metrics

The daos-server package includes the daos_metrics command-line tool. This tool fetches metrics from the local host only. No configuration is required to use the daos_metric command.

By default, daos_metrics displays the metrics in a human-readable tree format. To produce CSV formatted output, use daos_metrics --csv.

Each DAOS engine maintains its own metrics. The --srv_idx parameter can be used to specify which engine to query, if there are multiple engines configured per server. The default is to query the first engine on the server (index 0).

See daos_metrics -h for details on how to filter metrics.

Configuring the servers for remote metrics collection

Each DAOS server can be configured to provide an HTTP endpoint for metrics collection. This endpoint presents the data in a format compatible with Prometheus.

To enable remote telemetry collection, update the control plane section of your DAOS server configuration file:

telemetry_port: 9191

By default, the HTTP endpoint is disabled. The default port number is 9191, and it is recommended to use this port as it is also the default for the clients that will collect the metrics. Each control plane server will present its local metrics via the endpoint: http://<host>:<port>/metrics

Remote metrics collection with dmg telemetry

The dmg telemetry administrative command can be used to query an individual DAOS server for metrics. Only one DAOS host may be queried at a time. The command will return information for all engines on that server, identified by the "rank" attribute.

The metrics have the same names as seen on the telemetry web endpoint.

By default, the dmg telemetry command produces human readable output. The output can be formatted in JSON by running dmg -j telemetry.

To list all metrics for the server with their name, type and description:

dmg telemetry [-l <host>] [-p <telemetry-port>] metrics list

If no host is provided, the default is localhost. The default port is 9191.

To query the values of one or more metrics on the server:

dmg telemetry [-l <host>] [-p <telemetry-port>] metrics query [-m <metric_name>]

If no host is provided, the default is localhost. The default port is 9191.

Metric names may be provided in a comma-separated list. If no metric names are provided, all metrics are queried.

Remote metrics collection with Prometheus

Prometheus is the preferred way to collect metrics from multiple DAOS servers at the same time.

To integrate with Prometheus, add a new job to your Prometheus server's configuration file, with the targets set to the hosts and telemetry ports of your DAOS servers:

- job_name: daos
  scrape_interval: 5s
  - targets: ['<host>:<telemetry-port>']

If there is not already a Prometheus server set up, DMG offers quick setup options for DAOS.

To install and configure Prometheus on the local machine:

dmg telemetry config [-i <install-dir>]

If no install-dir is provided, DMG will attempt to install Prometheus in the first writable directory found in the user's PATH.

The Prometheus configuration file will be populated based on the DAOS server list in your dmg configuration file. The Prometheus configuration will be written to $HOME/.prometheus.yml.

To start the Prometheus server with the configuration file generated by dmg:

prometheus --config-file=$HOME/.prometheus.yml

Storage Operations

Space Utilization

To query SCM and NVMe storage space usage and show how much space is available to create new DAOS pools with, run the following command:

$ dmg storage query usage
Hosts   SCM-Total SCM-Free SCM-Used NVMe-Total NVMe-Free NVMe-Used
-----   --------- -------- -------- ---------- --------- ---------
wolf-71 6.4 TB    2.0 TB   68 %     1.5 TB     1.1 TB    27 %
wolf-72 6.4 TB    2.0 TB   68 %     1.5 TB     1.1 TB    27 %

The command output shows online DAOS storage utilization, only including storage statistics for devices that have been formatted by DAOS control-plane and assigned to a currently running rank of the DAOS system. This represents the storage that can host DAOS pools.

Note that the table values are per-host (storage server) and SCM/NVMe capacity pool component values specified in dmg pool create are per rank. If multiple ranks (I/O processes) have been configured per host in the server configuration file daos_server.yml then the values supplied to dmg pool create should be a maximum of the SCM/NVMe free space divided by the number of ranks per host.

For example, if 2.0 TB SCM and 10.0 TB NVMe free space is reported by dmg storage query usage and the server configuration file used to start the system specifies 2 I/O processes (2 "server" sections), the maximum pool size that can be specified is approximately dmg pool create -s 1T -n 5T (may need to specify slightly below the maximum to take account of negligible metadata overhead).

SSD Management

Health Monitoring

Useful admin dmg commands to query NVMe SSD health:

  • Query Per-Server Metadata:
  • dmg storage query (list-devices|list-pools)
  • dmg storage scan --nvme-meta shows mapping of metadata to NVMe controllers

The NVMe storage query list-devices and list-pools commands query the persistently stored SMD device and pool tables, respectively. The device table maps the internal device UUID to attached VOS target IDs. The rank number of the server where the device is located is also listed, along with the current device state. The current device states are the following: - NORMAL: a fully functional device in-use by DAOS - EVICTED: the device is no longer in-use by DAOS - UNPLUGGED: the device is currently unplugged from the system (may or not be evicted) - NEW: the device is plugged and available and not currently in-use by DAOS

The transport address is also listed for the device. This is either the PCIe address for normal NVMe SSDs, or the BDF format address of the backing NVMe SSDs behind a VMD (Volume Management Device) address. In the example below, the last two listed devices are both VMD devices with transport addresses in the BDF format behind the VMD address 0000:5d:05.5.

The pool table maps the DAOS pool UUID to attached VOS target IDs and will list all of the server ranks that the pool is distributed on. With the additional verbose flag, the mapping of SPDK blob IDs to VOS target IDs will also be displayed.

$ dmg -l boro-11,boro-13 storage query list-devices
    UUID:5bd91603-d3c7-4fb7-9a71-76bc25690c19 [TrAddr:0000:8a:00.0]
      Targets:[0 2] Rank:0 State:NORMAL
    UUID:80c9f1be-84b9-4318-a1be-c416c96ca48b [TrAddr:0000:8b:00.0]
      Targets:[1 3] Rank:0 State:NORMAL
    UUID:051b77e4-1524-4662-9f32-f8e4d2542c2d [TrAddr:0000:8c:00.0]
      Targets:[] Rank:0 State:NEW
    UUID:81905b24-be44-4106-8ff9-03002e9dd86a [TrAddr:5d0505:01:00.0]
      Targets:[0 2] Rank:1 State:EVICTED
    UUID:2ccb8afb-5d32-454e-86e3-762ec5dca7be [TrAddr:5d0505:03:00.0]
      Targets:[1 3] Rank:1 State:NORMAL
$ dmg -l boro-11,boro-13 storage query list-pools
      Rank:0 Targets:[0 1 2 3]
      Rank:1 Targets:[0 1 2 3]

$ dmg -l boro-11,boro-13 storage query list-pools --verbose
      Rank:0 Targets:[0 1 2 3] Blobs:[4294967404 4294967405 4294967407 4294967406]
      Rank:1 Targets:[0 1 2 3] Blobs:[4294967410 4294967411 4294967413 4294967412]

  • Query Storage Device Health Data:
  • dmg storage query (device-health|target-health)
  • dmg storage scan --nvme-health shows NVMe controller health stats

The NVMe storage query device-health and target-health commands query the device health data, including NVMe SSD health stats and in-memory I/O error and checksum error counters. The server rank and device state are also listed. The device health data can either be queried by device UUID (device-health command) or by VOS target ID along with the server rank (target-health command). The same device health information is displayed with both command options. Additionally, vendor-specific SMART stats are displayed, currently for Intel devices only. Note: A reasonable timed workload > 60 min must be ran for the SMART stats to register (Raw values are 65535). Media wear percentage can be calculated by dividing by 1024 to find the percentage of the maximum rated cycles.

$ dmg -l boro-11 storage query device-health --uuid=5bd91603-d3c7-4fb7-9a71-76bc25690c19
$ dmg -l boro-11 storage query target-health --rank=0 --tgtid=0
    UUID:5bd91603-d3c7-4fb7-9a71-76bc25690c19 [TrAddr:0000:8a:00.0]
      Targets:[0 1 2 3] Rank:0 State:NORMAL
      Health Stats:
        Controller Busy Time:0s
        Power Cycles:0
        Power On Duration:0s
        Unsafe Shutdowns:0
        Media Errors:0
        Read Errors:0
        Write Errors:0
        Unmap Errors:0
        Checksum Errors:0
        Error Log Entries:0
      Critical Warnings:
        Temperature: OK
        Available Spare: OK
        Device Reliability: OK
        Read Only: OK
        Volatile Memory Backup: OK
      Intel Vendor SMART Attributes:
        Program Fail Count:
        Erase Fail Count:
        Wear Leveling Count:
        End-to-End Error Detection Count:0
        CRC Error Count:0
        Timed Workload, Media Wear:65535
        Timed Workload, Host Read/Write Ratio:65535
        Timed Workload, Timer:65535
        Thermal Throttle Status:0%
        Thermal Throttle Event Count:0
        Retry Buffer Overflow Counter:0
        PLL Lock Loss Count:0
        NAND Bytes Written:244081
        Host Bytes Written:52114

Exclusion and Hotplug

  • Manually exclude an NVMe SSD: dmg storage set nvme-faulty

To manually evict an NVMe SSD (auto eviction will be supported in a future release), the device state needs to be set to "FAULTY" by running the following command:

$ dmg -l boro-11 storage set nvme-faulty --uuid=5bd91603-d3c7-4fb7-9a71-76bc25690c19
    UUID:5bd91603-d3c7-4fb7-9a71-76bc25690c19 Targets:[] Rank:1 State:FAULTY

The device state will transition from "NORMAL" to "FAULTY" (shown above), which will trigger the faulty device reaction (all targets on the SSD will be rebuilt, and the SSD will remain evicted until device replacement occurs).


Full NVMe hot plug capability will be available and supported in DAOS 2.2 release. Use is currently intended for testing only and is not supported for production.

  • Replace an excluded SSD with a New Device: dmg storage replace nvme

To replace an NVMe SSD with an evicted device and reintegrate it into use with DAOS, run the following command:

$ dmg -l boro-11 storage replace nvme --old-uuid=5bd91603-d3c7-4fb7-9a71-76bc25690c19 --new-uuid=80c9f1be-84b9-4318-a1be-c416c96ca48b
    UUID:80c9f1be-84b9-4318-a1be-c416c96ca48b Targets:[] Rank:1 State:NORMAL

The old, now replaced device will remain in an "EVICTED" state until it is unplugged. The new device will transition from a "NEW" state to a "NORMAL" state (shown above).

  • Reuse a FAULTY Device: dmg storage replace nvme

In order to reuse a device that was previously set as FAULTY and evicted from the DAOS system, an admin can run the following command (setting the old device UUID to be the new device UUID):

$ dmg -l boro-11 storage replace nvme --old-uuid=5bd91603-d3c7-4fb7-9a71-76bc25690c19 --new-uuid=5bd91603-d3c7-4fb7-9a71-76bc25690c19
    UUID:5bd91603-d3c7-4fb7-9a71-76bc25690c19 Targets:[] Rank:1 State:NORMAL

The FAULTY device will transition from an "EVICTED" state back to a "NORMAL" state, and will again be available for use with DAOS. The use case of this command will mainly be for testing or for accidental device eviction.


The SSD identification feature is simply a way to quickly and visually locate a device. It requires the use of Intel VMD (Volume Management Device), which needs to be physically available on the hardware as well as enabled in the system BIOS. The feature supports two LED device events: locating a healthy device and locating an evicted device.

  • Locate a Healthy SSD: dmg storage identify vmd

To quickly identify an SSD in question, an administrator can run the following command:

$ dmg -l boro-11 storage identify vmd --uuid=6fccb374-413b-441a-bfbe-860099ac5e8d

If a non-VMD device UUID is used with the command, the following error will occur:
localhost DAOS error (-1010): DER_NOSYS

The status LED on the VMD device is now set to an "IDENTIFY" state, represented by a quick, 4Hz blinking amber light. The device will quickly blink by default for about 60 seconds and then return to the default "OFF" state. The LED event duration can be customized by setting the VMD_LED_PERIOD environment variable if a duration other than the default value is desired.

  • Locate an Evicted SSD:

If an NVMe SSD is evicted, the status LED on the VMD device is set to a "FAULT" state, represented by a solidly ON amber light. No additional command apart from the SSD eviction command would be needed, and this would visually indicate that the device needs to be replaced and is no longer in use by DAOS. The LED of the VMD device would remain in this state until replaced by a new device.

System Operations

The DAOS server acting as the access point records details of engines that join the DAOS system. Once an engine has joined the DAOS system, it is identified by a unique system "rank". Multiple ranks can reside on the same host machine, accessible via the same network address.

A DAOS system can be shutdown and restarted to perform maintenance and/or reboot hosts. Pool data and state will be maintained providing no changes are made to the rank's metadata stored on persistent memory.

Storage reformat can also be performed after system shutdown. Pools will be removed and storage wiped.

System commands will be handled by the DAOS Server listening at the access point address specified as the first entry in the DMG config file "hostlist" parameter. See daos_control.yml for details.

The "access point" address should be the same as that specified in the server config file daos_server.yml specified when starting daos_server instances.


The system membership can be queried using the command:

$ dmg system query [--verbose] [--ranks <rankset>|--host-ranks <hostset>]

  • <rankset> is a pattern describing rank ranges e.g., 0,5-10,20-100
  • <hostset> is a pattern describing host ranges e.g., storagehost[0,5-10],10.8.1.[20-100]
  • --verbose flag gives more information on each rank

The output table will provide system rank mappings to host address and instance UUID, in addition to the rank state.

DAOS engines run a gossip-based protocol called SWIM that provides efficient and scalable fault detection. When an engine is reported as unresponsive, a RAS event is raised and the associated engine is marked as excluded in the output of dmg system query. The engine can be stopped (see next section) and then restarted to rejoin the system. An failed engine might also be excluded from the pools it hosted, please check the pool operation section on how to reintegrate an excluded engine.


When up and running, the entire system can be shutdown with the command:

$ dmg system stop [--force]

The output table will indicate action and result.

While the engines are stopped, the DAOS servers will continue to operate and listen on the management network.


All engines monitor each other and pro-actively exclude unresponsive members. It is critical to properly stop a DAOS system as with dmg in the case of a planned maintenance on all or a majority of the DAOS storage nodes. An abrupt reboot of the storage nodes might result in massive exclusion that will take time to recover.

The force option can be passed to dmg system stop for cases when a clean shutown is not working. Monitoring is not disabled in this case and spurious exclusion might happen, but the engines are guaranteed to be killed.

dmg also allows to stop a list of engines identified by ranks or hostnames. This is useful to stop (and restart) misbehaving engines.

$ dmg system stop [--force] [--ranks <rankset>|--host-ranks <hostset>]

  • <rankset> is a pattern describing rank ranges e.g., 0,5-10,20-100
  • <hostset> is a pattern describing host ranges e.g., storagehost[0,5-10],10.8.1.[20-100]


To start the system after a controlled shutdown, run the command:

$ dmg system start

  • <rankset> is a pattern describing rank ranges e.g., 0,5-10,20-100
  • <hostset> is a pattern describing host ranges e.g., storagehost[0,5-10],10.8.1.[20-100]

The output table will indicate action and result.

DAOS I/O Engines will be started.

As for shutdown, a list of engines to restart can be specified on the command line:

$ dmg system start [--ranks <rankset>|--host-ranks <hostset>]

  • <rankset> is a pattern describing rank ranges e.g., 0,5-10,20-100
  • <hostset> is a pattern describing host ranges e.g., storagehost[0,5-10],10.8.1.[20-100]

If the ranks were excluded from pools (e.g., unclean shutdown), they will need to be reintegrated. Please see the pool operation section for more information.

Storage Reformat

To reformat the system after a controlled shutdown, run the command:

$ dmg storage format --force

  • --force flag indicates that a (re)format operation should be performed disregarding existing filesystems
  • if no record of previously running ranks can be found, reformat is performed on the hosts that are specified in the daos_control.yml config file's hostlist parameter.
  • if system membership has records of previously running ranks, storage allocated to those ranks will be formatted

The output table will indicate action and result.

DAOS I/O Engines will be started, and all DAOS pools will have been removed.


While it should not be required during normal operations, one may still want to restart the DAOS installation from scratch without using the DAOS control plane.

First, ensure all daos_server processes on all hosts have been stopped, then for each SCM mount specified in the config file (scm_mount in the servers section) umount and wipe FS signatures.

bash $ umount /mnt/daos0 $ umount /mnt/daos1 $ wipefs -a /dev/pmem0 $ wipefs -a /dev/pmem0 Then restart DAOS Servers and format.

System Erase

To erase the DAOS sorage configuration, the dmg system erase command can be used. Before doing this, the affected engines need to be stopped by running dmg system stop (if necessary with the --force flag). The erase operation will destroy any pools that may still exist, and will unconfigure the storage. It will not stop the daos_server process, so the dmg command can still be used. For example, the system can be formatted again by running dmg storage format.


Note that dmg system erase does not currently reset the SCM. The /dev/pmemX devices will remain mounted, and the PMem configuration will not be reset to Memory Mode. To completely unconfigure the SCM, it is advisable to run daos_server storage prepare --scm-only --reset which will completely reset the PMem. A reboot will be required to finalize the change of the PMem allocation goals.

System Extension

To add a new server to an existing DAOS system, one should install:

  • the relevant certificates
  • the server yaml file pointing to the access points of the running DAOS system

The daos_control.yml file should also be updated to include the new DAOS server.

Then starts the daos_server via systemd and format the new server via dmg as follows:

$ dmg storage format -l ${new_storage_node}

new_storage_node should be replaced with the hostname or the IP address of the new storage node (comma separated list or range of hosts for multiple nodes) to be added.

Upon completion of the format operation, the new storage nodes will join the system (this can be checked with dmg system query -v).


New pools created after the extension will automatically use the newly added nodes (if membership is not restricted on the dmg command line). That being said, existing pools won't be automatically extended to use the new servers. Please see the pool operation section for how to extend the pool membership.

Software Update

The DAOS v2.0 wire protocol and persistent layout is not compatible with previous DAOS versions and would require a reformat and all client and server nodes to be updated to a 2.x version.


Attempts to start DAOS v2.0 over a system formatted with a previous DAOS version will trigger a RAS event and cause all the engines to abort. Similarly, a 2.0 DAOS client or engine will refuse to communicate with a peer that runs an incompatible version.

DAOS v2.0 will maintain interoperability for both the wire protocol and persistent layout with any future v2.x versions. That being said, it is required that all engines in the same system run the same DAOS version.


Rolling update is not supported at this time.

Back to top