#
Understanding "acidiag" Commands

Understanding "acidiag" Commands

 Austin Peacock
Written by Austin Peacock
Published on Aug. 2, 2024, 4:44 p.m.
Understanding "acidiag" Commands

Introduction

 The acidiag command set, like moquery, is also specific to only ACI, so it can be difficult to find a lot of documentation around its functions and uses. I am going to break down the different capabilities into two categories, the Daily Drivers and Advanced. There are several commands I am going to leave out, but for the most part, they are never used outside of development and deep level troubleshooting.

The Daily Drivers

These are the sets of commands that you could be using every single time you hop on the CLI of an APIC. Note that "avread" and "fnvread" are present on the switches as well, but "rvread" will not be.

The first one I'll mention here is "acidiag fnvread". This lists all of the Leafs and Spines in the fabric, as well as their TEP addresses and current state.

apic1# acidiag fnvread
      ID   Pod ID                 Name    Serial Number         IP Address    Role        State   LastUpdMsgId
--------------------------------------------------------------------------------------------------------------
     201        1            spine-201        TEP-1-103     10.0.136.65/32   spine         active   0
    1001        1            leaf-1001        TEP-1-101     10.0.136.64/32    leaf         active   0
    1002        1            leaf-1002        TEP-1-102     10.0.136.66/32    leaf         active   0

Total 3 nodes


The next command is "acidiag avread", which shows you information about your APIC cluster, including the out-of-band management addresses, APIC IDs, health, overlay-1 IPs, and more.

apic1# acidiag avread
Local appliance ID=1 ADDRESS=10.0.0.1 TEP ADDRESS=10.0.0.0/16 ROUTABLE IP ADDRESS=0.0.0.0 CHASSIS_ID=10220833-ea00-3bb3-93b2-ef1e7e645889
Cluster of 1 lm(t):1(zeroTime) appliances (out of targeted 1 lm(t):1(2024-08-01T17:11:04.355+00:00)) with FABRIC_DOMAIN name=ACI Fabric1 set to version=4.2(7l) lm(t):1(2024-08-01T17:11:04.675+00:00); discoveryMode=PERMISSIVE lm(t):0(zeroTime); drrMode=OFF lm(t):0(zeroTime); kafkaMode=OFF lm(t):0(zeroTime)
        appliance id=1  address=10.0.0.1 lm(t):1(2024-08-01T17:09:40.617+00:00) tep address=10.0.0.0/16 lm(t):1(2024-08-01T17:09:40.617+00:00) routable address=0.0.0.0 lm(t):1(zeroTime) oob address=10.10.20.2/24 lm(t):1(2024-08-01T17:09:43.354+00:00) version=4.2(7l) lm(t):1(2024-08-01T17:09:43.573+00:00) chassisId=10220833-ea00-3bb3-93b2-ef1e7e645889 lm(t):1(2024-08-01T17:09:43.573+00:00) capabilities=0X3EEFFFFFFFFF--0X2020--0X1 lm(t):1(2024-08-01T17:15:43.770+00:00) rK=(stable,present,0X206173722D687373) lm(t):1(2024-08-01T17:09:43.359+00:00) aK=(stable,present,0X206173722D687373) lm(t):1(2024-08-01T17:09:43.359+00:00) oobrK=(stable,present,0X206173722D687373) lm(t):1(2024-08-01T17:09:43.359+00:00) oobaK=(stable,present,0X206173722D687373) lm(t):1(2024-08-01T17:09:43.359+00:00) cntrlSbst=(APPROVED, TEP-1-1) lm(t):1(2024-08-01T17:09:43.573+00:00) (targetMbSn= lm(t):0(zeroTime), failoverStatus=0 lm(t):0(zeroTime)) podId=1 lm(t):1(2024-08-01T17:09:40.617+00:00) commissioned=YES lm(t):1(zeroTime) registered=YES lm(t):1(2024-08-01T17:09:40.617+00:00) standby=NO lm(t):1(2024-08-01T17:09:40.617+00:00) DRR=NO lm(t):0(zeroTime) apicX=NO lm(t):1(2024-08-01T17:09:40.617+00:00) virtual=NO lm(t):1(2024-08-01T17:09:40.617+00:00) active=YES(2024-08-01T17:09:40.617+00:00) health=(applnc:255 lm(t):1(2024-08-01T17:10:25.573+00:00) svc's)
---------------------------------------------
clusterTime=>
---------------------------------------------
As you can see, this is a bit of a nightmare to read, and this is only for one APIC. If you are on ACI version 4.2 or later, I would recommend using the "Cluster avread" instead, as it is much more human readable:

apic1# Cluster avread
Cluster:
-------------------------------------------------------------------------
fabricDomainName        ACI Fabric1
discoveryMode           PERMISSIVE
clusterSize             1
version                 4.2(7l)
drrMode                 OFF
operSize                1

APICs:
-------------------------------------------------------------------------
                    APIC 1
version           4.2(7l)
address           10.0.0.1
oobAddress        10.10.20.2/24
routableAddress   0.0.0.0
tepAddress        10.0.0.0/16
podId             1
chassisId         10220833-.-7e645889
cntrlSbst_serial  (APPROVED,TEP-1-1)
active            YES
flags             cra-
health            255


"acidiag rvread" shows you all of the shards of each process on each APIC, as well as their replicas. Remember that (almost) every process is broken up into 32 shards in the database, and then replicated twice so that there are 3 copies total, which will be spread out across all APICs. Note that running more APICs doesn't increase the number of replicas. This command is helpful to run if you are getting strange errors around sharding, or process or database errors.

apic1# acidiag rvread
\- unexpected state;    /-unexpected mutator;
s->  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32lcl
r->123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123lcl
  1
  2
  3
  4
  5
  6       /         /                          //               /              //
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
Replicas are in expected states and are mutated by proper apic's
---------------------------------------------
clusterTime=>
This output should be completely clean, no "X"s or "/"s. If you see either then you need to contact TAC and get them to help you with your cluster. If you want to try to troubleshoot it yourself, say in your lab or non-prod setup, I'll give some pointers below.

Advanced Commands

These commands are used a little more infrequently, but they do have their uses.

"acidiag touch clean" is the command you use when you want to "clean reload" an APIC or a Leaf/Spine. It is more common to use this command on a Leaf or a Spine that you want to have pull its entire configuration again from the APIC. This is useful when you are having policy issues, meaning that you don't see the ACI objects on the leaf, therefore you don't see the NXOS configuration on the leaf (usually the result of a defect).

Note: Once you run this command, it touchs a file under /firmware called ".clean". If you reload the box after this, then the device will come up "clean" without any policy, and it will try to pull policy from the fabric.

"acidiag oob enable" is a powerful command that you can use to have your APICs attempt to communicate via their out-of-band interfaces, instead of their fabric interfaces. This can be useful when the fabric is in a strange state that doesn't allow the APICs to communicate, but it can't get out of this state because the APICs can't become fully-fit to make any changes.

apic1# acidiag oob enable  
apic1# acidiag oob disable
Once connectivity is restored, make sure to set it back to "disabled".

"acidiag log acidiaglog" is a command that you may use if you want to see what acidiag commands have been run on this box in the past. One such scenario may be to check if someone has recently run an "acidiag touch clean" before you reload.

apic1# acidiag logs acidiaglog
2024-08-01 20:47:29,798 | INFO | admin | avread
2024-08-01 20:49:09,351 | INFO | admin | avread
2024-08-01 20:50:19,981 | INFO | admin | --help
2024-08-01 20:50:51,028 | INFO | admin | avread
2024-08-01 20:50:53,387 | INFO | admin | fnvread
2024-08-01 20:51:56,310 | INFO | admin | verifyapic
2024-08-01 20:52:14,580 | INFO | admin | logs
2024-08-01 20:52:20,556 | INFO | admin | logs | acidiaglog


"acidiag restart mgmt" is useful when you are seeing issues with your sharding or processes, and want a way to non-disruptively restart every process that the APICs have control over. Theoretically this should not cause any data-plane impact, but unless this is a lab or non-prod environment, it's best to engage TAC before using this command.

apic1# acidiag restart mgmt
Note that you can also use this command to restart a single process, by name. For example, you could use it to restart dbgr with "acidiag restart dbgr ". You can also use "start" and "stop" if you want to stop/start a process, much like you would with systemctl, if you are familiar with that command set.

Comments

Please Sign Up or Sign In to post.