Why is IPMI better than snmp?
I received an email asking why I prefer IPMI over snmp.
IPMI is more powerful, more flexible and provides better information… let’s have a look at some examples:
Investigating a restart without logging directly into the ILOM interface:
ipmitool -I lanplus -H 192.169.4.1 -U my_user -P my_password chassis restart_cause
System restart cause: chassis power control command
Checking whether the locate flashing LED is ON or OFF:
ipmitool -I lanplus -H 192.169.4.1 -U my_user -P my_password sunoem led get LOCATE
LOCATE | ON
Event list:
ipmitool -I lanplus -H 192.169.4.1 -U my_user -P my_password sel list
1 | 09/28/2015 | 20:35:35 | Power Supply #0xb8 | State Asserted | Asserted
2 | 09/28/2015 | 20:35:36 | Power Supply #0xb7 | State Asserted | Asserted
3 | 09/28/2015 | 20:35:46 | Power Supply #0xb8 | State Deasserted | Asserted
4 | 09/28/2015 | 20:35:47 | Power Supply #0xb7 | State Deasserted | Asserted
5 | 09/28/2015 | 21:17:47 | Power Supply #0xb8 | State Asserted | Asserted
6 | 09/28/2015 | 21:17:48 | Power Supply #0xb7 | State Asserted | Asserted
7 | 09/28/2015 | 21:17:59 | Power Supply #0xb8 | State Deasserted | Asserted
8 | 09/28/2015 | 21:18:00 | Power Supply #0xb7 | State Deasserted | Asserted
9 | 09/28/2015 | 21:39:49 | Power Supply #0xb4 | State Deasserted | Asserted
a | 09/28/2015 | 21:39:51 | Power Supply #0xb2 | State Deasserted | Asserted
b | 09/28/2015 | 21:39:51 | System ACPI Power State #0x0f | S0/G0: working | Deasserted
We can check all of them or clear all of them:
ipmitool -I lanplus -H 192.169.4.1 -U my_user -P my_password sel clear
Clearing SEL. Please allow a few seconds to erase.
Chaining more commands:
ipmitool -I lanplus -H 192.169.4.1 -U my_user -P my_password sunoem cli "show /SP/powermgmt/ actual_power" "show /SP/services"
Connected. Use ^D to exit.
-> show /SP/powermgmt/ actual_power
/SP/powermgmt
Properties:
actual_power = 741
-> show /SP/services
/SP/services
Targets:
http
https
ipmi
kvms
servicetag
snmp
ssh
sso
Check if a PSU has faulted:
ipmitool -I lanplus -H 192.169.4.1 -U my_user -P my_password sunoem cli "show /SYS/PSU_FAULT"
Connected. Use ^D to exit.
-> show /SYS/PSU_FAULT
/SYS/PSU_FAULT
Targets:
Properties:
type = Indicator
ipmi_name = SYS/PSFAIL/LED
value = On
Check an HW category:
ipmitool -I lanplus -H 192.169.4.1 -U my_user -P my_password sunoem sdr type "Power Supply"
PS0/VINOK | B1h | ok | 10.0 | State Asserted
PS0/PWROK | B2h | ok | 10.0 | State Asserted
PS1/VINOK | B4h | ok | 10.1 | State Asserted
PS1/PWROK | B5h | ok | 10.1 | State Asserted
PS2/VINOK | B7h | ok | 10.2 | State Deasserted
PS2/PWROK | B8h | ok | 10.2 | State Deasserted
PS3/VINOK | BAh | ok | 10.3 | State Deasserted
PS3/PWROK | BBh | ok | 10.3 | State Deasserted
List all the categories…:
ipmitool -I lanplus -H 192.169.4.1 -U my_user -P my_password sdr type
Temperature | (0x03) | Fan | (0x04)
Physical Security | (0x05) | Platform Security | (0x06)
Processor | (0x07) | Power Supply | (0x08)
Power Unit | (0x09) | Cooling Device | (0x0a)
Other | (0x0b) | Memory | (0x0c)
Drive Slot / Bay | (0x0d) | POST Memory Resize | (0x0e)
System Firmwares | (0x0f) | Event Logging Disabled | (0x10)
Watchdog1 | (0x11) | System Event | (0x12)
Critical Interrupt | (0x13) | Button | (0x14)
Module / Board | (0x15) | Microcontroller | (0x16)
Add-in Card | (0x17) | Chassis | (0x18)
Chip Set | (0x19) | Other FRU | (0x1a)
Cable / Interconnect | (0x1b) | Terminator | (0x1c)
System Boot Initiated | (0x1d) | Boot Error | (0x1e)
OS Boot | (0x1f) | OS Critical Stop | (0x20)
Slot / Connector | (0x21) | System ACPI Power State | (0x22)
Watchdog2 | (0x23) | Platform Alert | (0x24)
Entity Presence | (0x25) | Monitor ASIC | (0x26)
LAN | (0x27) | Management Subsys Health | (0x28)
Battery | (0x29) | Session Audit | (0x2a)
Version Change | (0x2b) | FRU State | (0x2c)
… and choose one:
ipmitool -I lanplus -H 192.169.4.1 -U my_user -P my_password sdr type 0x0d
DBP/HDD0/FAIL | 21h | ok | 4.0 | Predictive Failure Deasserted
DBP/HDD1/FAIL | 22h | ok | 4.1 | Predictive Failure Deasserted
DBP/HDD2/FAIL | 23h | ok | 4.2 | Predictive Failure Deasserted
DBP/HDD3/FAIL | 24h | ok | 4.3 | Predictive Failure Deasserted
Disk Backplane Present:
ipmitool -I lanplus -H 192.169.4.1 -U my_user -P my_password sunoem cli "show /SYS/DBP/PRSNT"
Connected. Use ^D to exit.
-> show /SYS/DBP/PRSNT
/SYS/DBP/PRSNT
Targets:
Properties:
type = Entity Presence
ipmi_name = DBP/PRSNT
class = Discrete Sensor
value = Present
alarm_status = cleared
Check if the hard Disk0 has failed:
ipmitool -I lanplus -H 192.169.4.1 -U my_user -P my_password sunoem cli "show /SYS/DBP/HDD0/FAIL"
Connected. Use ^D to exit.
-> show /SYS/DBP/HDD0/FAIL
/SYS/DBP/HDD0/FAIL
Targets:
Properties:
type = Drive Slot
ipmi_name = DBP/HDD0/FAIL
class = Discrete Sensor
value = Predictive Failure Deasserted
alarm_status = cleared
We can also filter the queries:
ipmitool -I lanplus -H 192.169.4.1 -U my_user -P my_password sunoem led get all |egrep -v "OFF|na|OK.*ON"
SYS/POWER/LED | ON
SYS/ALERT/LED | ON
SYS/PSFAIL/LED | ON
ipmitool -I lanplus -H 192.169.4.1 -U my_user -P my_password sunoem cli "show /SP/logs/event/list Severity==(critical) -level all -output table"
Connected. Use ^D to exit.
-> show /SP/logs/event/list Severity==(critical) -level all -output table
Target | Property | Value
--------------------+------------------------+---------------------------------
ID Date/Time Class Type Severity
----- ------------------------ -------- -------- --------
1726 Fri Oct 2 14:19:46 2015 IPMI Log critical ID = 4 : 10/02/2015 : 14:19:46 : System ACPI Power State : ACPI : S5/G2: soft-off : Asserted
1725 Fri Oct 2 14:19:46 2015 IPMI Log critical ID = 3 : 10/02/2015 : 14:19:46 : System ACPI Power State : ACPI : S0/G0: working : Deasserted
1724 Fri Oct 2 14:19:45 2015 IPMI Log critical ID = 2 : 10/02/2015 : 14:19:45 : Power Supply : PS0/PWROK : State Deasserted
1723 Fri Oct 2 14:19:42 2015 IPMI Log critical ID = 1 : 10/02/2015 : 14:19:42 : Power Supply : PS1/PWROK : State Deasserted
Check all temperatures:
ipmitool -I lanplus -H 192.169.4.1 -U my_user -P my_password sdr type temperature
MB/T_AMB0 | 01h | ok | 7.0 | 35 degrees C
MB/T_AMB1 | 02h | ok | 7.0 | 35 degrees C
MB/T_AMB2 | 03h | ok | 7.0 | 38 degrees C
P0/T_AMB | 32h | ok | 3.0 | 20 degrees C
P0/T_CORE | 33h | ok | 3.0 | 27 degrees C
P1/T_AMB | 42h | ok | 3.0 | 23 degrees C
P1/T_CORE | 43h | ok | 3.0 | 25 degrees C
P2/T_AMB | 52h | ok | 3.0 | 22 degrees C
P2/T_CORE | 53h | ok | 3.0 | 26 degrees C
P3/T_AMB | 62h | ok | 3.0 | 21 degrees C
P3/T_CORE | 63h | ok | 3.0 | 23 degrees C
P4/T_AMB | 72h | ok | 3.0 | 19 degrees C
P4/T_CORE | 73h | ok | 3.0 | 27 degrees C
P5/T_AMB | 82h | ok | 3.0 | 18 degrees C
P5/T_CORE | 83h | ok | 3.0 | 22 degrees C
P6/T_AMB | 92h | ok | 3.0 | 19 degrees C
P6/T_CORE | 93h | ok | 3.0 | 26 degrees C
P7/T_AMB | A2h | ok | 3.0 | 20 degrees C
P7/T_CORE | A3h | ok | 3.0 | 21 degrees C
We can even power ON/OFF the server:
ipmitool -I lanplus -H 192.169.4.1 -U my_user -P my_password power soft
Chassis Power Control: Soft
Post a comment
All comments are held for moderation; basic HTML formatting accepted.