1. 查看 CPU
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
| lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 57 bits virtual
CPU(s): 160 # 有 160 个 CPU
On-line CPU(s) list: 0-159
Thread(s) per core: 2 # 每个核心支持 2 个线程
Core(s) per socket: 40
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 106
Model name: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz
Stepping: 6
CPU MHz: 3000.000 # 当前工作频率为 3000 MHz
BogoMIPS: 4600.00
Virtualization: VT-x
L1d cache: 3.8 MiB
L1i cache: 2.5 MiB
L2 cache: 100 MiB
L3 cache: 120 MiB
|
如果当前的工作频率低于标称工作频率,则有可能是 CPU 没有处于高性能模式。
2. 查看内存
1
2
3
4
| free -h
total used free shared buff/cache available
Mem: 1.0Ti 319Gi 284Gi 9.2Gi 403Gi 673Gi
Swap: 0B 0B 0B
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
| dmidecode -t memory
Physical Memory Array
Location: System Board Or Motherboard
Use: System Memory
Error Correction Type: Single-bit ECC
Maximum Capacity: 12 TB # 支持的最大内存容量为 12 TB
Error Information Handle: Not Provided
Number Of Devices: 32 # 当前插入了 32 个内存条
# 下面是每个内存条的详细信息
Handle 0x0058, DMI type 17, 92 bytes
Memory Device
Array Handle: 0x0057
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 32 GB # 单个内存条的大小
Form Factor: DIMM
Set: None
Locator: CPU0_C0D0
Bank Locator: NODE 0
Type: DDR4 # 内存条的类型
Type Detail: Synchronous Registered (Buffered)
Speed: 3200 MT/s
Manufacturer: Samsung # 供应商
Serial Number: H0GE000240426DAE90
Asset Tag: 042240
Part Number: M393A4K40EB3-CWE
Rank: 2
Configured Memory Speed: 3200 MT/s # 每秒传输的次数
Minimum Voltage: 1.2 V
Maximum Voltage: 1.2 V
Configured Voltage: 1.2 V
Memory Technology: DRAM
Memory Operating Mode Capability: Volatile memory
Firmware Version: 0000
Module Manufacturer ID: Bank 1, Hex 0xCE
Module Product ID: Unknown
Memory Subsystem Controller Manufacturer ID: Unknown
Memory Subsystem Controller Product ID: Unknown
Non-Volatile Size: None
Volatile Size: 32 GB
Cache Size: None
Logical Size: None
...
|
内存带宽 = 3200 MT/s x 64/8 Byte = 25600 MB/s = 25 GB/s
3. 查看磁盘
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| lsblk -o NAME,TYPE,SIZE,MODEL,UUID,MOUNTPOINT
NAME TYPE SIZE MODEL UUID MOUNTPOINT
loop0 loop 63.9M /snap/core20/2105
loop1 loop 63.9M /snap/core20/2318
loop3 loop 87M /snap/lxd/28373
loop4 loop 38.7M /snap/snapd/21465
loop5 loop 38.8M /snap/snapd/21759
loop6 loop 87M /snap/lxd/29351
sda disk 446.6G MR9560-8i
├─sda1 part 2M
├─sda2 part 3.8G ff254d69-c5ca-4766-84e9-99925a81d97e /boot
└─sda3 part 442.8G 05d3b075-25a6-427d-b5f1-5947f93bebca /
nvme0n1 disk 14T INTEL SSDPF2NV153TZ d7e9389d-3a71-447d-9d2f-9544a85de5e3 /data
nvme1n1 disk 14T INTEL SSDPF2NV153TZ d73c1906-7bc1-411e-92a8-a737ee16adae /data1
|
ROTA 为 0 表示 SSD,为 1 表示 HDD。这里也有产品的型号,比如 INTEL SSDPF2NV153TZ,可以用来搜索详细的参数信息。
1
2
3
4
5
6
| df -H | grep -vE '^Filesystem|tmpfs|cdrom|loop|udev' | awk '{ print $5 "/" $2 " " $1 }' |grep " "/
12%/468G /dev/sda3
1%/16T /dev/nvme1n1
1%/16T /dev/nvme0n1
9%/4.0G /dev/sda2
|
4. 查看 PCI 带宽
这里以磁盘为例
1
2
3
4
5
6
7
| lspci | grep -iE "SATA|SAS|NVM Express|NVMe"
00:17.0 SATA controller: Intel Corporation Device 1ba2 (rev 11)
00:18.0 SATA controller: Intel Corporation Device 1bf2 (rev 11)
1a:00.0 Non-Volatile memory controller: Intel Corporation NVMe DC SSD [3DNAND, Sentinel Rock Controller]
3a:00.0 Non-Volatile memory controller: Intel Corporation NVMe DC SSD [3DNAND, Sentinel Rock Controller]
a6:00.0 RAID bus controller: Broadcom / LSI MegaRAID 12GSAS/PCIe Secure SAS39xx
|
1
2
| lspci -vv -s 1a:00.0 |grep -i "LnkCap:"
LnkCap: Port #0, Speed 16GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <512ns, L1 <16us
|
这块 NVMe 设备的 PCI 带宽为 16GT/s * 4/8 Byte = 8 GB/s
下面是一张速查表
版本 | 推出年份 | Line 编码 | 每通道传输率 | 带宽(每个方向) |
---|
1.0 | 2003 | 8b/10b | 2.5 GT/s | x1: 0.250 GB/s x2: 0.500 GB/s x4: 1.000 GB/s x8: 2.000 GB/s x16: 4.000 GB/s |
2.0 | 2007 | 8b/10b | 5.0 GT/s | x1: 0.500 GB/s x2: 1.000 GB/s x4: 2.000 GB/s x8: 4.000 GB/s x16: 8.000 GB/s |
3.0 | 2010 | 128b/130b | 8.0 GT/s | x1: 0.985 GB/s x2: 1.969 GB/s x4: 3.938 GB/s x8: 7.877 GB/s x16: 15.754 GB/s |
4.0 | 2017 | 128b/130b | 16.0 GT/s | x1: 1.969 GB/s x2: 3.938 GB/s x4: 7.877 GB/s x8: 15.754 GB/s x16: 31.508 GB/s |
5.0 | 2019 | 128b/130b | 32.0 GT/s | x1: 3.938 GB/s x2: 7.877 GB/s x4: 15.754 GB/s x8: 31.508 GB/s x16: 63.015 GB/s |
6.0 | 2021 | 242B/256B | 64.0 GT/s | x1: 7.563 GB/s x2: 15.125 GB/s x4: 30.250 GB/s x8: 60.500 GB/s x16: 121.000 GB/s |
5. 查看网卡设备
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
| lshw -class network -C network
*-network:0
description: Ethernet interface
product: MT2894 Family [ConnectX-6 Lx]
vendor: Mellanox Technologies
physical id: 0
bus info: pci@0000:26:00.0
logical name: eth0
version: 00
serial: a0:88:c2:3a:23:90
capacity: 25Gbit/s
width: 64 bits
clock: 33MHz
capabilities: pciexpress vpd msix pm bus_master cap_list rom ethernet physical fibre 1000bt-fd 10000bt-fd 25000bt-fd autonegotiation
configuration: autonegotiation=on broadcast=yes driver=mlx5_core driverversion=23.10-0.5.5 duplex=full firmware=26.36.1010 (MT_0000000546) latency=0 link=yes multicast=yes port=fibre slave=yes
resources: iomemory:fff0-ffef irq:16 memory:ffff0000000-ffff1ffffff memory:a6a00000-a6afffff memory:ffff2800000-ffff2ffffff
*-network:1
description: Ethernet interface
product: MT2894 Family [ConnectX-6 Lx]
vendor: Mellanox Technologies
physical id: 0.1
bus info: pci@0000:26:00.1
logical name: eth1
version: 00
serial: a0:88:c2:3a:23:90
capacity: 25Gbit/s
width: 64 bits
clock: 33MHz
capabilities: pciexpress vpd msix pm bus_master cap_list rom ethernet physical fibre 1000bt-fd 10000bt-fd 25000bt-fd autonegotiation
configuration: autonegotiation=on broadcast=yes driver=mlx5_core driverversion=23.10-0.5.5 duplex=full firmware=26.36.1010 (MT_0000000546) latency=0 link=yes multicast=yes port=fibre slave=yes
resources: iomemory:fff0-ffef irq:17 memory:fffee000000-fffefffffff memory:a6900000-a69fffff memory:ffff2000000-ffff27fffff
|
说明机器上有两个网卡,速度都是 25Gbit/s。
6. 查看 IB 网卡设备
1
2
3
4
5
6
7
| ibdev2netdev
mlx5_0 port 1 ==> ibs10 (Up)
mlx5_1 port 1 ==> ibs11 (Up)
mlx5_4 port 1 ==> ibs18 (Up)
mlx5_5 port 1 ==> ibs19 (Up)
mlx5_bond_0 port 1 ==> bond1 (Up)
|
表示有 5 个 IB 网卡,并且都处于工作状态。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| ibstat
CA 'mlx5_5'
CA type: MT4123
Number of ports: 1
Firmware version: 20.35.1012
Hardware version: 0
Node GUID: 0x946dae03008bc71c
System image GUID: 0x946dae03008bc71c
Port 1:
State: Active
Physical state: LinkUp
Rate: 200
Base lid: 118
LMC: 0
SM lid: 1
Capability mask: 0xa651e848
Port GUID: 0x946dae03008bc71c
Link layer: InfiniBand
|
其中,Rate 为 200 表示速度为 200 Gbit/s
7. 查看加速设备
1
2
3
4
5
6
7
8
9
10
| nvidia-smi -L
GPU 0: NVIDIA A800-SXM4-80GB (UUID: GPU-ace08757-67d5-1c00-2885-56bffc0f9199)
GPU 1: NVIDIA A800-SXM4-80GB (UUID: GPU-a7f1053a-5d92-1d5a-2d65-621c95fb228d)
GPU 2: NVIDIA A800-SXM4-80GB (UUID: GPU-96effe18-1cee-e4b0-7120-3961ced74d58)
GPU 3: NVIDIA A800-SXM4-80GB (UUID: GPU-5b9a3601-4690-8fd7-baff-c2086e984e01)
GPU 4: NVIDIA A800-SXM4-80GB (UUID: GPU-3636db36-51b8-7209-79e6-cc05b5acb6ea)
GPU 5: NVIDIA A800-SXM4-80GB (UUID: GPU-e67ab276-2d74-30fc-6e97-79e0da803bb0)
GPU 6: NVIDIA A800-SXM4-80GB (UUID: GPU-b6a8e448-5b04-517d-5daa-6c9457e116f0)
GPU 7: NVIDIA A800-SXM4-80GB (UUID: GPU-9a128cae-7f75-e8f2-1c75-2b16ccacb4b8)
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
| npu-smi info
+------------------------------------------------------------------------------------------------+
| npu-smi 23.0.2.1 Version: 23.0.2.1 |
+---------------------------+---------------+----------------------------------------------------+
| NPU Name | Health | Power(W) Temp(C) Hugepages-Usage(page)|
| Chip | Bus-Id | AICore(%) Memory-Usage(MB) HBM-Usage(MB) |
+===========================+===============+====================================================+
| 0 910B2C | OK | 384.6 79 0 / 0 |
| 0 | 0000:5A:00.0 | 69 0 / 0 61544/ 65536 |
+===========================+===============+====================================================+
| 1 910B2C | OK | 400.3 80 0 / 0 |
| 0 | 0000:19:00.0 | 70 0 / 0 60765/ 65536 |
+===========================+===============+====================================================+
| 2 910B2C | OK | 384.9 75 0 / 0 |
| 0 | 0000:49:00.0 | 68 0 / 0 60766/ 65536 |
+===========================+===============+====================================================+
| 3 910B2C | OK | 387.4 80 0 / 0 |
| 0 | 0000:39:00.0 | 71 0 / 0 60765/ 65536 |
+===========================+===============+====================================================+
| 4 910B2C | OK | 368.9 76 0 / 0 |
| 0 | 0000:DA:00.0 | 69 0 / 0 60765/ 65536 |
+===========================+===============+====================================================+
|