lmbench性能工具介绍及详细结果分析

发布时间 : 2024/5/16 2:11:27 星期四文章lmbench性能工具介绍及详细结果分析更新完毕开始阅读

Lmbench工具

1. 工具介绍:

Lmbench用于测试OS提供的基本系统调用的性能，主要衡量两个关键特征：反应时间和带宽。

LMbench的主要功能：

带宽测评工具：读取缓存文件、拷贝内存、读内存、写内存、管道、TCP。

延时测评工具：上下文切换、网络（连接的建立，管道，TCP，UDP和RPC hot potato）、文件系统的建立和删除、进程创建、信号处理、上层的系统调用、内存

读入反应时间。

其他：处理器时钟比率计算。

2. 安装与使用

1、解压工具包

#tar zxvf lmbench-3.0-a9.tgz #cd lmbench-3.0-a9

2、删除可能存在的编译文件和编译结果：

#ls results | grep –vi Makefile | rm –rf #make clean 3、配置运行一次：

#make results 配置相关参数

－MULTIPLE COPIES,同时运行并行测试数量，对应为结果中的scal load项－Job placement selection,作业调度控制方法，选1允许作业调度－Options to control job placement,选择1

－Memory,设置为略大于4倍的cache size，值越大结果越精确，运行时间越长－SUBSE，所要运行的子集，包括ALL/HARWARE/OS/DEVELOPMENT －Email最好选择no 避免太长时间－其余选项保持默认即可 4、写入结果并查看：

#make see

5、再次运行：

#make rerun [不必重新配置]

3. 结果分析

测试结果包含主机各种测试的速度或者延迟，单位-微秒。

3.1. 系统基本参数

Basic system parameters ------------------------------------------------------------------------------ Host OS Description Mhz tlb cache mem scal pages line par load bytes --------- ------------- ----------------------- ---- ----- ----- ------ ---- c-Lenovo- Linux 3.8.13. i686-pc-linux-gnu 1731 84 128 3.4200 1

Tlb pages: 转换后备缓存的页面数； Cache line bytes: 高速缓存行字节数 mem par: 存储器分层并行化

scal load：并行执行的Lmbench数目

3.2. 处理器Processor性能【单位：μs，值越小性能越好】

Processor, Processes - times in microseconds - smaller is better ------------------------------------------------------------------------------ Host OS Mhz null null open slct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- c-Lenovo- Linux 3.8.13. 1731 0.19 0.36 1.48 3.05 7.60 0.53 2.32 497. 1474 3674

null call: 执行getppid需要的时间；

null I/O：从/dev/zero读一个字节的时间长A，写一个字节到/dev/null需要的时间长B,A和B取平均得本值；

stat：stat一个文件（即得到一个文件的信息）需用的时间；

open close： open一个文件然后再close它总共需用的时间（不包括读目录和节点的时间）； selct TCP：通过TCP网络连接选择100个文件描述符所耗用的时间； sig inst： install signal handler所耗用的时间； sig hndl： catch signal 所耗用的时间；

fork proc: fork一个完全相同的process，并把原来的process关掉所耗用的时间。 exec proc：模拟一个shell进程的工作过程：fork一个新进程执行新命令，所耗用时间。 sh proc：模拟最常见情况：fork一个新进程，同时询问系统shell来找到并运行一个新程序所耗用时间。

3.3. 数学运算【单位：ns，值越小性能越好】

整型：

Basic integer operations - times in nanoseconds - smaller is better ------------------------------------------------------------------- Host OS intgr intgr intgr intgr intgr bit add mul div mod --------- ------------- ------ ------ ------ ------ ------ c-Lenovo- Linux 3.8.13. 0.5800 0.2900 1.1600 13.1 12.0 无符号整形：

Basic uint64 operations - times in nanoseconds - smaller is better ------------------------------------------------------------------ Host OS int64 int64 int64 int64 int64 bit add mul div mod --------- ------------- ------ ------ ------ ------ ------ c-Lenovo- Linux 3.8.13. 1.040 3.7100 33.7 40.0 浮点型：

Basic float operations - times in nanoseconds - smaller is better ----------------------------------------------------------------- Host OS float float float float add mul div bogo --------- ------------- ------ ------ ------ ------ c-Lenovo- Linux 3.8.13. 1.1600 2.0300 13.7 13.3 双精度型： Basic double operations - times in nanoseconds - smaller is better ------------------------------------------------------------------ Host OS double double double double add mul div bogo --------- ------------- ------ ------ ------ ------ c-Lenovo- Linux 3.8.13. 1.1600 2.3200 13.4 13.3 3.4. 上下文切换【单位：μs，值越小性能越好】

Context switching - times in microseconds - smaller is better ------------------------------------------------------------------------- Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw --------- ------------- ------ ------ ------ ------ ------ ------- ------- c-Lenovo- Linux 3.8.13. 3.2100 1.0800 5.1300 4.5200 6.5200 13.0 29.9 多个进程用unix pipe环连接起来，每个进程从自己的管道中读取token，执行任务，然后将token写给下一个进程。

context swithing时间包括：切换进程的时间，加上恢复进程所有状态所用时间（包括恢复cache状态）。

2p/0K：每个进程的size为0（不执行任何任务），进程数为2时上下文切换耗用的时间； 2p/16K：每个进程的size为16K（执行任务），进程数为2时上下文切换耗用的时间；

3.5. 本地通讯延时【单位：μs，值越小性能越好】

*Local* Communication latencies in microseconds - smaller is better --------------------------------------------------------------------- Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP ctxsw UNIX UDP TCP conn --------- ------------- ----- ----- ---- ----- ----- ----- ----- ---- c-Lenovo- Linux 3.8.13. 3.210 16.0 12.6 12.9 21.5 103. 2p/0K：每个进程的size为0（不执行任何任务），进程数为2时上下文切换耗用的时间； Pipe：所谓的hot potato测试：两个没有具体任务的进程用unix pipe通信，一个token在两个进程间来回传递，传递一个来回所耗用的平均时间； AF UNIX：同Pipe，不同的是两个进程采用unix socket通信。 UDP：同Pipe，不同的是两个进程采用UDP/IP 通信； RPC/UDP：同Pipe，不同的是两个进程采用sun RPC 通信；默认情况下，RPC通过udp协议传送。

TCP：同Pipe，不同的是两个进程采用TCP/IP；

RPC/TCP：同Pipe，不同的是两个进程采用sun RPC 通信；指定RPC通过tcp协议传送。 TCP conn：创建一个AF_INET (aka TCP/IP) socket，并连接到远程主机所耗用的时间，这个时间仅指创建socket和建立连接本身，不包括解析主机名等等其他动作所用时间。

3.6. 文档、内存延时【单位：μs，值越小越好】

File & VM system latencies in microseconds - smaller is better ------------------------------------------------------------------------------- Host OS 0K File 10K File Mmap Prot Page 100fd Create Delete Create Delete Latency Fault Fault selct --------- ------------- ------ ------ ------ ------ ------- ----- ------- ----- c-Lenovo- Linux 3.8.13. 15.7 13.5 52.8 19.5 35.8K 0.341 3.79810 3.535 0k create: 0k文件创建所花的时间； 0k Delete: 0k文件删除所花的时间；

Mmap Latency: 将指定文件的开头n个字节map到内存，然后umap，并记录每次map和umap共耗用的时间；记录的是每次耗用时间的最大值； Prot Fault: 保护页延时时间； Page Fault: 缺页延时时间；

100fd selct: 对100个文档描述符配置select的时间；

3.7. 本地通信带宽【单位：MB/S，值越大越好】

*Local* Communication bandwidths in MB/s - bigger is better ----------------------------------------------------------------------------- Host OS Pipe AF TCP File Mmap Bcopy Bcopy Mem Mem UNIX reread reread (libc) (hand) read write --------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- ----- c-Lenovo- Linux 3.8.13. 836. 1268 916. 1239.2 2947.9 1298.4 1360.6 2972 1829. Pipe：在两个进程间建立一个unix pipe，pipe的每个chunk为64K，通过该管道移动50M数据所用

lmbench性能工具介绍及详细结果分析

下载：lmbench性能工具介绍及详细结果分析.doc

最近浏览

最新搜索

站内搜索