Linux之random子系统问题解决分享

您所在的位置:网站首页 shrd指令 Linux之random子系统问题解决分享

Linux之random子系统问题解决分享

#Linux之random子系统问题解决分享| 来源: 网络整理| 查看: 265

文章目录 @[toc]问题信息问题描述 问题排查测试一、Hicore卡顿业务定位结论 测试二、系统资源信息分析结论 测试三、random相关分析结论一结论二总结 问题拓展random模块基本概念接口说明使用关注 hw_random模块基本概念接口说明使用说明 问题解决crng_ready方案trust_cpu方案hw_random方案random加速方案haveged工具 备注术语与缩写调试命令 问题信息 问题描述

T673 V1.1项目在集成测试Build 1期间, 使用T673设备升级系统测试Build1整机包测试启动耗时, 发现设备启动至进入应用用时1分钟左右, 耗时较其他项目(T671B等)时间长, 用户体验差,需要修复。

与应用同事沟通收集信息: T673 V1.1项目在今年四月份异常结项,未有关注启动耗时优化;T673 V1.1项目在今年九月份重新开启,至十月下旬的系统测试Build1期间未有关注启动耗时优化;在项目重启至系统测试B1之前有合入以下内容: 触摸屏新物料支持;PHY新物料支持;韦根驱动从基线同步;应用程序支持硬解密;其他; 问题排查 测试一、Hicore卡顿业务定位

应用同事在Hicore启动过程中增加时间戳信息,以初步定位耗时位置。调试整机固件, ACS_673_F1plus_CN_STD_V1.0.0_build221020.zip

测试日志

[10-20 09:33:14.152][E][anymouse]|acs_config_db_set_version|206|timetj exec start // 卡顿起始位置 [10-20 09:33:24.009][W][anymouse]|hardWatchDog|172|Feed dog is working properly. [10-20 09:33:34.025][W][anymouse]|hardWatchDog|172|Feed dog is working properly. [10-20 09:33:34.420][E][anymouse]|acs_config_db_set_version|212|timetj exec end // 卡顿消失位置 结论

​ 应用同事初步确认:卡顿长位置位于acs_config_db_set_version,用时约20s。

测试二、系统资源信息分析

top信息

Mem: 123584K used, 321324K free, 19752K shrd, 1088K buff, 70360K cached CPU: 52.3% usr 0.0% sys 0.0% nic 47.6% idle 0.0% io 0.0% irq 0.0% sirq Load average: 1.85 0.40 0.13 2/112 1000 PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND 955 951 root S 120m 27.7 0 52.3 [uicore] // uicore处于休眠状态,CPU占用较高 952 1 root S 178m 40.9 1 0.0 [hicore] 859 856 root S 37896 8.5 0 0.0 [log4j]

Uicore的CPU占用较高,做移除测试。 移除Uicore后,再次捕捉卡顿时间点的top信息

[01-25 01:14:25.143][E][anymouse]|acs_config_db_set_version|206|timetj exec start Enter Debug Mode. [root@dvrdvs ] # TMOUT=0 [root@dvrdvs ] # top -n1 Mem: 116800K used, 328108K free, 18524K shrd, 1076K buff, 64360K cached CPU: 0.0% usr 4.5% sys 0.0% nic 95.4% idle 0.0% io 0.0% irq 0.0% sirq Load average: 2.35 0.52 0.17 1/94 969 PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND 927 1 root S 178m 40.9 1 0.0 [hicore] 858 856 root S 28676 6.4 0 0.0 [log4j] 750 1 root S 14404 3.2 0 0.0 [bsp_log4j] 。。。 [01-25 01:14:51.871][E][anymouse]|acs_config_db_set_version|212|timetj exec end

perf分析文件,截图: 在这里插入图片描述

结论 从数据分析:CPU 95.4% idle, CPU占用少,IO占用低, 中断少。因为禁用uicore启动后,卡顿问题仍旧存在且时间上没有变化,所以排除掉uicore程序占用CPU高而导致的卡顿问题。如上说明top信息分析:CPU占用低、IO占用低、中断触发少,所以卡顿问题与负载无直接关联,可以关注其他性能分析方向,如: 锁竞争;资源同步;进程调度; 测试三、random相关分析 [11.16-15:44:27]/home/config/config.db init ok...[ 19.600532] random: hicore: uninitialized urandom read (256 bytes read) [11.16-15:44:27] [11.16-15:44:27][11-16 15:48:49.144][E][anymouse]|dev_param_cfg_file_init|113|timetj cccccccccccccccc [11.16-15:44:27][11-16 15:48:49.144][E][anymouse]|acs_config_db_set_version|206|timetj exec start // 应用调试打印 ... [11.16-15:44:36][ 28.118694] random: crng init done [11.16-15:44:36][11-16 15:48:57.673][E][anymouse]|acs_config_db_set_version|212|timetj exec end // 退出

经测试,每次进入(timetj exec start)、退出(timetj exec end)卡顿时都有随机妖数相关打印信息:

Hicore向随机随机数发生器读取256节点随机数:random: hicore: uninitialized urandom read (256 bytes read) ;

随机数发生器初始化完成:random: crng init done;

源码查找

gaoyang3513@General:~/Source_code/01-Nt9852x/04-T673/01-IBPPrj/SDK_T673_V1.1$ grep -rn "urandom read" . ./kernel/drivers/char/random.c:1896: "urandom read (%zd bytes read)\n", gaoyang3513@General:~/Source_code/01-Nt9852x/04-T673/01-IBPPrj/SDK_T673_V1.1$ grep -rn "crng init done" kernel/ kernel/drivers/char/random.c:962: pr_notice("random: crng init done\n");

栈回溯确认调用路径 测试补丁:

Index: kernel/drivers/char/random.c =================================================================== --- kernel/drivers/char/random.c (revision 523917) +++ kernel/drivers/char/random.c (working copy) @@ -238,6 +238,7 @@ * Eastlake, Steve Crocker, and Jeff Schiller. */ +#include "linux/printk.h" #include #include #include @@ -960,6 +961,7 @@ process_random_ready_list(); wake_up_interruptible(&crng_init_wait); pr_notice("random: crng init done\n"); + dump_stack(); if (unseeded_warning.missed) { pr_notice("random: %d get_random_xx warning(s) missed " "due to ratelimiting\n", @@ -1824,6 +1826,9 @@ urandom_warning.interval = 0; unseeded_warning.interval = 0; } + + printk("[%s|%u] -- GY\n", __FUNCTION__, __LINE__); + return 0; } early_initcall(rand_initialize); @@ -1891,10 +1896,12 @@ if (!crng_ready() && maxwarn > 0) { maxwarn--; - if (__ratelimit(&urandom_warning)) + if (__ratelimit(&urandom_warning)) { printk(KERN_NOTICE "random: %s: uninitialized " "urandom read (%zd bytes read)\n", current->comm, nbytes); + dump_stack(); + } spin_lock_irqsave(&primary_crng.lock, flags); crng_init_cnt = 0; spin_unlock_irqrestore(&primary_crng.lock, flags); 使用内核dump_stack实现栈回溯打印;在random模块初始化位置,增加打印信息;在打印random: hicore: uninitialized urandom read (256 bytes read) 之后,打印栈回溯信息;在打印random: crng init done之后,打印栈回溯信息;

打印信息如下:

[ 1.063632] nvt_rng f0680000.rng: Register nvt_rng_probe successfully // 硬件随机数驱动加载成功 ... [ 8.726708] random: fast init done // fast_pool初始化成功,crng_init=1; ... /home/config/config.db init ok...[ 9.267438] random: hicore: uninitialized urandom read (256 bytes read) // 获取random [ 9.276544] CPU: 0 PID: 946 Comm: hicore Tainted: P O 4.19.91 #2-ga8b08bf6-dirty [ 9.285224] Hardware name: Novatek Video Platform [ 9.289913] Backtrace: [ 9.292362] [] (dump_backtrace) from [] (show_stack+0x18/0x1c) [ 9.299914] r7:7ee32760 r6:60030013 r5:00000000 r4:80c890bc [ 9.305563] [] (show_stack) from [] (dump_stack+0x90/0xac) [ 9.312775] [] (dump_stack) from [] (urandom_read+0x84/0x2e4) [ 9.320239] r7:7ee32760 r6:00000100 r5:00000100 r4:80ccc784 [ 9.325889] [] (urandom_read) from [] (__vfs_read+0x40/0x15c) // vfs_read调用字符设备的read实际:urandom_read, [ 9.333357] r10:00000003 r9:00000100 r8:c10d7f60 r7:80c24508 r6:00000100 r5:8043b3d0 [ 9.341165] r4:c2a91200 [ 9.343692] [] (__vfs_read) from [] (vfs_read+0x98/0x104) [ 9.350812] r9:00000100 r8:c10d7f60 r7:7ee32760 r6:00000100 r5:00000000 r4:c2a91200 [ 9.358539] [] (vfs_read) from [] (ksys_read+0x64/0xc0) [ 9.365485] r9:00000100 r8:7ee32760 r7:80c24508 r6:c2a91201 r5:7ee32760 r4:c2a91200 [ 9.373211] [] (ksys_read) from [] (sys_read+0x10/0x14) [ 9.380157] r9:c10d6000 r8:80101204 r7:00000003 r6:00000015 r5:7ee32760 r4:00000100 [ 9.387885] [] (sys_read) from [] (ret_fast_syscall+0x0/0x58) [ 9.395347] Exception stack(0xc10d7fa8 to 0xc10d7ff0) [ 9.400385] 7fa0: 00000100 7ee32760 00000015 7ee32760 00000100 00000000 [ 9.408545] 7fc0: 00000100 7ee32760 00000015 00000003 00fcd90c 00000000 76f25000 7ee3273c [ 9.416702] 7fe0: 745ff4d0 7ee32700 00000000 74c07a24 BusyBox v1.31.1 (2022-11-20 20:01:27 CST) built-in shell (ash) Enter 'help' for a list of built-in commands. BusyBox v1.2.1 Protect Shell (psh svn326548) Build Time: Mar 17 2021:10:03:53 Enter 'help' for a list system commands. Random number is:18050941 # [ 24.626902] random: crng init done // crng初始化完成 [ 24.630304] CPU: 0 PID: 0 Comm: swapper/0 Tainted: P O 4.19.91 #2-ga8b08bf6-dirty [ 24.639067] Hardware name: Novatek Video Platform [ 24.643755] Backtrace: [ 24.646204] [] (dump_backtrace) from [] (show_stack+0x18/0x1c) [ 24.653756] r7:80c5bc68 r6:600e0193 r5:00000000 r4:80c890bc [ 24.659404] [] (show_stack) from [] (dump_stack+0x90/0xac) [ 24.666616] [] (dump_stack) from [] (crng_reseed.constprop.21+0x11c/0x1e8) [ 24.675208] r7:80c5bc68 r6:80c5bc68 r5:80c24508 r4:80c5bb2c [ 24.680858] [] (crng_reseed.constprop.21) from [] (credit_entropy_bits+0x178/0x31c) [ 24.690230] r9:0000004c r8:809bb505 r7:00000001 r6:80c5bb2c r5:80c5bb5c r4:00000080 [ 24.697958] [] (credit_entropy_bits) from [] (add_interrupt_randomness+0x1c0/0x1e4) [ 24.707333] r10:80c01ef0 r9:80169c44 r8:436c1000 r7:80c5bb80 r6:fffebd4b r5:80c24508 [ 24.715140] r4:c4232388 [ 24.717672] [] (add_interrupt_randomness) from [] (handle_irq_event_percpu+0x40/0x84) // 硬件随机数据生成器,在random中注册的回调; [ 24.727218] r10:80b5ca38 r9:80c00000 r8:c3c24000 r7:00000030 r6:80c08500 r5:00000001 [ 24.735027] r4:80c24508 [ 24.737555] [] (handle_irq_event_percpu) from [] (handle_irq_event+0x40/0x64) [ 24.746405] r6:80c08568 r5:80c08568 r4:80c08500 [ 24.751013] [] (handle_irq_event) from [] (handle_fasteoi_irq+0xc4/0x138) // CPU在每次处理完中断后发出EOI(end of interrupt), 由GIC自实现中断流控; [ 24.759518] r7:00000030 r6:80c08568 r5:80c24a78 r4:80c08500 [ 24.765166] [] (handle_fasteoi_irq) from [] (generic_handle_irq+0x20/0x30) [ 24.773757] r7:00000030 r6:80b71988 r5:00000000 r4:00000000 [ 24.779404] [] (generic_handle_irq) from [] (__handle_domain_irq+0xa8/0xbc) [ 24.788087] [] (__handle_domain_irq) from [] (gic_handle_irq+0x54/0x80) [ 24.796420] r9:80c00000 r8:80c24508 r7:c4803100 r6:80c01ef0 r5:80c24a78 r4:c4802100 [ 24.804147] [] (gic_handle_irq) from [] (__irq_svc+0x6c/0xa8) // 中断回调, [ 24.811609] Exception stack(0x80c01ef0 to 0x80c01f38) [ 24.816647] 1ee0: 00000000 00058e50 c422f48c 80117dc0 [ 24.824807] 1f00: 00000001 80c00000 80c24530 80c24500 80c24508 c45fcb00 80b5ca38 80c01f4c [ 24.832966] 1f20: 80c01f50 80c01f40 80108be8 80108bd8 600e0013 ffffffff [ 24.839564] r7:80c01f24 r6:ffffffff r5:600e0013 r4:80108bd8 [ 24.845215] [] (arch_cpu_idle) from [] (default_idle_call+0x30/0x34) [ 24.853290] [] (default_idle_call) from [] (do_idle+0xd8/0x128) [ 24.860928] [] (do_idle) from [] (cpu_startup_entry+0x20/0x28) [ 24.868479] r7:80c24500 r6:ffffffff r5:00000002 r4:000000c7 [ 24.874127] [] (cpu_startup_entry) from [] (rest_init+0xbc/0xdc) [ 24.881858] [] (rest_init) from [] (start_kernel+0x478/0x518) [ 24.889320] r5:80c9e48c r4:00000000 [ 24.892884] [] (start_kernel) from [] ( (null))

源码跟踪urandom_read

# kernel/drivers/char/random.c static ssize_t urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos) if (!crng_ready() && maxwarn > 0) { // crng未初始化完成 maxwarn--; if (__ratelimit(&urandom_warning)) // 在多次(10次)尝试仍失败时,问题打印 printk(KERN_NOTICE "random: %s: uninitialized " "urandom read (%zd bytes read)\n", current->comm, nbytes); nbytes = min_t(size_t, nbytes, INT_MAX >> (ENTROPY_SHIFT + 3)); extract_crng_user(buf, nbytes); int large_request = (nbytes > 256); while (nbytes) { if (large_request && need_resched()) { // 单次获取超过256字节时,主动调度释放CPU; schedule(); extract_crng(tmp); crng = &primary_crng; _extract_crng(crng, out); if (crng_ready() && // crng初始化完成,且超过老化时间时,重新写入熵值; (time_after(crng_global_init_time, crng->init_time) || time_after(jiffies, crng->init_time + CRNG_RESEED_INTERVAL))) crng_reseed(crng, crng == &primary_crng ? &input_pool : NULL); if (crng == &primary_crng && crng_init .read = urandom_read, // read为字符/dev/urandom的读实现 .write = random_write, .unlocked_ioctl = random_ioctl, }; # kernel/drivers/char/mem.c static const struct memdev devlist[] = { ... [8] = { "random", 0666, &random_fops, 0 }, [9] = { "urandom", 0666, &urandom_fops, 0 }, }; static int __init chr_dev_init(void) register_chrdev(MEM_MAJOR, "mem", &memory_fops); // 注册字符设备mem, 主设备 mem_class = class_create(THIS_MODULE, "mem"); // 创建设备类 for (minor = 1; minor int ret; if (flags & ~(GRND_NONBLOCK|GRND_RANDOM)) return -EINVAL; if (count > INT_MAX) // 限制最大读取长度(int值长度) count = INT_MAX; if (flags & GRND_RANDOM) // 如果指定使用random,替代urandom return _random_read(flags & GRND_NONBLOCK, buf, count); if (!crng_ready()) { // 如果crng未准备完成,等待 if (flags & GRND_NONBLOCK) // 如果传入的flags指定不阻塞,直接返回 return -EAGAIN; ret = wait_for_random_bytes(); // 否则挂起等待 if (likely(crng_ready())) // 如果crng准备完成,退出 return 0; return wait_event_interruptible(crng_init_wait, crng_ready()); // 提交至等待对列crng_init_wait, 条件为crng_ready() if (unlikely(ret)) return ret; } return urandom_read(NULL, buf, count, NULL); // 读取随机数 }

第二处进入条件为: 直接调用系统接口getrandom。 其中wait_event_interruptible为同步阻塞,所以系统调用getrandom历可能将线程挂起直至条件crng_ready()满足; 结合问题分析:在随机字节池未填满的前提下(crng_ready()不满足),Hicore陷入内核态后直接被挂起,不可能打印random: hicore: uninitialized urandom read (256 bytes read)(该打印在urandom_read中执行, 在唤醒后被调用)。

结论一

综上,有以下结论:

1. `/dev/urandom`不会引起进程阻塞,另外Hicore有调用`read("dev/urandom", xxx, 256)`,进而有的内核打印信息`random: hicore: uninitialized urandom read (256 bytes read)`; 2. `getrandom`可能会引起进程阻塞,如果进程被挂起,可能是调用了系统接口`getrandom`;

结合问题,random: hicore: uninitialized urandom read (256 bytes read)由用户使用urandom调用read(“dev/urandom”, xxx, 256)而有的内核打印信息。该流程无阻塞发生完整结束。

源码跟踪crng_reseed

# kernel/drivers/char/random.c #define crng_ready() (likely(crng_init > 1)) static void crng_reseed(struct crng_state *crng, struct entropy_store *r) ... if (crng == &primary_crng && crng_init unsigned int anfrac = min(pnfrac, pool_size/2); unsigned int add = ((pool_size - entropy_count)*anfrac*3) >> s; entropy_count += add; pnfrac -= anfrac; } while (unlikely(entropy_count crng_reseed(&primary_crng, r);

调用条件为:

用户主动调用ioctl(“/dev/random”, RNDADDTOENTCNT(或RNDADDENTROPY), xxx); RNDADDTOENTCNT, 仅增加熵池数量;RNDADDENTROPY, 向熵池增加若干随机字(单位:字节) 1; 熵池输入源一:add_timer_randomness; input_handle_event, input子系统event事件: 键盘、触摸等输入事件; 熵池输入源二:add_disk_randomness; blk_update_bidi_request, 块设备的的BIO请求:读、写磁盘(包括eMMC)动作; 熵池输入源三:add_interrupt_randomness; handle_irq_event_percpu, 所有中断的必经之路:所有硬中断; 熵池输入源三:add_hwgenerator_randomness; hwrng_fillfn, 内核线程"hwrng": 持续写入,最大延时1s; 其他: xfer_secondary_pool, 从主熵池推送到次熵池;push_to_pool, 从‘主熵池’推送到‘输出熵池’,工作对列回调;

由上可知credit_entropy_bits做为熵池填充的入口,与crng初始化完成有直接关系;

第二处:_extract_crng,

static void _extract_crng(struct crng_state *crng, __u8 out[CHACHA20_BLOCK_SIZE]) if (crng_ready() && (time_after(crng_global_init_time, crng->init_time) || time_after(jiffies, crng->init_time + CRNG_RESEED_INTERVAL))) crng_reseed(crng, crng == &primary_crng ? &input_pool : NULL);

必要条件

crng_ready(),crng初始化完成;

调用条件为

crng_reseed, 当入参entropy_store(熵池抽象类),即未指定熵池对象,递归一次;extract_crng, 当入参entropy_store(熵池抽象类)为input_pool,即指定主crng; crng_backtrack_protect, 规避chacha20-key回溯保护机制;urandom_read, 用户主动调用read(“/dev/urandom”, …);get_random_(bytes/u32/u64), 导出符号:涉及模块过多;

以上调用条件都基于必要条件:crng_ready()。

第三处:random_ioctl, /dev/randomr操作

static long random_ioctl(struct file *f, unsigned int cmd, unsigned long arg) case RNDRESEEDCRNG: if (crng_init


【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3