linux kernel: oom-killer
apache 终于停止服务了,系统为2.6内核、64位操作系统、2G内存。
/var/log/messages 有这样的错误提示
Mar 8 20:18:24 2hei-net kernel: oom-killer: gfp_mask=0x1d2
Mar 8 20:18:24 2hei-net kernel: Mem-info:
Mar 8 20:18:24 2hei-net kernel: Node 0 DMA per-cpu:
Mar 8 20:18:24 2hei-net kernel: cpu 0 hot: low 2, high 6, batch 1
Mar 8 20:18:24 2hei-net kernel: cpu 0 cold: low 0, high 2, batch 1
Mar 8 20:18:24 2hei-net kernel: cpu 1 hot: low 2, high 6, batch 1
Mar 8 20:18:24 2hei-net kernel: cpu 1 cold: low 0, high 2, batch 1
Mar 8 20:18:24 2hei-net kernel: cpu 2 hot: low 2, high 6, batch 1
Mar 8 20:18:24 2hei-net kernel: cpu 2 cold: low 0, high 2, batch 1
Mar 8 20:18:24 2hei-net kernel: cpu 3 hot: low 2, high 6, batch 1
Mar 8 20:18:24 2hei-net kernel: cpu 3 cold: low 0, high 2, batch 1
Mar 8 20:18:24 2hei-net kernel: Node 0 Normal per-cpu:
Mar 8 20:18:26 2hei-net kernel: cpu 0 hot: low 32, high 96, batch 16
Mar 8 20:18:26 2hei-net kernel: cpu 0 cold: low 0, high 32, batch 16
Mar 8 20:18:26 2hei-net kernel: cpu 1 hot: low 32, high 96, batch 16
Mar 8 20:18:26 2hei-net kernel: cpu 1 cold: low 0, high 32, batch 16
Mar 8 20:18:26 2hei-net kernel: cpu 2 hot: low 32, high 96, batch 16
Mar 8 20:18:26 2hei-net kernel: cpu 2 cold: low 0, high 32, batch 16
Mar 8 20:18:26 2hei-net kernel: cpu 3 hot: low 32, high 96, batch 16
Mar 8 20:18:26 2hei-net kernel: cpu 3 cold: low 0, high 32, batch 16
Mar 8 20:18:26 2hei-net kernel: Node 0 HighMem per-cpu: empty
Mar 8 20:18:26 2hei-net kernel:
Mar 8 20:18:26 2hei-net kernel: Free pages: 17536kB (0kB HighMem)
Mar 8 20:18:26 2hei-net kernel: Active:257583 inactive:239023 dirty:0 writeback:0 unstable:0 free:4384 slab:3787 mapped:497010 pagetables:2846
Mar 8 20:18:26 2hei-net kernel: Node 0 DMA free:11832kB min:44kB low:88kB high:132kB active:0kB inactive:0kB present:16384kB pages_scanned:0 all_unreclaimable? yes
Mar 8 20:18:26 2hei-net kernel: protections[]: 0 0 0
Mar 8 20:18:26 2hei-net kernel: Node 0 Normal free:5704kB min:5720kB low:11440kB high:17160kB active:1029692kB inactive:956732kB present:2080416kB pages_scanned:3188856 all_unreclaimable? yes
Mar 8 20:18:26 2hei-net kernel: protections[]: 0 0 0
Mar 8 20:18:26 2hei-net kernel: Node 0 HighMem free:0kB min:128kB low:256kB high:384kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar 8 20:18:26 2hei-net kernel: protections[]: 0 0 0
Mar 8 20:18:26 2hei-net kernel: Node 0 DMA: 6*4kB 4*8kB 2*16kB 3*32kB 2*64kB 2*128kB 2*256kB 1*512kB 0*1024kB 1*2048kB 2*4096kB = 11832kB
Mar 8 20:18:26 2hei-net kernel: Node 0 Normal: 0*4kB 1*8kB 4*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 0*2048kB 1*4096kB = 5704kB
Mar 8 20:18:26 2hei-net kernel: Node 0 HighMem: empty
Mar 8 20:18:26 2hei-net kernel: Swap cache: add 1070476, delete 1070476, find 119872/179715, race 0+20
Mar 8 20:18:26 2hei-net kernel: Free swap: 0kB
Mar 8 20:18:26 2hei-net kernel: 524200 pages of RAM
Mar 8 20:18:27 2hei-net kernel: 10214 reserved pages
Mar 8 20:18:27 2hei-net kernel: 67015 pages shared
Mar 8 20:18:27 2hei-net kernel: 0 pages swap cached
Mar 8 20:18:27 2hei-net kernel: Out of Memory: Killed process 26079 (httpd).
终于发现了linux的OOM Killer(Out of Memory: Killed process)这个功能。当linux发现有进程占用内存过多时会触发OOM Killer功能,将占用内存最多的pid给杀掉,通过网上的一些惨痛的教训可以看到有mysql、oracle、apache给OOM kill掉的。
因为我的机器上只跑了apache服务,所以又看了一下我的apache配置
<IfModule worker.c>
StartServers 2
MaxClients 2048
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 128
MaxRequests
本以为想让apache支持的连接数多一点,没想到MaxClients参数的设置影响了系统的稳定。
我看到以这种配置启动时VIRT RES的值就已经显示为1702,RES从11m开始逐步升高。
PID USER PR NI %CPU TIME+ %MEM VIRT RES SHR S COMMAND
7009 apache 16 0 1 0:06.95 0.8 1702m 11m 2172 S /home/local/apache/bin/httpd
跑了几天以后VIRT已经到了2G多,RES也接近1G,终于今天挂掉了,httpd进程被linux给kill掉了。
于是尝试修改apache的httd.conf配置,发现MaxClients为1536时,启动VIRT可以达到1024m,如果设定为1024时 VIRT为776m,所以对于2G内存的机器不要超过1536为好。
参考文档:
http://lwn.net/Articles/104179/
本文固定链接: https://www.2hei.net/2009/03/08/linux_kernel_oomkiller/ | 2hei.net
最活跃的读者