2014-10-16 110 views
5

我在Tomcat進程被Linux OOM殺死的生產環境中經常遇到問題。Linux OOM殺手和Java進程

檢查/ var/log/messages它表示java沒有被感染,並且java調用了OOM殺手。

-Xms20480m -Xmx20480m在32 GB的盒子上。

我看到下面的崩潰 -

是OOM造成這次崩潰?或者是因爲OOM而發生崩潰? 我該如何調試這個問題?

# 
# A fatal error has been detected by the Java Runtime Environment: 
# 
# SIGSEGV (0xb) at pc=0x00007f4c3230aad7, pid=16248, tid=139964439296320 
# 
# JRE version: Java(TM) SE Runtime Environment (7.0_45-b18) (build 1.7.0_45-b18) 
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.45-b08 mixed mode linux-amd64 compressed oops) 
# Problematic frame: 
# V [libjvm.so+0x674ad7] JVM_Clone+0x97 
# 
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again 
# 
# If you would like to submit a bug report, please visit: 
# http://bugreport.sun.com/bugreport/crash.jsp 
# The crash happened outside the Java Virtual Machine in native code. 
# See problematic frame for where to report the bug. 
# 

--------------- T H R E A D --------------- 

Current thread (0x0000000001313800): JavaThread "http-bio-14080-exec-17" daemon [_thread_in_native, id=18943, stack(0x00007f4c029f7000,0x00007f4c02af8000)] 

siginfo:si_signo=SIGSEGV: si_errno=0, si_code=128(), si_addr=0x0000000000000000 

Registers: 
RAX=0x0000000000010000, RBX=0x80000001000004db, RCX=0x0000000302bd6fb0, RDX=0x00007f4c32acda50 
RSP=0x00007f4c02af2c88, RBP=0x00007f4c02af2d70, RSI=0x00007f4c02af2d80, RDI=0x00000000013139e8 
R8 =0x0000000000000001, R9 =0x00000002f8191228, R10=0x00007f4c2e231658, R11=0x00007f4c2e231638 
R12=0x0000000302bd6fb0, R13=0x00000007e0002420, R14=0x0000000000000000, R15=0x0000000001313800 
RIP=0x00007f4c3230aad7, EFLAGS=0x0000000000010202, CSGSFS=0x0000000000000033, ERR=0x0000000000000000 
    TRAPNO=0x000000000000000d 

Top of Stack: (sp=0x00007f4c02af2c88) 
0x00007f4c02af2c88: 00007f4c3230aa63 00000000129196d1 
0x00007f4c02af2c98: 00000002f817b620 00000000013139e8 
0x00007f4c02af2ca8: 00007f4c02af2d18 00000002f8191080 
0x00007f4c02af2cb8: 0000000000000009 0000000001313800 
0x00007f4c02af2cc8: 0000000000000000 0000000001313800 
0x00007f4c02af2cd8: 00000000138f5509 0000000000000000 
0x00007f4c02af2ce8: 00007f4c2e382ae8 0000000000000000 
0x00007f4c02af2cf8: 00007f4c02af2cf8 00000002f817b620 
0x00007f4c02af2d08: 00000002f8190fd8 00000007e01ab278 
0x00007f4c02af2d18: 00000002f8190ec0 00000007e0002420 
0x00007f4c02af2d28: 0000000000000000 00000007e0002420 
0x00007f4c02af2d38: 0000000000000000 0000000000000000 
0x00007f4c02af2d48: 00000007e0002420 0000000000000000 
0x00007f4c02af2d58: 00007f4c02af2de0 0000000000000000 
0x00007f4c02af2d68: 0000000001313800 00007f4c02af2dc0 
0x00007f4c02af2d78: 00007f4c2e2316c6 0000000302bd6fb0 
0x00007f4c02af2d88: 0000000000000000 00007f4c02af2e08 
0x00007f4c02af2d98: 00007f4c2e1bc8e1 00007f4c02af2db8 
0x00007f4c02af2da8: 00007f4c2e1bc8e1 00000002f8190fd8 
0x00007f4c02af2db8: 00000002f817b620 00007f4c02af2e28 
0x00007f4c02af2dc8: 00007f4c2e1bc233 00000007e1ec4e9f 
0x00007f4c02af2dd8: 00007f4c2e1bc233 0000000302bd6fb0 
0x00007f4c02af2de8: 00007f4c02af2de8 00000007e1b7a5db 
0x00007f4c02af2df8: 00007f4c02af2e30 00000007e37575a8 
0x00007f4c02af2e08: 0000000000000000 00000007e1b7a5e8 
0x00007f4c02af2e18: 00007f4c02af2de0 00007f4c02af2e40 
0x00007f4c02af2e28: 00007f4c02af2ed0 00007f4c2edaaf34 
0x00007f4c02af2e38: 00007f4c2edaaf34 00007f4c0000000a 
0x00007f4c02af2e48: 00000007e416ec20 0000000000000000 
0x00007f4c02af2e58: 00000007e416e2b8 00007f4c02af2ed0 
0x00007f4c02af2e68: 00007f4c2e1bc233 00007f4c02af2ed0 
0x00007f4c02af2e78: 00007f4c2e1bc233 000000000000000a 

Instructions: (pc=0x00007f4c3230aad7) 
0x00007f4c3230aab7: 85 60 02 00 00 06 00 00 00 4c 89 6d b0 4c 8b 23 
0x00007f4c3230aac7: 4d 85 e4 0f 84 68 02 00 00 49 8b 9d 20 01 00 00 
0x00007f4c3230aad7: 48 83 7b 10 f7 0f 87 6e 02 00 00 48 8b 43 10 48 
0x00007f4c3230aae7: 8d 50 08 48 3b 53 18 0f 87 9c 02 00 00 48 89 53 

Register to memory mapping: 

RAX=0x0000000000010000 is an unknown value 
RBX=0x80000001000004db is an unknown value 
RCX=0x0000000302bd6fb0 is an oop 
[Lcom.mycompany.MyClass$Type; 
- klass: 'com/mycompany/MyClass$Type'[] 
- length: 16 
RDX=0x00007f4c32acda50: <offset 0xe37a50> in /usr/java/jre64-1.7.0_45/jre/lib/amd64/server/libjvm.so at 0x00007f4c31c96000 
RSP=0x00007f4c02af2c88 is pointing into the stack for thread: 0x0000000001313800 
RBP=0x00007f4c02af2d70 is pointing into the stack for thread: 0x0000000001313800 
RSI=0x00007f4c02af2d80 is pointing into the stack for thread: 0x0000000001313800 
RDI=0x00000000013139e8 is an unknown value 
R8 =0x0000000000000001 is an unknown value 
R9 = 
[error occurred during error reporting (printing register info), id 0xb] 

Stack: [0x00007f4c029f7000,0x00007f4c02af8000], sp=0x00007f4c02af2c88, free space=1007k 
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) 
V [libjvm.so+0x674ad7] JVM_Clone+0x97 
J java.lang.Object.clone()Ljava/lang/Object; 
j com.mycompany.MyClass$Type.values()[Lcom/mycompany/MyClass$Type;+3 

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) 
J java.lang.Object.clone()Ljava/lang/Object; 
j com.mycompany.MyClass$Type.values()[Lcom/mycompany/MyClass$Type;+3 

的/ var/log/messages中輸出 -

myhostname kernel: java invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0 
myhostname kernel: java cpuset=/ mems_allowed=0 
myhostname kernel: Pid: 32307, comm: java Not tainted 2.6.39-400.209.1.el5uek #1 
myhostname kernel: Call Trace: 
myhostname kernel: [<ffffffff811136b4>] dump_header+0x94/0xe0 
myhostname kernel: [<ffffffff811137fd>] oom_kill_process+0x6d/0x160 
myhostname kernel: [<ffffffff811139ec>] out_of_memory+0xfc/0x210 
myhostname kernel: [<ffffffff811187ec>] __alloc_pages_slowpath+0x64c/0x660 
myhostname kernel: [<ffffffff811189b4>] __alloc_pages_nodemask+0x1b4/0x200 
myhostname kernel: [<ffffffff8111b140>] ? __do_page_cache_readahead+0xe0/0x170 
myhostname kernel: [<ffffffff81150893>] alloc_pages_current+0xb3/0x120 
myhostname kernel: [<ffffffff811100da>] __page_cache_alloc+0x9a/0xb0 
myhostname kernel: [<ffffffff8111097f>] page_cache_read+0x4f/0xb0 
myhostname kernel: [<ffffffff81111a54>] filemap_fault+0x174/0x270 
myhostname kernel: [<ffffffff81137a2c>] __do_fault+0x5c/0x550 
myhostname kernel: [<ffffffff81137fc6>] do_linear_fault+0x36/0x40 
myhostname kernel: [<ffffffff81510b6e>] ? call_function_interrupt+0xe/0x20 
myhostname kernel: [<ffffffff81138044>] handle_pte_fault+0x74/0x190 
myhostname kernel: [<ffffffff815106ae>] ? apic_timer_interrupt+0xe/0x20 
myhostname kernel: [<ffffffff8113828f>] handle_mm_fault+0x12f/0x1b0 
myhostname kernel: [<ffffffff8150b1cd>] do_page_fault+0x17d/0x4b0 
myhostname kernel: [<ffffffff8117b821>] ? user_path_at+0x11/0x20 
myhostname kernel: [<ffffffff81170516>] ? vfs_fstatat+0x56/0x90 
myhostname kernel: [<ffffffff8117067b>] ? vfs_stat+0x1b/0x20 
myhostname kernel: [<ffffffff81507cd5>] page_fault+0x25/0x30 
myhostname kernel: Mem-Info: 
myhostname kernel: Node 0 DMA per-cpu: 
myhostname kernel: CPU 0: hi: 0, btch: 1 usd: 0 
myhostname kernel: CPU 1: hi: 0, btch: 1 usd: 0 
myhostname kernel: CPU 2: hi: 0, btch: 1 usd: 0 
myhostname kernel: CPU 3: hi: 0, btch: 1 usd: 0 
myhostname kernel: CPU 4: hi: 0, btch: 1 usd: 0 
myhostname kernel: CPU 5: hi: 0, btch: 1 usd: 0 
myhostname kernel: Node 0 DMA32 per-cpu: 
myhostname kernel: CPU 0: hi: 186, btch: 31 usd: 0 
myhostname kernel: CPU 1: hi: 186, btch: 31 usd: 0 
myhostname kernel: CPU 2: hi: 186, btch: 31 usd: 0 
myhostname kernel: CPU 3: hi: 186, btch: 31 usd: 11 
myhostname kernel: CPU 4: hi: 186, btch: 31 usd: 30 
myhostname kernel: CPU 5: hi: 186, btch: 31 usd: 0 
myhostname kernel: Node 0 Normal per-cpu: 
myhostname kernel: CPU 0: hi: 186, btch: 31 usd: 0 
myhostname kernel: CPU 1: hi: 186, btch: 31 usd: 0 
myhostname kernel: CPU 2: hi: 186, btch: 31 usd: 0 
myhostname kernel: CPU 3: hi: 186, btch: 31 usd: 0 
myhostname kernel: CPU 4: hi: 186, btch: 31 usd: 0 
myhostname kernel: CPU 5: hi: 186, btch: 31 usd: 0 
myhostname kernel: active_anon:4372468 inactive_anon:275213 isolated_anon:0 
myhostname kernel: active_file:17 inactive_file:21 isolated_file:0 
myhostname kernel: unevictable:6002 dirty:24 writeback:4 unstable:0 
myhostname kernel: free:38369 slab_reclaimable:3708 slab_unreclaimable:9265 
myhostname kernel: mapped:1253 shmem:67 pagetables:10880 bounce:0 
myhostname kernel: Node 0 DMA free:15880kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15688kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes 
myhostname kernel: lowmem_reserve[]: 0 3000 32290 32290 
myhostname kernel: Node 0 DMA32 free:119288kB min:2136kB low:2668kB high:3204kB active_anon:30224kB inactive_anon:9152kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3072096kB mlocked:0kB dirty:16kB writeback:4kB mapped:36kB shmem:0kB slab_reclaimable:40kB slab_unreclaimable:232kB kernel_stack:24kB pagetables:48kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:67 all_unreclaimable? yes 
myhostname kernel: lowmem_reserve[]: 0 0 29290 29290 
myhostname kernel: Node 0 Normal free:17564kB min:20856kB low:26068kB high:31284kB active_anon:17459648kB inactive_anon:1091700kB active_file:120kB inactive_file:88kB unevictable:24008kB isolated(anon):0kB isolated(file):0kB present:29992960kB mlocked:24008kB dirty:80kB writeback:12kB mapped:4976kB shmem:268kB slab_reclaimable:14792kB slab_unreclaimable:36828kB kernel_stack:2768kB pagetables:43472kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:14972 all_unreclaimable? yes 
myhostname kernel: lowmem_reserve[]: 0 0 0 0 
myhostname kernel: Node 0 DMA: 0*4kB 1*8kB 0*16kB 0*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15880kB 
myhostname kernel: Node 0 DMA32: 458*4kB 494*8kB 284*16kB 55*32kB 7*64kB 4*128kB 5*256kB 7*512kB 7*1024kB 6*2048kB 20*4096kB = 119288kB 
myhostname kernel: Node 0 Normal: 3333*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 17428kB 
myhostname kernel: 115783 total pagecache pages 
myhostname kernel: 114466 pages in swap cache 
myhostname kernel: Swap cache stats: add 2183060, delete 2068594, find 1860109/1985849 
myhostname kernel: Free swap = 0kB 
myhostname kernel: Total swap = 2097148kB 
myhostname kernel: 8388592 pages RAM 
myhostname kernel: 134806 pages reserved 
myhostname kernel: 6845 pages shared 
myhostname kernel: 8210059 pages non-shared 
+0

請發佈/ var/log/messages中的相關消息。 – helmy 2014-10-16 12:03:32

+0

已添加問題。我提供了一些信息,因爲他們有提示性信息。 – 2014-10-16 12:12:17

+1

'dmesg | egrep -i'殺死進程'# – 2014-10-16 12:15:23

回答

0

它更可能是OOM問題導致崩潰。使用比-Xmx參數更多內存的一個常見原因是使用本機內存。這可能是因爲您使用的JNI庫分配了很多對象,或者如果您是內存映射文件等。

您應該嘗試添加一些日誌語句到您的Java代碼以打印出內存Java認爲它使用的是Runtime.totalMemory等。然後將這些值與通過top查看的值進行比較,並查看是否有其他內容消耗內存。

0

這是linux限制線程數。

Linux的默認限制1024,和最高65535,

您可以使用命令ulimit -c unlimited無限。