jvm · 2023-02-13 0

排查 jvm 内存问题

1.目的

排查 java 程序出现内存一直飙高,或者出现 OutOfMemoryError 的问题

2.查看 jvm 内存

2.1 jps 查看进程号

zxm@zxm-pc:~$ jps -l
2643 com.intellij.idea.Main
20068 org.jetbrains.jps.cmdline.Launcher
19990 org.jetbrains.idea.maven.server.RemoteMavenServer36
20231 sun.tools.jps.Jps
20079 org.example.DemoApplication

2.2 查看堆内存和对象大小

  • jmap -heap <pid>:输出堆内存设置和使用情况
  • jmap -histo <pid>:输出heap的直方图,包括类名,对象数量,对象占用大小
  • arthas 的 memory 命令也查看堆内存情况

jmap -heap

zxm@zxm-pc:~$ jmap -heap 20079
Attaching to process ID 20079, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.341-b10

using thread-local object allocation.
Parallel GC with 8 thread(s)

Heap Configuration:
   MinHeapFreeRatio         = 0
   MaxHeapFreeRatio         = 100
   MaxHeapSize              = 134217728 (128.0MB)
   NewSize                  = 67108864 (64.0MB)
   MaxNewSize               = 67108864 (64.0MB)
   OldSize                  = 67108864 (64.0MB)
   NewRatio                 = 2
   SurvivorRatio            = 8
   MetaspaceSize            = 67108864 (64.0MB)
   CompressedClassSpaceSize = 58720256 (56.0MB)
   MaxMetaspaceSize         = 67108864 (64.0MB)
   G1HeapRegionSize         = 0 (0.0MB)

Heap Usage:
PS Young Generation
Eden Space:
   capacity = 23068672 (22.0MB)
   used     = 18431480 (17.57762908935547MB)
   free     = 4637192 (4.422370910644531MB)
   79.89831404252486% used
From Space:
   capacity = 22020096 (21.0MB)
   used     = 0 (0.0MB)
   free     = 22020096 (21.0MB)
   0.0% used
To Space:
   capacity = 22020096 (21.0MB)
   used     = 0 (0.0MB)
   free     = 22020096 (21.0MB)
   0.0% used
PS Old Generation
   capacity = 67108864 (64.0MB)
   used     = 61432632 (58.58672332763672MB)
   free     = 5676232 (5.413276672363281MB)
   91.54175519943237% used

13506 interned Strings occupying 1166688 bytes.

jmap -histo

zxm@zxm-pc:~$ jmap -histo 20079

 num     #instances         #bytes  class name
----------------------------------------------
   1:       2097152       50331648  org.example.bean.Person
   2:          6970       18589368  [Ljava.lang.Object;
   3:         33332        3143664  [C
   4:          3410         990312  [I
   5:         33109         794616  java.lang.String
   6:          7016         777240  java.lang.Class
   7:          7817         687896  java.lang.reflect.Method
   8:          1953         537376  [B
   9:         13958         446656  java.util.concurrent.ConcurrentHashMap$Node
  10:          2583         223504  [Ljava.util.HashMap$Node;
  11:          6782         217024  java.util.HashMap$Node
  12:          9549         215560  [Ljava.lang.Class;
  13:          5172         206880  java.util.LinkedHashMap$Entry
  14:           127         155568  [Ljava.util.concurrent.ConcurrentHashMap$Node;
  15:          2690         150640  java.util.LinkedHashMap
  16:          9117         145872  java.lang.Object
  17:          1429         102888  java.lang.reflect.Field
...               ...               ...                         ...
2926:             1             16  sun.util.calendar.Gregorian
2927:             1             16  sun.util.locale.InternalLocaleBuilder$CaseInsensitiveChar
2928:             1             16  sun.util.locale.provider.AuxLocaleProviderAdapter$NullProvider
2929:             1             16  sun.util.locale.provider.CalendarDataUtility$CalendarWeekParameterGetter
2930:             1             16  sun.util.locale.provider.SPILocaleProviderAdapter
2931:             1             16  sun.util.locale.provider.TimeZoneNameUtility$TimeZoneNameGetter
2932:             1             16  sun.util.resources.LocaleData
2933:             1             16  sun.util.resources.LocaleData$LocaleDataResourceBundleControl
Total       2310431       79979448

arthas 的 memory 命令也查看堆内存情况

[arthas@20079]$ memory 
Memory                                                              used                  total                  max                   usage                  
heap                                                                83M                   107M                   107M                  78.49%                 
ps_eden_space                                                       20M                   22M                    22M                   92.13%                 
ps_survivor_space                                                   0K                    21504K                 21504K                0.00%                  
ps_old_gen                                                          63M                   64M                    64M                   99.56%                 
nonheap                                                             53M                   56M                    360M                  14.83%                 
code_cache                                                          6M                    6M                     240M                  2.68%                  
metaspace                                                           41M                   43M                    64M                   64.66%                 
compressed_class_space                                              5M                    6M                     56M                   9.94%                  
direct                                                              16K                   16K                    -                     100.01%                
mapped                                                              0K                    0K                     -                     0.00%  

2.3.查看 gc

  • jstat -gc <pid>:输出gc信息,包括gc次数和时间,内存使用状况(可带时间和显示条目参数)
zxm@zxm-pc:~$ jstat -gc 20079
 S0C    S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC   CCSU   YGC     YGCT    FGC    FGCT     GCT   
21504.0 21504.0  0.0    0.0   22528.0  21104.4   65536.0    65248.6   44544.0 42249.4 6144.0 5681.1      7    0.102  10      3.768    3.869
  • S0C:第一个幸存区的大小
  • S1C:第二个幸存区的大小
  • S0U:第一个幸存区的使用大小
  • S1U:第二个幸存区的使用大小
  • EC:伊甸园区的大小
  • EU:伊甸园区的使用大小
  • OC:老年代大小
  • OU:老年代使用大小
  • MC:方法区大小
  • MU:方法区使用大小
  • CCSC:压缩类空间大小
  • CCSU:压缩类空间使用大小
  • YGC:年轻代垃圾回收次数
  • YGCT:年轻代垃圾回收消耗时间
  • FGC:老年代垃圾回收次数
  • FGCT:老年代垃圾回收消耗时间
  • GCT:垃圾回收消耗总时间

arthas 的 dashboard 命令可查看线程、内存、gc的情况

[arthas@20079]$ dashboard
ID     NAME                                   GROUP               PRIORITY     STATE        %CPU         DELTA_TIME    TIME         INTERRUPTED  DAEMON       
-1     GC task thread#0 (ParallelGC)          -                   -1           -            0.0          0.000         0:3.317      false        true         
-1     GC task thread#5 (ParallelGC)          -                   -1           -            0.0          0.000         0:3.307      false        true         
-1     GC task thread#2 (ParallelGC)          -                   -1           -            0.0          0.000         0:3.300      false        true         
-1     GC task thread#3 (ParallelGC)          -                   -1           -            0.0          0.000         0:3.299      false        true         
-1     GC task thread#6 (ParallelGC)          -                   -1           -            0.0          0.000         0:3.291      false        true         
-1     GC task thread#4 (ParallelGC)          -                   -1           -            0.0          0.000         0:3.270      false        true         
-1     GC task thread#1 (ParallelGC)          -                   -1           -            0.0          0.000         0:3.226      false        true         
-1     GC task thread#7 (ParallelGC)          -                   -1           -            0.0          0.000         0:3.224      false        true         
40     DestroyJavaVM                          main                5            RUNNABLE     0.0          0.000         0:2.474      false        false        
-1     C1 CompilerThread3                     -                   -1           -            0.0          0.000         0:1.515      false        true         
-1     VM Periodic Task Thread                -                   -1           -            0.0          0.000         0:1.428      false        true         
-1     VM Thread                              -                   -1           -            0.0          0.000         0:1.208      false        true         
57     arthas-NettyHttpTelnetBootstrap-3-2    system              5            RUNNABLE     0.0          0.000         0:0.223      false        true         
23     Catalina-utility-2                     main                1            WAITING      0.0          0.000         0:0.177      false        false        
22     Catalina-utility-1                     main                1            TIMED_WAITIN 0.0          0.000         0:0.170      false        false        
26     http-nio-8080-exec-1                   main                5            WAITING      0.0          0.000         0:0.160      false        true         
36     http-nio-8080-ClientPoller             main                5            RUNNABLE     0.0          0.000         0:0.114      false        true         
25     http-nio-8080-BlockPoller              main                5            RUNNABLE     0.0          0.000         0:0.079      false        true         
58     arthas-command-execute                 system              5            TIMED_WAITIN 0.0          0.000         0:0.067      false        true         
-1     C2 CompilerThread1                     -                   -1           -            0.0          0.000         0:0.052      false        true         
-1     C2 CompilerThread2                     -                   -1           -            0.0          0.000         0:0.048      false        true         
-1     C2 CompilerThread0                     -                   -1           -            0.0          0.000         0:0.048      false        true         
50     arthas-NettyHttpTelnetBootstrap-3-1    system              5            RUNNABLE     0.0          0.000         0:0.030      false        true         
27     http-nio-8080-exec-2                   main                5            WAITING      0.0          0.000         0:0.030      false        true         
Memory                            used       total      max         usage      GC                                                                             
heap                              85M        107M       107M        79.88%     gc.ps_scavenge.count                    7                                      
ps_eden_space                     21M        22M        22M         98.87%     gc.ps_scavenge.time(ms)                 101                                    
ps_survivor_space                 0K         21504K     21504K      0.00%      gc.ps_marksweep.count                   10                                     
ps_old_gen                        63M        64M        64M         99.56%     gc.ps_marksweep.time(ms)                3767                                   
nonheap                           53M        56M        360M        14.86%                                                                                    
code_cache                        6M         6M         240M        2.71%                                                                                     
metaspace                         41M        43M        64M         64.70%                                                                                    
compressed_class_space            5M         6M         56M         9.94%                                                                                     
direct                            16K        16K        -           100.01%                                                                                   
mapped                            0K         0K         -           0.00%                                                                                     
Runtime                                                                                                                                                       
os.name                                                                        Linux                                                                          
os.version                                                                     5.15.0-58-generic                                                              
java.version                                                                   1.8.0_341                                                                      
java.home                                                                      /opt/jdk1.8.0_341/jre                                                          
systemload.average                                                             0.60                                                                           
processors                                                                     8                                                                              
timestamp/uptime                                                               Mon Feb 12 20:14:43 CST 2023/1661s      

3.生成 dump 文件

  • jvm内存溢出OutOfMemoryError自动生成dump内存快照。需配置 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/heapdump.hprof
  • 手动触发生成 jmap -dump:format=b,file=/tmp/heapdump.hprof
  • arthas 命令 heapdump /tmp/heapdump.hprof

4.mat 分析 dump 文件

MAT 是 Memory Analyzer Tool 的缩写,是一种快速,功能丰富的Java堆分析工具,能帮助你查找内存泄漏和减少内存消耗

什么时候会用到 MAT?

  • OutOfMemoryError 的时候,触发 Full GC,但空间却回收不了,引发内存溢出
  • java应用服务器系统异常,比如load负载飙高,io异常,或者线程死锁等,都可能通过分析堆中的内存对象来定位原因

4.1 查看对象直方图

  • Shallow Heap: 类对象本身占用内存大小,不包含其引用的对象内存。List对象占用内存大小 4k
  • Retained Heap: 对象自己占用内存 + 关联引用对象占用大小。List对象占用内存大小 4k + User对象占用内存大小 123k注:如一个ArrayList持有100,000个对象,每一个占用16 bytes,移除这些ArrayList可以释放16 x 100,000 + X,X代表ArrayList的shallow大小。相对于shallow heap,RetainedHeap可以更精确的反映一个对象实际占用的大小(因为如果该对象释放,retained heap都可以被释放)

mat_histogram

4.2 对象引用关系

  • outgoing references:对象引用的外部对象(注意不包含对象的基本类型属性。基本属性内容可在 inspector 查看)。
  • incoming references:直接引用了当前对象的对象,每个对象的 incoming references 可能有 0 到多个。

mat_references

mat_incoming_references

4.3 发生 oom 线程栈

mat_to_gc_root

mat_to_gc_thread

mat_stack