这篇主要讲如何分析高内存和高CPU。
1、如何分析高内存
注:如果抓Dump的同时,刚好在执行GC,抓出来的Dump执行命令多半会出错,用!VerifyHeap也能验证Dump有误,这种情况只能重新抓Dump。报错如下:
The garbage collector data structures are not in a valid state for traversal.
It is either in the "plan phase," where objects are being moved around, or
we are at the initialization or shutdown of the gc heap. Commands related to
displaying, finding or traversing objects as well as gc heap segments may not
work properly. !dumpheap and !verifyheap may incorrectly complain of heap
consistency errors.
Could not request method table data for object 6E8B4D74 (MethodTable: FFE00F74).
procdump -ma TranProc //用procdump工具抓dump,用windbg打开,加载sos扩展
!eeheap [-gc] [-loader] //可以先用这个初步判断一下是否确实有高内存,主要在哪里。一般LoaderHeap较小,问题多半出在GC Heap里。
Total LoaderHeap size: Size: 0x2f000 (192512) bytes.
=======================================
Number of GC Heaps: 1
generation 0 starts at 0x02d01018
generation 1 starts at 0x02d0100c
generation 2 starts at 0x02d01000
ephemeral segment allocation context: none
segment begin allocated size
02d00000 02d01000 02e99ff4 0x198ff4(1675252)
Large object heap starts at 0x03d01000
segment begin allocated size
03d00000 03d01000 03d12408 0x11408(70664)
Total Size: Size: 0x1aa3fc (1745916) bytes.
------------------------------
GC Heap Size: Size: 0x1aa3fc (1745916) bytes.
!DumpHeap -stat //能看到占最大的对象是string,有143万个,占了171M内存
0x00007ffbae467df0 1,435,313 171,657,048 System.String
!DumpHeap -stat -gen 0 // -gen 参数只有psscor2里的DumpHeap才支持!
!DumpHeap -stat -gen 1
!DumpHeap -stat -gen 2 // 分别查看3个gen的统计情况,能看到主要集中在gen 2上,与perfmon显示的情况一致
!DumpHeap -stat -type System.String //针对这个类型做统计,也能看到类似的结果
MT Count TotalSize Class Name
0x00007ffbae467df0 1,435,313 171,657,048 System.String
Total 1,435,313 objects, Total size: 171,657,048
//现在的关键是:为什么程序会产生这么多string对象,并且没回收?谁(哪些对象)引用了这些string,然后在code里找到这些变量名,分析代码的写法是否有问题
!DumpHeap -type System.String
Address MT Size
02e928b8 6fbd2300 72
02e92900 6fbd2300 1168
可以找Size最大的几个对象,用!do 02e92900 查看这个对象为什么产生了这么多string。之后配合!gcroot 02e92900 看一下这个对象的根在哪里,为什么没回收。
!gcroot -mt 0x00007ffbae467df0 //在不知道对象实例,只知道对象类型的时候,用!gcroot -mt [MethodTable] 非常有用,能找到是哪些对象(包括<object address>,可以用!do、!DumpMT)创建了大量的目标对象(在这里是string),可惜不能统计出哪个Container里含有多少个目标对象。但可以看到引用链Reporter->Report->List<string>->object[],另一个引用链是AppDomain->AppDomainSetup->object[],问题当然出在我们的代码里,于是分析Reporter和Report的源码解决问题。如果没有dll,也可以SaveModule出来再用ILSpy打开。
Scan Thread 0 OSThread c10
RSP:cee858:Root: 0000000002b837c8(TranProc.Reporter)->
0000000002b83768(TranProc.Report)->
0000000002b83780(System.Collections.Generic.List 1[[System.String, mscorlib]])->
0000000013b85470(System.Object[])
……
Scan Thread 4 OSThread b90
rdx:Root: 00000000123de5f8(System.String)
r8:Root: 00000000123de5f8(System.String)
DOMAIN(0000000000E50EE0):HANDLE(Strong):213f8:Root: 0000000002b81478(System.AppDomain)->
0000000002b81560(System.AppDomainSetup)->
0000000002b815b0(System.Object[])
!do 0000000002b83768
Name: TranProc.Report
MethodTable: 00007ffb50dc3718
EEClass: 00007ffb50ef3640
Size: 24(0x18) bytes
GC Generation: 2
(D:\IT\debug\Exercise\Labdata\lab6\TranProc\Debug\TranProc.exe)
Fields:
MT Field Offset Type VT Attr Value Name
0000000000000000 4000007 8 0 instance 0000000002b83780 reportDetails
//最后,可以通过SaveModule把dll保存出来仔细查看,通过这个过程,基本可以确定高内存的问题所在。
!SaveModule 00400000 d:\ProcessItems.dll
2、如何分析高CPU
高CPU,一般伴随着死锁或线程间同步问题。可以从下面对dump的分析,看出一些思路。
!syncblk //可以看到ba8号线程持有一个锁AsyncRendering<EventHandlerObject>,地址为000000000559f920
Index SyncBlock MonitorHeld Recursion Owning Thread Info SyncBlock Owner
802 000000001e086658 11 1 000000000404cf30 ba8 10 000000000559f920 Microsoft.Windows.ManagementUI.CombinedControls.AsyncRendering+EventHandlerObject
!threads //能观察到有2个Domain(406e3b0、40f8d10),每个分别有一个STA的UI线程
ID OSID ThreadOBJ State GC GC Alloc Context Domain Count APT Exception
0 1 10d8 0000000004076870 4220 Enabled 00000000057713c8:0000000005771430 000000000406e3b0 0 STA System.IO.FileNotFoundException (0000000005205eb0)
7 3 1e90 00000000040f25d0 2007020 Enabled 0000000005776a10:0000000005777430 00000000040f8d10 2 STA
10 6 ba8 000000000404cf30 380b220 Enabled 00000000056ee470:00000000056efb50 00000000040f8d10 2 MTA (Threadpool Worker)
~10s //进入10号线程
!clrstack //观察线程上的堆栈,看起来在等一个WaitHandle,其他也看不出什么了
000000001e97e760 000007feebb7250b System.Threading.WaitHandle.WaitOne(Int64, Boolean)
000000001e97e7a0 000007feeb537707 System.Windows.Forms.Control.WaitForWaitHandle(System.Threading.WaitHandle)
~7s //进入7号线程
!clrstack //都是MIGUIControls.dll里的代码,Save出来看看!
Microsoft.Windows.ManagementUI.CombinedControls.AsyncRendering.RenderValueInt(Microsoft.Windows.ManagementUI.CombinedControls.RenderingContext, Microsoft.Windows.ManagementUI.CombinedControls.RenderValueHandler, Boolean, System.String ByRef)
Microsoft.Windows.ManagementUI.CombinedControls.AsyncRendering.AddRenderingToBatch(Int32, Microsoft.Windows.ManagementUI.CombinedControls.RenderingContext, Microsoft.Windows.ManagementUI.CombinedControls.RenderValueHandler)
Microsoft.Windows.ManagementUI.CombinedControls.CrimsonEvent.GetEventMessage(Boolean, Microsoft.Windows.ManagementUI.CombinedControls.RenderValueHandler, Int32, Int32)
Microsoft.Windows.ManagementUI.CombinedControls.CrimsonEvent.GetMessage(Microsoft.Windows.ManagementUI.CombinedControls.RenderValueHandler, Int32, Int32)
!runaway //看一下各线程执行时间,hang的问题要么是死锁、要么是无限循环?、要么是性能问题执行太慢?呵呵。结果表明7号线程执行的比较久, 10号持有锁
User Mode Time
Thread Time
7:1e90 0 days 0:00:02.090
0:10d8 0 days 0:00:00.733
10:ba8 0 days 0:00:00.655
!SaveAllModules d:\ //先把dll保存出来,总归是慢在用户自己的代码上面,应该就是这个MIGUIControls.dll,执行慢的7号线程、和持有锁的10号线程。具体问题如何分析,还需要总结经验。
|
请发表评论