Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
293 views
in Technique[技术] by (71.8m points)

java - Profiled app with YourKit, still can't identify the CPU hog

I've got a Java application that is consuming 100% of the CPU most of the time (as indicated by cacti and top monitoring). We fired up YourKit (which confirms the CPU resource issue) and it identifies java.net.SocketInputStream.read(byte[], int, int) as the biggest hot spot at 15% of time. I believe they aren't accurately measuring CPU time for methods that perform blocking IO like SocketInputStream.read would.

There are 6 other identified hot spots, but they account for less than 20% of accounted for CPU time combined. all in the 5%-1% range.

So I know I have a problem, I can see the problem, YourKit does too, but I am no closer to identifying the actual problem.

I am pretty new to using a profiler, and am most likely missing something. Any ideas?

EDIT: Sean makes a good point about using tools built into the system. If I use top and shift+h to view threads, it displays anywhere from 7-15 threads, and the CPU utilization jumps around. I don't believe it's any one thread that is causing the problem, rather it is a piece of code each thread executes at some time.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I would recommend running this on a Solaris box if you can. If you don't have a Solaris box consider setting a Virtual Machine up with Open Solaris running on it.

Solaris offers a tool called prstat

Prstat works much like top which most people are familiar with. The important difference is prstat can break the processes up for you and show each thread within a process.

For your case the usage would be prstat -L 0 1

Paired with a thread dump (doing this in a script is preferred) you can match the LWPID together to find exactly which thread is the CPU hog.

Here is a functional example (I created a small app going in a big loop for poc)

Standard Top will show you something like the following

 PID USERNAME NLWP PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
  924 username   10  59    0   31M   11M run      0:53 36.02% java

Then using prstat The following command was used

 prstat -L 0 1 | grep java > /export/home/username/Desktop/output.txt

And the output from prstat

PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/LWPID    
924 username   31M   10M run     30    0   0:00:09  35% java/10
924 username   31M   10M sleep   59    0   0:00:00 0.8% java/3
924 username   31M   10M sleep   59    0   0:00:00 0.6% java/2
924 username   31M   10M sleep   59    0   0:00:00 0.3% java/1

This may not look much different then top, but if you notice to the right side of the data, the PROCESS/LWPID is telling you the exact thread within the java process which is consuming the CPU. the thread running with the light weight process id (lwpid) 10 is consuming 35% of the CPU. As I mentioned before, if you pair this with a thread dump, you can find the exact thread. For my case, this is the relevant portion of the thread dump

"Thread-0" prio=3 tid=0x08173800 nid=0xa runnable [0xc60fc000..0xc60fcae0]
   java.lang.Thread.State: RUNNABLE
    at java.util.Random.next(Random.java:139)
    at java.util.Random.nextInt(Random.java:189)
    at ConsumerThread.run(ConsumerThread.java:13)

On the top line of the thread, the nid can be matched to the LWPID. nid=0xa (which is 10 in dec when converted from Hex)

If you can put the prstat and thread dump commands in a script and run it 4-5 times during high CPU usages you will begin to see patterns and able to determine the cause of your high CPU that way.

In my time, I have seen this result from long running gc times to a misconfiguration of an LDAP connection. Have fun :)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...