Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
220 views
in Technique[技术] by (71.8m points)

java - Difference between intel and AMD multithreading

I have an application meant for data transfer between 2 databases. Most of the operations of this application are independent and runs concurrently. Earlier this application was running on 4 core intel machine and now this application needs to be ported onto AMD quad(4) core machine. I am doubtful about couple of points below.

  1. I found AMD does not support hyper threading(HTT), this obviously means application performance (throughput) will degrade. Will performance degrade due to Context Switching, If yes will decreasing number of threads running concurrently help ??

  2. Whether any code changes are required from my side to increase application throughput.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Instead of hyperthreading, AMD took an alternative route as of Bulldozer called (by some) clustering.
As explained in the link MinGW brought, this means that a single AMD core can now sustain 2 integer "HW threads" (much like HT) + one floating point dedicated one. Note that unlike HT which shares all core resources between the HW threads, only the frontend (instruction fetch & decode) is shared in this scheme. The backend is duplicated, meaning that you should be able to get 2x resource BW than HT if you were backend-bound (execution was taking most of the time for you), and roughly the same as HT if you're frontend bound (for e.g. you have a complex control flow with multiple branches).

Notice the following quote saying pretty much the same:

All else being the same, it should give you more threaded performance than a single SMT (Hyper Threaded) core but less than two dedicated cores

So essentially each HW thread now is more than a single Intel HW thread, but less than a full intel core. You can either consider it as a super HW thread, or a lame core, depending on your personal preference.

However, and this is a big "however", AMD used to cheat a bit here - they published core counts based on these "super" threads and not the actual combos (newly dubbed as a "module"). This means that a 4-core AMD machine actually has 2 modules, with 4 super threads, and would therefore have the same HW thread count as a 2-core Intel machine (although with stronger threads), but half of the threads on a 4-core Intel machine with HT enabled. You did not specify which machine you intend to use so make sure the core count has the right meaning.

The performance may vary as I said above - for execution intensive workloads you may see similar results between 4-core AMD and 4-core Intel since you have the same number of parallel pipelines, and HT may not help Intel much (although "may" is used here in a very wide sense - a better comparison would take into account the sizes of different buffers on each machine, number of parallel ALUs and ports, issue width, etc..). On the other hand, on branchy or memory intensive workloads, where you tend to get stuck a lot waiting for data/branch resolutions - Intel can pull in the extra 4 HW threads in parallel without any overhead for context switching, getting more work done.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...