1) Primary, the overhead of Spring JMS is the use of JmsTemplate to send messages wihtout a caching mechanism underneath. Essentially, JmsTemplate will do the following for each message you send:
- Create Connection
- Create Session
- Create Producer
- Create Message
- Send Message
- Close Session
- Close connection
This of could be compared to manually written code where you reuse things:
- Create Connection
- Create Session
- Create Producer
- Create Message
- Send Message
- Create Message
- Send Message
- Create Message
- Send Message
- Close Session
- Close connection
Since the creation of connections, sessions and producers needs communication between your client and the JMS provider and, of course, resource allocation, it will create pretty large overhead for lots of small messages.
You can easily come around this by caching JMS resources. For instance use the spring CachingConnectionFactory or ActiveMQs PooledConnectionFactory (if you are using ActiveMQ, which you tagged this question with).
If you are running inside a full JavaEE container, pooling/caching is often built in and implicit when you retrieve your JNDI connection factory.
When receving, using spring Default Message Listening Container, there is a thin layer in spring that might add little overhead, but the primary aspects is that you can tweak the performance in terms of concurrency etc. This article explains it very well.
2)
PubSub is a pattern of usage, where the publisher does not need to know which subscribers that exists. You can't simply emulate that with p2p. And, without any proof at hand, I would argu that if you want to send an identical message from one application to ten other applications, a pub-sub setup would be faster than to send the message ten times p2p.
On the other hand, if you only have one producer and one consumer, choose the P2P pattern with queues instead, since it's easier to manage in some aspects. P2P (queues) allows load balancing, which pub/sub does not (as easily).
ActiveMQ also has a hybride version, VirtualDestinations - which essentially is topics with load balancing.
The actual implementation differs by different vendors, but topics and queues are not fundamentally different and should behave with similar performance. What you instead should check on is:
- Persistence? (=slower)
- Message selectors? (=slower)
- Concurrency?
- Durable subscribers? (=slower)
- Request/reply, "synchronously" with temporary queues (= overhead = slower)
- Queue prefetching (=impacts performance in some aspects)
- Caching