I've hit this issue as well. It basically boils down to having variable number of values in your IN clause and Hibernate trying to cache those query plans.
There are two great blog posts on this topic.
The first:
Using Hibernate 4.2 and MySQL in a project with an in-clause query
such as: select t from Thing t where t.id in (?)
Hibernate caches these parsed HQL queries. Specifically the Hibernate
SessionFactoryImpl
has QueryPlanCache
with queryPlanCache
and
parameterMetadataCache
. But this proved to be a problem when the
number of parameters for the in-clause is large and varies.
These caches grow for every distinct query. So this query with 6000
parameters is not the same as 6001.
The in-clause query is expanded to the number of parameters in the
collection. Metadata is included in the query plan for each parameter
in the query, including a generated name like x10_, x11_ , etc.
Imagine 4000 different variations in the number of in-clause parameter
counts, each of these with an average of 4000 parameters. The query
metadata for each parameter quickly adds up in memory, filling up the
heap, since it can't be garbage collected.
This continues until all different variations in the query parameter
count is cached or the JVM runs out of heap memory and starts throwing
java.lang.OutOfMemoryError: Java heap space.
Avoiding in-clauses is an option, as well as using a fixed collection
size for the parameter (or at least a smaller size).
For configuring the query plan cache max size, see the property
hibernate.query.plan_cache_max_size
, defaulting to 2048
(easily too
large for queries with many parameters).
And second (also referenced from the first):
Hibernate internally uses a cache that maps HQL statements (as
strings) to query plans. The cache consists of a bounded map limited
by default to 2048 elements (configurable). All HQL queries are loaded
through this cache. In case of a miss, the entry is automatically
added to the cache. This makes it very susceptible to thrashing - a
scenario in which we constantly put new entries into the cache without
ever reusing them and thus preventing the cache from bringing any
performance gains (it even adds some cache management overhead). To
make things worse, it is hard to detect this situation by chance - you
have to explicitly profile the cache in order to notice that you have
a problem there. I will say a few words on how this could be done
later on.
So the cache thrashing results from new queries being generated at
high rates. This can be caused by a multitude of issues. The two most
common that I have seen are - bugs in hibernate which cause parameters
to be rendered in the JPQL statement instead of being passed as
parameters and the use of an "in" - clause.
Due to some obscure bugs in hibernate, there are situations when
parameters are not handled correctly and are rendered into the JPQL
query (as an example check out HHH-6280). If you have a query that is
affected by such defects and it is executed at high rates, it will
thrash your query plan cache because each JPQL query generated is
almost unique (containing IDs of your entities for example).
The second issue lays in the way that hibernate processes queries with
an "in" clause (e.g. give me all person entities whose company id
field is one of 1, 2, 10, 18). For each distinct number of parameters
in the "in"-clause, hibernate will produce a different query - e.g.
select x from Person x where x.company.id in (:id0_)
for 1 parameter,
select x from Person x where x.company.id in (:id0_, :id1_)
for 2
parameters and so on. All these queries are considered different, as
far as the query plan cache is concerned, resulting again in cache
thrashing. You could probably work around this issue by writing a
utility class to produce only certain number of parameters - e.g. 1,
10, 100, 200, 500, 1000. If you, for example, pass 22 parameters, it
will return a list of 100 elements with the 22 parameters included in
it and the remaining 78 parameters set to an impossible value (e.g. -1
for IDs used for foreign keys). I agree that this is an ugly hack but
could get the job done. As a result you will only have at most 6
unique queries in your cache and thus reduce thrashing.
So how do you find out that you have the issue? You could write some
additional code and expose metrics with the number of entries in the
cache e.g. over JMX, tune logging and analyze the logs, etc. If you do
not want to (or can not) modify the application, you could just dump
the heap and run this OQL query against it (e.g. using mat): SELECT l.query.toString() FROM INSTANCEOF org.hibernate.engine.query.spi.QueryPlanCache$HQLQueryPlanKey l
. It
will output all queries currently located in any query plan cache on
your heap. It should be pretty easy to spot whether you are affected
by any of the aforementioned problems.
As far as the performance impact goes, it is hard to say as it depends
on too many factors. I have seen a very trivial query causing 10-20 ms
of overhead spent in creating a new HQL query plan. In general, if
there is a cache somewhere, there must be a good reason for that - a
miss is probably expensive so your should try to avoid misses as much
as possible. Last but not least, your database will have to handle
large amounts of unique SQL statements too - causing it to parse them
and maybe create different execution plans for every one of them.