PROBLEM
I have recently been put in charge of a Java web application with code already written and in place. The app receives moderately high traffic and has peak hours of traffic between 11am to 3pm everyday.
The application uses Spring, JPA(Hibernate), MYSQL DB.
Spring has been configured to use tomcat jdbc connection pool to make connections to the DB.
(Details of configuration at the end of the post)
For the past few days, during the application's peak load hours, the application has been going down due to tomcat going unresponsive to requests. It has required tomcat to be restarted multiple times.
Going through the tomcat catalina.out logs, I noticed a whole lot of
Caused by: java.sql.SQLException: Connection has already been closed.
at org.apache.tomcat.jdbc.pool.ProxyConnection.invoke(ProxyConnection.java:117)
at org.apache.tomcat.jdbc.pool.JdbcInterceptor.invoke(JdbcInterceptor.java:109)
at org.apache.tomcat.jdbc.pool.DisposableConnectionFacade.invoke(DisposableConnectionFacade.java:80)
at com.sun.proxy.$Proxy28.prepareStatement(Unknown Source)
at org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:505)
at org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:423)
at org.hibernate.jdbc.AbstractBatcher.prepareQueryStatement(AbstractBatcher.java:139)
at org.hibernate.loader.Loader.prepareQueryStatement(Loader.java:1547)
at org.hibernate.loader.Loader.doQuery(Loader.java:673)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:236)
at org.hibernate.loader.Loader.loadCollection(Loader.java:1994)
... 115 more
These appear frequently just before the crash.
Going further earlier before these exceptions, I noticed a whole lot of Connections being abandoned just before the Connection Closed exceptions.
WARNING: Connection has been abandoned PooledConnection[com.mysql.jdbc.Connection@543c2ab5]:java.lang.Exception
at org.apache.tomcat.jdbc.pool.ConnectionPool.getThreadDump(ConnectionPool.java:1065)
at org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:782)
at org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:618)
at org.apache.tomcat.jdbc.pool.ConnectionPool.getConnection(ConnectionPool.java:188)
at org.apache.tomcat.jdbc.pool.DataSourceProxy.getConnection(DataSourceProxy.java:128)
at org.hibernate.ejb.connection.InjectedDataSourceConnectionProvider.getConnection(InjectedDataSourceConnectionProvider.java:47)
at org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:423)
at org.hibernate.jdbc.ConnectionManager.getConnection(ConnectionManager.java:144)
at org.hibernate.jdbc.AbstractBatcher.prepareQueryStatement(AbstractBatcher.java:139)
These seem to appear frequently just before the Connection Closed exceptions. And these seem to be the first symptoms of impending doom in the logs.
ANALYSIS
Going by the logs, I set out to see if there is any connection pool configuration/mysql configuration that might be causing the issue.
Went through a couple of excellent articles that show tuning of the pool for Production environment. Links 1 & 2
Going by these articles, I noticed that:
The below line in JHanik's article (link 1) mentions this
Setting the value of abandonWhenPercentageFull to 100 would mean that connections are not > considered abandoned unless we've reached our maxActive limit.
I figured this might be important in my case, because I see a lot of connections being abandoned.
My max_connections setting does not match what is being recommended (in link 2)
mysql max_connections should be equal to max_active+max_idle
WHAT I TRIED
So, as per the recommendations from the articles, I did the following two things:
- Changed abandonWhenPercentageFull to 100
- In my MYSQL server, max_connections was set as 500. Increased it to 600
In my connection pool settings, max_active was 200 and max_idle was 50.
Changed it to max_active=350, max_idle = 250
THIS DID NOT HELP
The next day, the following observations were made during peak hours:
- Tomcat did not go down. The app stayed up throughout peak hours.
However the performance went from bad to worse and then the app was barely usable even though it did not really go down.
- DB Connection pool, though increased in size, got completely utilized, and I could see 350 active connections to the DB at one point.
FINALLY, MY QUESTION:
It clearly looks like there are issues with the way DB connections are being made from the app server.
So I have two directions to take this analysis forward.
My question is which of these should I be taking?
1. The issue is not with the connection pool settings. The code is what is causing the issue
There might be places in the code where DB connections are not being closed. Which is causing the high number of connections being open.
The code uses a GenericDao which is extended in every Dao class. The GenericDao uses Spring's JpaTemplate to fetch a EntityManager instance which in turn is used for all DB operations. My understanding is using the JpaTemplate handles the nitty gritty of closing DB connections internally.
So, where exactly should I looking for possible connection leaks?
2. The issue is with the connections pool/mysql config parameters. However, the optimizations I have put in need to be tuned further
If yes, which parameters should I be looking at?
Should I be collecting some data to use to determine more appropriate values for my connection pool. (For eg, for max_active, max_idle, max_connections)
Addendum: Complete Connection Pool configuration
<bean id="dataSource" class="org.apache.tomcat.jdbc.pool.DataSource" destroy-method="close">
<property name="driverClassName" value="com.mysql.jdbc.Driver" />
<property name="url" value="jdbc:mysql://xx.xx.xx.xx" />
<property name="username" value="xxxx" />
<property name="password" value="xxxx" />
<property name="initialSize" value="10" />
<property name="maxActive" value="350" />
<property name="maxIdle" value="250" />
<property name="minIdle" value="90" />
<property name="timeBetweenEvictionRunsMillis" value="30000" />
<property name="removeAbandoned" value="true" />
<property name="removeAbandonedTimeout" value="60" />
<property name="abandonWhenPercentageFull" value="100" />
<property name="testOnBorrow" value="true" />
<property name="validationQuery" value="SELECT 1" />
<property name="validationInterval" value="30000" />
<property name="logAbandoned" value="true" />
<property name="jmxEnabled" value="true" />
</bean>
See Question&Answers more detail:
os