Is there a way to cause stale connections to time out in ActiveMQ Artemis? I have a situation where the connections are accumulating and then I get the "newSocketStream(..) failed: Too many open files" error, which I think is due to the connections.
How should I diagnose this problem?
2021-01-28 01:20:39,492 WARN [io.netty.channel.DefaultChannelPipeline] An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.: io.netty.channel.unix.Errors$NativeIoException: accept(..) failed: Too many open files
2021-01-28 01:20:39,656 WARN [io.netty.channel.DefaultChannelPipeline] An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.: io.netty.channel.unix.Errors$NativeIoException: accept(..) failed: Too many open files
2021-01-28 01:20:39,937 ERROR [org.apache.activemq.artemis.core.client] AMQ214016: Failed to create netty connection: io.netty.channel.ChannelException: Unable to create Channel from class class io.netty.channel.epoll.EpollSocketChannel
at io.netty.channel.ReflectiveChannelFactory.newChannel(ReflectiveChannelFactory.java:46) [netty-all-4.1.48.Final.jar:4.1.48.Final]
at io.netty.bootstrap.AbstractBootstrap.initAndRegister(AbstractBootstrap.java:310) [netty-all-4.1.48.Final.jar:4.1.48.Final]
at io.netty.bootstrap.Bootstrap.doResolveAndConnect(Bootstrap.java:155) [netty-all-4.1.48.Final.jar:4.1.48.Final]
at io.netty.bootstrap.Bootstrap.connect(Bootstrap.java:139) [netty-all-4.1.48.Final.jar:4.1.48.Final]
at org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector.createConnection(NettyConnector.java:818) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector.createConnection(NettyConnector.java:785) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.openTransportConnection(ClientSessionFactoryImpl.java:1076) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.createTransportConnection(ClientSessionFactoryImpl.java:1125) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.establishNewConnection(ClientSessionFactoryImpl.java:1336) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.getConnection(ClientSessionFactoryImpl.java:931) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.getConnectionWithRetry(ClientSessionFactoryImpl.java:820) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.connect(ClientSessionFactoryImpl.java:252) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.connect(ClientSessionFactoryImpl.java:268) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl$StaticConnector$Connector.tryConnect(ServerLocatorImpl.java:1813) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl$StaticConnector.connect(ServerLocatorImpl.java:1682) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:536) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:524) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl$4.run(ServerLocatorImpl.java:482) [artemis-core-client-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42) [artemis-commons-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31) [artemis-commons-2.14.0.jar:2.14.0]
at org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:65) [artemis-commons-2.14.0.jar:2.14.0]
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [java.base:]
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [java.base:]
at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118) [artemis-commons-2.14.0.jar:2.14.0]
Caused by: java.lang.reflect.InvocationTargetException
at jdk.internal.reflect.GeneratedConstructorAccessor17.newInstance(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) [java.base:]
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490) [java.base:]
at io.netty.channel.ReflectiveChannelFactory.newChannel(ReflectiveChannelFactory.java:44) [netty-all-4.1.48.Final.jar:4.1.48.Final]
... 23 more
Caused by: io.netty.channel.ChannelException: io.netty.channel.unix.Errors$NativeIoException: newSocketStream(..) failed: Too many open files
at io.netty.channel.unix.Socket.newSocketStream0(Socket.java:421) [netty-all-4.1.48.Final.jar:4.1.48.Final]
at io.netty.channel.epoll.LinuxSocket.newSocketStream(LinuxSocket.java:319) [netty-all-4.1.48.Final.jar:4.1.48.Final]
at io.netty.channel.epoll.LinuxSocket.newSocketStream(LinuxSocket.java:323) [netty-all-4.1.48.Final.jar:4.1.48.Final]
at io.netty.channel.epoll.EpollSocketChannel.<init>(EpollSocketChannel.java:45) [netty-all-4.1.48.Final.jar:4.1.48.Final]
... 27 more
Caused by: io.netty.channel.unix.Errors$NativeIoException: newSocketStream(..) failed: Too many open files
This problem looks similar: SocketException : TOO MANY OPEN FILES
As for my use case, I'm receiving orders from a website and processing them into an ERP, then transmitting status back to the website and other systems. Sending messages back to the website API is a bit slow, and near the time of the incident there was maybe 700 messages queued.
The website uses AMQP and my message routing is down with JMS.
Here is the ulimit for the user that runs the broker.
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 63805
max locked memory (kbytes, -l) 16384
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 63805
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
My JVM memory setting: -Xms1024M -Xmx8G
And here is my broker.xml
<configuration xmlns="urn:activemq"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xi="http://www.w3.org/2001/XInclude"
xsi:schemaLocation="urn:activemq /schema/artemis-configuration.xsd">
<core xmlns="urn:activemq:core" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:activemq:core ">
<name>0.0.0.0</name>
<persistence-enabled>true</persistence-enabled>
<journal-type>NIO</journal-type>
<paging-directory>/nfs/amqprod/data/paging</paging-directory>
<bindings-directory>/nfs/amqprod/data/bindings</bindings-directory>
<journal-directory>/nfs/amqprod/data/journal</journal-directory>
<large-messages-directory>/nfs/amqprod/data/large-messages</large-messages-directory>
<journal-datasync>true</journal-datasync>
<journal-min-files>2</journal-min-files>
<journal-pool-files>10</journal-pool-files>
<journal-device-block-size>4096</journal-device-block-size>
<journal-file-size>10M</journal-file-size>
<journal-buffer-timeout>2628000</journal-buffer-timeout>
<journal-max-io>1</journal-max-io>
<disk-scan-period>5000</disk-scan-period>
<max-disk-usage>90</max-disk-usage>
<critical-analyzer>true</critical-analyzer>
<critical-analyzer-timeout>120000</critical-analyzer-timeout>
<critical-analyzer-check-period>60000</critical-analyzer-check-period>
<critical-analyzer-policy>HALT</critical-analyzer-policy>
<page-sync-timeout>2628000</page-sync-timeout>
<jmx-management-enabled>true</jmx-management-enabled>
<global-max-size>2G</global-max-size>
<acceptors>
<!-- keystores will be found automatically if they are on the classpath -->
<acceptor name="netty-ssl-acceptor">tcp://0.0.0.0:5500?sslEnabled=true;keyStorePath={path}/keystore.ks;keyStorePassword={pasword};protocols=CORE,AMQP,STOMP,HORNETQ,MQTT,OPENWIRE</acceptor>
<!-- Acceptor for every supported protocol -->
<acceptor name="artemis">tcp://0.0.0.0:61616?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;amqpMinLargeMessageSize=102400;protocols=CORE,AMQP,STOMP,HORNETQ,MQTT,OPENWIRE;useEpoll=true;amqpCredits=1000;amqpLowCredits=300;amqpDuplicateDetection=true</acceptor>
</acceptors>
<!-- HA -->
<connectors>
<connector name="artemis">tcp://{Primary IP}:61616</connector>
<connector name="artemis-backup">tcp://{Secondary IP}:61616</connector>
</connectors>
<cluster-user>activemq</cluster-user>
<cluster-password>{cluster password}</cluster-password>
<ha-policy>
<shared-store>
<master>
<failover-on-shutdown>true</failover-on-shutdown>
</master>
</shared-store>
</ha-policy>
<cluster-connections>
<cluster-connection name="cluster-1">
<connector-ref>artemis</connector-ref>
<!--<discovery-group-ref discovery-group-name="discovery-group-1"/>-->
<static-connectors>
<connector-ref>artemis-backup</connector-ref>
</static-connectors>
</cluster-connection>
</cluster-connections>
<!-- HA -->
<security-settings>
<security-setting match="#">
<permission type="createNonDurableQueue" roles="amq"/>
<permission type="deleteNonDurableQueue" roles="amq"/>
<permission type="createDurableQueue" roles="amq"/>
<permission type="deleteDurableQueue" roles="amq"/>
<permission type="createAddress"