Problem :
We are testing a springboot application in AWS with ELB in front.
After a while of load-testing, the application was hanging :
- HTTP 504 error code from Jmeter client
- HTTP 502 if we raise ELB timeout
- Once logged on the server :
- telnet localhost 8080 was OK
- sending GET / on this socket was not responding
- plenty of CLOSE_WAIT socket
- wget was also hanging (normal)
- connection was established during wget hang
- nothing in the log
Solution :
I initially think about the keepAlive timeout and pool of tomcat but
- SpringBoot copy the connectionTimeout parameter to keepAliveTimeout
- new socket is accepted and established
- CLOSE_WAIT wasn't shutdown after hour
Doing the test many times, I finally so a classical "Too many open files" in the log. That's why I could not see more log during the hang.
So we change the nproc and nofile in /etc/security/limits.conf
And taadaaaa ! Nothing change in :
cat /proc/<$PID>/limits
Thanks to blogs over the world like this one :
- the service is start with systemd
- to override ressources limits with systemd :
[Service]
...
LimitNOFILE=500000
LimitNPROC=500000
At last but not least, the value of Tomcat NIO socket queue is around 10000 + other files + other process... choose wisely your limit
Le problème :
Un EJB singleton qui sert de cache remonte une erreur avec JBoss 7.1.1 :
EJB Invocation failed on component XXXX for method public java.lang.Object:
javax.ejb.EJBTransactionRolledbackException: JBAS014373: EJB 3.1 PFD2 4.8.5.5.1
concurrent access timeout on org.jboss.invocation.InterceptorContext$Invocation@398e7b17
- could not obtain lock within 5000MILLISECONDS
Solution :
Tout est déjà écrit bien sur, encore aurait-il fallu le lire :)
Par défaut les EJB @Singleton sont
-
@ConcurrencyManagement(ConcurrencyManagementType.CONTAINER)
-
protégés par un @Lock(LockType.WRITE)
donc les méthodes sont synchronisés par le container. On peut
-
soit passer la méthode ou la classe en LockType.READ,
-
soit passer en Bean-managed (@ConcurrencyManagement(ConcurrencyManagementType.BEAN)). Là c'est vous qui gérer le tout.
Pour reproduire systématiquement, il suffit de mettre un sleep de 6s dans la méthode problématique.
Note:
-
Si le pool des ejb n'est pas assez grand on a un permit timeout et non un concurrent ascess timeout.
-
Le timeout sous JBoss est configuré dans cette section très intéressante :
<subsystem xmlns="urn:jboss:domain:ejb3:1.2">
<session-bean>
<stateless>
<bean-instance-pool-ref pool-name="slsb-strict-max-pool"/>
</stateless>
<stateful default-access-timeout="5000" cache-ref="simple"/>
<singleton default-access-timeout="5000"/>
</session-bean>
<pools>
<bean-instance-pools>
<strict-max-pool name="slsb-strict-max-pool" max-pool-size="20" instance-acquisition-timeout="5" instance-acquisition-timeout-unit="MINUTES"/>
<strict-max-pool name="mdb-strict-max-pool" max-pool-size="20" instance-acquisition-timeout="5" instance-acquisition-timeout-unit="MINUTES"/>
</bean-instance-pools>
</pools>
<caches>
<cache name="simple" aliases="NoPassivationCache"/>
<cache name="passivating" passivation-store-ref="file" aliases="SimpleStatefulCache"/>
</caches>
<passivation-stores>
<file-passivation-store name="file"/>
</passivation-stores>
<async thread-pool-name="default"/>
<timer-service thread-pool-name="default">
<data-store path="timer-service-data" relative-to="jboss.server.data.dir"/>
</timer-service>
<remote connector-ref="remoting-connector" thread-pool-name="default"/>
<thread-pools>
<thread-pool name="default">
<max-threads count="10"/>
<keepalive-time time="100" unit="milliseconds"/>
</thread-pool>
</thread-pools>
</subsystem>