Tomcat, NIO, Hanging et CLOSE_WAIT
Rédigé par gorki Aucun commentaireProblem :
We are testing a springboot application in AWS with ELB in front.
After a while of load-testing, the application was hanging :
- HTTP 504 error code from Jmeter client
- HTTP 502 if we raise ELB timeout
- Once logged on the server :
- telnet localhost 8080 was OK
- sending GET / on this socket was not responding
- plenty of CLOSE_WAIT socket
- wget was also hanging (normal)
- connection was established during wget hang
- nothing in the log
Solution :
I initially think about the keepAlive timeout and pool of tomcat but
- SpringBoot copy the connectionTimeout parameter to keepAliveTimeout
- new socket is accepted and established
- CLOSE_WAIT wasn't shutdown after hour
Doing the test many times, I finally so a classical "Too many open files" in the log. That's why I could not see more log during the hang.
So we change the nproc and nofile in /etc/security/limits.conf
And taadaaaa ! Nothing change in :
cat /proc/<$PID>/limits
Thanks to blogs over the world like this one :
- the service is start with systemd
- to override ressources limits with systemd :
[Service]
...
LimitNOFILE=500000
LimitNPROC=500000
At last but not least, the value of Tomcat NIO socket queue is around 10000 + other files + other process... choose wisely your limit