History of a bug

Upgrade debian et lost network

Rédigé par gorki Aucun commentaire

Problem :

I manage a dedicated server in OVH and I upgrade my debian from jessie to buster. Upgrade works quite well (it seems...) and I try to restart.

Server reboot fails as unreachable, fortunately OVH rescue mode allows me to login.

I check error log and first lost myself in RAID error message, but it was more simple than that.

Solution :

I check the /etc/network/interfaces file, it was OK

I check the logs files, clean, reboot, check again, still OK except that network was unreachable for named.

I finally remember that Debian switch to systemD in latest version so I tried to create system networking file manually : too complicate, it was not working.

In rescue mode, you can access your files as a mounted point so usual commands as systemctl does not work.

The solution was to chroot a shell :

  1. chroot /mnt/md2 bash
  2. systemctl enable networking

And it works...

Now I have to check all other system to be sure that everything is working...

Begining with :

sudo apt-get update

sudo apt-get clean

sudo apt-get autoremove

sudo apt-get update && sudo apt-get upgrade

sudo dpkg --configure -a


Montage Samba et erreurs à côté de la plaque

Rédigé par gorki Aucun commentaire

Le problème :

Dans deux cas, j'exécute la commande mount via ansible pour monter un partage samba sur deux clients Linux.

Dans les deux cas, le montage est en échec pour cause :

  • soit "CIFS VFS: validate protocol negotiate failed: -13"

Solution :

Les erreurs venaient à chaque fois de ma configuration Samba côté serveur.

Les tests que j'ai réalisé pour retrouver la cause :

  1. Tester le montage du partage à partir de Windows (j'obtenais la même erreur : le problème vient du serveur)
  2. A partir des linux : smbclient -L <monserveur> -A /path/to/mycredentials
    • ça, ça marchait dans 1 cas, dans l'autre cas, le nom du partage était mauvais : tilt ! (n°1)
  3. Avec le user samba, je suis aller dans le répertoire partagé pour vérifier que j'avais bien les droits
    • et là ça ne marchait pas pour le 2ème cas (negociation failed)

Après avoir remis les droits pour l'un, corrigé mon template ansible pour l'autre, tout marche.

Pour info la configuration mis en place :

  workgroup = SAMBA
  security = user
  unix password sync = no
  log file = /var/log/samba/log.%m
  guest account = {{samba_user.user}}
  force group = {{samba_user.group}}
  force user = {{samba_user.user}}
  create mode = 0660
  directory mode = 0770

  valid users={{samba_user.user}}
  browseable = yes
  force create mode = 0660
  force directory mode = 0770

Et pour autoriser les users à se connecter via leur compte unix et toujours autoriser ce user générique à accéder aux fichiers :

# Ensure all files are owned by {{ samba_user.user }}
  shell: "chown -R {{ samba_user.user }}:{{ samba_user.group }} {{samba_share.export_path}}" 

# Ensure sticky bit is present on all directories
  shell: "find {{ samba_share.export_path }} -type d -exec chmod g+s {} +" 

# Add default rw for default group on {{ samba_share.export_path }}
  shell: "setfacl -m d:g::rwx {{ samba_share.export_path }}" 

# Add default rw for default group on subdirectories
  shell: "find {{ samba_share.export_path }} -type d -exec setfacl -m d:g::rwx {} +" 

Merci à eux :


SystemD and tomcat hang on startup

Rédigé par gorki Aucun commentaire

Problem :

I used robertdebock/ansible-role-tomcat to install a Tomcat instance using Ansible. Works well until I deploy an application on it. Then java process hangs with 100% system CPU.

Starting with tomcat users without system work correctly.

Solution :

I suspected :

  • SELinux
  • Linux limits
  • VM slow I/O

But after a while I ran strace :

  • by modifying systemd configuration
  • by modifying catalina.sh configuration

All I have was a simple FUTEX wait...

And then I read the manual, as simple as :

strace -f -e trace=all -p <PID>

No need to trace from startup and by default, not all is traced...

After that, easy way, the process was reading recursively :


Just fixing the working_directory in the ansible role, and all is working.

Issue reported here.


SpringBoot 2, OAUTH 2 and tokenStore

Rédigé par gorki Aucun commentaire

Problem :

Following the previous post (this one and this one) I configured a Authorization and Resource server on a a same JVM.

All was working well in my local machine, but when I send the springboot on the server, I get an "Invalid access token". The authorization request was accepted, I get an access token but it was refused by Resource server.


Solution :

I activate remote DEBUG, perform tests (I also had a reverse proxy but that's was not the problem). The issue was due to the tokenStore which :

  • on my local machine was the same instance
  • on the server was two different instance

In fact depending on the bean initialization order, the token store could be shared or not according to optional Autowire field of the authorization and resource server. If not available at init time, it could use a local instance.

So here is my updated configuration :

My AuthorizationServer is now :

package com.example;

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.security.oauth2.config.annotation.configurers.ClientDetailsServiceConfigurer;
import org.springframework.security.oauth2.config.annotation.web.configuration.AuthorizationServerConfigurerAdapter;
import org.springframework.security.oauth2.config.annotation.web.configuration.EnableAuthorizationServer;
import org.springframework.security.oauth2.config.annotation.web.configurers.AuthorizationServerSecurityConfigurer;
import org.springframework.security.oauth2.provider.token.TokenStore;
import org.springframework.security.oauth2.provider.token.store.InMemoryTokenStore;

public class AuthorizationServerConfig extends AuthorizationServerConfigurerAdapter {

    public TokenStore tokenStore;

    public void configure(ClientDetailsServiceConfigurer clients) throws Exception {
        clients.withClientDetails(new MyClientDetailsService());

    public void configure(AuthorizationServerSecurityConfigurer security) throws Exception {

    public void configure(AuthorizationServerEndpointsConfigurer endpoints) throws Exception {

And my resource server :

package com.hexagon.hpa.security;

import org.springframework.context.annotation.Configuration;
import org.springframework.security.config.annotation.web.builders.HttpSecurity;
import org.springframework.security.oauth2.config.annotation.web.configuration.EnableResourceServer;
import org.springframework.security.oauth2.config.annotation.web.configuration.ResourceServerConfigurerAdapter;
import org.springframework.security.oauth2.config.annotation.web.configurers.ResourceServerSecurityConfigurer;
import org.springframework.security.oauth2.provider.error.OAuth2AccessDeniedHandler;

public class ResourceServerConfig extends ResourceServerConfigurerAdapter {

    private static final String RESOURCE_ID = "RESSOURCE_ID";

    TokenStore tokenStore;

    public void configure(ResourceServerSecurityConfigurer resources) {


    public void configure(HttpSecurity http) throws Exception {
                .and().exceptionHandling().accessDeniedHandler(new OAuth2AccessDeniedHandler());


A simple configurer :

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.security.oauth2.provider.token.TokenStore;
import org.springframework.security.oauth2.provider.token.store.InMemoryTokenStore;

public class TokenStoreProvider {

    public TokenStore tokenStore() {
        return new InMemoryTokenStore();


Tomcat, NIO, Hanging et CLOSE_WAIT

Rédigé par gorki Aucun commentaire

Problem :

We are testing a springboot application in AWS with ELB in front.

After a while of load-testing, the application was hanging :

  • HTTP 504 error code from Jmeter client
  • HTTP 502 if we raise ELB timeout
  • Once logged on the server :
    • telnet localhost 8080 was OK
    • sending GET / on this socket was not responding
    • plenty of CLOSE_WAIT socket
    • wget was also hanging (normal)
    • connection was established during wget hang
    • nothing in the log


Solution :


I initially think about the keepAlive timeout and pool of tomcat but

  1. SpringBoot copy the connectionTimeout parameter to keepAliveTimeout
  2. new socket is accepted and established
  3. CLOSE_WAIT wasn't shutdown after hour

Doing the test many times, I finally so a classical "Too many open files" in the log. That's why I could not see more log during the hang.

So we change the nproc and nofile in /etc/security/limits.conf

And taadaaaa ! Nothing change in :

cat /proc/<$PID>/limits

Thanks to blogs over the world like this one :

  • the service is start with systemd
  • to override ressources limits with systemd :

At last but not least, the value of Tomcat NIO socket queue is around 10000 + other files + other process... choose wisely your limit

Fil RSS des articles