I meet a very strange behaviour on my CI/CD Linux machine (running 16.04.4 LTS + Gitlab CI in Docker container) and I’m looking for a “path of debug” for the problem.
And the problem is that after every reboot my Gitlab CI’s container cannot start, because port 443 (which it’s supposed to use). Netstat shows:
~$ netstat -ano | grep 443 tcp6 0 0 :::443 :::* LISTEN off (0.00/0/0)
I tried to use
tcpkill and many more solutions I’ve found. None of them worked actually. It looks like it is always in use by PID 1.
But then I decided to execute
nmap 127.0.0.1, which showed me:
~$ nmap 127.0.0.1 Starting Nmap 7.01 ( https://nmap.org ) at 2019-03-18 09:31 CET Nmap scan report for localhost (127.0.0.1) Host is up (0.00015s latency). Not shown: 997 closed ports PORT STATE SERVICE 22/tcp open ssh 443/tcp open https 5900/tcp open vnc Nmap done: 1 IP address (1 host up) scanned in 0.15 seconds
And after that… the port became free – the second execution of this command shows:
~$ nmap 127.0.0.1 Starting Nmap 7.01 ( https://nmap.org ) at 2019-03-18 09:31 CET Nmap scan report for localhost (127.0.0.1) Host is up (0.00016s latency). Not shown: 998 closed ports PORT STATE SERVICE 22/tcp open ssh 5900/tcp open vnc Nmap done: 1 IP address (1 host up) scanned in 0.09 seconds
How is this even possible, that
nmap is able to release a busy port? It works every time.
I’m very curious why this is happening, but I don’t know “where” to start my debugging. Or maybe it is a common problem, but I just cannot find any description of that?
Have you checked the system logs on your Ubuntu host? You can use the
journalctl command for this purpose. One potential reason that I can think of is socket activation where systemd (which runs as PID 1) listens on the port and starts a process when something (like nmap) tries to connect to it.
To test this theory, you could for example reboot and run
journalctl -f to follow the logs and run nmap again in a different shell.
Apart from checking logs you could also run
systemctl status to figure out which services were started or have failed.
Finally it is also entirety possible that a service failed to start earlier in the boot process due to missing dependencies. For example, if your service depends on Docker but does not (implicitly) declare it as such, then the initial attempt to start it at boot could fail whereas a manual start could work by luck since Docker has already been started.
@Lekensteyn, thank you! Your answer led me to the solution.
As it may help some other with similar problem, what I did:
~$ systemctl list-units --state=failed
This showed, that there are 4 failed services (which are not in use anymore, some legacy solution).
Then, as root:
~# systemctl stop <a_failed_service_name> && systemctl disable <a_failed_service_name>
executed for every failed unit.
After restart, port 443 is free.