Linux router frequently drops connections

Posted on

Problem :

Until a few months ago, I had my Linux desktop serving double duty as a router for my home network, and all was well.

Then I set up a small Linux machine to act as a stand-alone router, and since then, I experience lots of dropped connections.

Errors like this are frequent (this one during an rsync session):

Write failed: Broken pipe
rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32)
rsync: connection unexpectedly closed (1576 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(601) [sender=3.0.7]

My IM sessions disconnect and reconnect frequently. SSH sessions drop, and sometimes web pages fail to load.

It’s always an active connection that gets dropped (as opposed to problems establishing new connections). And it’s most frequent when my Internet connection is busy–in terms of number of active connections, not in terms of bandwidth. For instance, running bittorrent makes the problem much worse, but downloading or uploading a single large file that consumes 100% of my bandwidth does not seem to trigger the problem. I can always reconnect immediately (although the new connection often gets dropped soon, too).

I have an 8mbit (ha! yeah right!) cable modem connection from Telecable (one of the big cable companies in Mexico). I would have assumed it was a problem with their service, except that I don’t have the problem when not using my router.

So it seems pretty apparent to me that I’m reaching some sort of “max connections” limit in my Linux router

I have experienced similar problems in the past, on very busy systems, and increasing the netconn_max (or the equivalent in older kernels) has always solved the problem. But this time that doesn’t seem to be the issue. This is immediately after having experienced a series of disconnections:

/proc/sys/net/ipv4/netfilter/ip_conntrack_max: 48324
/proc/sys/net/ipv4/netfilter/ip_conntrack_count: 75

For what it’s worth, the output of `iptables -L -t nat’:

target     prot opt source               destination
DNAT       tcp  --  anywhere             anywhere            multiport dports 6881:6999 to:
DNAT       udp  --  anywhere             anywhere            multiport dports 6881:6999 to:
DNAT       tcp  --  anywhere             anywhere            tcp dpt:4380 to:
DNAT       udp  --  anywhere             anywhere            udp dpt:4380 to:
DNAT       tcp  --  anywhere             anywhere            tcp dpt:49181 to:
DNAT       udp  --  anywhere             anywhere            udp dpt:49181 to:

target     prot opt source               destination
MASQUERADE  all  --  anywhere             anywhere

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

What else do I need to check?

Load average and memory usage, as requested:

            total       used       free     shared    buffers     cached
Mem:           755        747          8          0        154        504
-/+ buffers/cache:         88        667
Swap:         1903          0       1903

 18:32:19 up 4 days, 19:53,  2 users,  load average: 0.00, 0.00, 0.00

I also forgot to include the uname -a output originally:

Linux reep 2.6.32-5-686 #1 SMP Wed Jan 12 04:01:41 UTC 2011 i686 GNU/Linux

Solution :

Interestingly, one of the things your ‘router’ does is masquerade / NAT (if I’m reading that right).

If your cable modem normally does that task, its ability to handle outbound connections may be affected by the fact that everything on the other side of your ‘router’ (in quotes because it does more than just route) has the same IP address while the router is turned on. In other words, if it has a per-host bucket for connections, or if its internal handling of connections simply can’t cope with a large number of connections from a single host, then the cable modem router could be overloaded.

One way to test this would be to configure your router to simply push packets to the cable modem router, instead of doing masquerade / NAT. This means configuring a default route and telling the kernel to forward packets (if I recall correctly). If I’m right, the problem should go away again.

Leave a Reply

Your email address will not be published. Required fields are marked *