OOM killer despite lots of free memory on PAE kernel

Problem :

We have an application server which for legacy reasons still runs on 32bit kernel with PAE (Ubuntu 12.04 LTS). The server has 24GB of RAM, as seen in the output of free:

$> free -lmt
             total       used       free     shared    buffers     cached
Mem:         24256      19468       4788          0          0       2382
Low:           189        146         42
High:        24067      19321       4745
-/+ buffers/cache:      17085       7170
Swap:        19956         47      19908
Total:       44212      19515      24697

However, as soon as the real memory usage rises over approximately 16GB, processes are killed by the OOM killer (notably Google Chrome), and some memory allocations from Java also tend to fail. I have already set

vm.overcommit_memory = 1

via sysctl, but it doesn’t seem to help. Here is an excerpt of dmesg which shows the output after one of the OOMs.

Solution :

A quick google for oom killer premature seems to suggest there are a few reasons the OOM killer could be invoked even when the system has plenty of apparent memory/swap available.

One possible explanation is memory fragmentation, in particular:

Normal: 2386*4kB 2580*8kB 197*16kB 6*32kB 4*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 35576kB

Suggests that there aren’t many normal large pages available.

I’m afraid this isn’t a complete answer to your question, but it could point you in one possible direction of inquiry.

