16. September 2009

ejabberd crash solved by raising number of network connections

I was bothered by ejabberd crashes. Suddenly it started to crash quicky. The crash dumps said something like:

eheap_alloc: Cannot allocate 747325720 bytes of memory (of type "heap")
and:
eheap_alloc: Cannot allocate 934157120 bytes of memory (of type "old_heap").
I tried the installer, tried compiling erlang and ejabberd from source. I usually run ejabber in a virtual server. I tried on the native host. Nothing helped. We had a similar problem before with a memory bug related to mnesia tables where a server with 5000 clients reaches 3 GB in 5 days and crashes. That case was solved by upgrading the erlang runtime from R12B-3 to R12B-5.

But in this case it needed only 5 minutes and 1000 users to suck up the entire system memory. This was different. The number of about 1000 client connections was the constant of all crashes. @zeank suspected that the problem is the limit of open file descriptors, 1024 by default. The open file descriptor limit means also the max number of TCP (client) connections. Setting this to a higer value, e.g. 32.000 solves the problem.

In practice: insert into ejabberdctl
after:
ERL_MAX_PORTS=32000
the line:
ulimit -n $ERL_MAX_PORTS
_happy_ejabbering()

Keine Kommentare: