Sip servers using lots of memory

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Article Type: General
Product: Aleph
Product Version: 20

Description:
Under normal daily operations 1 of our 4 sip2 servers is utilizing a significant amount of memory.

As the day goes on, it will continue to increase its memory usage until we begin to have alerts from our monitoring system about swap space usage. Upon restarting all of the sip2 servers, we will regain several gigabytes of memory from the first sip2 server on 5333 and no noticeable amount of memory from the other servers.

When started, each sip2 server will spawn about 10 child processes, but that 5333 server will quickly balloon to 35 or more child processes.

This has been a problem for us since version 18. The problem continued thru version 19 and is still a problem now on version 20.

Resolution:
Checking with the

aleph@libserv4(a20_3) ABC50> ps -ef | grep sip2 | grep 5333

command, I see that their are currently 44 child processes.

The large number of current SIP2 processes results from more connections being opened by the SC machine than being closed.

There is no automatic timeout of these connections on Aleph side, so it is very important that your SC machines always close each connection. There is no way to force the close of the connection in Aleph.

Note that rebooting the sc machines by unplugging them can result in the sip2 process not being terminated.

Could it be that your sites are unplugging the machines at some point?

Also, checking with util w/1/7/3, I see this:

| Port | Pid | Server Type | Started At | Status
|---------|---------|-----------------|-----------------|--------------------
| 5333 | 13985 | SC Server | Aug 10 14:47:53 | Free
| 5343 | 24927 | SC Server | Aug 06 06:40:08 | Free
| 5353 | 25268 | SC Server | Aug 06 06:40:20 | Free
| 5363 | 25400 | SC Server | Aug 06 06:40:31 | Free

I suggest that you restart all of the self-check servers each night . This can be done through the job_list -- as described in KB 16384-29841.

Article last edited: 10/8/2013