Long-running rts32 (ue_01) sessions with high CPU
- Article Type: General
- Product: Aleph
- Product Version: 20
Description:
A number of Unix sessions were running rts32 in our test server, which used 98% phys memory and 96% virt memory. They were logged into the database since May 27.
Two sessions with Unix PID 2032, 2033 appeared to have used the most resources. I killed both so the CPU usage for the database dropped to a fraction of where it was earlier but memory usage is still tight. The available free memory on the server increased from 90 MB to 390 MB (total memory is 6 GB). This happened a couple of weeks ago. We had to kill all the rts32 sessions and restarted Aleph applications.
Now the same problem occurs. Looks like there are also a number of Unix sessions that have been running rts32 in production since as early as May 11. Are these sessions regular sessions? Could you help identify the root of the problem?
aleph 2380 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_a ABC11.a18_1
aleph 2176 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_a ABC10.a18_1
aleph 2854 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_word_index ABC30.a18_1
aleph 2668 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_z0102_index ABC13.a18_1
aleph 1893 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_a USM01.a18_1
aleph 2033 1 1 May 27 pts/2 1177:36 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_z0102_index ABC01.a18_1
aleph 2853 1 0 May 27 pts/2 0:02 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_z0102_index ABC30.a18_1
aleph 3022 1 0 May 27 pts/2 0:16 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_08_a ABC01.a18_1 C
aleph 2034 1 0 May 27 pts/2 1:36 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_word_index ABC01.a18_1
aleph 2852 1 0 May 27 pts/2 0:06 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_a ABC30.a18_1
aleph 2381 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_z0102_index ABC11.a18_1
aleph 2949 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_08_a USM01.a18_1 C
aleph 2533 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_word_index ABC12.a18_1
aleph 2382 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_word_index ABC11.a18_1
aleph 2178 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_word_index ABC10.a18_1
aleph 2177 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_z0102_index ABC10.a18_1
aleph 1895 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_word_index USM01.a18_1
aleph 2669 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_word_index ABC13.a18_1
aleph 2532 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_z0102_index ABC12.a18_1
aleph 2032 1 3 May 27 pts/2 2874:22 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_a ABC01.a18_1
aleph 2531 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_a ABC12.a18_1
aleph 15401 1 0 00:05:36 pts/2 0:06 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_11_a ABC0.a18_1
aleph 15445 1 1 00:05:43 pts/2 46:07 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_06_a ABC50.a18_1
aleph 3081 1 0 May 27 pts/2 3:29 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_06_a USM50.a18_1
aleph 2667 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_a ABC13.a18_1
aleph 1894 1 0 May 27 pts/2 0:01 /exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_z0102_index USM01.a18_1
Heather [ <Heather Cai > ]
[ <Alan Manifold > ]
[ <Jerry Specht > ]
[ <Jerry Specht > ]
Resolution:
I see that the 2032 process is: exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_a ABC01.a18_1
And the 2033 is: exlibris/aleph/a18_1/aleph/exe/rts32 ue_01_z0102_index ABC01.a18_1
These are ue_01 and ue_z0101.
You say:
"Looks like there are also a number of Unix sessions that have been running rts32 in production since as early as May 11. Are these sessions regular sessions?"
I see this in the mgu01 $data_scratch:
-rw-rw-r-- 1 aleph aleph 3671553 May 16 18:20 run_e_01.8986
-rw-r--r-- 1 aleph aleph 16523617 Jun 2 17:33 run_e_01.981
This indicates that run_e_01.981 was running from May 17 until June 2. Was aleph restarted in that time? If so, it seems that this ue_01 process was not shut down. It needs to be. (If a ue_01 process continues through an Oracle shutdown/startup, it can behave badly in the fashion you describe.)
It is a good practice to shut down / start up aleph at least once per week.
- Article last edited: 2/4/2015