p_manage_01: HEAP DUMP ... "Execution error .../io_z00.gnt"

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Article Type: General
Product: Aleph
Product Version: 16.02

Description:
We are running parallel indexing on our test server, with odn04 as the parallel library. p_manage_01 is suspended with this in the $data_scratch /p_manage_01_a.err file:

FAILURE Tue Oct 17 21:47:25 CDT 2006 ==================
step 1
Cycle: 4

b_manage_01_a: Retrieve Failure

Job Suspended !!!

And then we see this in the p_manage_01_a_4.log:
....
15015 READING DOC 000165015 AT 21:47:01
kghalo bad size 0x04000020
********** Internal heap ERROR KGHALO2 addr=0 *********

******************************************************
HEAP DUMP heap name="Alloc environm" desc=14c28b4

<snip>

003754C0 01000000 03000738 30BFFE31 01000000 03000744 30BFFE2E 01000000 03000750
003754E0 30BFFE2B 01000000 0300075C 30BFFE28 01000000 03000768 30BFFE25 01000000

----- End of Call Stack Trace -----

Execution error : file '/exlibris/aleph/a16_1/aleph/exe/io_z00.gnt'
error code: 114, pc=0, call=14, seg=0
114 Attempt to access item beyond bounds of memory (Signal 11)

Tue Oct 17 21:47:25 CDT 2006

FAILURE Tue Oct 17 21:47:25 CDT 2006 ==================
step 1
Cycle: 4

b_manage_01_a: Retrieve Failure

Three attempts to run this job (with 7 processes) have all failed with this error.

Resolution:
Running the job with only three processes -- and canceling a tivoli dsmc process which was constantly taking 25% of CPU -- worked.

It seems that either this server cannot handle 7-9 process jobs -- or perhaps the out-of control tivoli dsmc process made the server susceptible to the HEAP DUMP crash when there were 7-9 processes.

It may be that there is something about the two server (Oracle back server) setup which is also a factor in this.

Additional Information

heap dump

Article last edited: 10/8/2013