Optimal cycle size / number of processes for indexing jobs
- Article Type: Q&A
- Product: Aleph
- Product Version: 20
Question
What are the best cycle size (loop length) and number-of-processes to use in running Aleph indexing jobs? (The cycle size specified in the $data_root/prof_library. For example: "p_manage_01_loop_length".)
Answer
A cycle size of 10% of the number of records to be indexed -- with a maximum of 50000 -- is optimal.
Thus,
for 100,000 records ... 10000
for 200,000 records ... 20000
for 300,000 records ... 30000
for 400,000 records ... 40000
for 500,000 and up ... 50000
For the manage-01 and manage-02 jobs -- which have a single-threaded step --, the number of processes should be 8 on a small machine and 16 on a large. 16 processes can be used as the number of processes for other jobs.
Additional Information
Note 1: if the (number of processes) * (cycle-size) is greater than the number of records, that's not a problem. The extra cycles will just be unused.
Note 2: Article 000033393 describes the job/function for each prof_library p_manage_nn_loop_length entry ( Loop length recommendations ).
Category: Background processing (500)
- Article last edited: 1/15/2015