P_PNX table seems to grow when re-run normalization
- Article Type: General
- Product: Primo
- Product Version: 2
Description:
We have been doing renormalization over the same set of records as we tweak our normalization rules. The source record count stays the same - this is as should be since we are just normalizing the exact same set of records.
What is odd is that after each normalization, the PNX counts go up by about the same amount and that amount is exactly equal to the number of new dedup and frbr records. It looks like Primo is adding the total of FRBRs and Dedups into the PNX database instead of overlaying the existing ones and if we continue to renormalize, the PNX database is going to artificially bloat.
Is this behavior a bug, expected, other? If it is expected, why isn't the PNX database being cleaned up?
Resolution:
When data is reloaded, the system deletes the FRBR and dedup records that were previously created for the single updated records. The delete is not physical (logical) and therefore the P_PNX table grows by the amount of FRBR and dedup records.
If you want to start from scratch or avoid growth of P_PNX, you may run the clean_data_all.sh script which will remove the data and the indexes.
Additional Information
dedup, frbr, p_pnx, database
- Article last edited: 10/8/2013