Configuring Pipes
This functionality is not used in Primo VE environments. If you are looking to load records from external data sources, see Loading Records from External Sources into Primo VE.
Defining a Pipe
To create an effective pipe for your system, first create your data sources, normalization mapping sets, and enrichment sets.
-
Click Pipe Configuration Wizard on the Ongoing Configuration Wizard page.The Pipe Configuration Wizard page opens.You can also access the Define Pipe page by clicking Create new pipe on the Primo Home > Monitor Primo Status > Pipe Monitoring page.
-
Click Pipes Configuration.The Pipes Configuration page opens.Pipe Configuration Page
-
Click Define Pipe to open the Define Pipe page.
-
Select the name of the institution from the Owner drop-down list. For institution-level staff users, your institution will already be selected.
For installation-level users, you must select an institution before the associated values appear in the drop-down lists that display the Select Institution value.
-
In the Pipe Name field, enter the name of the new pipe.
The Pipe name is composed of letters, numbers, and/or the underscore character.
-
In the Pipe Description field, enter a description for the new pipe.
-
Enter the remaining fields as described in the following table.
Define Pipe Details Field name Description Pipe TypeIndicates the type of pipe. The following types are valid:-
Regular – This type of pipe uses records harvested from the data source to create, update, and delete PNX records. For more information on the stages of pipe execution, see Configuring the Publishing Platform Pipe Flow.
-
Delete Data Source – This type of pipe is used to delete a data source from the Primo database, including data from dedup and FRBR groups. It removes all previously harvested records from the P_PNX and P_SOURCE_RECORD tables for the specified data source. In addition, it removes all tags and reviews.
-
No Harvesting – Update Data Source – This pipe is similar to a “Regular” pipe, but records are not harvested from the data source. It uses all of the previously harvested source records from the P_SOURCE_RECORD table instead of the data source. This type of pipe is typically used when it is necessary to re-normalize and/or enrich all records from a specific data source (for example, due to a change in normalization rules).
-
Delete Data Source and Reload – This pipe is similar to the Regular pipe, but if first removes all harvested records from the P_PNX and P_SOURCE_RECORD tables before reloading the PNX records from the data source. This option is intended for data sources (such as MetaLib) that have to harvest the entire database each time. This ensures that deleted records from the data source are removed from Primo.
The default value is Regular.When running pipes (such as pipes set to No Harvesting - Update Data Source) that add or change a large amount of data, it is recommended that you stop Oracle archiving, as this slows down the process and fills up the disk. Immediately after the process is complete, perform a full cold backup and then turn archiving back on.Records that are deleted and re-inserted using the Delete Data Source and Reload option may be included with the tally of the updated records (instead of the deleted and inserted records) in the pipe’s log.Data SourceThe data source of the pipe.Normalization Mapping SetThe normalization set used to map the source records to the PNX.PriorityThis field defines the priority of the pipe: Low, Medium, High, and Critical.Pipes with the highest priority run first. The default setting is Medium.Maximum error thresholdThe maximum percent of errors allowed until the system stops running the pipe.Harvesting methodThe method used to harvest the source information. The following methods can be selected: FTP, Copy, OAI, and SFTP.If Copy is selected, the user must have read permission for the directory.Enrichment SetThe enrichment set used to enrich the records.Harvested File FormatIndicates the format of the harvested file. The following values are valid: *.tar.gz, *.tar, *.gz, *.warc, *.warc.gz, and *.zip.This field is not available with all types of pipes, such as Delete Data Source.The *.gz, *.warc, *.warc.gz, and *.zip formats require the data source to use the WARC file splitter.
Start harvesting files/records fromThe date from which to harvest the records.-
For FTP/Copy this is the date and time of the file to harvest. Following harvesting, this date is updated with the date of the latest harvest file.
-
For OAI this is the date and time on which the file is to be updated. Following harvesting this is updated with the date of the request.
This date is updated after each successful run of the pipe to ensure that all harvested files have been processed completely.
Use Static Date
When selected, the date shown in the Start harvesting files/records from field will not be updated each time the pipe is run. This is useful if you need to reload all the source data each time and want to retain the original start from date.
Start timeThe time from which to harvest the records.System Last StageThis field allows you to change the last stage that is run during the execution of a pipe. By default, this field is set to FRBR, the last stage of pipe execution. The following values are valid:-
PERSISTENCE – This option stops the execution of the pipe after loading records to the database. Note that the Dedup and FRBRization stages are not executed.
-
DEDUP – This option stops the execution of the pipe after the Dedup stage. Note that the FRBRization stage is not executed.
-
FRBR – This default option stops the execution of the pipe after the FRBRization process completes.
-
FRBR WITHOUT DEDUP – This option skips the Dedup stage and stops the execution of the pipe after the FRBRization process completes.
This field does not display when the Parallel Processing of Pipes mode is set to Harvesting, NEP on the General Configuration page.
Include DEDUPIndicates whether the Dedup stage will be executed when the Parallel Processing of Pipes mode is set Harversting, NEP on the General Configuration page.Include FRBRIndicates whether the FRBR stage will be executed when the Parallel Processing of Pipes mode is set Harversting, NEP on the General Configuration page.Force DEDUPIndicates whether Dedup processing is performed on PNX records that have no changes to the dedup section. This allows you to apply changes made to the Dedup rules. Force Dedup and Force FRBR are settings that were designed to be enabled only occasionally, as-needed. They are not intended to be enabled permanently, and are best used for pipes that are executed during off hours to minimize impact with patrons and staff users-
If the pipe is not configured to run the Dedup stage, Dedup processing will not be forced regardless of this setting.
-
Pipes with this setting are meant to be run occasionally as needed. To minimize the impact on patrons and staff, it is recommended that you execute the pipe during off hours.
Force FRBRIndicates whether FRBR processing is performed on PNX records that have no changes to the frbr section. This allows you to apply changes made to the FRBR rules.-
If the pipe is not configured to run the FRBR stage, FRBR processing will not be forced regardless of this setting.
-
Pipes with this setting are meant to be run occasionally as needed. To minimize the impact on patrons and staff, it is recommended that you execute the pipe during off hours.
ServerThe IP used to access the server.This field appears only if the harvesting method is OAI, FTP, or SFTP.For OAI, the system supports the HTTPS protocol for harvesting.
UsernameThe user name used to access the server.This field appears only if the harvesting method is FTP or SFTP.PasswordThe password used to access the server.This field appears only if the harvesting method is FTP or SFTP.Metadata format (OAI only)All OAI-PMH compliant repositories can return records in Dublin Core format. The Dublin Core format is usually expressed as oai_dc, but some repositories use a different code. Enter the term used by your repository.This field appears only if the harvesting method is OAI.Set (OAI only)OAI repositories may organize items into sets, allowing you to selectively harvest information. Specify the name of the set if you want to harvest only a specific part of the OAI repository.This field appears only if the harvesting method is OAI.Encode Resumption Token (OAI only)
Indicates whether to encode the resumption token (such as characters like @) within the OAI protocol. The valid values are true and false. The default value is false.
This field appears only if the harvesting method is OAI.
Source directoryThe directory of the source record. This is used for copy only.This field appears only if the harvesting method is Copy, FTP, or SFTP.Delete after copyIndicates whether the system should delete the source files after the harvest. If selected, the files are deleted as follows, per Harvesting method:-
Copy – The files are removed from the directory on the Primo server.
-
FTP/SFTP – The files are removed from the directory on the source server. If the staff user does not have write permissions to the source files, the system will stop the pipe and log the following error:
stop harvest errorIf this check box is not selected, the source files are not removed from their respective directories after harvesting.After the harvest, the system stores a copy of the source files in the harvest directory. To view the harvested files, enter the following commands:-
be_pipes
-
cd <pipe_name>/<data_source>/<timestamp-of-the-pipe_run>/harvest
Configure Server LocaleWhen this field is selected, this page opens the Server Locale field.This field appears only if the harvesting method is FTP.Server LocaleSelect a locale from the drop-down list.This field appears only if the harvesting method is FTP and the Configure Server Locale check box is selected.By default, the harvester assumes the locale of the server is English. If the locale of your server is different, you must select the relevant locale.
-
-
For FTP, OAI, and SFTP harvesting methods, click Test Connection to verify the connection to the server.
-
Click Save.
Editing a Pipe
-
On the Primo Home > Monitor Primo Status > Pipe Monitoring page, click Edit next to the pipe that you want to update.The Define Pipe page opens, showing the details of the specified pipe (see Define Pipe Page).
-
Edit the fields according to Define Pipe Details.
-
Click Save to update the pipe's settings.
Deleting a Pipe
-
On the Primo Home > Monitor Primo Status > Pipe Monitoring page, click Edit next to the pipe that you want to delete.The Define Pipe page displays the specified pipe's details (see Define Pipe Page).
-
Click Delete Pipe to delete the pipe.