Page Comparison

Note: These instructions are for Bepress Digital Commons, but many steps will apply to any harvest pipe.

Basic Information

Useful Orbis Cascade instructions to start with: https://www.orbiscascade.org/blog/9/?bid=153

Important to know before you begin:

Go in order: source, scope, norm rules, then pipe last.
Deploy after every step, and Deploy All before running the pipe for the first time.
Records will not be discoverable in OneSearch until indexing is run for the entire installation. The pipe is for our local instance, but the records are still added to a single installation-level database, which is all indexed at once.
View harvested records in the PNX viewer at any time after the pipe finishes. Search using [source code*] in Record ID. Adding date/time to Created after or Updated after will make the search go much better (it's super slow otherwise). Put date AND time (e.g. 05/03/2017 01:01) or it won't work.

...

Data Source
Note that the input path is oai_dc:dc, despite the fact that we harvest qualified DC data.
Image Removed
Image Added

Scope

To view records in Primo, you have to create a scope, and then put that scope in a view. The scope needs to be named the same thing as your source, because the OTB norm rules are already set up to write the source as a scope value in the PNX.

Image RemovedImage Added

Add the scope to any view where you want the IR records to be in your search results. The One Ex Libris trainer defined the scopes you build in the Views area as "scope buckets." They're called scopes, but they're really containers of one or more scopes. You can add your IR as a separate scope bucket so it will be searched in a separate tab/dropdown, or add it as a value to an existing scope bucket so the results are just automatically included with whatever else it's searching.

Separate scope bucket:
Image Removed
Image Added

Add to existing scope bucket:
Image Removed
Image Added

If you added it to an existing scope bucket, you're done. If not, you need to choose which tab to put it in, click Edit, and then Add a New Scope:
Image Removed
Image Added

Norm Rules

Duplicate from Generic Dublin Core. Start with the Orbis Cascade recommended changes. For an example of modified norm rules (for local additions and to adapt to Qualified Dublin Core) see here: Example of DC Norm Rules Changes for Institutional Repository Harvest

Test records: Load a harvested file, or do this quick-and-dirty method:

...

Regular
- Used for harvesting
Delete DS
- Used to delete records if necessary for a completely fresh start
No Harvest
- Update the records without re-harvesting. Use after any change to norm rules.

Determine whether to harvest the whole source or a smaller set. Set names can be found in the header when viewing an XML record:

Image Added

Pipe configuration for Digital Commons:

Data Source: Institutional Repository
Harvesting Method: OAI
Server: http://digitalcommons.calpoly.edu[base URL]/do/oai/
Metadata format: qdc
Set: publication:seniorprojects, publication:thesesexact name of set as found in XML header (e.g. "publication:students")
Start harvesting files/records from: (use date of oldest content; auto-updates to most recent harvest date)
Start time: (default)
Normalization Mapping Set: 01CALS_PSU_SPyour new norm rules
Enrichment Set: No Enrichments - Template
Harvested File Format: *.tar.gz
Include DEDUP (checked)
Include FRBR (checked)
Force DEDUP (checked)
Force FRBR (checked)
Maximum error threshold: 5% (default)

Example of Regular (harvest) pipe:

Image Added

Example of Delete pipe:

Image Added

Example of No Harvest (renorm) pipe:

Image Added

Deleting a pipe:

A pipe that has never been executed can be deleted in the Back Office. If the pipe has ever run, only Ex Libris can delete it–open a Salesforce ticket for this.

...

Note that a status of completed when Stage still says "harvesting" isn't necessarily an error–it can mean that there were no new records in the source available for harvesting, so the job ended there.Image Removed.

Image Added

Additional Back Office Configuration

If an extraneous and non-functional link appears in the GetIT! 2 (At the Library) section of your records, here is how to remove it without disturbing possibly necessary GetIT! 2 link functions for Alma records.

ADD row, ENABLED: Online Resource not_restricted 01CALS_ALMA display openurl
ADD row, NOT ENABLED: Online Resource not_restricted /[your data source code] display openurl
DISABLE row: Online Resource not_restricted ALL display openurl
You must ADD a row for every data source if you take this approach.

Image Added

Versions Compared

Old Version 2

New Version Current

Key

Note: These instructions are for Bepress Digital Commons, but many steps will apply to any harvest pipe.

Basic Information