Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Note:  These instructions are for Bepress Digital Commons, but many steps will apply to any harvest pipe.

Basic Information

Ex Libris documentation

Useful Orbis Cascade instructions to start with:    https://www.orbiscascade.org/blog/9/?bid=153

(warning) Important to know before you begin:

  • Go in order:  source, scope, norm rules, then pipe last.
  • Deploy after every step, and Deploy All before running the pipe for the first time.
  • Records will not be discoverable in OneSearch until indexing is run for the entire installation.  The pipe is for our local instance, but the records are still added to a single installation-level database, which is all indexed at once.
  • View harvested records in the PNX viewer at any time after the pipe finishes.  Search using [source code*] in Record ID.  Adding date/time to Created after or Updated after will make the search go much better (it's super slow otherwise).  Put date AND time (e.g. 05/03/2017 01:01) or it won't work.

...

Data Source
Note that the input path is oai_dc:dc, despite the fact that we harvest qualified DC data.
Image Removed
Image Added


Scope

To view records in Primo, you have to create a scope, and then put that scope in a view.  The scope needs to be named the same thing as your source, because the OTB norm rules are already set up to write the source as a scope value in the PNX.

Image RemovedImage Added

Add the scope to any view where you want the IR records to be in your search results.  The One Ex Libris trainer defined the scopes you build in the Views area as "scope buckets."  They're called scopes, but they're really containers of one or more scopes.  You can add your IR as a separate scope bucket so it will be searched in a separate tab/dropdown, or add it as a value to an existing scope bucket so the results are just automatically included with whatever else it's searching.

Separate scope bucket:
Image Removed
Image Added

Add to existing scope bucket:
Image Removed
Image Added

If you added it to an existing scope bucket, you're done.  If not, you need to choose which tab to put it in, click Edit, and then Add a New Scope:
Image Removed
Image Added

Norm Rules

Duplicate from Generic Dublin Core.  Start with the Orbis Cascade recommended changes.  For an example of modified norm rules (for local additions and to adapt to Qualified Dublin Core) see here:  Example of DC Norm Rules Changes for Institutional Repository Harvest

Test records:  Load a harvested file, or do this quick-and-dirty method:

...

  • Regular
    • Used for harvesting
  • Delete DS
    • Used to delete records if necessary for a completely fresh start
  • No Harvest
    • Update the records without re-harvesting.  Use after any change to norm rules.

Determine whether to harvest the whole source or a smaller set.  Set names can be found in the header when viewing an XML record:

  • Image Added

Pipe configuration for Digital Commons:

  • Data Source:  Institutional Repository
  • Harvesting Method:  OAI
  • Server:   http://digitalcommons.calpoly.edu[base URL]/do/oai/
  • Metadata format:  qdc
  • Set:   publication:seniorprojects, publication:thesesexact name of set as found in XML header (e.g. "publication:students")
  • Start harvesting files/records from: (use date of oldest content; auto-updates to most recent harvest date)
  • Start time: (default)
  • Normalization Mapping Set:  01CALS_PSU_SPyour new norm rules
  • Enrichment Set:  No Enrichments - Template
  • Harvested File Format:  *.tar.gz
  • Include DEDUP (checked)
  • Include FRBR (checked)
  • Force DEDUP (checked)
  • Force FRBR (checked)
  • Maximum error threshold:  5% (default)

Example of Regular (harvest) pipe:

Harvest Pipe configurationImage Added

Example of Delete pipe:

Image Added

Example of No Harvest (renorm) pipe:

Image Added

Deleting a pipe:

  • A pipe that has never been executed can be deleted in the Back Office.  If the pipe has ever run, only Ex Libris can delete it–open a Salesforce ticket for this.

...

Note that a status of completed when Stage still says "harvesting" isn't necessarily an error–it can mean that there were no new records in the source available for harvesting, so the job ended there.Image Removed.

Image Added


Additional Back Office Configuration

If an extraneous and non-functional link appears in the GetIT! 2 (At the Library) section of your records, here is how to remove it without disturbing possibly necessary GetIT! 2 link functions for Alma records.

  • ADD row, ENABLED: Online Resource not_restricted 01CALS_ALMA display openurl
  • ADD row, NOT ENABLED: Online Resource not_restricted /[your data source code] display openurl
  • DISABLE row: Online Resource not_restricted ALL display openurl
  • You must ADD a row for every data source if you take this approach.

Image Added