Migration & Implementation Plan

ScholarWorks Implementation Team

March 27, 2019

Introduction

The decision to adopt a new deployment model for ScholarWorks -- now focused around a single, systemwide instance of Hyrax instead of separate campus instances -- allows us to rethink the migration and implementation strategy for this new system.

Rather than migrating and implementing each campus in turn, as we were doing previously, we propose setting up all of the campuses on Hyrax simultaneously and directing all new submissions to this new system, while migrating historical submissions from DSpace to Hyrax on an ongoing basis.  

This will allow us to get campuses up and running in the new system quickly, while the process of migrating older submissions -- which can, in total, take upwards of 12 months to complete -- can run in the background.  Once all of the historical submissions have been migrated for a particular campus, we can then put in place redirects for those records in DSpace to Hyrax.

See Project calendar for timelines.  Note there are two tabs: one for Implementation tasks and the other for Data Migration tasks.

1. Implementation

1a. Initial set-up

The initial set-up of the system will include:

  1. Upgrades to Hyrax and our AWS infrastructure
  2. Authentication
  3. Configuration of metadata schemas and facets
  4. Basic branding, design, and labels

The majority of this work is already completed, but should be reviewed and approved by the Implementation Team before being finalized.  Authentication will require light configuration work by each campus identity management group, facilitated by the Chancellor’s Office. Once we have these three basic elements in place, we can begin the pilot phase.

1b. Pilot

We need to run a pilot to ensure that the new single-instance Hyrax is working as expected.  We think the following three campuses are good participants for a pilot:

  • Dominguez Hills
  • Fullerton
  • Sacramento

Sacramento has a typical, medium-sized repository in ScholarWorks, and has a number of new collections they’d like to upload.  We think they will be a good representative of current CSU DSpace users. Dominguez Hills and Fullerton do not currently use DSpace, and so can start using the new system without consideration for historical data or current workflows.  This will allow them to get up and running faster than most campuses. Having three campuses will let us test the multi-campus configuration of the new system.

If any of the campuses identified here are unwilling to participate, we can select alternates from among the second-wave campuses below.

The pilot should have a tight scope of work focused on:

  • Onboarding the pilot campuses, including
    • Training
    • Setting up workflows
    • Setting up local controlled vocabularies
    • Authentication
  • Running through a comprehensive set of tests
  • Reporting any bugs or performance issues

The pilot should not include pilling on new feature requests.  Rather, such requests should be taken up more formally within the new proposed governance structure.  

Campuses in the pilot should be willing to help train campuses in the subsequent phases.

1c. Second-wave campuses

Once we are satisfied that the new Hyrax instance is performing as expected via the pilot, we can start to onboard the next phase of campuses.

The primary tasks needed for each campus to get started is:

  • Authentication
  • Configuration of workflows
  • Local controlled vocabularies (e.g., college and department names)

Authentication will be an ongoing task for all campus identity management groups from the initial set-up onward, so authentication will likely be in place for some libraries even before the second phase begins.  Configuration of workflows and local controlled vocabularies is rather simple and can be done quickly.

Campuses available for this second phase include:

  • Bakersfield
  • Channel Islands
  • Chico
  • Fresno
  • Humboldt
  • Long Beach
  • Maritime
  • Moss Landing
  • San Francisco
  • San Marcos
  • Sonoma
  • Stanislaus

1d. Third-wave campuses

The last wave of campuses will include those who require additional development in Hyrax before using the system.  This includes campuses that require SWORD to support integration with Proquest ETD Administrator or a local campus ETD management system. It can also include any campus preliminarily marked for the second wave who deem the Handle Service a requirement.

Campuses in this phase include:

  • East Bay
  • Los Angeles
  • Northridge
  • Pomona

2. Data migration


While the process of setting-up and onboarding campuses on the new systemwide Hyrax continues, we will be simultaneously migrating historical submissions from DSpace to the new system.  Already the CO has migrated data for six campuses to individual campus Hyrax instances for testing and review. More will follow. Campuses can undertake the following steps before we run a second, final data migration for each one to the new system.

  • Campus will review migration data in their test instances
  • Campus will identify any problems
  • CO will document and address any corrections needed
  • Final load of data to new system

Campuses migrating from third-party systems, including Islandora (Moss Landing) and III Symposia (Long Beach), will require additional development work to migrate data from those systems before that process can begin.

3. Project structure & communication


A key component of this new plan is to pair the above tasks with a more robust governance structure.  This will allow us to ensure that key decisions are being made both quickly and collaboratively.

Additionally, each campus involved in the project will designate a single representative to serve as the principle point of contact and local project manager for the migration and implementation.  This group of project managers will meet bi-weekly (separately and then together with the existing monthly ScholarWorks open forum) to go over project updates, tasks, and due dates.

These two components of the project -- similar in structure to the ULMS migration and implementation -- will help ensure that all decisions, changes, and timelines are communicated regularly to all campuses and to COLD throughout the project.