Digital Repositories Update - July, August, September 2023

It’s been a while since our last digital repositories update.  With Systemwide Digital Library Services now back up to a full complement of staff, we’ll be getting back to our regular, written reports. 

Digital Repositories Committee (DRC)

The DRC is off to a quick start this year, with Andrew Weiss now chairing the Steering Committee, Carmen Mitchell chairing the Institutional Repository Working Group (IRWG), and Nicole Shibata and Steve Kutay co-chairing the Digital Archives Working Group (DAWG).

Both IRWG and DAWG have made improving documentation a major goal.  We’ve had some preliminary discussions about putting together new documentation for ScholarWorks and the Digital Archives similar to the Publishing Interest Group’s excellent Journal Publishing Guide.

The Steering Committee is also interested in auditing ScholarWorks’ status as a trusted repository.  This multi-year project would look to review the current state of the repository and how it aligns with next-generation repository goals, as well as ongoing gap and needs analyses.

Digital Archives

Work continues apace with the new CSU Digital Archives.  Since our last report in the spring, the production system is now up and running, with basic configurations and functionality in place.  We’re still working with several campuses to complete authentication work for the new system, and still ironing out some final issues with new submissions. 

Most of our focus over the summer has been on migrating the remaining collections in DSpace to the new Digital Archives.  Eleven campuses have some type of historical collections in DSpace, including several with historical photograph collections, student newspapers, oral histories, and other institutional documents.  We’re half-way through the migration of those collections, with the goal of wrapping up all DSpace migrations by January

In the meantime, DAWG continues to do beta testing of the Digital Archives in order to recommend changes and feature enhancements, as well as refine the Metadata Application Profile.

ScholarWorks

We’ve recently put in place configurations to block known AI bots from harvesting content in ScholarWorks.  The DRC voted to take this action due to a lack of faculty and student consent for this reuse, unclear legality due to pending lawsuits over AI harvesting of author content, and the fact that some AI bots have been slamming repositories, including ScholarWorks, with un-throttled harvesting, essentially creating denial-of-service attacks. The DRC intends to review this decision periodically. 

We’re wrapping up a few changes to the submission forms approved by IRWG last spring, including moving the degree name and language fields to controlled vocabularies (as pull-down menus in the forms).  We also continue to update department and college controlled vocabularies for campuses.

We’re also finalizing work to allow abstract-only records in ScholarWorks.  While we encourage campuses to upload the full text for all works, we recognize there is a need at some campuses to include only the abstract of a work in ScholarWorks.  Our goal is to make it clear to end-users that these records only contain an abstract, while also excluding these records from harvesting into Primo (OneSearch).