Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Time

Item

Presenter

Notes

Annual meeting update

Carmen Mitchell


Working groups restructuring

Nicole Shibata


New committee/Working group members

Carmen Mitchell

Invite the new members to your July meetings, please. That’s the start of the new term when the old folks rotate off.

Annual report prep

Carmen Mitchell

Could the WG chairs, Dave, and Carmen plan to compile their end-of-year updates by June 10th? This will give us some time to circulate the report and send it to cold. Previous examples are available.

Information item:

Tesseract OCR
https://tesseract-ocr.github.io/

From Mark Bilby

Just read up for the first time the Tesseract 5.0 open source OCR solution after seeing it used in recent Internet Archive digitization batches. We’re about to gear up to digitize thousands of retro theses and had considered using ABBYY for this piece of the workflow, but it looks like Tesseract may be the better way to go. Made me wonder whether other CSUs have ever used it, and if not, whether it might be worth exploring and providing some training. Another approach might be for one campus or the CSUCO to spin up a virtual machine once a week or once a month and run through a batch task where all pdfs or image files from an input group of folders in cloud storage are processed and saved to an output folder.

...