Background
Numeric Character References (or, NCRs) are common markup constructs used in markup languages like HTML and XML, where a sequence of characters will be rendered as a single character. NCRs are structured as ampersand ( & ), pound sign ( # ), lowercase letter x, four-position Unicode character code, and a trailing semicolon ( ; ). For example, च . This policy is about the use of NCRs in MARC cataloging records in OCLC (and hence Alma).
Policy Statement
Catalogers most often use NCRs in the context of non-Latin scripts. Catalogers may supply parallel non-Latin fields only for scripts supported by OCLC.
MARC-8 scripts (subsets of UTF-8 characters, so they are also compatible with UTF-8 Unicode): Arabic, CJK (Chinese, Japanese, Korean), Cyrillic (within the MARC-8 character set), Greek, or Hebrew scripts.
UTF-8 Unicode only scripts: Armenian, Bengali, Cyrillic (outside the MARC-8 character set), Devanagari, Ethiopic, Syriac, Tamil, or Thai scripts. These scripts are not included in MARC-8.
Notice that Cyrillic is the one script that straddles both categories. In general, modern Slavic languages using Cyrillic (like Russian, Bulgarian, Ukrainian) are within MARC-8, and Old Church Slavic and non-Slavic languages using Cyrillic (like Kazakh, Uzbek and Mongolian) use additional Unicode characters.
Old record using NCRs should be upgraded when used for cataloging (Mongolian example):
Upgraded record using newly supported characters (Kazakh example):
Connexion Client Help advises that non-Latin characters that are not supported by OCLC can be entered and exported to a local system using Unicode, but then deleted from the OCLC master record (NOTE: This violates CSU policy, don’t do it!); or you can enter the name of the character within square brackets using the Unicode standard name (for example, enter [schwa]), or for CJK characters, enter the reading of the character (for example, enter [yin]). NOTE: Client Help does not advise using NCRs. But OCLC noted: “However, for the stray unsupported character that appears in a supported script, then the NCR is appropriate. If the script is not supported, it should not be represented in the database, but rather be transliterated.”
Action log
Section | Point Person | Expected Completion Date | Last action taken | Next action required |
---|---|---|---|---|
Articulate the need for the policy (background) | Cataloging Task Force |
| Discussed need to adopt policy to ensure appropriate use of NCR for non-Latin scripts | To de discussed with TS Working Group. |
Finalize Policy Statement | Cataloging Task Force |
| ||
Identify and create best practice recommendations |
| |||
Where applicable, identify and write up procedures in Alma |
|
Tasks to be completed
- Type your task here, using "@" to assign to a user and "//" to select a due date