Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...


Page Properties


Document status

Status
colourGreen
titleAPPROVED
  

Area coveredCataloging
Prepared byCataloging Task Force
Adapted fromOrbis Cascade Alliance Collaborative Technical Services Team


...

Numeric Character References (or, NCRs) are common markup constructs used in markup languages like HTML and XML, where a sequence of characters will be rendered as a single character. NCRs are structured as ampersand ( & ), pound sign ( # ), lowercase letter x, four-position Unicode character code, and a trailing semicolon ( ; ). For example, च . This policy is about the use of NCRs in MARC cataloging records in OCLC (and hence Alma).

Policy Statement

Catalogers most often use NCRs in the context of non-Latin scripts. Catalogers may supply parallel non-Latin fields only for scripts supported by OCLC.

...

These are:

  • MARC-8 scripts (subsets of UTF-8 characters, so they are also compatible with UTF-8 Unicode): Arabic, CJK (Chinese, Japanese, Korean), Cyrillic (within the MARC-8 character set), Greek, or Hebrew scripts.

  • UTF-8 Unicode only scripts: Armenian, Bengali, Cyrillic (outside the MARC-8 character set), Devanagari, Ethiopic, Syriac, Tamil, or Thai scripts. These scripts are not included in MARC-8.

Notice that Cyrillic is the one script that straddles both categories. In general, modern Slavic languages using Cyrillic (like Russian, Bulgarian, Ukrainian) are within MARC-8, and Old Church Slavic and non-Slavic languages using Cyrillic (like Kazakh, Uzbek and Mongolian) use additional Unicode characters.


If any non-MARC-8 scripts are exported in MARC-8 data format, the non-MARC-8 characters are saved in Numeric Character Reference (NCR) format (see  Numeric Character Reference for more information). However, all All settings for Alma should be UTF-8 Unicode. NCRs should NOT be used to create non-Latin scripts for scripts not supported by OCLC. Examples include Georgian, Khmer, and anything else not listed above. Non-Latin fields using NCRs to render non-supported scripts should be deleted from OCLC master records when cataloging.

...

However, some scripts/characters that were not supported in the past are supported by OCLC now, so non-Latin fields using NCRs in those cases will need to be upgraded.

Old record using NCRs should be upgraded when used for cataloging (Mongolian example):

Image Removed

Upgraded record using newly supported characters (Kazakh example):

Image Removed

...

.

...


Exceptions to this policy may be made in the case of large record sets provided by vendors, but CSU Libraries must make a commitment to using the available records that most closely adhere to this policy in such cases. See OCLC Connexion Client guide International Cataloging: Use Non-Latin Scripts  for more details. 

Action log

...