• Font
  • Family
  • Foundry
  • Designer
  • Sample
  • Article
  • Help
Fontke.com>Article>Details

Corrigendum #9 clarifies noncharacter usage in Unicode

Date:2013-02-21 02:33:07| Standard|Browse: 110|Source: The Unicode Blog|Author: Unicode, Inc.
  • Follow FontKe on Wechat to get Zcode
  • Scan the Qrcode to participate in the SVIP lottery
IntroductionThere has been confusion about whether noncharacters were permitt

There has been confusion about whether noncharacters were permitted in Unicode text. The new Corrigendum #9: Clarification About Noncharacters makes it clear that noncharacters are permissible even in open interchange, although their intended semantics may not be interpretable in such contexts. ​The UTF-8, UTF-16, UTF-32 & BOM FAQ has also been updated for clarity​, and other informative text about noncharacters will be revised over time​, including the Core Specification.

Background. There are 66 noncharacters permanently reserved for internal use, typically used for some sort of internally-defined control function or sentinel value. They should be supported by APIs, components, and applications that handle (i.e., either process or pass through) all Unicode strings, such as a text editor or string class. Where an application does make internal use of a noncharacter, it should take some measures to sanitize input text from unknown sources. The best practice is to replace that particular noncharacter on input by U+FFFD. (The noncharacter should not be simply deleted, since that can cause security problems. For more information, see Section 3.5 Deletion of Code Points in UTR #36, Unicode Security Guidelines.)

0
  • Follow FontKe on Wechat to get Zcode
  • Scan the Qrcode to participate in the SVIP lottery
Relevant font foundry
Corrigendum #9 clarifies noncharacter usage in Unicode Comments
Guest Please obey the rules of this website. Unclear?
Corrigendum #9 clarifies noncharacter usage in Unicode Latest comments
No relevant comments