Next: Ctl-Char Syntax, Previous: Basic Char Syntax, Up: Character Type
In addition to the specific excape sequences for special important control characters, Emacs provides general categories of escape syntax that you can use to specify non-ASCII text characters.
For instance, you can specify characters by their Unicode values.
?\unnnn represents a character that maps to the Unicode
code point `U+nnnn'. There is a slightly different syntax
for specifying characters with code points above #xFFFF;
\U00nnnnnn represents the character whose Unicode code
point is `U+nnnnnn', if such a character is supported by
Emacs. If the corresponding character is not supported, Emacs signals
an error.
This peculiar and inconvenient syntax was adopted for compatibility with other programming languages. Unlike some other languages, Emacs Lisp supports this syntax in only character literals and strings.
The most general read syntax for a character represents the
character code in either octal or hex. To use octal, write a question
mark followed by a backslash and the octal character code (up to three
octal digits); thus, `?\101' for the character A,
`?\001' for the character C-a, and ?\002 for the
character C-b. Although this syntax can represent any
ASCII character, it is preferred only when the precise octal
value is more important than the ASCII representation.
?\012 => 10 ?\n => 10 ?\C-j => 10
?\101 => 65 ?A => 65
To use hex, write a question mark followed by a backslash, `x',
and the hexadecimal character code. You can use any number of hex
digits, so you can represent any character code in this way.
Thus, `?\x41' for the character A, `?\x1' for the
character C-a, and ?\x8e0 for the Latin-1 character
`a' with grave accent.