Wikifunctions:Catalogue/Character operations
Appearance
The term Character seems ambiguous. This page deals with functions that deal with the different interpretations of that term.
Unicode code points
These are functions that deal with Unicode code point (Z86): A single code point in Unicode
Conversions
- natural number to codepoint (Z23022): Description missing
- code point to natural number (Z23063): Description missing
- codepoint to U+HHHH notation (Z23061): a string showing unicode hex
- codepoint to string (Z15631): type conversion from Code point object to string object
- String to codepoint list (Z22717): Converts a string to a list of codepoints. Reverse at Z22693
- Codepoint list to string (Z22693): Converts a list of code points to a string. Reverse at Z22717
- hex code point to string (Z22829): enter the hex code point of a unicode character and get it back as a string
- digit string to codepoint (Z23028): converts a string of (western Arabic) decimal digits into the codepoint corresponding to the number in the string
- codepoint from string leniently (Z23031): Converts a string to a codepoint with as little ambiguity and as much flexibility as possible. For digits only, map to the numerical codepoint number. Otherwise return first codepoint of character.
- read code point (Z23041): reads a string leniently with an associated language to return a code point
- code point to digit string (Z23053): return the string of western Arabic digits representing the value of the code point
- code point as character and digits (Z23056): verbose display, format: "c (nnn)"
- get Nth code point of String (Z24472): counting each codepoint from 1 returns 0x0015 (negative acknowledge) if index out-of-bounds
Checks
- Code point equality (Z22683): tests whether two code-point realisations (as entered) have identical Unicode code-point representations
- get general Grapheme_Cluster_Break from codepoint (Z24459): https://www.unicode.org/reports/tr29/#Default_Grapheme_Cluster_Table Output is an enum from 1–17 for each possible value mentioned in the table, in the table's order Lacks CLDR language-specific rules
- is Extended_Pictographic codepoint (Z24460): see also [1]
Operations
- code point prefix (Z15991): returns a string prefixed with a code point
- plane of code point (Z24736): Returns the number of the Unicode plane to which the given code point belongs
- number of bytes for code point in UTF-8 (Z24809): The number of bytes the given Unicode code point needs in UTF-8 encoding
Search for
Functions expecting or returning an explicit Unicode code point object, singly or in a list
String functions that deal with single characters
There are functions that deal with strings that represent a single character.
- get first character of string (Z10901): returns the first character of a string
- get last character of string (Z11060): returns the last character of a string
- get Nth character of a string (Z14244): counting each codepoint as a character, counting from 1
- previous character (Z11564): returns the character one codepoint before the input character
- next character (Z11538): returns the character one codepoint above the input character
- first letter of strings: codepoints in ascending order (Z11523): First has a <= codepoint than the second. Only considers the first character of strings.
- in codepoint order (three characters) (Z11528): characters in codepoint order (each equal or ascending)
- chr of codepoint value (Z11534): the chr() function in python
- unicode of first character (Z11515): Python ord() function. Return the natural number that represents the (single/first) character input.