Wikifunctions:Type proposals/Grapheme
Summary
A user-perceived character; smallest functional writing unit, the w:grapheme. Includes not just the "base" character but also the sequence of combiners.
Uses
Often, we need to process Strings by user-perceived character. "1️⃣" should be processed as one grapheme, not three characters (1+variation selector+combining keycap⃣). When the user wants to select the last character of "I won't decline this proposal 1️⃣", they want to select the full 1️⃣, not just the ⃣. Likewise, we don't want nonsense like ⃣1 to emerge when we reverse the String. This doesn't just apply to keycap emojis. It also applies to real natural-language writing systems, like the grapheme "A̧" (A+̧).
To do all of that, we need to split a String into graphemes, and to do that, we need the grapheme type. Such a splitter would fix characters with diacritics (Z22735).
Structure
Like the String, but the value is also a String that is the full sequence of characters for this single grapheme.
Example values
{
"type": "grapheme",
"value": "A̧"
}
|
{
"Z1K1": "Zxyz",
"ZxyzK1": "A̧"
}
|
Validator
The validator ensures that all characters under the "value" field combine to form exactly one grapheme.
Identity
Two graphemes are the same if their value Strings are the same.
Converting to code
Probably just return the String value, K1?
Display function
Display K1.
Read function
The input should be K1. Another function can split an arbitrary String into a typed array of graphemes; see Special:Permalink/32335, which can also be reengineered into a validator.
Alternatives
We can represent the grapheme as a String, but that makes things very weird, and could require bundling a validator and equality-finder with each function dealing with graphemes.
Comments
Support as proposer. I might be able to spin up a splitter function specified to output a String while the grapheme type isn't there yet soon. Aaron Liu (talk) 01:24, 3 May 2025 (UTC)
- The function in question is Z24453, still just a skeleton.
- But already I've used it in my implementation of reverse string (grapheme level) (Z10548). That function serves as a good example of why this Type is necessary, take a look at the test Z30031 compared to Z30032. YoshiRulz (talk) 13:41, 4 December 2025 (UTC)
- I got it mostly working, sans some bug in Z31149 that means the emoji examples don't work :( YoshiRulz (talk) 19:23, 16 January 2026 (UTC)
Support YoshiRulz (talk) 13:41, 4 December 2025 (UTC)
Support--So9q (talk) 16:16, 4 December 2025 (UTC)
Support John Samuel 12:05, 17 April 2026 (UTC)
Support why not. Feeglgeef (talk) 22:51, 11 May 2026 (UTC)
Support --Ameisenigel (talk) 08:24, 12 May 2026 (UTC)