Wikifunctions:Catalogue/String operations
Appearance

Strings
These are functions that deal with String (Z6): A sequence of characters, and one of the fundamental Z4/Types available in the Wikifunctions system.
String evaluation operations
These functions perform simple tests on a text string to tell you if something else needs to be done, or they already are in an expected format.
- string length (Z11040): Return the length of this string
- string length in UTF-8 code units (Z17036): Return the length of this string in UTF-8 code units
- string length in UTF-16 code units (Z17030): Return the length of this string in UTF-16 code units
- is empty string (Z10008): true if the input string is strictly empty, even without any non-printing characters, and false otherwise
- is string blank (Z10083): Checks if a string just contains whitespaces
- is numeric (Z10715): Checks if a string contains only numeric characters
- is uppercase (Z10336): checks if string is uppercase (equal to its own uppercase - so blanks and empty strings count)
- has and is uppercase (Z11349): checks if string has and is uppercase (blanks and strings without letters don't count)
- is lowercase (Z10346): checks if string is all lowercase
- has and is lowercase (Z11383): checks if string has and is lowercase (blanks and strings without letters don't count)
- is title case (Z10375): checks if string is in title case
- is pascal case (Z10363): checks if string is in pascal case
- is camel case (Z10897): true if the entered string is in camel case (e.g. 'camelCase')
- is snake case (Z10324): checks if string is in snake case
- fallback if string is empty (Z11082): returns a fallback string if the value is empty, and the value itself if not
- has specified chars paired (Z11678): check if a string has correctly paired chars (for example: brackets). Specifying left and right chars (brackets) in sequence
- has all brackets paired (Z11684): check for pairing of all possible left-right paired characters, feel free to extend
- is pangram (Latin alphabet) (Z12626): checks whether a string of characters possesses every letter from the Latin alphabet at least once
- is a palindrome (Z10096): test if a string is the same when read forward and backward (see Z10553 for one with Unicode grapheme support)
- has double letter (Z19170): tests whether the string has any letter (case sensitive) used twice in a row
- is square-free (Z19191): Combinatorial term. A word that avoids the pattern XX where X is any non-empty sequence of letters
- Is it a valid ISO 6709 code (Z19217): checks if a string matches ISO 6709.
String comparison operations
- string equality (Z866): True if the first string and the second string are the same
- case-insensitive string equality (Z10539): returns true if both strings are the same if converted to lowercase
- string inequality (Z10379): true if two text strings are not exactly equal
- has substring (Z10070): Check if a substring exists within another string. Case-sensitive. For a case-insensitivity support see: Z22812
- count substrings (Z14450): returns the number of times a substring occurs in a string
- string starts with (Z10615): returns true if the substring exists at the beginning of the string
- string ends with (Z10618): true if the substring exists at the end of the string
- strings equal length (Z11690): checks if the two input strings are of equal length
- longer string (Z11519): Returns the longer of two strings. If equal, defaults to returning the first.
- string only has characters from alphabet (Z11693): check if all of the characters in the tested string are from the alphabet string
- common codepoints in strings (Z14483): true if the two strings contain any codepoints (~characters) in common
- is pangram of alphabet (Z13119): check if the string uses every letter of a specified (lowercase) alphabet
- is anagram (simple) (Z10973): test if the same characters at the same number of times are used in two strings (characters must be exact code points).
- is heterogram (Z11573): True if no character occurs more than once
- is ISO 639-1 language code (Z13482): validates whether a string is a valid ISO 639-1 language code
- is ISO 639-2 language code (Z14083): validates whether a string is a valid ISO 639-2 language code
- is subword of string (Z19177): the subword is contained (in order) in the string, but may be interspersed with other letters
- hamming distance between two strings (Z11328): the strings should be of equal length, otherwise will return -1
- Levenshtein Distance (Z10393): gibt eine Zahl als Ergebnis aus
String discard functions
- discard from start of first substring (Z11410): if the substring is found in the full string, discard everything after and including the first occurrence, otherwise leave unchanged
- discard from end of first substring (Z11412): if the substring is found in the full string, discard everything after but not including the first occurrence, otherwise leave unchanged
- discard until start of first substring (Z11418): if the substring is found in the full string, discard everything before but not including the first occurrence, otherwise leave unchanged
- discard until end of first substring (Z11420): if the substring is found in the full string, discard everything before and including the first occurrence, otherwise leave unchanged
- discard from start of last substring (Z11414): if the substring is found in the full string, discard everything after and including the last occurrence, otherwise leave unchanged
- discard from end of last substring (Z11416): if the substring is found in the full string, discard everything after but not including the last occurrence, otherwise leave unchanged
- discard until start of last substring (Z11422): if the substring is found in the full string, discard everything before but not including the last occurrence, otherwise leave unchanged
- discard until end of last substring (Z11424): if the substring is found in the full string, discard everything before and including the last occurrence, otherwise leave unchanged
- remove at end (Z11170): if a string ends with the given suffix, remove the suffix, otherwise return the string unchanged
String character discard functions
These reduce a string by discarding certain characters.
- remove regular spaces (Z10052): remove all regular spaces (U+0020) from a string
- trim string (Z10079): Remove starting and ending whitespaces
- remove characters in character range (Z11531): strips all characters from a codepoint block from a string
- remove characters in unicode range (Z14119): strips all characters from a codepoint block (specified by unicodes) from a string
- remove interpunction (Z11193): remove all interpunction characters
- remove first character (Z14456): removes the first character of a string and returns the rest
- remove last character (Z11879): renvoie la chaîne de caractères sans le dernier caractère
- final N characters of string (Z14460): return only the last N characters of the initial string
- remove first N characters of string (Z14636): return the string with the first N characters removed
- first N characters of string (Z14592): returns a substring from the beginning of a specified string up to a number of characters
- str left (Z22344): Gives the resultant <count> of characters creating a substring of characters from the start of the trimmed string. Duplicates string as needed. (same as Template:Str left on EnWp)
- character Nth from the end of the string (Z14463): return is a string type
- remove all characters except Arabic numerals (Z14494): no description
- remove all characters except ASCII digits, uppercase Latin letters and lowercase Latin letters (Z10171): 文字列から半角英数字以外の文字を除去する
- remove all characters not in second string (Z14515): leaves only the characters in string 1 that are also in string 2
- remove all characters in second string (Z14520): leaves only the characters in string 1 that are not in string 2
- remove repeated characters (Z19185): remove repeat occurrences of any character in the string, just leave the first one
- replace multiple spaces with single spaces (Z22507): removes repeated (regular U+0020) spaces in a string, leaving only a single space in place
Simple String transformations
These perform character replacements and other basic operations.
Replace suffix
- replace at end (Z11178): replaces suffix with replacement if input ends with suffix; if not, returns input unchanged
- replace suffix "a" with "ors" (Z18092): no description
- replace suffix "a" with "ons" (Z18026): no description
- replace suffix "a" with "on" (Z17827): E.g. öga -> ögon
- replace suffix "a" with "orna" (Z17915): E.g. gata -> gatorna
- replace suffix "a" with "ornas" (Z17918): E.g. gata -> gatornas
Add string suffix if not already present
- add suffix to string if it does not already end with the suffix (Z17973): E.g. testing + ing -> testing
- add suffix "ns" to string if it does not end with "ns" (Z18066): for Swedish
- add suffix "en" to string if it does not end with "en" (Z18050): no description
- add suffix "ets" to string if it does not end with "ets" (Z18042): no description
- add suffix "ens" to string if it does not end with "ens" (Z18039): no description
- add suffix "enas" to string if it does not end with "enas" (Z18036): E.g. huvud -> huvudenas
- add suffix "s" to string if it does not already end with "s" (Z18020): E.g. test -> tests
- add suffix "ts" to string if it does not already ends with "ts" (Z18017): no description
- add suffix "nas" to string if it does not already end with "nas" (Z17952): no description
- add suffix "rnas" to string if it does not end with "rnas" (Z17942): E.g. fiende -> fiendernas
- add suffix "rna" to string if it does not end with "rna" (Z17939): E.g. fiende -> fienderna
- add suffix "na" to string if it does not end with "na" (Z17946): no description
- add suffix "r" to string if it does not end with "r" (Z17749): E.g. fiende -> fiender
- add suffix "t" to string if it does not end with "t" (Z17904): E.g. äpple -> äpplet
- add suffix "a" to string if it does not end in "a" (Z17948): no description
- add suffix "n" to string if it does not already end with "n" (Z17791): E.g. äpple -> äpplen
Other transformations
- join two strings (Z10000): combine two strings, one after the other
- concatenate many strings (Z21394): no description
- join list of strings (Z12899): returns string composed of list elements separated by a given delimiter
- join list of strings with spaces (Z22504): joins a list of strings inserting a single space between each
- sentence from list of words (Z22514): takes a list of words, joins with spaces, collapses multiple spaces, then turns to sentence case and adds a full stop at the end
- reverse string (Z10012): Inverts the order of the characters in a String (see Z10548 for one with Unicode grapheme support)
- replace all substrings (Z10075): finds and replaces all instances of a substring in an input string
- replace character set (Z14613): replaces each character of the first string that appears in the second string with the corresponding character in the third string
- wrap string (Z11145): add wrapper text to the start and end of a string
- unwrap string (Z11151): removes text from start and end of string once only if both are present
- duplicate string (Z10753): takes a string and returns it duplicated
- Replicate string n-times (Z12624): Replicates a string n times: (e.g. f("a",5) -> "aaaaa")
- duplicate string N times (Z10911): 入力された文字列をN回複製して、結合した形で出力する
- regular expression substitute with flags (Z12316): $N for capture groups. Flags supported should at least be 'i', 'm', and 'g'.
- replace all (regex, case sensitive) (Z10193): replace characters in a string with another string according to a regex pattern
- echo string except for specific replacement (Z18898): returns the same string, unless it matches a specific string when it returns a specific string
- to Title Case (Z10251): converts a string to title case
- to PascalCase (Z10290): convert string to Pascal Case
- to camelCase (Z10816): convert string to lower camelCase,
- to snake_case (Z10281): convert string to snake case
- string to hex (UTF-8) (Z10366): convert string of UTF-8 characters into hexadecimal
- hex (string) to string (UTF-8) (Z10373): hex to string
- URI percent encode (Z10761): encodes certain characters using URI percent encoding syntax
- URI percent decode (Z10774): decodes a percent-encoded input string
- international morse code encode (Z10944): encodes the supplied string in morse code, separating letter encodings by spaces and words by " / "
- international morse code decode (Z10956): decodes the supplied string in morse code: separate letter encodings by spaces and words by " / "
- encode NATO phonetic alphabet code (Z10309): requires ALLCAPS input, e.g. EXAMPLE
- decode NATO phonetic alphabet code (Z10970): case insensitive
- Infix to Postfix (Z13060): converts infix operators and operands to postfix format
Color operations
- Mix colours (Z12997): Calculates the midpoint between two colours. It prefers input in hexadecimal but also accepts basic colour names.
- convert hex color (Z13017): converts a hexadecimal color code into HSL, HSV, RGB, and CMYK formats
- convert hex colour to [R,G,B] (Z17664): output is a list of three natural numbers, each between 0 and 255
- convert [R,G,B] to hex colour (Z17687): input is triplets of natural numbers between 0 and 255. output is lowercase preceded by #
- convert X11 colour to hex (Z17713): converts colour names to hex (including leading #) https://www.w3.org/TR/css-color-3/#svg-color
- opposite colour (Z13023): in the RGB color space
- colour contrast ratio (Z13028): returns colour contrast ratio 'X:1' for given hex colours
- Tint of colour (Z18184): It will mix a color with white by a given percentage.
- Shade of colour (Z18189): Returns the shade of a colour by mixing it with a percentage of black.
- Tone of colour (Z18196): Returns the tone of a color by mixing it with gray
- Analogous colour (Z18204): Returns the colours which are 30 degrees apart from the input base colour.
- Tetradic colours (square) (Z18208): Returns colours that are 90 degree apart from the input base colour.
- Triadic colours (Z18212): Returns the two colours that are 120 degrees and 240 degrees apart from the input base colour.
- Saturation of colour (Z18263): Returns the intensity of an colour. 100% saturation means there is no addition of gray.
- Lightness of colour (Z18268): Returns the measure of how light or dark a colour is, with 0% being completely black and 100% being completely white
- Subtractive colour (Z18296): Subtract the second colour from the first colour.
- Additive colours (Z18300): Additively mix two hex colours using the RGB model.
String presentation transformations
- to uppercase (Z10018): Convert a string to uppercase letters
- to lowercase (Z10047): Convert a string to lowercase letters
- turn to superscript (Z19612): Takes a text, and all characters that have a superscript version are replaced with such.
- pretty " (Z11484): replace " with pretty left-right quotes depending on position
- pretty ' (Z11490): replace ' with pretty left-right quotes depending on position
- format large natural number strings by adding commas (Z13473): অনেকগুলো অংক রয়েছে এমন স্বাভাবিক সংখ্যায় কমা যোগ করে সাজায়।
- pad string with leading characters to specified length (Z14770): add specified characters at the start until the string is of the required length
- capitalise first letter and add full stop (Z22511): turn a string of words into a sentence format, with an initial capital, and a full stop at the end.
Uncommon String operations
These functions perform more advanced transformations, hold more states and showcase the more advanced capabilities of Wikifunctions.
- left/inner/right mark replacement (Z11492): replaces the same mark (or substring) in a string with different replacements depending on position
- general positional mark replacement (Z11501): a generalisation of Z11492 to allow different spacers and specify isolated replacement
String classical cipher functions
(alphabet needs to be specified when calling these functions)
- Caesar cipher (Latin alphabet) (Z12812): rotates letters in the Latin alphabet forward by a defined number of places
- ROT1 (Latin alphabet) (Z10846): move by one letter in the English alphabet
- ROT13 (Latin alphabet) (Z10627): encode or decode a Latin alphabet string using the ROT13 cipher ROT13 encrypt/decrypt
- ROT25 (Latin alphabet) (Z10851): move each letter one letter back in the English alphabet
- to Scream Cipher (Z22725): Based on xkcd.com/3054 - reverse at Z22728
- from Scream Cipher (Z22728): Based on xkcd.com/3054 - reverse at Z22725
Cryptographic hash functions
(would be better with types representing a stream of bytes)
- SHA-1 (Z10148): SHA-1 hash of the UTF-8 representation of a string, as a lowercase hexadecimal byte string SHA-1
- SHA-256 (Z10124): returns the hexadecimal hash of a string in SHA-256 SHA-256
- SHA-384 (Z10132): returns the hexadecimal hash of a string in SHA-384 SHA-384
- SHA-512 (Z10067): hash a string using the SHA-512 function SHA-512
Experimental String operations
TODO: Explain why these exist and when people might use them.
- (!) get lemma string from Lexeme JSON (Z10037): (!) approximates mw.wikibase.lexeme.entity.lexeme:getLemma(languageCode)
- Turkish final-obstruent devoicing for string (Z10022): トルコ語の文字列において、末尾の有声阻害音を無声化した綴りを返す
- Base16 Encode (Z11003): Encode a string into base16
- Base16 Decode (Z11007): Decode a string from base16
- Base32 Encode (Z14189): no description
- Base32 Decode (Z14195): Decode a string from Base32
- Base64 Encode (Z10057): Encode a string into base64
- Base64 decode (Z10062): Decode a string from base64 (needed to demonstrate base64 encode/decode examples)
- debug (Z12941): prints the non-empty string passed to it as a debug and returns true, if empty returns false
Wikitext and Mediawiki string operations
- italicise a simple string in Wikitext (Z11019): wrap string with two pairs of single quotes (ABC -> ''ABC''). Careful using this if your text has special formatting characters.
- bold in Wikitext (Z11139): bold a string by triple quoting, e.g. (ABC -> '''ABC'''). Careful if there are special characters.
- csv record to wikitable row (Z10919): Converts a validly formatted (RFC 4180) comma-separated value series into the contents of a valid wikitable row (not including the row start or row end characters) where variables are separated by '||', and any whitespace is unchanged. Be careful to validly render CSV with quoted fields and with pipes ('|') in the field.
- wrap with XML tag (Z11156): adds <tag> and </tag> around a string
- substitute mediawiki editchangetags query (Z17954): 36621225 ... ↓ ?action=editchangetags&ids%5B36621225%5D=1 ... &ids
- substitute mediawiki revisiondelete query (Z17956): 36621225 ... ↓ ?action=revisiondelete&ids%5B36621225%5D=1 ... &ids
Comma-separated operations
- string is element of CSV (Z11094): tests whether a string is an element of a validly formatted (RFC 4180) comma-separated value series (single row, not whole file); be careful to validly interpret a CSV with quoted fields