Wikifunctions:Suggest a function
Do you have an idea for a new function? Suggest it here! It may help to refer to our glossary.
You can go and create a function right away if you have the user-rights, and it aligns with other work.
Note that for now we only support strings and booleans as input and output types of functions. More types are coming in the next few months.
Once created, consider adding new Functions to the catalogue.
Proposed functions requiring only String and Boolean types
String manipulation functions
- discard from start of first substring
- discard from end of first substring
- discard until start of first substring
- discard until end of first substring
- discard from start of last substring
- discard from end of last substring
- replace at end(string, suffix to replace, suffix to add)
Done replace at end (Z11178)
- remove at end(string, suffix)
Done remove at end (Z11170)
String character discard functions
- remove all punctuation
Done Remove interpunction (Z11193)
- remove all emoticons/emoji
String character replacement functions
- replace all U+0022 quotation marks with pretty quotation marks (usually U+201x, but to be specified as arguments – it's laguage-specific)
- replace all U+0027 apostrophe marks with pretty quotation marks
String search functions
String escaping and unescaping functions
String encoding and decoding functions
- Backslash-U with delimiters ASCII encoding of Unicode encode
- Backslash-U with delimiters ASCII encoding of Unicode decode
- XML and HTML ASCII encoding of Unicode encode
- XML and HTML ASCII encoding of Unicode decode
- HTML named character encode - HTML named character escape (Z10987)
- HTML named character decode - HTML named character unescape (Z10938)
- Punycode encode - Punycode encode (Z10178) (part only, not whole url); see also IDNA encode (Z10185)
- Punycode decode - Punycode decode (Z10181) (part only, not whole url); see also IDNA decode (Z10188)
- Unified English Braille encode (discarding invalid characters?)
- Unified English Braille decode
- ASCII Braille encode (discarding invalid characters?)
- ASCII Braille decode
Done NATO phonetic alphabet code word decode - Decode NATO phonetic alphabet code (Z10970) ("ALPHA BRAVO CHARLIE NINE" ⇒ "ABC9")
- NATO phonetic alphabet ICAO pronunciations encode ("ABC9" ⇒ "AL FAH. BRAH VOH. CHAR LEE. NIN-er.") (discarding invalid characters?)
- NATO phonetic alphabet ICAO pronunciations decode ("AL FAH. BRAH VOH. CHAR LEE. NIN-er." ⇒ "ABC9")
- NATO phonetic alphabet ICAO IPA transcription encode ("ABC9" ⇒ "ˈælfa ˈbraːˈvo ˈtʃɑːli ˈnaɪnə") (discarding invalid characters?)
- NATO phonetic alphabet ICAO IPA transcription decode ("ˈælfa ˈbraːˈvo ˈtʃɑːli ˈnaɪnə" ⇒ "ABC9")
- NATO phonetic alphabet DIN 5009 IPA transcription encode ("ABC9" ⇒ "ˈalfa ˈbravo ˈtʃali ˈnaɪnə") (discarding invalid characters?)
- NATO phonetic alphabet DIN 5009 IPA transcription decode ("ˈalfa ˈbravo ˈtʃali ˈnaɪnə" ⇒ "ABC9")
String presentation functions
- add locale-specific quotation marks to string
- format a large integer with "," between every three digits. e.g. "1000000" -> "1,000,000"
- Shouldn't the output depend on the locale? See mw.language:formatNum. —Dexxor (talk) 17:15, 4 September 2023 (UTC)
String colour notation functions
- complementary colour in RGB colour model ("#FF0000" ⇒ "#00FFFF")
- Great question. I don't think there is a position documented on Wikifunctions for how to handle invalid input to a function. Can we throw exceptions? Return null? Dhx1 (talk) 13:23, 6 August 2023 (UTC)
- This shouldn't be a string function. This should be a type that represents a RGB color (with corresponding validation function (hopefully it can just be three unsigned 8bit integers)) and a function that returns the complementary color. 0xDeadbeef (talk) 12:38, 7 August 2023 (UTC)
- Great question. I don't think there is a position documented on Wikifunctions for how to handle invalid input to a function. Can we throw exceptions? Return null? Dhx1 (talk) 13:23, 6 August 2023 (UTC)
String notation validation checks
- check if string is a nucleic acid notation
- check if string is a simplified molecular-input line-entry system (SMILES) notation - is SMILES notation (Z11208)
- check if string is a SMILES arbitrary target specification (SMARTS) notation
- check if string is an ABC notation
- check if string is a LilyPond notation
- check if string is a portable game notation
Done check if string is a Whyte notation - is Whyte notation (Z10524)
- check if string is a UIC classification of locomotive axle arrangements notation
- check if string is an AAR wheel arrangement notation
- check if string is an IPv4 address - is IPv4 (Z10476)
- check if string is an IPv6 address - is IPv6 (Z10786)
- check if a string is a valid ISBN-10
- check if a string is a valid ISBN-13 (probably just a simple variant of is EAN (Z10821), dropping/validating the hyphens)
Done check if a string is a valid ISSN - is ISSN (Z10765)
String validation checks
Done check if string A is an anagram of string B - is anagram (simple) (Z10973)
- Hi. I don’t know where we are supposed to discuss function propositions. I just want to note that the notion of anagram is language dependent. For example, in French, the diacritics doesn’t count (“niché” is an anagram of “chien”). But in Esperanto, the letters with diacritics are considered different letters. So, “ico” is not an anagram of “ĉio”. Lepticed7 (talk) 10:40, 5 August 2023 (UTC)
- @Lepticed7: assuming that in the future a "remove diacritics" function would exist, anagram in this context would be strictly defined as string having exactly the same set of characters. 0xDeadbeef (talk) 16:17, 5 August 2023 (UTC)
- @0xDeadbeef Thus giving wrong results for Esperanto. Or at least results that do not match the common understanding of "anagrams" in Esperanto. Lepticed7 (talk) 16:34, 5 August 2023 (UTC)
- @Lepticed7: therefore you can't call it an anagram in Esperanto then. You have to translate what the function does. 0xDeadbeef (talk) 16:36, 5 August 2023 (UTC)
- @0xDeadbeef Well, what’s your definition of "anagram"? Lepticed7 (talk) 16:43, 5 August 2023 (UTC)
- @Lepticed7: For this function I am working on: are strings anagrams (Z10423) It is defined as, whether one string with its codepoints rearranged is equivalent to the other string. Actually I misread the discussion above I think. I'm just saying anyone can just create another function for anagrams in French that ignores diacritics, while anagrams can remain having a strict definition (sensitive to diacritics). 0xDeadbeef (talk) 16:47, 5 August 2023 (UTC)
- Ok. So maybe should we indicate in the function description that are strings anagrams (Z10423) is diacritics-sensitive? Lepticed7 (talk) 16:50, 5 August 2023 (UTC)
- Sure, but I hit the character limit for the short description, so maybe we could do that when long descriptions become a thing. 0xDeadbeef (talk) 16:51, 5 August 2023 (UTC)
- Well, if you start talking about codepoints, you have to be even more careful, as Unicode is tricky. Because now, your function says “QR̀” and “RQ̀“ are anagrams (while “AÈ” and “EÀ” are not)… [Note that I am not sure if some specification/design document of Wikifunctions even specifies that all strings are NFC?] --Mormegil (talk) 18:17, 19 August 2023 (UTC)
- Sorry, are you talking about unicode graphemes? I don't think codepoints are as tricky as graphemes. (since graphemes can be ligated and etc.) 0xDeadbeef (talk) 04:40, 20 August 2023 (UTC)
- Well, if you start talking about codepoints, you have to be even more careful, as Unicode is tricky. Because now, your function says “QR̀” and “RQ̀“ are anagrams (while “AÈ” and “EÀ” are not)… [Note that I am not sure if some specification/design document of Wikifunctions even specifies that all strings are NFC?] --Mormegil (talk) 18:17, 19 August 2023 (UTC)
- Sure, but I hit the character limit for the short description, so maybe we could do that when long descriptions become a thing. 0xDeadbeef (talk) 16:51, 5 August 2023 (UTC)
- Ok. So maybe should we indicate in the function description that are strings anagrams (Z10423) is diacritics-sensitive? Lepticed7 (talk) 16:50, 5 August 2023 (UTC)
- @Lepticed7: For this function I am working on: are strings anagrams (Z10423) It is defined as, whether one string with its codepoints rearranged is equivalent to the other string. Actually I misread the discussion above I think. I'm just saying anyone can just create another function for anagrams in French that ignores diacritics, while anagrams can remain having a strict definition (sensitive to diacritics). 0xDeadbeef (talk) 16:47, 5 August 2023 (UTC)
- @0xDeadbeef Well, what’s your definition of "anagram"? Lepticed7 (talk) 16:43, 5 August 2023 (UTC)
- @Lepticed7: therefore you can't call it an anagram in Esperanto then. You have to translate what the function does. 0xDeadbeef (talk) 16:36, 5 August 2023 (UTC)
- @0xDeadbeef Thus giving wrong results for Esperanto. Or at least results that do not match the common understanding of "anagrams" in Esperanto. Lepticed7 (talk) 16:34, 5 August 2023 (UTC)
- I have done this and named it "is anagram (simple)" to avoid confusing it with a more linguistically aware version. Mtanti (talk) 12:56, 31 August 2023 (UTC)
- check if string is a heterogram
- check if string is a tautogram
- check if string A includes all letters of string B ("Nadine" & "and" ⇒ true)
- check if string is in lower camel case
- check if string is a valid ISO 639-1 language code
- check if string is a valid ISO 639-2 language code
- check if string is a valid ISO 3166 country code
- check if string is a valid ISO 8601 date/time (2023-08-03 ⇒ true; 2023-02-30 ⇒ false; 2023-08-03 15:00:00.000 ⇒ true; 2023-08-03 25:00:00.000 ⇒ false)
- check if string is a valid EDTF date/time
Doing... check if string is a valid email address (watch out, see this list of falsehoods about email addresses to create unit tests - email addresses are more complicated than they seem) — is valid email address (Z10410) creating test cases in progress. Currently it is stuck on figuring out what exactly is a valid emaill address. Nearly every errata for RFC:3696 is about that.
Doing... check if string is a valid Wikidata item — Is valid Qid (Z10696) (possibly stuck on phab:T343593?)
Wikitext string operations
- italicise in Wikitext (Z11019) italicise (ABC -> ''ABC''), careful if the input has wikitext special characters in it
- bold in Wikitext (Z11139) bold (ABC -> '''ABC'''), careful if the input has wikitext special characters in it
- Escape all special characters in a string so that they don't become functional when in Wikitext. (Is this possible? Or would it always be better to wrap with <nowiki></nowiki>?)
CSV list operations
Morphological functions
Croatian
- gender * case * number for each noun
- genders: feminine, masculine, neuter
- cases: nominativ, genitive, dative, accusative, lokative, vocative, instrumental
- numbers: singular, plural
Done regular Croatian feminine genitive singular (Z11165)
Done regular Croatian feminine dative singular (Z11199)
Doing... regular Croatian feminine accusative singular (Z11204)
Proposed functions requiring future types
Note these functions cannot be implemented yet as Wikifunctions currently only supports Boolean and String types for function definitions.
If one wishes to nevertheless attempt to define and implement them,
- the functions and implementations should indicate prominently in their labels that their input/output types must be adjusted once support for the appropriate replacement types become available; and
- the functions should not be used in the implementations of any other functions, as the later adjustment of input/output types to appropriate replacements will break those implementations.
String manipulation functions
- right (returns a substring from the end of a specified string up to a number of characters)
- left (returns a substring from the beginning of a specified string up to a number of characters)
- duplicate string n times (e.g. dup("a",5) -> "aaaaa")
String analysis functions
- string length (Hello -> 5)
- count distance between two letters in given alphabet (default to 26-charcater western alphabet. case insensitive. e.g. "a" & "A" ⇒ 0; "K" & "N" ⇒ 3)
- Hamming distance between two strings of equal length, e.g. "Wikipedia" & "Wikimedia" ⇒ 1.
String encoding and decoding functions
(would be better with types representing a stream of bytes)
Done BASE16 encode - Base16 Encode (Z11003)
Done BASE16 decode - Base16 Decode (Z11007)
- BASE32 encode
- BASE32 decode
- BASE45 encode
- BASE45 decode
Done BASE64 encode - Base64 Encode (Z10057)
Done BASE64 decode - Base64 Decode (Z10062)
- Hexadecimal UTF-8 encode ("ABC ₤" ⇒ "41 42 43 20 E2 82 A4")
- Hexadecimal UTF-8 decode ("41 42 43 20 E2 82 A4" ⇒ "ABC ₤")
- Decimal UTF-8 encode ("ABC ₤" ⇒ "65 66 67 32 226 130 164")
- Decimal UTF-8 decode ("65 66 67 32 226 130 164" ⇒ "ABC ₤")
- Octal UTF-8 encode ("ABC ₤" ⇒ "101 102 103 40 342 202 244")
- Octal UTF-8 decode ("101 102 103 40 342 202 244" ⇒ "ABC ₤")
- Binary UTF-8 encode ("ABC ₤" ⇒ "01000001 01000010 01000011 00100000 11100010 10000010 10100100")
- Binary UTF-8 decode ("01000001 01000010 01000011 00100000 11100010 10000010 10100100" ⇒ "ABC ₤")
- Unicode code point encode ("ABC ₤" ⇒ "41 42 43 20 20A4") - Unicode code point encode hex (Z10785)
- Unicode code point decode ("41 42 43 20 20A4" ⇒ "ABC ₤")
- Create regular expression object/string (i.e: "test" & "i" to /test/i)
String validation checks
(alphabet needs to be specified when calling these functions)
- check if string is a pangram for a specified alphabet
- Could you just specify the "alphabet" in a second string "abcdefghijklmnopqrstuvwxyz" ?
Natural language functions
- Choose singular or plural based on number (e.g. singularOrPlural("person",6") -> "people")
Cryptographic hash functions
(would be better with types representing a stream of bytes)
To do MD2 - MD2 (Z10135)
To do MD4 - MD4 (Z10136)
To do MD5 - MD5 (Z10137)
To do RIPEMD-128 - RIPEMD-128 (Z10138)
To do RIPEMD-160 - RIPEMD-160 (Z10139)
To do BLAKE2b-160 - BLAKE2b-160 (Z10140)
To do BLAKE2b-256 - BLAKE2b-256 (Z10141)
To do BLAKE2b-384 - BLAKE2b-384 (Z10142)
To do BLAKE2b-512 - BLAKE2b-512 (Z10143)
To do BLAKE2s-128 - BLAKE2s-128 (Z10144)
To do BLAKE2s-160 - BLAKE2s-160 (Z10145)
To do BLAKE2s-224 - BLAKE2s-224 (Z10146)
To do BLAKE2s-256 - BLAKE2s-256 (Z10147)
To do SHA-224 - SHA-224 (Z10149)
To do HMAC-SHA-256
To do SHAKE-128 - SHAKE-128 (Z10150)
To do SHAKE-256 - SHAKE-256 (Z10151)
Colour functions
- return colour contrast ratio (per [1]) of two RGB colours (provided as strings e.g. "#FF0000")
Date functions
- check if year (for a specific calendar) is a leap year:
Done is leap year (Gregorian calendar) (Z10996) - Gregorian calendar (2020 ⇒ true; 2023 ⇒ false; 2100 ⇒ false)
Done is leap year (Julian calendar) (Z11015) - Julian calendar (2020 ⇒ true; 2023 ⇒ false; 2100 ⇒ true)
Done is leap year (Jalali calendar) (Z11011) - Jalali (or Persian) calendar (1399 ⇒ true; 1400 ⇒ false)
- Solar Hijri (or Islamic) calendar
- Indian national calendar
- Bengali calendar
- Chinese calendar
- Thai calendar
- etc.
- return weekday (2023-08-03 ⇒ Thursday (or should this return Q129, for Thursday (Q129)?))
- date to weekday number (0-6)
- advance n days (2023-08-03 & "69" ⇒ 2023-10-11)
- go back n days
- string to date
- date to ISO 8601 string
- date to year (yyyy)
- date to month of the year (1-12)
- date to month name (January-December)
- date to day of the month (1-31)
- date to hour of the day (0-23)
- date to minutes (0-59)
- date to seconds (0-59)
Basic list/iterable functions
- map, return a list of elements resulted from a given function
- filter, return elements meeting criteria given by a function
- take, return n items of a list
- skip, return a new list that don't have n first of items of the given list
- reverse, return a reversed list of a given list
- contains, if a list has a specific item
- every/all, return true if all of items of a list meets criteria given by a function, otherwise false
- any, return true if one the items of a list meets criteria given by a function, otherwise false
- slice, a combination of take/skip
- length, size of a list
- fold/reduce
- group
- flat
- sort, by a given function
Basic numerical functions
- round up ("1.289" & "2" ⇒ "1.29"; "5678" & "2" ⇒ "5700")
- round down
- return integer value (5678.678 ⇒ 5678)
- decode Roman numerals ("X" ⇒ 10; 2023 ⇒ MMXXIII)
Doing... : Roman to Arabic numeral (Z11023): Convert a Roman numeral to Arabic numeral
- encode as Roman numerals (10 ⇒ "X"; MMXXIII ⇒ 2023)
Doing... : Arabic to Roman numeral (Z11022): Convert a positive integer [1, 4999] to roman numeral
- return cardinal (23 ⇒ "twenty-three")
- return ordinal (23 ⇒ "twenty-third")
- Body Mass Index (80kg and 1.80m ⇒ 24)
- Convert money from US$ to anything else
- Kronecker delta
Data serialization functions
- parse a string as JSON
- extract string from JSON object based on JSONPath (
{"name":"Alice"}
, "$.name" ⇒ "Alice")- Why not first convert a JSON string to an object, and then have a function that extracts fields based on JSONPath? Doing Stringly-typed things like this proposal as defined isn't a good idea. 0xDeadbeef (talk) 16:16, 5 August 2023 (UTC)
CSV list operations
- Convert a validly formatted CSV list to a list (new datatype) of strings.
External function lists
Wikititle function
See discussion at en:User talk:RMCD bot#Circular RM notice posted at talk page. Wbm1058 (talk) 02:11, 17 August 2023 (UTC)