Wikifunctions:Suggest a function
Do you have an idea for a new function? Suggest it here! It may help to refer to our glossary.
You can go and create a function right away if you have the user-rights, and it aligns with other work.
Note that for now we only support strings and booleans as input and output types of functions. More types are coming in the next few months.
Once created, consider adding new Functions to the catalogue.
Proposed functions requiring only String and Boolean types
String manipulation functions
- discard from start of first substring (Z11410)
- discard from end of first substring (Z11412)
- discard until start of first substring (Z11418)
- discard until end of first substring (Z11420)
- discard from start of last substring (Z11414)
- discard from end of last substring (Z11416)
- discard until start of last substring (Z11422)
- discard until end of last substring (Z11424)
Done replace at end (Z11178)
Done remove at end (Z11170)
String character discard functions
- remove all punctuation
Done remove interpunction (Z11193)
- remove all emoticons/emoji remove emoticons/emoji (Z11553)
Done remove isotopic specificity in SMILES (Z11811) remove isotopic specificity in SMILES string
- remove stereochemical specificity in SMILES string
- simplify SMILES string according to some basic simplifications
Done condense to numeric charges in SMILES string (Z11884): e.g. [Ti++++] to [Ti+4]
- expand numeric charges in SMILES string (e.g. [Ti+4] to [Ti++++])
- remove charges in SMILES string
String character replacement functions
Done pretty " (Z11484) replace all U+0022 quotation marks with pretty quotation marks (usually U+201x, but to be specified as arguments – it's laguage-specific)
Done pretty ' (Z11490) replace all U+0027 apostrophe marks with pretty quotation marks
String search functions
String escaping and unescaping functions
String encoding and decoding functions
- Backslash-U with delimiters ASCII encoding of Unicode encode
- Backslash-U with delimiters ASCII encoding of Unicode decode
- XML and HTML ASCII encoding of Unicode encode
- XML and HTML ASCII encoding of Unicode decode
- HTML named character encode - HTML named character escape (Z10987)
Done HTML named character decode - HTML named character unescape (Z10938)
- Punycode encode - Punycode encode (Z10178) (part only, not whole url); see also IDNA encode (Z10185)
- Punycode decode - Punycode decode (Z10181) (part only, not whole url); see also IDNA decode (Z10188)
- Unified English Braille encode (discarding invalid characters?)
- Unified English Braille decode
- ASCII Braille encode (discarding invalid characters?)
- ASCII Braille decode
Done NATO phonetic alphabet code word decode - Decode NATO phonetic alphabet code (Z10970) ("ALFA BRAVO CHARLIE NINE" ⇒ "ABC9")
Done NATO phonetic alphabet ICAO pronunciations encode - NATO phonetic alphabet ICAO pronunciations encode (Z11642) ("ABC9" ⇒ "AL FAH. BRAH VOH. CHAR LEE. NIN-er.") (discarding invalid characters?)
- NATO phonetic alphabet ICAO pronunciations decode - NATO phonetic alphabet ICAO pronunciations decode (Z11668) ("AL FAH. BRAH VOH. CHAR LEE. NIN-er." ⇒ "ABC9")
- NATO phonetic alphabet ICAO IPA transcription encode - NATO phonetic alphabet ICAO IPA transcription encode (Z11670) ("ABC9" ⇒ "ˈælfa ˈbraːˈvo ˈtʃɑːli ˈnaɪnə") (discarding invalid characters?)
- NATO phonetic alphabet ICAO IPA transcription decode - NATO phonetic alphabet ICAO IPA transcription decode (Z11672) ("ˈælfa ˈbraːˈvo ˈtʃɑːli ˈnaɪnə" ⇒ "ABC9")
- NATO phonetic alphabet DIN 5009 IPA transcription encode - NATO phonetic alphabet DIN 5009 IPA transcription encode (Z11674) ("ABC9" ⇒ "ˈalfa ˈbravo ˈtʃali ˈnaɪnə") (discarding invalid characters?)
- NATO phonetic alphabet DIN 5009 IPA transcription decode - NATO phonetic alphabet DIN 5009 IPA transcription decode (Z11676) ("ˈalfa ˈbravo ˈtʃali ˈnaɪnə" ⇒ "ABC9")
String presentation functions
- add locale-specific quotation marks to string
- format a large integer with "," between every three digits. e.g. "1000000" -> "1,000,000"
- Shouldn't the output depend on the locale? See mw.language:formatNum. —Dexxor (talk) 17:15, 4 September 2023 (UTC)
String colour notation functions
- complementary colour in RGB colour model ("#FF0000" ⇒ "#00FFFF")
- Great question. I don't think there is a position documented on Wikifunctions for how to handle invalid input to a function. Can we throw exceptions? Return null? Dhx1 (talk) 13:23, 6 August 2023 (UTC)
- This shouldn't be a string function. This should be a type that represents a RGB color (with corresponding validation function (hopefully it can just be three unsigned 8bit integers)) and a function that returns the complementary color. 0xDeadbeef (talk) 12:38, 7 August 2023 (UTC)
- Great question. I don't think there is a position documented on Wikifunctions for how to handle invalid input to a function. Can we throw exceptions? Return null? Dhx1 (talk) 13:23, 6 August 2023 (UTC)
String notation validation checks
- check if string is a nucleic acid notation - is DNA nucleic acid notation (Z11342)
- check if string is a simplified molecular-input line-entry system (SMILES) notation - is SMILES notation (Z11208)
- check if string is an en:International_Chemical_Identifier
- check if string is a SMILES arbitrary target specification (SMARTS) notation
- check if string is an ABC notation
- check if string is a LilyPond notation
- check if string is a portable game notation
Done check if string is a Whyte notation - is Whyte notation (Z10524)
- check if string is a UIC classification of locomotive axle arrangements notation
- check if string is an AAR wheel arrangement notation
- check if string is an IPv4 address - is IPv4 (Z10476)
- check if string is an IPv6 address - is IPv6 (Z10786)
Done check if a string is a valid ISBN-10 - is ISBN-10 (Z11705)
- check if a string is a valid ISBN-13 (probably just a simple variant of is EAN (Z10821), dropping/validating the hyphens)
Done check if a string is a valid ISSN - is ISSN (Z10765)
String validation checks
Done check if string A is an anagram of string B - is anagram (simple) (Z10973)
- Discussion moved to Talk:Z10423. --99of9 (talk) 01:46, 29 September 2023 (UTC)
Done is heterogram (Z11573) check if string is a heterogram
- is tautogram (Z11577) check if string is a tautogram
Done string only has characters from alphabet (Z11693): check if all of the characters in the tested string are from the alphabet string check if string A includes all letters of string B ("Nadine" & "and" ⇒ true)
- check if string is in lower camel case
- check if string is a valid ISO 639-1 language code
- check if string is a valid ISO 639-2 language code
- check if string is a valid ISO 3166 country code
- check if string is a valid ISO 8601 date/time (2023-08-03 ⇒ true; 2023-02-30 ⇒ false; 2023-08-03 15:00:00.000 ⇒ true; 2023-08-03 25:00:00.000 ⇒ false)
- check if string is a valid EDTF date/time
Doing... check if string is a valid email address (watch out, see this list of falsehoods about email addresses to create unit tests - email addresses are more complicated than they seem) — is valid email address (Z10410) creating test cases in progress. Currently it is stuck on figuring out what exactly is a valid emaill address. Nearly every errata for RFC:3696 is about that.
Doing... check if string is a valid Wikidata item — Is valid Qid (Z10696) (possibly stuck on phab:T343593?)
Wikitext string operations
Done italicise in Wikitext (Z11019) italicise (ABC -> ''ABC''), careful if the input has wikitext special characters in it
Done bold in Wikitext (Z11139) bold (ABC -> '''ABC'''), careful if the input has wikitext special characters in it
- Escape all special characters in a string so that they don't become functional when in Wikitext. (Is this possible? Or would it always be better to wrap with <nowiki></nowiki>?)
CSV list operations
Morphological functions
German
- tense * person * number for each verb
- tenses: present, past, ...?
- person: first, second, third
- number: singular, plural
Done regular German First person singular present verb (Z11256)
Done regular German Second person singular present verb (Z11264)
Doing... third person singular present
Done regular German First person plural present verb (Z11268)
Doing... regular German verb in the second person plural present (Z11272)
English
- English verb to agent noun (Z11390) Verb -> agent noun, e.g. "dance"->"dancer"
- English nominative to accusative (Z11651): converts a nominative (subject) pronoun to the accusative (objective) case
French
- French masculine adjective to feminine (Z11590) Masculine adjective -> feminine, e.g. "exact"->"exacte"
Proposed functions requiring future types
Note these functions cannot be implemented yet as Wikifunctions currently only supports Boolean and String types for function definitions.
If one wishes to nevertheless attempt to define and implement them,
- the functions and implementations should indicate prominently in their labels that their input/output types must be adjusted once support for the appropriate replacement types become available; and
- the functions should not be used in the implementations of any other functions, as the later adjustment of input/output types to appropriate replacements will break those implementations.
String manipulation functions
- right (returns a substring from the end of a specified string up to a number of characters)
- left (returns a substring from the beginning of a specified string up to a number of characters)
- duplicate string n times (e.g. dup("a",5) -> "aaaaa")
String analysis functions
- string length (Hello -> 5)
- count distance between two letters in given alphabet (default to 26-charcater western alphabet. case insensitive. e.g. "a" & "A" ⇒ 0; "K" & "N" ⇒ 3)
Done Hamming distance between two strings of equal length, e.g. "Wikipedia" & "Wikimedia" ⇒ 1. - (!) hamming distance between two strings (Z11328)
String encoding and decoding functions
(would be better with types representing a stream of bytes)
Done BASE16 encode - Base16 Encode (Z11003)
Done BASE16 decode - Base16 Decode (Z11007)
- BASE32 encode
- BASE32 decode
- BASE45 encode
- BASE45 decode
Done BASE64 encode - Base64 Encode (Z10057)
Done BASE64 decode - Base64 Decode (Z10062)
- Hexadecimal UTF-8 encode ("ABC ₤" ⇒ "41 42 43 20 E2 82 A4")
- Hexadecimal UTF-8 decode ("41 42 43 20 E2 82 A4" ⇒ "ABC ₤")
- Decimal UTF-8 encode ("ABC ₤" ⇒ "65 66 67 32 226 130 164")
- Decimal UTF-8 decode ("65 66 67 32 226 130 164" ⇒ "ABC ₤")
- Octal UTF-8 encode ("ABC ₤" ⇒ "101 102 103 40 342 202 244")
- Octal UTF-8 decode ("101 102 103 40 342 202 244" ⇒ "ABC ₤")
- Binary UTF-8 encode ("ABC ₤" ⇒ "01000001 01000010 01000011 00100000 11100010 10000010 10100100")
- Binary UTF-8 decode ("01000001 01000010 01000011 00100000 11100010 10000010 10100100" ⇒ "ABC ₤")
- Unicode code point encode ("ABC ₤" ⇒ "41 42 43 20 20A4") - Unicode code point encode hex (Z10785)
- Unicode code point decode ("41 42 43 20 20A4" ⇒ "ABC ₤")
Done chr of codepoint string (Z11534) Unicode code point decimal decode - single character ("65" ⇒ "A")
- Create regular expression object/string (i.e: "test" & "i" to /test/i)
String validation checks
(alphabet needs to be specified when calling these functions)
- check if string is a pangram for a specified alphabet
- Could you just specify the "alphabet" in a second string "abcdefghijklmnopqrstuvwxyz" ?
Natural language functions
- Choose singular or plural based on number (e.g. singularOrPlural("person",6") -> "people")
- Note that there are also dual and other grammatical numbers in other languages. 魔琴 (talk) 18:54, 26 October 2023 (UTC)
Cryptographic hash functions
(would be better with types representing a stream of bytes)
To do MD2 - MD2 (Z10135)
To do MD4 - MD4 (Z10136)
To do MD5 - MD5 (Z10137)
To do RIPEMD-128 - RIPEMD-128 (Z10138)
To do RIPEMD-160 - RIPEMD-160 (Z10139)
To do BLAKE2b-160 - BLAKE2b-160 (Z10140)
To do BLAKE2b-256 - BLAKE2b-256 (Z10141)
To do BLAKE2b-384 - BLAKE2b-384 (Z10142)
To do BLAKE2b-512 - BLAKE2b-512 (Z10143)
To do BLAKE2s-128 - BLAKE2s-128 (Z10144)
To do BLAKE2s-160 - BLAKE2s-160 (Z10145)
To do BLAKE2s-224 - BLAKE2s-224 (Z10146)
To do BLAKE2s-256 - BLAKE2s-256 (Z10147)
To do SHA-224 - SHA-224 (Z10149)
To do HMAC-SHA-256
To do SHAKE-128 - SHAKE-128 (Z10150)
To do SHAKE-256 - SHAKE-256 (Z10151)
Colour functions
- return colour contrast ratio (per [1]) of two RGB colours (provided as strings e.g. "#FF0000")
Date functions
- check if year (for a specific calendar) is a leap year:
Done is leap year (Gregorian calendar) (Z10996) - Gregorian calendar (2020 ⇒ true; 2023 ⇒ false; 2100 ⇒ false)
Done is leap year (Julian calendar) (Z11015) - Julian calendar (2020 ⇒ true; 2023 ⇒ false; 2100 ⇒ true)
Done is leap year (Jalali calendar) (Z11011) - Jalali (or Persian) calendar (1399 ⇒ true; 1400 ⇒ false)
- Solar Hijri (or Islamic) calendar
- Indian national calendar
- Bengali calendar
- Chinese calendar
- Thai calendar
- etc.
- return weekday (2023-08-03 ⇒ Thursday (or should this return Q129, for Thursday (Q129)?))
- date to weekday number (0-6)
- advance n days (2023-08-03 & "69" ⇒ 2023-10-11)
- go back n days
- string to date
- date to ISO 8601 string
- date to year (yyyy)
- date to month of the year (1-12)
- date to month name (January-December)
- date to day of the month (1-31)
- date to hour of the day (0-23)
- date to minutes (0-59)
- date to seconds (0-59)
Basic list/iterable functions
- map, return a list of elements resulted from a given function
- filter, return elements meeting criteria given by a function
- take, return n items of a list
- skip, return a new list that don't have n first of items of the given list
- reverse, return a reversed list of a given list
- contains, if a list has a specific item
- every/all, return true if all of items of a list meets criteria given by a function, otherwise false
- any, return true if one the items of a list meets criteria given by a function, otherwise false
- slice, a combination of take/skip
- length, size of a list
- fold/reduce
- group
- flat
- sort, by a given function
- apply a function to every element of a list, and get a list of answers
Basic numerical functions
- round up ("1.289" & "2" ⇒ "1.29"; "5678" & "2" ⇒ "5700")
- round down
- return integer value (5678.678 ⇒ 5678)
- decode Roman numerals ("X" ⇒ 10; 2023 ⇒ MMXXIII)
Doing... : Roman to Arabic numeral (Z11023): Convert a Roman numeral to Arabic numeral
- encode as Roman numerals (10 ⇒ "X"; MMXXIII ⇒ 2023)
Doing... : Arabic to Roman numeral (Z11022): Convert a positive integer [1, 4999] to roman numeral
- return cardinal (23 ⇒ "twenty-three")
- return ordinal (23 ⇒ "twenty-third")
- Body Mass Index (80kg and 1.80m ⇒ 24)
- Convert money from US$ to anything else
- Kronecker delta
Data serialization functions
- parse a string as JSON
- extract string from JSON object based on JSONPath (
{"name":"Alice"}
, "$.name" ⇒ "Alice")- Why not first convert a JSON string to an object, and then have a function that extracts fields based on JSONPath? Doing Stringly-typed things like this proposal as defined isn't a good idea. 0xDeadbeef (talk) 16:16, 5 August 2023 (UTC)
CSV list operations
- Convert a validly formatted CSV list to a list (new datatype) of strings.
External function lists
Wikititle function
See discussion at en:User talk:RMCD bot#Circular RM notice posted at talk page. Wbm1058 (talk) 02:11, 17 August 2023 (UTC)