Wikifunctions:Suggest a function
Do you have an idea for a new function? Suggest it here! It may help to refer to our glossary.
You can create a function right away if you have the user-rights.
If a function requires a new type, consider proposing it.
Note that for now we only support a limited number of types as input and output types of functions. More types are coming in the next few months. For the full list, see WF:Type.
Once created, consider adding new Functions to the catalogue.
Proposed functions requiring only available types (string, Boolean, Natural number, list)
String
String character discard functions
- remove stereochemical specificity in SMILES string, like e/z isomers
- already fulfilled by someone else at: remove stereochemical specificity at tetrahedral sites in SMILES string (Z11815)
- simplify SMILES string according to some basic simplifications
Partly done, see simplify SMILES string (Z19380). There's testcases, and I (or someone else) can get around to the coding later. MolecularPilot (talk) 10:21, 26 October 2024 (UTC)
Done completely, still at simplify SMILES string (Z19380). Another user helpfully wrote a python script that passed 1 of my test cases between October and now. I just re-wrote the script to pass all 3 test cases, and also created a JavaScript version. MolecularPilot (talk) 03:43, 10 January 2025 (UTC)
String character replacement functions
String search functions
String escaping and unescaping functions
String encoding and decoding functions
- Unicode normalising functions (there are several types of normalisation)
- Backslash-U with delimiters ASCII encoding of Unicode encode
- Can someone elaborate on this? No example cases were given on the document, and backslash-U with delimiters is anyway not that prevalent as far as I have seen. BrightSunMan (talk) 15:24, 26 December 2023 (UTC)
Done, see \u with delimiters ASCII encoding of Unicode (Z21486). I've made 5 test cases (achieving 100% coverage) and implementations in both JavaScript and Python, which passes all test cases. Support for both the Basic Multilingual Plane (BMP) and supplementary characters (using surrogate pairs). MolecularPilot (talk) 02:49, 10 January 2025 (UTC)
- XML and HTML ASCII encoding of Unicode encode
Done, see HTML/XML ASCII encoding of unicode string (Z21503). Again, I've also made 5 test cases which cover a wide variety of Unicode characters, and implementations in JS and Python (which pass all the tests). As before successful support for both the Basic Multilingual Plane (BMP) and supplementary characters (this time not using surrogate pairs, as is customary for HTML/XML encoding). MolecularPilot (talk) 05:30, 10 January 2025 (UTC)
- HTML named character encode
- Punycode encode - Punycode encode (Z10178) (part only, not whole url); see also IDNA encode (Z10185)
- Unified English Braille encode (discarding invalid characters?)
Done, see Unified English Braille encode (Z21514). 6 test cases this time, and support for both letters and numbers, with implementations in JS and Python (both passing all the tests). MolecularPilot (talk) 06:00, 10 January 2025 (UTC)
String presentation functions
- add locale-specific quotation marks to string
- Shouldn't the output depend on the locale? See mw.language:formatNum. —Dexxor (talk) 17:15, 4 September 2023 (UTC)
String colour notation functions
- complementary colour in RGB colour model ("#FF0000" ⇒ "#00FFFF")
- Any specification on invalid inputs? MilkyDefer 11:22, 5 August 2023 (UTC)
- Great question. I don't think there is a position documented on Wikifunctions for how to handle invalid input to a function. Can we throw exceptions? Return null? Dhx1 (talk) 13:23, 6 August 2023 (UTC)
- This shouldn't be a string function. This should be a type that represents a RGB color (with corresponding validation function (hopefully it can just be three unsigned 8bit integers)) and a function that returns the complementary color. 0xDeadbeef (talk) 12:38, 7 August 2023 (UTC)
- Work on the color type has been stalled for over a year. But this task is
Done, I have made Complementary color (Z21554), that uses string hex codes (with or without the initial # and supporting short hex format). This is probably the most optimal format, as I can imagine this function being used on-wiki for thing like the style parameter (CSS) of MediaWiki tags, or of templates etc. There's 5 testcases I've made, which are passed by both my JS and Python versions. :) MolecularPilot (talk) 03:39, 11 January 2025 (UTC)
String notation validation checks
- check if string is an en:International_Chemical_Identifier
Partly done see Validate International Chemical Identifier (Z21539). Supports the verification of the chemical formula and the stereochemical layer. There are 13 testcaes that I've written, all of which are passed by my JavaScript implementation. Note that a python implementation is not possible as the regex module is not available in Wikifunctions. MolecularPilot (talk) 03:09, 11 January 2025 (UTC)
- To do:
- Needs to verify the hydrogen and connection sections of the main layer
- Support the charge layer
- Support the isotopic layer
- check if string is a SMILES arbitrary target specification (SMARTS) notation
- check if string is an ABC notation
- check if string is a LilyPond notation
Doing... check if string is a portable game notation for a chess game (is portable game notation (Z15867), figuring out how to add newlines to the test input)
- is Forsyth–Edwards Notation (Z14643) check if string is Forsyth–Edwards Notation for a chess position
- check if string is a UIC classification of locomotive axle arrangements notation
- check if a string is a valid ISBN-13 (probably just a simple variant of is EAN (Z10821), dropping/validating the hyphens)
- check if a string is a valid DOI
- Something about implementation difficulties: https://stackoverflow.com/questions/27910/finding-a-doi-in-a-document-or-page Alexander-Mart-Earth (talk) 14:28, 21 December 2023 (UTC)
- check if a string is a valid ISWN
Done, see ISWN validator (Z21562). Contains 6 test cases that I made, all of which are passed by my Python and JavaScript implementations. It supports both just numbers, and a string containing the "separator" symbols (like ., - and /). MolecularPilot (talk) 07:32, 11 January 2025 (UTC)
String validation checks
Doing... check if string is in lower camel case
- check if string is a valid ISO 3166 country code
- check if string is a valid ISO 8601 date/time (2023-08-03 ⇒ true; 2023-02-30 ⇒ false; 2023-08-03 15:00:00.000 ⇒ true; 2023-08-03 25:00:00.000 ⇒ false)
- check if string is a valid EDTF date/time
Doing... check if string is a valid email address (watch out, see this list of falsehoods about email addresses to create unit tests - email addresses are more complicated than they seem) — is valid email address (Z10410) creating test cases in progress. Currently it is stuck on figuring out what exactly is a valid emaill address. Nearly every errata for RFC:3696 is about that.
Doing... check if string is a valid Wikidata item — is it a valid Qid? (Z10696) (possibly stuck on phab:T343593?)
String analysis functions
- Word frequency counting. Provide a list of words and their frequencies.
Done, see Word frequency (Z21593). Providing a list of words and frequencies would require a new type, so instead it requires the sentence and the word you want to count, and returns the occurrences. Hyphenated words are not considered a match of their components, i.e. "fast-forward" is a match of "fast-forward" but nether "fast" nor "forward", I think this is the optimal behaviour but if someone disagrees we can change it. There is a JS implementation that I made which bases all of my 4 test cases. MolecularPilot (talk) 05:41, 12 January 2025 (UTC)
- @MolecularPilot And I have added a Python implementation at Word frequency counter, Python (Z22473). I would appreciate if you could attach it! ~/Bunnypranav:<ping> 08:13, 15 February 2025 (UTC)
- Cool, thank you so much for doing it! Someone's already beat me to it re attaching it, but great work! :) MolecularPilot (talk) 21:33, 16 February 2025 (UTC)
Monolingual text
String Wikitext operations
...
Natural number
- rectified linear unit (ReLU) - https://www.wikifunctions.org/view/en/Z13909
Integer
Done - multiply vectors (get cross product (Z21903), get dot product (Z20659))
Object
List
Basic list/iterable functions
- group
- w:Circular shift
Complex list functions
- zip lists together: for [ A .. Z ] and [ 1 .. 26 ] return [ [ A, 1 ], [ B, 2 ], .. ]
- Unsure what happens if input lists are of different lengths.
- If possible this function should be able to zip more than 2 lists together... 3, 4, n? Perhaps the input should be list(list, list, list, list, ..).
- remove elements common to second list (Z19198): return the first list with any elements common to the second list removed
CSV list operations
- list of strings to csv
- number -> list of decimal digits
number -> list of binary digits
number -> list of digits in base provided Well very well (talk) 11:20, 18 May 2024 (UTC)
Functions with functions as arguments
- sort, by a given function
- test whether certain functions have specific properties of homogeneous relations for particular lists/sets
- remove first element matching filter from list
Morphological functions
morphology is the part of linguistics that studies how language parts are 'shaped' and change diachronically and when inflected. Hausa, Igbo, Malayalam, Bangla/Bengali and Dagbani are focus languages for Wikidata's lexicographic dataset, which is an important aspect of Abstract Wikipedia.
mul - Multiple languages
- inputs: natural number (new numeric type) and language Z-number; output: 'singular', 'dual', 'paucal', 'plural', etc. as string
ase - American Sign Language
bn - Bangla
cy - Welsh
w:en:colloquial Welsh morphology
dag - Dagbani
de - German
- tense * person * number for each verb
- tenses: present, past, ...?
- person: first, second, third
- number: singular, plural
Doing... third person singular present
- second person singular preterite
en - English
- English verb to agent noun (Z11390) Verb -> agent noun, e.g. "dance"->"dancer"
- Join English morphemes (extends suffix English word (Z13254) to cases like re + en + able + er + s → re-enablers. suffix English word (Z13254) will correctly join re-enable + ers or re- + enablers, but re + enablers → “renablers” (incorrect). English morpheme agglutination (Z13275) tests the Reduce function to produce “detoxification” from a list of four morphemes (orchestrator limit exceeded with five). I doubt we’ll want to derive “toxify” from “toxic”, however.
- Derive lemmas from a form. This is envisaged as the converse of Join English morphemes. The focus would be identifying the base form (the lexeme’s lemma) rather than further segmenting the lemma. For example, “underlay” should return “underlie” (for which it is the past participle) and the noun “underlay” (for which it is the lemma) and (perhaps) the verb “underlay”, which might be the tendency of an unproductive hen or the activity of a carpet-fitter. As this is a purely functional converse, every string will have itself as a possible lemma.
- Generate Numerical prefixes of various kinds from a natural number input.
- sort English adjectives (Z19499): Sort a list of English adjectives into the correct order: quantity, opinion, size, physical quality, shape, age, colour, origin, material, type, purpose. Where equal, leave in original order.
eu - Basque
- Basque language declension system in rather regular based on suffixes.
- Here a few examples for Basque declension:
- Basque plural noun (Z18541): This function creates the plural for a noun in Basque
- Basque ergative singular case declension (Z18670): This function returns the singular ergative case for a noun (proper or not) in Basque language
- Before implementing all of them, we may propose an overall classification that eases both the implementation and the future usage of the functions. Here a first try based on bibliography from the Basque Language Academy:
- Personal pronouns: they can be treated as exceptions (e.g. "zuek -> zuei", etc.) together with proper noun declension, or as a separate case.
- Determiners: they can be treated as exceptions (e.g. "hau" -> "honek", etc) together with common noun declension, or as a separated case
- Grammatical cases:
- Absolutive ("Nor"): indefinite, singular and plural
- Ergative ("Nork"): indefinite, singular and plural
- Dative ("Nori"): indefinite, singular and plural
- Place and Time: we must distinguish animate (AN) and inanimate (IN)
- Inessive IN ("Non"): indefinite, singular and plural
- Inessive AN ("Norengan"): indefinite, singular and plural - It could be a composition of "Noren" + "-gan"
- Place and time ("Nongo"): indefinite, singular and plural
- Allative IN ("Nora"): indefinite, singular and plural
- Allative AN ("Norengana"): indefinite, singular and plural - It could be a composition of "Noren/Norengan" + "-gan/-a"
- Finished Allative AN ("Noraino"): indefinite, singular and plural - It could be a composition of "Nora" + "-ino"
- Finished Allative AN ("Norengainaino"): indefinite, singular and plural - It could be a composition of "Noregana" + "-ino"
- Right way Allative IN ("Noratz"): indefinite, singular and plural - It could be a composition of "Nora" + "-ntz"
- Right way Allative AN ("Norenganantz"): indefinite, singular and plural - It could be a composition of "Norengana" + "-ntz"
- Ablative IN ("Nondik"): indefinite, singular and plural
- Ablative AN ("Norengandik"): indefinite, singular and plural
- Rest of the cases:
- Partitive ("Zerik"): indefinite
- Possessive ("Noren"): indefinite, singular and plural
- Sociative ("Norekin"): indefinite, singular and plural
- Instrumental ("Zerez"): indefinite, singular and plural
- Motivative ("Zerengatik"): indefinite, singular and plural
- Destinative ("Norentzat"): indefinite, singular and plural - It could be a composition of "Noren" + "-tzat"
- Special case:
- Prolative ("Nortzat"): indefinite
- To take into consideration:
- Together with animate and inanimate classification, we should also consider if the noun is a proper noun ("izen berezia"). We can identify that automatically (e.g. check if written in Title case, but this may not be always possible like in the beginning of sentences), but a dedicated function may be preferred (or a boolean to the generic function saying it is a proper noun).
- The main distinction is between noun ending by vowel or consonant that can be easily computed
- Here a few examples for Basque declension:
fr - French
- French masculine adjective to feminine (Z11590) Masculine adjective -> feminine, e.g. "exact"->"exacte"
- Conjugated verb => Infinitive, e.g. "alla" => "aller", "mordit" => "mordre"
ha - Hausa
A notated demo sentence ("Aishà taa jeefar dà kàren Indoo" ― "Aisha threw away Indo's dog") is available at http://intent.xigt.org
ig - Igbo
ldn - Láadan
section moved to WF:human languages/Z1882
ml - Malayalam
Proposed functions requiring future types
Note these functions cannot be implemented properly until the needed types are requested and approved.
If one wishes to nevertheless attempt to define and implement them,
- the functions and implementations should indicate prominently in their labels that their input/output types must be adjusted once support for the appropriate replacement types become available; and
- the functions should not be used in the implementations of any other functions, as the later adjustment of input/output types to appropriate replacements will break those implementations.
String manipulation functions
String analysis functions
- count distance between two letters in given alphabet (default to 26-charcater western alphabet. case insensitive. e.g. "a" & "A" ⇒ 0; "K" & "N" ⇒ 3)
String encoding and decoding functions
(would be better with types representing a stream of bytes)
- BASE45 encode
- BASE45 decode
- Hexadecimal UTF-8 encode ("ABC ₤" ⇒ "41 42 43 20 E2 82 A4")
- Hexadecimal UTF-8 decode ("41 42 43 20 E2 82 A4" ⇒ "ABC ₤")
- Decimal UTF-8 encode ("ABC ₤" ⇒ "65 66 67 32 226 130 164")
- Decimal UTF-8 decode ("65 66 67 32 226 130 164" ⇒ "ABC ₤")
- Octal UTF-8 encode ("ABC ₤" ⇒ "101 102 103 40 342 202 244")
- Octal UTF-8 decode ("101 102 103 40 342 202 244" ⇒ "ABC ₤")
- Binary UTF-8 encode ("ABC ₤" ⇒ "01000001 01000010 01000011 00100000 11100010 10000010 10100100")
- Binary UTF-8 decode ("01000001 01000010 01000011 00100000 11100010 10000010 10100100" ⇒ "ABC ₤")
- Unicode code point encode ("ABC ₤" ⇒ "41 42 43 20 20A4") - Unicode code point encode hex (Z10785)
- Unicode code point decode ("41 42 43 20 20A4" ⇒ "ABC ₤")
- Create regular expression object/string (i.e: "test" & "i" to /test/i)
Natural language functions
- Choose singular or plural based on number (e.g. singularOrPlural("person",6") -> "people")
- Note that there are also dual and other grammatical numbers in other languages. 魔琴 (talk) 18:54, 26 October 2023 (UTC)
- relevant interwiki link: d:WD:property proposal/plural forms Arlo Barnes (talk) 04:15, 9 February 2024 (UTC)
Cryptographic hash functions
(would be better with types representing a stream of bytes)
To do MD2 - MD2 Hashing (Z10135)
To do MD4 - MD4 (Z10136)
To do MD5 - MD5 (Z10137)
To do RIPEMD-128 - RIPEMD-128 (Z10138)
To do RIPEMD-160 - RIPEMD-160 (Z10139)
To do BLAKE2b-160 - BLAKE2b-160 (Z10140)
To do BLAKE2b-256 - BLAKE2b-256 (Z10141)
To do BLAKE2b-384 - BLAKE2b-384 (Z10142)
To do BLAKE2b-512 - BLAKE2b-512 (Z10143)
To do BLAKE2s-128 - BLAKE2s-128 (Z10144)
To do BLAKE2s-160 - BLAKE2s-160 (Z10145)
To do BLAKE2s-224 - BLAKE2s-224 (Z10146)
To do BLAKE2s-256 - BLAKE2s-256 (Z10147)
To do SHA-224 - SHA-224 (Z10149)
To do HMAC-SHA-256
To do SHAKE-128 - SHAKE128 (Z10150)
To do SHAKE-256 - SHAKE256 (Z10151)
Colour functions
- return colour contrast ratio (per [1]) of two RGB colours (provided as strings e.g. "#FF0000")
Date, time, and calendric functions
Note: 'time' type not yet supported, use 'string' (or for strictly numeric values, 'natural number')
Bengali calendar
Gregorian to Bengali date (Bangladesh) (Z12926): Converts a Gregorian date to Bangla date per Bangladeshi calendar. Inputs: Year, Month, Day.
Chinese calendar
French Republican Calendar
decimalises and secularises the Gregorian
- day names: is the day name part of the French Republican Calendar 'rural' naming? (Z13006): tests if input is one of the French-language names for days in the FRC
Not done yet
Gregorian
widely used calendar derived from the Julian, basis for ISO 8601
- date to ISO week number ISO week date (Q2110154)
- string to date
- date to ISO 8601 string
- date to year (yyyy)
- date to month of the year (1-12)
- date to month name (January-December)
- date to day of the month (1-31)
- date to hour of the day (0-23)
- date to minutes (0-59)
- date to seconds (0-59)
Holocene calendar
Indian national calendar
Islamic
a Lunar calendar, also called Hijri
Julian
mostly used by astronomers, some historians, and some Orthodox Christian denominations
Mesoamerican calendars
including civil and clerical forms
Persian
also called Jalali
Thai calendar
Hebrew calendar
Darian calendar
Proposed time-keeping system for Mars, requires Julian Date/Time to calculate.
Basic numerical functions
- round up ("1.289" & "2" ⇒ "1.29"; "5678" & "2" ⇒ "5700")
- So if the number is floating point, round to n decimal places, and if not, round to n significant figures. Is that right? BrightSunMan (talk) 19:36, 24 December 2023 (UTC)
- round down
- return integer value (5678.678 ⇒ 5678)
- English cardinal (Z13587): expresses a natural number in English words (23 ⇒ "twenty-three")
- Convert money from US$ to anything else
- requires source of conversion rates, which is a hole in function-likeness
- Arabic numeral to Etruscan numeral
- Etruscan numeral to Arabic numeral
Data serialization functions
- parse a string as JSON
- extract string from JSON object based on JSONPath (
{"name":"Alice"}
, "$.name" ⇒ "Alice")- Why not first convert a JSON string to an object, and then have a function that extracts fields based on JSONPath? Doing Stringly-typed things like this proposal as defined isn't a good idea. 0xDeadbeef (talk) 16:16, 5 August 2023 (UTC)
- This seems to be a good idea, thanks! I moved and splitted the proposal accordingly. --1-Byte (talk) 09:51, 6 August 2023 (UTC)
- is it okay to go ahead to create this 'extract string from JSON object based on JSONPath' as a function ? Dolphyb (talk) 16:14, 15 February 2024 (UTC)
- Why not first convert a JSON string to an object, and then have a function that extracts fields based on JSONPath? Doing Stringly-typed things like this proposal as defined isn't a good idea. 0xDeadbeef (talk) 16:16, 5 August 2023 (UTC)
Basic list/iterable functions requiring numeric types
- Sum the elements of a numeric list - sum the elements of a list of natural numbers (Z14038)
- Product of the elements of a numeric list
- flat a list (Z12676): flatten a list to limited depth
- Slice of list elements: for the supplied list, return a list of elements that are at indexes between a supplied range n:m
- Zero indexing is used (first element is index 0)?
- n and m are are included in the range?
- What happens if n and/or m are invalid indexes?
- Remove slice of elements from list: return the supplied list with elements between a supplied range of indexes removed
- Zero indexing is used (first element is index 0)?
- n and m are are included in the range?
- What happens if n and/or m are invalid indexes?
- Every nth element of list: returns every nth element of the supplied list
- Remove every nth element of list: removes every nth element of the supplied list -
- sample n objects from list (return up to n random objects from the list)
- Jaccard similarity coefficient (see https://en.wikipedia.org/wiki/Jaccard_index)
Geodetics functions
w:en:planetary coordinate system, w:en:well-known text representation of coordinate reference systems
Earth
- convert coordinates outside of the ranges (-180, 180) for longitude and (-90, 90) for latitude to a canonical form
Mars
- convert coordinates outside of the ranges [0, 360) for longitude and (-90, 90) for latitude to a canonical form
Unit conversion functions
- Fahrenheit to Celsius (Z15560): Converts Fahrenheit (°F) to Celsius (°C)
Object / type / function functions
- ZID of object type (Z17893): returns a string with the ZID of the type of the object passed in
- function call of test case (Z21180): returns the function call that the test case performs
- function called by test case (Z21182): no description
- function this implementation is implementing (Z21193): no description