Jump to content

Wikifunctions:Suggest a function

From Wikifunctions

Do you have an idea for a new function? Suggest it here! It may help to refer to our glossary.

You can create a function right away if you have the user-rights.

If a function requires a new type, consider proposing it.

Note that for now we only support a limited number of types as input and output types of functions. More types are coming in the next few months. For the full list, see WF:Type.

Once created, consider adding new Functions to the catalogue.

Proposed functions requiring only available types (string, Boolean, Natural number, list)

String

String character discard functions

String character replacement functions

String search functions

String escaping and unescaping functions

String encoding and decoding functions

String presentation functions

String colour notation functions

  • complementary colour in RGB colour model ("#FF0000" ⇒ "#00FFFF")
    Great question. I don't think there is a position documented on Wikifunctions for how to handle invalid input to a function. Can we throw exceptions? Return null? Dhx1 (talk) 13:23, 6 August 2023 (UTC)[reply]
    This shouldn't be a string function. This should be a type that represents a RGB color (with corresponding validation function (hopefully it can just be three unsigned 8bit integers)) and a function that returns the complementary color. 0xDeadbeef (talk) 12:38, 7 August 2023 (UTC)[reply]
    • Work on the color type has been stalled for over a year. But this task is Done, I have made Z21554, that uses string hex codes (with or without the initial # and supporting short hex format). This is probably the most optimal format, as I can imagine this function being used on-wiki for thing like the style parameter (CSS) of MediaWiki tags, or of templates etc. There's 5 testcases I've made, which are passed by both my JS and Python versions. :) MolecularPilot (talk) 03:39, 11 January 2025 (UTC)[reply]

String notation validation checks

String validation checks

  • 22.5px Doing... check if string is in lower camel case
  • check if string is a valid ISO 3166 country code
  • check if string is a valid ISO 8601 date/time (2023-08-03 ⇒ true; 2023-02-30 ⇒ false; 2023-08-03 15:00:00.000 ⇒ true; 2023-08-03 25:00:00.000 ⇒ false)
  • check if string is a valid EDTF date/time
  • 22.5px Doing... check if string is a valid email address (watch out, see this list of falsehoods about email addresses to create unit tests - email addresses are more complicated than they seem) — Z10410 creating test cases in progress. Currently it is stuck on figuring out what exactly is a valid emaill address. Nearly every errata for RFC:3696 is about that.
  • 22.5px Doing... check if string is a valid Wikidata itemZ10696 (possibly stuck on phab:T343593?)

String analysis functions

  • Word frequency counting. Provide a list of words and their frequencies.
    • Done, see Z21593. Providing a list of words and frequencies would require a new type, so instead it requires the sentence and the word you want to count, and returns the occurrences. Hyphenated words are not considered a match of their components, i.e. "fast-forward" is a match of "fast-forward" but nether "fast" nor "forward", I think this is the optimal behaviour but if someone disagrees we can change it. There is a JS implementation that I made which bases all of my 4 test cases. MolecularPilot (talk) 05:41, 12 January 2025 (UTC)[reply]
    @MolecularPilot And I have added a Python implementation at Z22473. I would appreciate if you could attach it! ~/Bunnypranav:<ping> 08:13, 15 February 2025 (UTC)[reply]
    Cool, thank you so much for doing it! Someone's already beat me to it re attaching it, but great work! :) MolecularPilot (talk) 21:33, 16 February 2025 (UTC)[reply]

Monolingual text

String Wikitext operations

...

Natural number

Integer

Byte

  • bitwise XOR
  • bitwise AND
  • bitwise OR
  • previous byte
  • add bytes
  • subtract bytes
  • multiply bytes
  • modulo bytes
  • byte division

Unicode code point

  • Codepoint to natural number
  • Natural number to codepoint
  • Codepoint to hex representation with U+ prefix
  • Codepoint to list of bytes for UTF-8
  • Codepoint to list of bytes for UTF-16
  • Codepoint to list of bytes for UTF-32
  • which plane of codepoint?

Object

List

Basic list/iterable functions

Complex list functions

  • zip lists together: for [ A .. Z ] and [ 1 .. 26 ] return [ [ A, 1 ], [ B, 2 ], .. ]
    • Unsure what happens if input lists are of different lengths.
    • If possible this function should be able to zip more than 2 lists together... 3, 4, n? Perhaps the input should be list(list, list, list, list, ..).
  • remove elements common to second list (Z19198): return the first list with any elements common to the second list removed

CSV list operations

  • list of strings to csv
number -> list of decimal digits
number -> list of binary digits
number -> list of digits in base provided Well very well (talk) 11:20, 18 May 2024 (UTC)[reply]

Functions with functions as arguments

  • sort, by a given function
  • test whether certain functions have specific properties of homogeneous relations for particular lists/sets
  • remove first element matching filter from list

Morphological functions

morphology is the part of linguistics that studies how language parts are 'shaped' and change diachronically and when inflected. Hausa, Igbo, Malayalam, Bangla/Bengali and Dagbani are focus languages for Wikidata's lexicographic dataset, which is an important aspect of Abstract Wikipedia.

mul - Multiple languages

ase - American Sign Language

  • string: Stokoe to ase-Sgnw and vice-versa (consult @Slevinski: as to best approach)

bn - Bangla

cy - Welsh

w:en:colloquial Welsh morphology

dag - Dagbani

de - German

  • tense * person * number for each verb
    • tenses: present, past, ...?
    • person: first, second, third
    • number: singular, plural
    • 22.5px Doing... third person singular present
    • second person singular preterite

en - English

  • Z11390 Verb -> agent noun, e.g. "dance"->"dancer"
  • Join English morphemes (extends Z13254 to cases like re + en + able + er + s → re-enablers. Z13254 will correctly join re-enable + ers or re- + enablers, but re + enablers → “renablers” (incorrect). Z13275 tests the Reduce function to produce “detoxification” from a list of four morphemes (orchestrator limit exceeded with five). I doubt we’ll want to derive “toxify” from “toxic”, however.
  • Derive lemmas from a form. This is envisaged as the converse of Join English morphemes. The focus would be identifying the base form (the lexeme’s lemma) rather than further segmenting the lemma. For example, “underlay” should return “underlie” (for which it is the past participle) and the noun “underlay” (for which it is the lemma) and (perhaps) the verb “underlay”, which might be the tendency of an unproductive hen or the activity of a carpet-fitter. As this is a purely functional converse, every string will have itself as a possible lemma.
  • Generate Numerical prefixes of various kinds from a natural number input.
  • sort English adjectives (Z19499): Sort a list of English adjectives into the correct order: quantity, opinion, size, physical quality, shape, age, colour, origin, material, type, purpose. Where equal, leave in original order.

eu - Basque

  • Basque language declension system in rather regular based on suffixes.
    • Here a few examples for Basque declension:
    • Before implementing all of them, we may propose an overall classification that eases both the implementation and the future usage of the functions. Here a first try based on bibliography from the Basque Language Academy:
      • Personal pronouns: they can be treated as exceptions (e.g. "zuek -> zuei", etc.) together with proper noun declension, or as a separate case.
      • Determiners: they can be treated as exceptions (e.g. "hau" -> "honek", etc) together with common noun declension, or as a separated case
      • Grammatical cases:
        • Absolutive ("Nor"): indefinite, singular and plural
        • Ergative ("Nork"): indefinite, singular and plural
        • Dative ("Nori"): indefinite, singular and plural
      • Place and Time: we must distinguish animate (AN) and inanimate (IN)
        • Inessive IN ("Non"): indefinite, singular and plural
        • Inessive AN ("Norengan"): indefinite, singular and plural - It could be a composition of "Noren" + "-gan"
        • Place and time ("Nongo"): indefinite, singular and plural
        • Allative IN ("Nora"): indefinite, singular and plural
        • Allative AN ("Norengana"): indefinite, singular and plural - It could be a composition of "Noren/Norengan" + "-gan/-a"
        • Finished Allative AN ("Noraino"): indefinite, singular and plural - It could be a composition of "Nora" + "-ino"
        • Finished Allative AN ("Norengainaino"): indefinite, singular and plural - It could be a composition of "Noregana" + "-ino"
        • Right way Allative IN ("Noratz"): indefinite, singular and plural - It could be a composition of "Nora" + "-ntz"
        • Right way Allative AN ("Norenganantz"): indefinite, singular and plural - It could be a composition of "Norengana" + "-ntz"
        • Ablative IN ("Nondik"): indefinite, singular and plural
        • Ablative AN ("Norengandik"): indefinite, singular and plural
      • Rest of the cases:
        • Partitive ("Zerik"): indefinite
        • Possessive ("Noren"): indefinite, singular and plural
        • Sociative ("Norekin"): indefinite, singular and plural
        • Instrumental ("Zerez"): indefinite, singular and plural
        • Motivative ("Zerengatik"): indefinite, singular and plural
        • Destinative ("Norentzat"): indefinite, singular and plural - It could be a composition of "Noren" + "-tzat"
      • Special case:
        • Prolative ("Nortzat"): indefinite
      • To take into consideration:
        • Together with animate and inanimate classification, we should also consider if the noun is a proper noun ("izen berezia"). We can identify that automatically (e.g. check if written in Title case, but this may not be always possible like in the beginning of sentences), but a dedicated function may be preferred (or a boolean to the generic function saying it is a proper noun).
        • The main distinction is between noun ending by vowel or consonant that can be easily computed

fr - French

  • Z11590 Masculine adjective -> feminine, e.g. "exact"->"exacte"
  • Conjugated verb => Infinitive, e.g. "alla" => "aller", "mordit" => "mordre"

ha - Hausa

A notated demo sentence ("Aishà taa jeefar dà kàren Indoo" ― "Aisha threw away Indo's dog") is available at http://intent.xigt.org

ig - Igbo

ldn - Láadan

section moved to WF:human languages/Z1882

ml - Malayalam

Proposed functions requiring future types

Note these functions cannot be implemented properly until the needed types are requested and approved.

If one wishes to nevertheless attempt to define and implement them,

  • the functions and implementations should indicate prominently in their labels that their input/output types must be adjusted once support for the appropriate replacement types become available; and
  • the functions should not be used in the implementations of any other functions, as the later adjustment of input/output types to appropriate replacements will break those implementations.

String manipulation functions

String analysis functions

  • count distance between two letters in given alphabet (default to 26-charcater western alphabet. case insensitive. e.g. "a" & "A" ⇒ 0; "K" & "N" ⇒ 3)

String encoding and decoding functions

(would be better with types representing a stream of bytes)

  • BASE45 encode
  • BASE45 decode
  • Hexadecimal UTF-8 encode ("ABC ₤" ⇒ "41 42 43 20 E2 82 A4")
  • Hexadecimal UTF-8 decode ("41 42 43 20 E2 82 A4" ⇒ "ABC ₤")
  • Decimal UTF-8 encode ("ABC ₤" ⇒ "65 66 67 32 226 130 164")
  • Decimal UTF-8 decode ("65 66 67 32 226 130 164" ⇒ "ABC ₤")
  • Octal UTF-8 encode ("ABC ₤" ⇒ "101 102 103 40 342 202 244")
  • Octal UTF-8 decode ("101 102 103 40 342 202 244" ⇒ "ABC ₤")
  • Binary UTF-8 encode ("ABC ₤" ⇒ "01000001 01000010 01000011 00100000 11100010 10000010 10100100")
  • Binary UTF-8 decode ("01000001 01000010 01000011 00100000 11100010 10000010 10100100" ⇒ "ABC ₤")
  • Unicode code point encode ("ABC ₤" ⇒ "41 42 43 20 20A4") - Z10785
  • Unicode code point decode ("41 42 43 20 20A4" ⇒ "ABC ₤")
  • Create regular expression object/string (i.e: "test" & "i" to /test/i)

Natural language functions

Cryptographic hash functions

(would be better with types representing a stream of bytes)

Colour functions

  • return colour contrast ratio (per ) of two RGB colours (provided as strings e.g. "#FF0000")

Date, time, and calendric functions

Note: 'time' type not yet supported, use 'string' (or for strictly numeric values, 'natural number')

Bengali calendar

Gregorian to Bengali date (Bangladesh) (Z12926): Converts a Gregorian date to Bangla date per Bangladeshi calendar. Inputs: Year, Month, Day.

Chinese calendar

French Republican Calendar

decimalises and secularises the Gregorian

Gregorian

widely used calendar derived from the Julian, basis for ISO 8601


Named Day from Date or day of year ; Input type : Date ; Output Type : String. ; The initial use case was automated population of On The Day, based on various collections of Holidays, festival days and observances. ? ShakespeareFan00 (talk) 19:35, 26 March 2025 (UTC)[reply]

So If you gave it 2025-05-01 It said "All Fools Day" etc.. Possibly an additional input of enumrated type to indicate which data set to pull holidays, fesitvals and observances from.

ShakespeareFan00 (talk) 19:35, 26 March 2025 (UTC)[reply]

Diary/calander Header function - Using the above and other date functions, generates a data set from a given date. Hence if you give it 2003-05-01 you get back a JOSN set contianing the {Day of week:String, Day in the Month, Observances} etc. ShakespeareFan00 (talk) 19:35, 26 March 2025 (UTC)[reply]

Holocene calendar

Indian national calendar

Islamic

a Lunar calendar, also called Hijri

Julian

mostly used by astronomers, some historians, and some Orthodox Christian denominations

Mesoamerican calendars

including civil and clerical forms

Persian

also called Jalali

Thai calendar

Hebrew calendar

Darian calendar

Proposed time-keeping system for Mars, requires Julian Date/Time to calculate.

Basic numerical functions

  • round up ("1.289" & "2" ⇒ "1.29"; "5678" & "2" ⇒ "5700")
    So if the number is floating point, round to n decimal places, and if not, round to n significant figures. Is that right? BrightSunMan (talk) 19:36, 24 December 2023 (UTC)[reply]
  • round down
  • return integer value (5678.678 ⇒ 5678)
  • English cardinal (Z13587): expresses a natural number in English words (23 ⇒ "twenty-three")
  • Convert money from US$ to anything else
    • requires source of conversion rates, which is a hole in function-likeness
  • Arabic numeral to Etruscan numeral
  • Etruscan numeral to Arabic numeral
  • floor and ceiling functions, based on defined standards.

Data serialization functions

Basic list/iterable functions requiring numeric types

  • Sum the elements of a numeric list - Z14038
  • Product of the elements of a numeric list
  • flatten untyped list (Z12676): flatten an (untyped) list to limited depth
  • Slice of list elements: for the supplied list, return a list of elements that are at indexes between a supplied range n:m
    • Zero indexing is used (first element is index 0)?
    • n and m are are included in the range?
    • What happens if n and/or m are invalid indexes?
  • Remove slice of elements from list: return the supplied list with elements between a supplied range of indexes removed
    • Zero indexing is used (first element is index 0)?
    • n and m are are included in the range?
    • What happens if n and/or m are invalid indexes?
  • Every nth element of list: returns every nth element of the supplied list
  • Remove every nth element of list: removes every nth element of the supplied list -
  • sample n objects from list (return up to n random objects from the list)
  • Jaccard similarity coefficient (see https://en.wikipedia.org/wiki/Jaccard_index)

Geodetics functions

w:en:planetary coordinate system, w:en:well-known text representation of coordinate reference systems

Earth

  • convert coordinates outside of the ranges (-180, 180) for longitude and (-90, 90) for latitude to a canonical form

Mars

  • convert coordinates outside of the ranges [0, 360) for longitude and (-90, 90) for latitude to a canonical form

Unit conversion functions

Conversion function : 2D Cartesian to 2D Polar

Input : matrix [x,y] Output: matrix [θ,r] Short text : Polar conversion of x,y to a polar space centred at 0,0 in the Cartesian. Constraints: x,y,r are reals (float64), θ lies in the range -π<0<π (Sign determined in relation to standards used in STEM applications. ShakespeareFan00 (talk) 14:30, 26 March 2025 (UTC)[reply]

The companion could also be provided. As I never did Geodetic functions, I am not sure how Lat, Long to map projection would work , but useful. ShakespeareFan00 (talk) 14:30, 26 March 2025 (UTC)[reply]

Trignometric Functions

  • sine, cosine,
    Input : float64 Angle in radians.
    Output : float64 desired trignometric value

ShakespeareFan00 (talk) 19:40, 26 March 2025 (UTC)[reply]

Function Proposal : Decimalise angle of the form ('1:x' or '1 in x') to % (in 100) or ‰ (in 1000)

  • Suggested name: gardient_decimal.
  • Input type: Integer ( The 1 is implied.). Lower Bound +1: Upper Bound: 1000 (for most practical situations?)
  • Output type: Real/float 64.

Proposer: ShakespeareFan00 (talk) 19:05, 28 March 2025 (UTC)[reply]

Color Functions

Colorspace Conversion

x,y,Y to sRGB (Illuminant D65). Input : 3tuple of float64, Output: 3 tuple of integer, where 0>=r<=255, 0>=g<=255 0>=b<=255.

Convert a color specfied as 3 float64 values, from x,y,Y colorspace to sRGB or raise an "Out of Gaumt" exception. ShakespeareFan00 (talk) 19:09, 7 April 2025 (UTC)[reply]

Music Functions

It would be nice to have 12 equal temperament pitch class and 12 equal temperament pitch types, as they would be useful for calculating harmonies and melodies. The pitch classes could be stored as natural numbers from 0 to 11, and represented with symbols C, C♯, D, ..., B. The pitches could be stored as integers with -1 being B3, 0 being C4, 1 being C♯4, etc. Over time, we could expand the pitch class and pitch types to other temperaments and just intonation. As I'm new to Wikifunctions and my coding skills are next to zero, this is just a suggestion to the community. (edited) CaffeineP (talk) 14:48, 9 April 2025 (UTC)[reply]

Yes… There are some notational challenges because of enharmonics as well as naming conventions varying by language/culture, so English A♯ is equivalent to German B and English B♭, for example. Ideally, I would want the (English) pitch class that is five semitones higher than G♭ to be displayed as C♭ rather than B.
Also, given some reference pitch like A4 = 440 Hz, we should be able to return the frequency in hertz of a given pitch and, conversely, the nearest pitch for a given frequency and its offset in cents (or whatever). The computation is a lot simpler than representing the result (or capturing how the result should be represented)! GrounderUK (talk) 20:08, 9 April 2025 (UTC)[reply]
  • 12-ET Pitch Class of a Pitch: Return the 12 equal temperament pitch class of a given 12 equal temperament pitch. For example, C4 returns C.
  • 12-ET Pitch based on Pitch Class: Return a 12 equal temperament pitch based on a given 12 equal temperament pitch class and a given integer. For example, C and 4 return C4.
  • Interval between 12-ET Pitch Classes in Semitones: Get the interval in semitones between two 12 equal temperament pitch classes, always assuming that the first is lower than (or the same as) the second, and the interval is less than an octave. For example, C and B return 11, while B and C return 1.
  • Interval between 12-ET Pitches in Semitones: Get the interval in semitones between two 12 equal temperament pitches. For example, C4 and B3 return -1, while C3 and B4 return 23.
  • Raise 12-ET Pitch Class by Semitones: Get a new 12 equal temperament pitch class through raising a given pitch class by the provided number of semitones. For example, raising B by 1 semitone returns C.
  • Lower 12-ET Pitch Class by Semitones: Same as above, but lower the pitch class instead of raising it.
  • Raise 12-ET Pitch by Semitones: Get a new 12 equal temperament pitch through raising a given pitch by the provided number of semitones. For example, raising B3 by 1 semitone returns C4.
  • Lower 12-ET Pitch by Semitones: Same as above, but lower the pitch instead of raising it.
  • Frequency of a 12-ET Pitch: Return a float64 frequency in Hz based on the provided 12 equal temperament pitch (and possibly a reference pitch with its frequency; if not provided, take default A4 = 440 Hz).
  • Approximate 12-ET Pitch Class based on Frequency: Return a 12 equal temperament pitch class approximately based on the provided frequency in Hz.
  • Approximate 12-ET Pitch based on Frequency: Return a 12 equal temperament pitch approximately based on the provided frequency in Hz.

SVG Functions

I would be nice to generate SVG (a XML-based vector image format which is basically a long string), it could allow to replace a lot of images on Commons or templates/tools. Here some examples:

  • create simple forms,
  • create graphs (line graph/bar graph for population or for production, elections diagrams like File:1900Hawaii.svg, etc.),
  • create more complex visualisation like genealogical trees,
  • create coat of arms (?),
  • etc.

Cheers, VIGNERON (talk) 10:47, 23 April 2025 (UTC)[reply]

@VIGNERON: Eventually that is something we might support, but there'll be nothing any time soon. It has a number of complex security and scalability concerns, sadly. Jdforrester (WMF) (talk) 13:21, 23 April 2025 (UTC)[reply]
@Jdforrester (WMF): thanks. I talked about it for the last Corner but I wanted to leave a record here, if we have time, maybe I'll use that time to write some things to prepare (like listing templates and tools on the Wikimedia projects that generate SVG or visualisations). Cheers, VIGNERON (talk) 15:28, 23 April 2025 (UTC)[reply]
Of course! I've explicitly added a section on this here: Wikifunctions:Embedded function calls#Non-text output — hope that helps assure you that we're thinking about it. Jdforrester (WMF) (talk) 21:40, 23 April 2025 (UTC)[reply]

Object / type / function functions

External function lists