Jump to content

Wikifunctions talk:Naming conventions

From Wikifunctions
(Redirected from Wikifunctions talk:Naming convention)
Latest comment: 1 month ago by Hogü-456 in topic NLG Test labels

Discussion elsewhere

This page's content is has ongoing discussions under the Project chat sections Function naming? and Naming convention draft.

/ Autom (talk) 19:22, 27 August 2023 (UTC)Reply

Labels of language functions?

I support the naming conventions. I would like an example of how to name language functions e.g. "Swedish noun declesion -an" -> 'Swedish noun declesion, indefinite plural ending with "an"' Is it ok to use - or " in that case? So9q (talk) 10:59, 19 July 2024 (UTC)Reply

I'd imagine that'd be fine, in that case. Added to the draft. infernostars (talk) (contribs) 20:16, 22 July 2024 (UTC)Reply

Code for cross-language labels

Pretty sure it would be und-Zmth or zxx, and not mul. The latter seems to be intended for mixed-language content. YoshiRulz (talk) 16:47, 3 January 2025 (UTC)Reply

@YoshiRulz: kinda but no it's not. und is for Undetermined (when you find a text but don't know what language it's in, which is obviously not the case ; und-Zmth is weird, if you know it's mathematical notation then it's not undetermined) and zxx is for No linguistic content, Not applicable, it could kinda work in some case (like + for addition which is borderline not linguistic, zxx-ZMth) but it's strange. mul is indeed originally for multiple languages in the string but it's also commonly used (in Wikidata for instance) for used in multiple languages (note that ISO code was initial meant for text and content, fragment of text and content are a different case ; for a long text, it's quite easy to have multiple languages and not being able to tag each specific part but for a few words, it's rare to have multiple languages). Cheers, VIGNERON (talk) 14:47, 4 January 2025 (UTC)Reply

Per-language pages

The recommendations must be split into things that are specific to English and things that are general to all languages.Capitalization, for example, is not used in many languages at all, and it is used differently in many languages.

So there should be one general page, and language-spefific pages. This is comparable to how translatewiki has general Localisation guidelines, or how Lexicographical Data on Wikidata has general documentation and per-language modeling guides.

If there is no objection to this, I'll split the parts that are specific to English after a few weeks.

Whenever a language becomes active, the editors should write a guide for it. Amir E. Aharoni (talk) 13:36, 11 January 2025 (UTC)Reply

Thanks. We’ll probably want a major overhaul, depending on how the proposals in Wikifunctions:Design/Naming conventions recommendations play out. There was some discussion on Telegram following on from this suggestion.
In theory, all languages are “active”. If we get more than one contributor using a language, there may be room for constructive dialogue between them but I imagine that tailored naming standards would be agreed for only a small number of languages in the next two years (arbitrary timescale). Technically, we haven’t even agreed these for English (see Wikifunctions talk:Best practices#Wikifunctions:Naming conventions. GrounderUK (talk) 14:40, 11 January 2025 (UTC)Reply
Exactly, in theory. For the practice, see the examples I mentioned above. Another example is that editions of Wikipedia in more than a hundred languages have a policy for article naming, which are probably similar in spirit, but different in details and length.
When at least one person actually starts doing something in a certain language, a practice emerges, and even if it's one person, it's not necessarily consistent in the beginning. If more people start doing stuff in the same language, then they should become coördinated. It doesn't mean that they will document the common practices, but sooner or later they should. Because of all that, it's good to separate the general recommendations from the language-specific recommendations and to set up the framework for adding specific languages as early as possible. Amir E. Aharoni (talk) 16:00, 11 January 2025 (UTC)Reply
I agree. I guess it won’t be too much wasted effort if we later effectively evolve the new proposals into separate global recommendations and English-language recommendatios. GrounderUK (talk) 16:07, 11 January 2025 (UTC)Reply

Proposed changes, starting with labels for Types

The momentum from WF:Design/Naming conventions recommendations died pretty quickly and nothing has happened since, so I'd like to restart the conversation with this blunderbuss of ideas.
Starting with:


Casing: Types' labels in English should be in "Sentence case". This is already done, just not documented. YoshiRulz (talk) 21:33, 1 December 2025 (UTC)Reply

Support Support as proposer. YoshiRulz (talk) 21:33, 1 December 2025 (UTC)Reply
I think title case could make sense for these to stand out as an alternative. Jdforrester (WMF) (talk) 14:15, 2 December 2025 (UTC)Reply
One exception is float64 (Z20838).
We also treat Wikidata as a proper noun, so it is not immediately obvious that Wikidata reference (Z6008) is not the target of Get Wikidata reference from enum instance (Z6895), for example, where the targeted “Wikidata entity reference” is not a Type, but a union of Wikidata item reference (Z6091) and Wikidata lexeme reference (Z6095). Consistent use of hyphens within multi-word (English) Type labels is an alternative that is worth considering. This would allow us to distinguish between “Wikidata-reference” (the Type) and “Wikidata reference” (the conceptual union of “Wikidata-item-reference” and “Wikidata-lexeme-reference”, both being “Wikidata entity references”. I would retain the initial capital where normal English does not require it, allowing “Integer” and “Natural-number” to be differentiated from ordinary use of “integer” and “natural number”. (See, for example, is rational number an integer (Z19806), which might be labelled “is Rational-number an integer”.) GrounderUK (talk) 21:29, 2 December 2025 (UTC)Reply
My gut reaction to the hyphens is that they feel out-of-place in English text... but of course that's not true, and anyway this is about labels, which are like code.
Objectively speaking, "Snake-case" for Types would resolve most of the ambiguities I can imagine in labels, even more than "Title Case" would. YoshiRulz (talk) 16:35, 3 December 2025 (UTC)Reply

Type names in Function labels

Casing: When a Function's label includes the name of a Type, it should at least match the casing of that Type's label, if not match it verbatim.
(In English, this would mean Functions labelled like "concatenate many Strings" or "sum of Naturals"). YoshiRulz (talk) 21:34, 1 December 2025 (UTC)Reply

Support Support as proposer. YoshiRulz (talk) 21:34, 1 December 2025 (UTC)Reply

Functions returning Booleans

Functions: The labels for 'predicate' Functions (those that return a Z40, excluding those which have only Z40 inputs) should be phrased as a question, and include the '?' or equivalent in languages that use one.
For example, "is String empty?" or "are Integers equal?", maybe "equal Integers?".
(Under my proposal to copy the Function's label for Implementations, the '?' can be omitted there.) YoshiRulz (talk) 21:34, 1 December 2025 (UTC)Reply

Support Support as proposer. YoshiRulz (talk) 21:34, 1 December 2025 (UTC)Reply
I don't think this is a good idea; it adds a lot of verbiage to a limited length, and the question mark in particular is confusing. Also, a lot of people will try to normalise their language's labels against the English ones, and this may be even worse in them. We have "string equality" etc. already; turning that to "are strings equal?" isn't more findable. Jdforrester (WMF) (talk) 14:19, 2 December 2025 (UTC)Reply
I think long function names with question marks at the end are a speciality of Wikifunctions because of the internal use of IDs. For example in Spreadsheets function names are short and translated into different languages. If people want to add a question mark they can do it. I think as it is a project of individuals naming conventions are only recommendations and they will be not enforced in a strict way. At least this is how I experienced it so far and how wish it for the future. If it is not the current function name then it can be added as an Alias name. Hogü-456 (talk) 22:35, 5 December 2025 (UTC)Reply

Arithmetic Functions

Functions: The labels for arithmetic and other mathematical Functions should include the name of the numeric Type(s) they operate on. And prefer e.g. "add two Natural numbers" over "add two numbers (Naturals)".
I believe this would be applicable across languages. YoshiRulz (talk) 21:34, 1 December 2025 (UTC)Reply

Support Support as proposer. YoshiRulz (talk) 21:34, 1 December 2025 (UTC)Reply
I think using any parantheticals are extremely bad even for unusual edge cases; to require them for everything is actively harmful. Your given example also breaks your proposal above (the type is "Natural number", not "Natural"). Jdforrester (WMF) (talk) 14:20, 2 December 2025 (UTC)Reply
Edited for clarity. I intended the opposite. YoshiRulz (talk) 15:10, 2 December 2025 (UTC)Reply
Aha, thanks. in that case, yes, this looks like a good idea. Jdforrester (WMF) (talk) 15:43, 3 December 2025 (UTC)Reply

Brevity in Function labels

Functions: In choosing an English label, constructions like "calculate the ..." or "get the ... of ..." are verbose and should be avoided unless that would introduce ambiguity:
"calculate the distance between the Earth and Sun on the given day" could be shortened to "distance between Earth and Sun on the given day"; "is integral" couldn't be shortened to "integral" because there are other operations by that name; "get unit of Wikidata quantity" could be shortened to "unit of Wikidata quantity", but not to "unit" or "quantity unit" since those could be Type names; and "compute SHA-512" couldn't be shortened to "SHA-512" if there was a Type by that name.
I'm not sure whether this principle is applicable across languages, but the goals are firstly to avoid ambiguity and secondly to avoid verbosity. YoshiRulz (talk) 21:34, 1 December 2025 (UTC)Reply

This builds on earlier discussions at WF:Project chat/Archive/2023/08#Naming convention draft and WF:Project chat/Archive/2023/08#Function naming?.
Support Support as proposer. YoshiRulz (talk) 21:34, 1 December 2025 (UTC)Reply

Out-of-order marker

Functions: Standardise on a marker that can be added at the start of a Function's label to indicate that it shouldn't be used. Preferably a single ASCII character. There could also be a separate character to indicate 'deprecated', as opposed to just broken for technical reasons.
I propose '!' for broken, and '#' for working but deprecated. YoshiRulz (talk) 21:35, 1 December 2025 (UTC)Reply

Support Support as proposer. YoshiRulz (talk) 21:35, 1 December 2025 (UTC)Reply
I'd rather we didn't need to have a standard for this. :-( Jdforrester (WMF) (talk) 14:21, 2 December 2025 (UTC)Reply

Implementation labels

Implementations: In English, these should be labelled "<Function's label>, <language>", where the language of compositions is "composition". (Because the length limit on Functions' and Implementations' labels is the same, it's often necessary to abbreviate the part taken from the Function. I've been doing this and it hasn't been too hard, but that's only an anecdote and only for English. Also "JavaScript" can be abbreviated to "JS" as a first before you start dissecting the Function part.)
Where there are multiple Implementations in one language, they should be disambiguated with "<Function's label>, <algorithm> <language>" e.g. "replicate object n times, map composition", "first word of String, RegEx JS".
This is just a refinement of the current documented conventions. I think this pattern can be adapted to other (human) languages. YoshiRulz (talk) 21:35, 1 December 2025 (UTC)Reply

Support Support as proposer. The current convention only says that the language should be mentioned in the label. I've seen it in parentheses, which is wasting a character compared to a comma. YoshiRulz (talk) 21:35, 1 December 2025 (UTC)Reply
The interface already shows for what they're an implementation, but sure, this avoids clashes where there's space. Jdforrester (WMF) (talk) 14:21, 2 December 2025 (UTC)Reply

Range notation

Integer ranges/domains: These should be represented as either "[start]..=[endIncl]" or "[start]..<[endExcl]", e.g. "1..=26" or "0..<N" ("N" referring to the length of a list input). Other numeral systems can be used instead if there's consensus within a language. YoshiRulz (talk) 21:35, 1 December 2025 (UTC)Reply

Support Support as proposer. YoshiRulz (talk) 21:35, 1 December 2025 (UTC)Reply
Can you give an example of where we would apply this naming rule? It's a very complicated, and not user-friendly label, but maybe it's fine if the usages are going to be very rare. Jdforrester (WMF) (talk) 14:22, 2 December 2025 (UTC)Reply
index of first listing (1...N) – note limitation (Z13708): returns the position (1…n) of the first matching element, except where the list is untyped (Z1) and the match-object has custom conversion (its type has a converter to code)
generate untyped list of length (Z21821): calls the given function for each index 1..=n, appending the results into a list (zero-indexed version at Z24387)
take sub-sequence of list (Z26556): clamps indices to valid range (1..=N); returns empty if end < start
Since there can be 0..<N and 1..=N versions of a Function, this information really needs to go in the label. YoshiRulz (talk) 15:23, 2 December 2025 (UTC)Reply
@YoshiRulz: Right. Wouldn't 0..N be shorter, and 0..<N could be used in the rarer case? Jdforrester (WMF) (talk) 15:45, 3 December 2025 (UTC)Reply
I hadn't considered that closed ranges will probably be much more common than half-open.
As long as it's applied consistently, bare .. implying closed ranges would be fine too (that's the syntax used by Kotlin, for example). YoshiRulz (talk) 16:04, 3 December 2025 (UTC)Reply

Phraseology of lists, part 1

Lists: The whole untyped/typed thing is confusing even though I know how it works. I think this stems from the system being dynamically-typed, so the type annotations on Functions can't be trusted.
My proposals are 1. make functions preserve list typing wherever possible, and improve documentation,
but more relevantly 2. change the phraseology (for English) to "Z1-typed list" and something like "Any-typed list". YoshiRulz (talk) 21:35, 1 December 2025 (UTC)Reply

Support Support as proposer. YoshiRulz (talk) 21:35, 1 December 2025 (UTC)Reply
Any rule that uses ZIDs in user-facing content is a bad rule. "Untyped" is also shorter than "Z1-typed", but "Any-typed" is fine. Why not just go with that? Jdforrester (WMF) (talk) 14:24, 2 December 2025 (UTC)Reply
"untyped" is really the term I take issue with. The representation still includes a type indicatorit's just Z1.
I accept that using the ZID isn't great. What about "Object-typed" (swapping in Z1's label)? YoshiRulz (talk) 15:17, 2 December 2025 (UTC)Reply
Z1 isn't a real Type, is the fallback concept (as we don't have Type inheritance). "Object-typed" is quite long compared to "any-typed". Also in general we want to eliminated the use of untyped lists (but we'll need Unions before that's possible for some cases). Jdforrester (WMF) (talk) 15:46, 3 December 2025 (UTC)Reply
That's interesting. Do you have a vision for eliminating untyped lists? How will you indicate contravariance?
For context, I've been interpreting a Z881(Z1) type annotation as ReadOnlyList<out Object> (syntax more familiar to me, just meaning an ordered collection that can contain values of unknown and disparate types).
With union types it's obvious how you could express a list of, say, QIDs or PIDs Z881(Either(Z6091, Z6092)), but a lot of list operations are necessarily generic across lists of any type, hence Z1 as a top type. YoshiRulz (talk) 16:19, 3 December 2025 (UTC)Reply
Yes, there’s a semantic deficit here. We have no way of differentiating between a properly-typed list where the type is unknown/irrelevant and a truly heterogeneous list. Thus we have no way of requiring that a list that happens to be homogeneous should nevertheless be returned as “untyped” (hence the need to untype at each recursion). GrounderUK (talk) 17:44, 3 December 2025 (UTC)Reply

Phraseology of lists, part 2

Lists: Function labels should prefer the term "list" (in English; in general, follow Z881's label). Aliases (and/or labels of inputs) can be used to indicate that it's semantically an "ordered set", "vector"/"matrix", or whatever else. YoshiRulz (talk) 21:35, 1 December 2025 (UTC)Reply

Support Support as proposer. I'd add an explicit exception for where Z881 is being used in lieu of another type, like Z882 which I believe is still not workingsuch a function could be labelled e.g. "filter Pairs by second element". YoshiRulz (talk) 21:35, 1 December 2025 (UTC)Reply
@YoshiRulz: Can you link to the Phabricator task for Pairs not working for some use case? I'm not aware of such issues. Jdforrester (WMF) (talk) 14:25, 2 December 2025 (UTC)Reply
Talk:Z882#Problems making use of this Type. I search Phabricator for Z882 and Typed pair and didn't see anything. YoshiRulz (talk) 16:32, 2 December 2025 (UTC)Reply
@YoshiRulz: Thanks for collating. I'll ping the team. Jdforrester (WMF) (talk) 15:48, 3 December 2025 (UTC)Reply
Please see phab:T384104. Since this was created, support for adding local keys has become available, but it is still necessary to specify each object’s type and there is no validation that both the paired objects have types that are consistent with the Typed pair’s specification. Please see this simple example.
Issues like this have not been reported because, according to Wikifunctions:Status#Type creation is locked-down to staff, “Generic types and generic functions require a bit of development and bug-fixing, and are not ready yet.” I believe the most recent announcement on the subject is Wikifunctions:Status updates/2024-07-10#Typed lists now open beyond Booleans and Strings, which implicitly excludes Typed pair (and Typed map). This is also the reason why Z882 has no test cases. GrounderUK (talk) 20:19, 2 December 2025 (UTC)Reply
@GrounderUK: I don't think that the nice-to-edit UX work is the same level as the concerns @YoshiRulz has highlighted, but yes, that also needs fixing. Type creation being locked down is not the same thing as using existing Types. Assuming that we intended you to never use them without asking seems like what led to this. Jdforrester (WMF) (talk) 15:49, 3 December 2025 (UTC)Reply
I think having a rule for temporary usage of lists where we're waiting for wider type support is, like the advice to use (!) above, a bit sad to have to do. Jdforrester (WMF) (talk) 14:25, 2 December 2025 (UTC)Reply

Functions with multiple inputs of the same Type

Same-typed inputs: Where these are transposable (the function is commutative), they should be labelled "1st <type>", "2nd <type>" (i.e. with ordinal numbers, and preferably using digits rather than words).
Where the order is semantic (non-commutative functions), I'm not sure it's possible to have a one-size-fits-all guideline. We should at least encourage the use of precise terms like "minuend"/"subtrahend" where they exist. YoshiRulz (talk) 21:36, 1 December 2025 (UTC)Reply

Support Support as proposer. YoshiRulz (talk) 21:36, 1 December 2025 (UTC)Reply
I think this could be more simply worded as something like:

Functions' inputs should be briefly labelled in a way that easily distinguishes them. This is to help users understand what to do when using the Function. It is especially important where there are multiple inputs of the same Type. If the order of the inputs matters, they should have semantic labels like "numerator" and "denominator", or "subject", "action", and "object". If the order does not matter, they can have ordinal labels ("1st Natural number", "2nd Natural number", …) if no common semantic labels exist.

i.e., focus on the semantic-label-first rule, and push the ordinality question to the end as a back-up rule. What do you think? Jdforrester (WMF) (talk) 14:33, 2 December 2025 (UTC)Reply
I like how you've phrased it. YoshiRulz (talk) 14:57, 2 December 2025 (UTC)Reply

Preventative Test disambiguation

Tests: It seems Tests' labels must be globally unique, for some reason. We could recommend that their labels include a word specific to the Function, especially for NLG Functions' Tests which I suspect are more at risk of label collision. YoshiRulz (talk) 21:37, 1 December 2025 (UTC)Reply

Support Support as proposer. YoshiRulz (talk) 21:37, 1 December 2025 (UTC)Reply
Historically we've mostly put the output as the label, sometimes with the language code if the tested output is a monolingual string, or the inputs and outputs if there's space (e.g. "1+2 => 3", "[1,2,3,4] => 2.5", "population of Chicago in 2024", "[en] An archæopteryx is a genus of dinosaurs."). Trying to push in also the Function's name, especially when many of the NLG Functions' names are very similar, seems like it's going to make things particularly hard. Jdforrester (WMF) (talk) 14:36, 2 December 2025 (UTC)Reply
Rather than repeating the function's name, I was thinking of just a keyword or the name of the input/return Type, like I've done for prepend "pfx:" to a Z11 (Z30037). YoshiRulz (talk) 15:04, 2 December 2025 (UTC)Reply

NLG Test labels

Tests for NLG Functions: Their labels should be prefixed with the output language/dialect e.g. "[en] 2, good => second-best".
I guess these should be moved to mul too? YoshiRulz (talk) 21:37, 1 December 2025 (UTC)Reply

Support Support as proposer. YoshiRulz (talk) 21:37, 1 December 2025 (UTC)Reply
See above; though I like this pattern for functions that return a monolingual string, I don't think these should necessarily go in mul given wider usage patterns. Jdforrester (WMF) (talk) 14:38, 2 December 2025 (UTC)Reply
I like when I see a label also when it is not translated so far. There is a fallback language in Wikidata with a small text showing the language of the label. It is not possible in all cases to describe all inputs and the output they are leading to in the label. So it is necessary to look into the test case to understand it. This is fine for me. Hogü-456 (talk) 22:48, 5 December 2025 (UTC)Reply

Void is meaningless

Void: Change the English label for Z24 to "null tuple" or "nullary tuple", or rename Z23 to "Never" and use "nothing" for Z24.
(Z21 is fine as "Unit"; it's a Type with exactly 1 possible realisation.) YoshiRulz (talk) 21:38, 1 December 2025 (UTC)Reply

Support Support as proposer. YoshiRulz (talk) 21:38, 1 December 2025 (UTC)Reply
I'm not convinced. But feel free to at least list them as aliases. 99of9 (talk) 22:37, 1 December 2025 (UTC)Reply
Void is the formal term; it's far from meaningless. As @99of9 says, if you are troubled by the term, you could add it as an alias, but I think this deserves a wider discussion specific to that Type and not just as part of wider long-term naming rules discussions. Please also don't confuse things by merging in Z23/Nothing's rôle. Jdforrester (WMF) (talk) 14:39, 2 December 2025 (UTC)Reply
Clearly I didn't put enough time into this proposal, there's no explanation behind the new names nor for removing the old one.
I don't actually believe "void" is meaninglessactually, the fact that it has multiple meanings is IMO one of the problems with it. And the other big one is the contexts in which it appears, which includes any time an execution fails. Why is my function call "void", and is that "null and void" or "an empty void"? (In a way, both are true, since my call timed out and returned nothing.) I was already familiar with void in JS and the void return type in other languages so I had a head start in working out what Wikifunctions' void means.
Some programming languages like Rust represent their unit type's value as an empty tuple (), and I think that's the correct analogy because you're not getting nothing from the function, you're getting an indication of success, an empty box. Its contents are a "void", but it is not one. (And you're not being given a Nothing/Never because by definition there are no values of that type.) On Wikifunctions, a timeout means you get an empty box for your result plus a timeout error. YoshiRulz (talk) 17:05, 3 December 2025 (UTC)Reply
Isn’t it just that you always get a Evaluation result (Z22) but sometimes the object associated with the "Z22K1" Key reference (Z39) is literally a Reference to void (Z24) (even if you just echo the empty Unit object)? GrounderUK (talk) 18:01, 3 December 2025 (UTC)Reply
Sure, but I was referring to the UI. With this call for example: under "Result" it just says "void", then you can click "Details" and see that the error's Type is Z576 in this case.
I don't know why a Function would intentionally return Z24 except as a marker that an item should be dropped from a list (Wolfram Language Nothing, JS undefined IIRC), but yes I would expect to see Z24's label under "Result" then. YoshiRulz (talk) 18:52, 3 December 2025 (UTC)Reply
@YoshiRulz: I think it's totally fair to say that the case when we get a response that's a Z24 with an error, instead of showing "void" we could make the software show something different — but that's not really a naming convention, and more of a feature request. Jdforrester (WMF) (talk) 15:26, 5 December 2025 (UTC)Reply