Jump to content

Wikifunctions:Project chat/Archive/2025/02

From Wikifunctions

Unable to serialize Natural Numbers?

It seems that natural numbers (and quite a few other objects) can't be serialized into the Python/JavaScript runtime. Am I missing something?

~~~ WikipediaNeko (talk) 20:57, 4 February 2025 (UTC)

I’m guessing the function does not specify a type for the list it returns (other than “Z1”). The error text may be misleading, but it is generally conversion of integer, number or BigInt values back into Wikifunctions objects that fails, because the appropriate Z64 is not identified (or something). Officially, type conversion is supported only for properly typed lists, but yours may be a context where Z17895 may be of use. Further details are available! GrounderUK (talk) 21:35, 4 February 2025 (UTC)
This is the function I'm currently working on: https://www.wikifunctions.org/view/en/Z22180
This is my first experience with WF, I can barely understand what's going on :P. Thanks for the help though! WikipediaNeko (talk) 22:08, 4 February 2025 (UTC)
Looks like my guess was right. You can either change your test so that the list’s type is Z1 or you can have a function that only works for Natural numbers.
If you want a function that works for any type of list, I think your best option is to create a new function like Z18479 with an implementation like Z18496. This would call your existing function after converting a properly typed list to a Z1-typed list.
(This is a limitation that is really difficult to understand or explain, so please don’t let it put you off!) GrounderUK (talk) 23:22, 4 February 2025 (UTC)
To me it seems like Untyped Lists should be a first-class citizen. But ah well, WF is still in beta anyways. Again, thanks for all the help! WikipediaNeko (talk) 00:54, 5 February 2025 (UTC)
Oh, also - How come Z22182 is able to correctly process the natural numbers but not the others? WikipediaNeko (talk) 01:01, 5 February 2025 (UTC)
Because that composition relies on functions where type converting back is not needed. Feeglgeef (talk) 01:04, 5 February 2025 (UTC)
This section was archived on a request by: WikipediaNeko (talk) 02:23, 5 February 2025 (UTC)

Kleenean type available

We published the Kleenan type (still marked for Testing for now, just in case I made something wrong), based on the Type proposal for Kleenean. I wanted to kick off the year with something simple, before we get back into the business of regularly creating more types. If anything seems wrong, please let us know! Otherwise I will remove the testing mark tomorrow. --DVrandecic (WMF) (talk) 17:46, 3 February 2025 (UTC)

I removed the (Testing) tag, as it seems to work. Thanks everyone for testing, and thanks for reporting an issue with one of the converters, @Feeglgeef! --DVrandecic (WMF) (talk) 15:37, 4 February 2025 (UTC)
This section was archived on a request by: Mohanad (talk) 19:48, 12 February 2025 (UTC)

Creating a new pages

Hi, If I may, I have a tiny note. In my opinion, When creating a new page (even a translation paragraph) it might be better to use the edit summary, even if it's a single character, This prevents system from quoting part of page content in the summary and thus distorts page history and recent changes, best regards --Mohanad (talk) 05:41, 5 February 2025 (UTC)

This section was archived on a request by: Mohanad (talk) 19:48, 12 February 2025 (UTC)

How to test an Implementation

Yet another newbie question - How do I test an implementation I'm writing for a function that has no connected implementations? WikipediaNeko (talk) 22:54, 5 February 2025 (UTC)

You need a connected test. This can be run while you are editing the first implementation. You can have more than one, of course. GrounderUK (talk) 23:10, 5 February 2025 (UTC)
Even when neither the implementation nor the test is connected, you can see the success/failure in the table down the bottom of the function page itself. But only after you've saved the implementation, so it's a little frustrating. I'd encourage you to apply for functioneer if you plan on continuing to write functions. 99of9 (talk) 23:24, 5 February 2025 (UTC)
But assuming Z22227 is the function you're talking about, it looks like there is already a problem. What value will be sent to the function that is chosen to run? There is no input argument for that. Also, you've chosen a very challenging start, because functions of functions are not yet properly supported. Perhaps try with some much simpler functions first - there are plenty to try. For example, Kleenean xor is not written yet. Or there are plenty of floating point functions that do not yet have tests. --99of9 (talk) 23:28, 5 February 2025 (UTC)
Is there any existing way to run two different functions sequentially in composition? If so I definitely missed it. WikipediaNeko (talk) 23:34, 5 February 2025 (UTC)
Z13351 might be what you're after. 99of9 (talk) 23:38, 5 February 2025 (UTC)
Yes, it is! Thanks. WikipediaNeko (talk) 23:41, 5 February 2025 (UTC)
This section was archived on a request by: Mohanad (talk) 19:48, 12 February 2025 (UTC)

Wikifunctions & Abstract Wikipedia Newsletter #188 is out: Invitation to the Natural Language Generation Special Interest Group

There is a new update for Abstract Wikipedia and Wikifunctions. Please, come and read it!

In this issue, we present a proposal to restructure our Natural Language Generation Special Interest Group (NLG SIG) meeting, we announce the creation of a new type, and we take a look at the latest software developments.

Want to catch up with the previous updates? Check our archive!

Enjoy the reading! -- User:Sannita (WMF) (talk) 17:17, 6 February 2025 (UTC)

@99of9 I saw that you mentioned the algebraic formula implementation of Kleenean and in the Function of the week section; just a random question, is there a list of algebraic equation equivalents of three-valued logical functions, and are there equivalents in Boolean algebra as well? Xeroctic (talk) 12:03, 8 February 2025 (UTC)
@Xeroctic I got that one from the three-valued logic Wikipedia page. Sorry, I don't know if they're neatly listed somewhere. It seems like something that could be done computationally without too much difficulty. 99of9 (talk) 12:10, 8 February 2025 (UTC)
This section was archived on a request by: DVrandecic (WMF) (talk) 14:13, 13 February 2025 (UTC)

Byte type

We implemented the proposal to fix up the Byte type. We removed the markers from the type to invite usage, and now more functions can be created for the Z80/Byte type. We are inviting you to suggest display and read functions for this type, too. If you find any issues with the type, please let us know. Enjoy! -- DVrandecic (WMF) (talk) 15:06, 13 February 2025 (UTC)

This section was archived on a request by: DVrandecic (WMF) (talk) 09:55, 26 February 2025 (UTC)

Wikifunctions & Abstract Wikipedia Newsletter #189 is out: Restricting the World, redux

There is a new update for Abstract Wikipedia and Wikifunctions. Please, come and read it!

In this issue, we have an essay from Denny, we discuss the fix to the Byte Type, and we take a look at the latest software developments.

Want to catch up with the previous updates? Check our archive!

Enjoy the reading! -- User:Sannita (WMF) (talk) 11:19, 14 February 2025 (UTC)

This section was archived on a request by: DVrandecic (WMF) (talk) 09:54, 26 February 2025 (UTC)

Wikifunctions & Abstract Wikipedia Newsletter #190 is out: A proposal for types per language and part of speech

There is a new update for Abstract Wikipedia and Wikifunctions. Please, come and read it!

In this issue, we present a proposal for types to represent a part of speech in a language, we present some events that we have taken part to (or we are taking part to), and we take a look at the latest software developments.

Want to catch up with the previous updates? Check our archive!

Enjoy the reading! -- User:Sannita (WMF) (talk) 15:50, 20 February 2025 (UTC)

This section was archived on a request by: DVrandecic (WMF) (talk) 09:55, 26 February 2025 (UTC)

New Codex table

Hi, I don't know if it's just me or everyone else's, there's a display problem with the word "Passed" that makes it wrap in a new line "letter by letter" in some table cells. You can see that in the test table of this function. I think there's some CSS rule responsible for that "It appears in browser dev.tools", maybe overflow-wrap: anywhere;. The issue depends on column width and for limited columns, the word Passed appears like that --Mohanad (talk) 20:04, 11 February 2025 (UTC)

Yes, this sometimes happens for me too. 99of9 (talk) 09:42, 12 February 2025 (UTC)
I can not reproduce the issue on the provided link in firefox, chrome or safari. Would you mind telling me which browser and dimensions (screensize) your are using when you see this? DSmit-WMF (talk) 10:26, 12 February 2025 (UTC)
Hi @DSmit-WMF see this function --Mohanad (talk) 10:41, 12 February 2025 (UTC)
Or this --Mohanad (talk) 10:45, 12 February 2025 (UTC)
I am not seeing it locally so my guess is this reverted mixin in Codex.
I asked internally. I will get back to you! DSmit-WMF (talk) 11:16, 12 February 2025 (UTC)
I'm seeing it right now at on Chrome in a full screen browser on a 1920x1080 display. 99of9 (talk) 11:30, 12 February 2025 (UTC)
@99of9 I tested it on different desktop & mobile browsers and got the same issue, it also affects the words "composition" & "javascript" on the Implementations table. --11:49, 12 February 2025 (UTC) Mohanad (talk) 11:49, 12 February 2025 (UTC)
Interesting. I don't get that behaviour in the implementation table. 99of9 (talk) 11:59, 12 February 2025 (UTC)
Just confirmed: This weeks Codex release will fix the table behaviour. DSmit-WMF (talk) 14:39, 12 February 2025 (UTC)
@Mohanad@99of9 Did the issues mentioned here get resolved for you? --DVrandecic (WMF) (talk) 09:52, 26 February 2025 (UTC)
I can't reproduce it at the link I previously shared, and haven't seen the issue recently. But then again I haven't contributed much this week, so I may not be the best judge. --99of9 (talk) 11:37, 26 February 2025 (UTC)
@DVrandecic (WMF) It was actually resolved the same day DSmit-WMF commented, I left discussion open in case she had any final comments before closing it --Mohanad (talk) 19:22, 26 February 2025 (UTC)
@Mohanad: Thanks! I'm closing it then! Cheers --DVrandecic (WMF) (talk) 12:07, 27 February 2025 (UTC)
This section was archived on a request by: DVrandecic (WMF) (talk) 12:07, 27 February 2025 (UTC)

More detailed request

Moved from Administrators' noticeboard
Hi, I just noticed that the template here doesn't have any prominent information indicating exactly what permission is requested, which makes the sub-pages of the archive page here (all requests together) a bit less clear.
Maybe a line like this one used on mediawiki wiki will be helpful. --Mohanad (talk) 10:37, 3 February 2025 (UTC)

Natural language functions

Hello everyone,

We’re working on functions that return natural-language outputs, and we should think about how to handle both these outputs and their inputs consistently. For example:

  • How should we represent inputs like Wikidata’s “grammatical features” (gender, number, etc.) across different languages?
  • How do we decide whether a function’s output should be a simple string or monolingual text?
  • How can we create functions that work across languages and sentence structures, or that can be combined with other functions to do so?

What are your thoughts? Any suggestions or examples? GrounderUK (talk) 13:55, 1 February 2025 (UTC)

Thanks. This is important questions. I don't have a good answer for now but I did see some strange and inconsistent things, we need some clarity. For example, "[gender] is a [country] [professional]", English Lexemes (Z21765) using the datatype Sign (Z16659) in input for gender (it works in this case but it still feels wrong, and it won't work in most languages and grammatical features), there is also Conjugate regular -er verb (Z21617) using arbitrary Natural number (Z13518) (see the two proposals WF:FRENCHSUBJ and WF:FRENCHTENSE by MolecularPilot).
My two cents to help move forward:
  • grammatical features can be strange (natural languages are full of exception and unexpected things, like in Breton prepositions are conjugated like verbs), whatever we choose need to be flexible enough.
    • At the same time, I guess we don't want to recreate a list/datatype for each languages as most behave similarly and it would be redundant (6000+ languages with most of the non-genderless one having masculine/feminine, see https://wals.info/feature/30A for instance).
  • Some of these functions will rely on Wikidata so Wikifunctions should understand and accept grammatical features used on Lexemes (and there is a lot, 989 right now : https://qlever.cs.uni-freiburg.de/wikidata/7n3eYj ).
  • Right now, it seems most natural language functions return a simple string (String (Z6)), it's not wrong but a monolingual text (Monolingual text (Z11)) would be cleaner and clearer (removing the language tag afterwards is super easy - we already have string of monolingual text (Z14396) -, adding it could be trickier: when the label of a function says "English" is is American English, British English, both, none? a monolingual text would be explicit).
Cheers, VIGNERON (talk) 14:27, 1 February 2025 (UTC)
The 989 grammatical features are probably all useful, but the first few I looked at (e.g. "singular") would just be enumeration values for a type (in this case "grammatical number", of which there are currently 24 possible values). I think we should have a single type for grammatical number, and some languages will hardly use any values, but every language would find what they need. Similar for other categories of grammatical features. At present, this would result in long dropdowns (e.g. 24 items of which I might only know what two of them are), but if sorted reasonably well, I think that would be fine. 99of9 (talk) 10:55, 2 February 2025 (UTC)
If we are ramping up the use of monolingual texts (I'm not opposed, when functions are returning sentences or phrases directly for a language), then we should start building quite a few more monolingual text helper functions (e.g. join texts - even the simple ones haven't been written). I'm a little bit concerned that we'll need separate functions to generate each of the monolingual texts for each of the English variants. It would be good to call a single one and share results whenever possible. 99of9 (talk) 11:03, 2 February 2025 (UTC)
As I suggested on Telegram, I think the “gender” input for Z21765 is best understood as a placeholder for a noun phrase. The function supplies the copula and sentence complement for an indeterminate person who is the grammatical subject. This is an English language (family) function that assumes a third-person (semantically) singular subject (a living human being who is currently active in some profession or role). For English, such a context is not sufficient to determine the required placeholder (a pronoun), because third-person (semantically) singular pronouns are marked for “gender”. This additional context is therefore required as input. The use of Z16659 here is unfortunate, but we do not have a general-purpose Type to represent “one of three options” (and I’m not suggesting we should). A similar solution was not available for Z21617, hence the use of arbitrary natural numbers (and I’m not suggesting we should do that, either).
Although I don’t object to Z21765, I don’t believe it provides a useful pattern for future functions. To be useful in a Wikipedia context, the end result (like “they are an American actor”) would need to change when the person gives up acting or dies, or changes their pronoun preference or nationality. Of course, we could have a separate function to handle the past tense or whatever and rely on prior functions to call a different function when the context changes, but I don’t think that would be sensible. In a more multilingual context, we would presumably characterise the context in a language-neutral way but expect (more) language-specific functions to determine the form of the copula (if any), and the required forms of any article, adjective or noun (or, indeed, a different sentence structure altogether). None of this is straightforward but it is characterising the context that poses a particular challenge, as more languages are considered.
The “grammatical features” for a lexeme form on Wikidata suggest a way forward, since they can account for the variety of forms that are available. In effect, a particular function will produce sensible results for some subset of all supported contexts. However, we need to be able to handle the normal cases where the available context provides information that is unnecessary for the function, as well as the cases where the function supports distinctions that the context does not. This suggests the need for some intermediate interface function(s) that can reduce or extend the available context according to the expectations of the function being called. For example, if the mood is not available in context when calling a French conjugation function, it would default to the indicative mood. This implies that the user interface for such a function would support the provision of the context as an input object, presumably (at its most basic) as a list of grammatical features. How we could restrict such a list to values for relevant grammatical features is an open question (see, for example, phab:T379338 and Wikifunctions talk:Representing identity#Functionally constrained lists). GrounderUK (talk) 12:06, 2 February 2025 (UTC)
Using grammatical features in this way has now been prototyped at Z22097. This calls one of these existing functions based on a supplied list of grammatical features. It has two implementations but these are set up to call only a few functions while we evaluate this approach. GrounderUK (talk) 21:20, 2 February 2025 (UTC)
I’ve also created Z22107 to demonstrate the expansion of composite items like first-person singular (Q51929218) into its basic components. It currently recognises only three such items and passes any other grammatical features through unaltered.
After discussions with @99of9 and @Feeglgeef on Telegram, I created an implementation of Z22097 that uses Z19601. This allows a flatter conditional structure similar to a case construct, which is easier to work with but doesn’t scale well to support a large number of function calls. This implementation currently supports calls to seven of the Breton conjugation functions. GrounderUK (talk) 13:25, 3 February 2025 (UTC)
  1. No opinion
  2. I'd actually quite like to use monolingual texts for ones that we don't intend to use on Wikipedia
  3. Generally I think we should try to have the same input types, even if that means a lot of redundant inputs.
Feeglgeef (talk) 16:50, 1 February 2025 (UTC)
I like it if outputs are simple strings. For me as I usually try to not care about types while programming and give the decision to language interpreter this seems to be easiest thing. As different people implement functions it will be not possible to be completely consistent here. I prefer referring to objects. For using functions across languages I need to think about how far it is possible. I will write something about it maybe in the next days. Hogü-456 (talk) 22:58, 1 February 2025 (UTC)
Thanks.
  1. Do you have any concerns about the use of grammatical features?
  2. Same here, although my comment seems to have been overlooked (except by you).
  3. The problem I have with redundant inputs is that they are liable to be inconsistent with the grammatical features that are actually present on Wikidata. The approach I’ve adopted so far with Z22097 and Z22107 is tolerant of redundancy and intolerant of deficiencies, but that is more “line of least resistance” than a firm conviction.
GrounderUK (talk) 14:41, 3 February 2025 (UTC)
Since first text of lexeme matching grammatical features (Z19530) takes a list of Wikidata item reference (Z6091), I'd expect other functions to use that too (though I'm not sure how you'd include number). For languages with only a few cases, there could be persistent (named) lists for each as a shorthand. YoshiRulz (talk) 06:41, 2 February 2025 (UTC)
Please see, for example, Z22098 specifying grammatical number (Q104083) as singular (Q110786). We might also consider using expansions of items like first-person singular (Q51929218), as suggested by User:VIGNERON on Talk:Z22097 GrounderUK (talk) 22:58, 2 February 2025 (UTC)
My preference would be to introduce precise enumerations for grammatical features. For example, we would have one enumeration for grammatical genders for languages that have feminine and masculine genders (e.g. for Spanish and French), and one for languages that have three grammatical genders such as German, and so on. Then there are individual enumerations for grammatical numbers: there's one for languages with singular and plural, one for singular, dual, and plural, etc. And each language-part of speech would only use the relevant enumerations.
This means creating quite a few enumerations, but I think that's OK.
Furthermore I think we should have individual types for each pair of language and part of speech, i.e. a type for English noun, a type for Breton verb, a type for Hausa verb, a type for Ukrainian adjective, etc. And each of these would be using the right enumerations as created above.
I know that it is a bit of work, but in the end it allows the user experience to provide much more guidance.
I think this is a really important discussion, and it would be good to get this right! --Denny (talk) 15:25, 3 February 2025 (UTC)
I would be happy to create a few grammatical gender enumerations for now, e.g. one for feminine / masculine and one for feminine / masculine / neuter, and maybe a few more, depending on the languages people would like to work on. Creating enumerations is not that much work, and I agree that it would be good to get rid of using sign to represent gender rather sooner than later. --Denny (talk) 15:28, 3 February 2025 (UTC)
Can instead of this we have a "flexible enum" type that has a key-value pairing for dropdowns instead?
{"Masculine": 1, "Feminine": 2, "Neuter": 3}
Would create a dropdown and the value would be passed into the function. This would still be reusable and scale much better. Feeglgeef (talk) 15:47, 3 February 2025 (UTC)
The idea of having a separate type for each pairing of language and part of speech sounds pretty scary. Or were you thinking of generic types?
I’m not opposed to specific enumerations for grammatical features like “gender” and “number” but naming might be a problem. For pronouns and verb conjugations, a pairing of gender and number (like first-person singular (Q51929218), as suggested on Talk:Z22097) should be considered (and deferred).
It seems important that we have seamless conversion to and from Wikidata item references so that we can select appropriate forms. This is why I still prefer to use Z6091 directly. But I recognise the usability advantages of a dropdown that is limited to the tiny subset of all Wikidata items that are the relevant kind of grammatical feature. This is why I favour “functionally constrained lists” or something similar, in the medium term. This month, I support a few gender and number types, as you suggest. I would include common/neuter for Swedish etc. (For English, we should think about gender-neutral/masculine/feminine nouns like actor/actress and (s)he/they/it singular pronouns.) GrounderUK (talk) 15:21, 5 February 2025 (UTC)
Yes, I fully agree that we should have a smooth transition from Wikidata Items to e.g. enumeration values. So if we had, e.g. french tenses, with a value for "Présent", we should have a functions that map to the appropriate Item and back. I see that Gregorian calendar month to Wikidata reference (Z22240) is heading in exactly that direction, thank you!
This would allow us to have exactly the values that we need, without needing to force Wikidata to follow the same structure. For example, if we decide that person and number should be combined, we could have mappings to the number, mappings to the person, and mappings to combinations.
Regarding separate types, I wasn't thinking of generic types, but of normal types. Do you find it scary due to the potential number of types, or due to other reasons? I want to write up more on my thinking around that, so questions are good. --Denny (talk) 12:29, 7 February 2025 (UTC)
1000 languages * 10 enums per language * 10 items per enums gives us a rough estimate of 100,000 objects. The number is definitely the problem. Feeglgeef (talk) 13:48, 7 February 2025 (UTC)
Oh, I don't think that's how it would play out. The enums are often reusable across different languages. For example, I expect a few grammatical gender enums, but certainly not one per language, more like in the area of low tens. Same think for grammatical numbers, where they are probably less than ten different ones. --Denny (talk) 13:53, 7 February 2025 (UTC)
You mentioned one per part of speech per language. That's not reusable (and would have much much more than 10 items). Perhaps the team can provide a way to filter a lexeme/item search box with a generic type? I've created a task for this, phab:T385895. Feeglgeef (talk) 16:54, 7 February 2025 (UTC)
Ah, yes, but those wouldn't be enums. But yes, I currently don't see how those could be made generic, nor how this could be avoided without the system become very fragile. But I am afraid this requires an essay to explain. I will do so, but give me a few days. Thanks for prompting. --Denny (talk) 07:27, 8 February 2025 (UTC)
I made proposals for grammatical gender enumerations: masculine / feminine, masculine / feminine / neuter, common / neuter, animate / inanimate, in order to demonstrate what I mean. I am not sure if it makes sense to discuss the individually, and instead I suggest to discuss them here all together. --Denny (talk) 14:52, 7 February 2025 (UTC)
Hi @Denny just a slight question about "Gregorian calendar month to Wikidata reference", I found that the returned object in the return statement works fine in js, but in python it always gives errors, I tried rewriting it from scratch but got the same object, shouldn't it be the same form in python and js? thx --Mohanad (talk) 20:21, 7 February 2025 (UTC)
The way I did it for JS was to copy from and adapt one of the converters from code. So for Python, I would start with Z13532. Is that roughly what you already tried? Would you like to try this way? I can try too if you like. @GrounderUK also mentioned problems when he tried with Python a few weeks ago. 99of9 (talk) 01:32, 8 February 2025 (UTC)
I tried at Z22256 (direct copy of the natural number converter). I have the same problem as you. So yes @DVrandecic (WMF) this is worth a better investigation. 99of9 (talk) 02:14, 8 February 2025 (UTC)
Thanks @Mohanad for raising it, and @99of9 for looking into it! I looked into it and am at a loss too. I will raise it with the team next week. Thanks! --Denny (talk) 08:17, 8 February 2025 (UTC)
@Denny Thx in advance, @99of9 I followed function model to write what I think would work, and it didn't. Thank you for contributing to discussion --Mohanad (talk) 09:05, 8 February 2025 (UTC)
@Mohanad: Cory on our team fixed it. This is not yet documented, so yeah, there wasn't much chance to get this right, apologies. But now it works! --DVrandecic (WMF) (talk) 13:53, 10 February 2025 (UTC)
@Denny oh that's different, thx again --Mohanad (talk) 14:20, 10 February 2025 (UTC)
Yes, sorry for the missing documentation! --Denny (talk) 18:57, 10 February 2025 (UTC)
I was close! Feeglgeef (talk) 14:36, 10 February 2025 (UTC)
Closer than me! --Denny (talk) 18:57, 10 February 2025 (UTC)
Maybe it’s just scary because I’ve always assumed we definitely wouldn’t be having separate types for each language. I won’t try to second guess your thinking now, if you’re planning to write it up in the next week or two. At the phrase/lexeme level, we just need to refer to the linguistic context. The language, part of speech, person, number, tense etc are all included in that context and if you multiply them up, you end up with a big number. The two biggest contributors to the final answer are language and “etc”. We can quantify language at around a thousand, but I don’t have a good sense of how large “etc” might be. In any event, since you mention only language x part of speech, my concern is how we represent functions that determine the part of speech and whose return type represents “noun or verb (infinitive or participle)”, for example (“I need to think” or “I need a think” or “thought is needed” or “thinking is necessary” etc). GrounderUK (talk) 10:55, 8 February 2025 (UTC)
Just repeating myself from Wikifunctions:Type proposals/Grammatical gender (m/f)#Comments, as the concern applies to all the proposed grammatical “gender” proposals.
“The conversion to code is unclear, but arbitrary strings like “m” or “f” may reflect a Latin bias and alternatives may not appear to be gender neutral. We could just use QID or ZID strings, but the readability of the code would suffer. In any event, the “masculine” QID happens to be alphabetically later and numerically lower (and shorter) than the “feminine”. When it comes to creating ZIDs, we might choose to create them in the opposite order (and perhaps create “common” and “neuter” before we create either of them).”
Although the concept and terminology of grammatical “gender” is well established and has some objective justification in many languages, we might take this opportunity to consider whether we can adopt a more neutral classification, especially with regards to cultural bias in favour of “western” languages. GrounderUK (talk) 19:24, 16 February 2025 (UTC)
As promised before, I published a proposal for a new type for representing parts of speech in certain languages, and why I think that's helpful, and how it would work, and why I think that, though it would be a lot of Types, many of them are kinda easy to build. I am also putting this up for discussion in today's NLG SIG meeting, but I also invite discussion on-wiki. I don't know a good name for the proposal, and I am sure it can be explained better what I mean, so please ask questions! --DVrandecic (WMF) (talk) 12:31, 18 February 2025 (UTC)

Translation statistics

I updated my subpage with statistics about translation of Z-Objects. I am not sure if the numbers are accurate. So far according to the numbers there are many languages with only a small number of translations. I hope there will be more translations also into languages with not so much translations of Z-Objects so far. What do you think needs to be changed to get more translations in Wikifunctions. Hogü-456 (talk) 19:14, 16 February 2025 (UTC)

I don't really understand what those numbers mean. For English it says:
Z2K2 value: 2993
Z2K3 label: 12136
Z2K3 alias: 1
Z2K4 short description: 2633
Total: 17763
Usually a Z-Object would be in a few categories (e.g. it might have a label and a description). Where would it show up? If it showed up twice, then the total should not be the sum of all the categories. I'm also surprised that alias = 1, when heaps of our ZIDs have English aliases. Finally, what does Z2K2 value mean? Doesn't every ZID have a value? The value doesn't really have a language does it? 99of9 (talk) 23:39, 17 February 2025 (UTC)
I'm assuming it means that English is referenced in the Z2K2. I don't think that is accurate and I wouldn't trust any of this data. Feeglgeef (talk) 01:59, 18 February 2025 (UTC)
Here are my results for English in a python program on the most recent dump:
{'Z2K1': 1, 'Z2K2': 2193, 'Z2K3': 12700, 'Z2K4': 1230, 'Z2K5': 2808}
Seems to be using a simple text includes search, which is problematic and overcounts sometimes. Also seems to be using an older dump.
Feeglgeef (talk) 02:49, 18 February 2025 (UTC)
Yes it can be that the data is not accurate. The table shows the total number of translations of Text objects within Z-Objects. It is counting how often you can find specific combinations of labels or other texts in a Z-Object. The number for alias is not correct. I will look at the code again and maybe I can explain it based on one Z-Object how the counting does work at the moment and can learn how it should work. Hogü-456 (talk) 21:16, 18 February 2025 (UTC)

equality function for natural languages

Please can we add Z14326 as the equality function for Z60? As far as I understand, the only consequence of adding it to the Type is that when we create tests on functions which return languages, it shows up as the default result validator. This would almost always be an improvement over not having any default, so having to choose a function every time (especially for newcomers who don't know which equality functions are on offer). 99of9 (talk) 04:03, 13 February 2025 (UTC)

@99of9: Done. If there is disagreement on this, we can still change it. Thanks for the suggestion! --DVrandecic (WMF) (talk) 14:12, 13 February 2025 (UTC)
I very much disagree with this change. I think to newcomers, prefilling often seems mandatory (which may mean we need UI improvements), but I also think that to an average person, American English and British English are the same language, so I Oppose Oppose this change. Feeglgeef (talk) 21:44, 13 February 2025 (UTC)
I don't think we should treat American English and British English as identical or the same. They have different QIDs and different IETF codes.
On the other hand, it would make a lot of sense to introduce a weaker notion of similar enough, which could be based solely on the language code being the same (in which case "en-GB" and "en-US" could be the same, just as "pt-PT" and "pt-BR" could be the same), and even a mutual intelligible test, in which case e.g Urdu and Hindi or Serbian and Croatian may pass (I am not a linguist and might have gotten my examples wrong).
For testing though, I think that strict identity is the right thing to test. --Denny (talk) 10:35, 14 February 2025 (UTC)
An equality with the semantics of "is mutually intelligible" wouldn't be symmetric nor transitive, limiting how it could be used. YoshiRulz (talk) 17:37, 14 February 2025 (UTC)
Good point -- that's no equality. --Denny (talk) 09:53, 26 February 2025 (UTC)
I support the change that has been made, for the time being. We should consider a function that extends the concept of “similarity”, one of whose options would correspond to “identical” in the current sense given by same language (Z14326). The other options will be difficult to get right but will be needed for graceful fallback. In general, en-gb text would tend to prefer an en-au or en-ca option over en-us, for example. Likewise, I imagine, en-au text would favour en-gb if it differs from en, especially if there is no en-us text or when en-us and en are the same. In the context of result validation, we could simply use is language in list (Z14321) rather than same language (Z14326), as it’s pretty obvious that listing only one language requires an exact match. This would mean that something like Z22385 could not be specified but I’m not actually convinced that it should pass anyway. GrounderUK (talk) 13:10, 16 February 2025 (UTC)
This section was archived on a request by: 99of9 (talk) 00:34, 26 March 2025 (UTC)

Unicode code point is now available for testing

Hello all! As per Wikifunctions:Type proposals/Unicode codepoint, the type Unicode code point (Z86): A single code point in Unicode has been (hopefully fixed). All functions using it have also been adapted. The functions are collected in the function catalogue: Character operations. Please give this type a whirl, and if it is all fine, I will remove the 'testing' tag after the weekend. --Denny (talk) 16:05, 21 February 2025 (UTC)

I removed the "testing" from the name. --DVrandecic (WMF) (talk) 09:57, 26 February 2025 (UTC)
This section was archived on a request by: 99of9 (talk) 00:35, 26 March 2025 (UTC)

Wikifunctions & Abstract Wikipedia Newsletter #191 is out: From things to words

There is a new update for Abstract Wikipedia and Wikifunctions. Please, come and read it!

In this issue, we discuss the deployment of the possibility of getting the right Lexeme given a Wikidata Item, we discuss the newest updates about Types, and we take a look at the latest software developments.

Want to catch up with the previous updates? Check our archive!

Also, we remind you that if you have questions or ideas to discuss, the next Volunteers' Corner will be held on March 3, at 18:30 UTC (link to the meeting).

Enjoy the reading! -- User:Sannita (WMF) (talk) 16:16, 28 February 2025 (UTC)

This section was archived on a request by: 99of9 (talk) 00:35, 26 March 2025 (UTC)

proposed Reader and Display functions for Gregorian Year Type

I suggest we use the following functions as a reader and display function for the Gregorian Year Type:

Feel free to add a localised function for your language to the configuration if you would like to replace the default display function which uses ISO 8601 integer strings. Adopting the English AD/BC is not the best default here. --99of9 (talk) 03:13, 28 February 2025 (UTC)

Good work. It would probably make more sense to use (B)CE for the default display case, but there are no great options here, as far as I can see. The other issue we might consider concerns “year 0”. Possibly the read function should do the least illogical thing, treating unmarked or signed 0 as 1 BC, with 0 AD being the year before 1 AD and 0 BC being the year after 1 BC. Currently, I think -0 becomes 1 AD but 0 BC becomes 1 BC; I’m inclined to switch those around. GrounderUK (talk) 11:43, 2 March 2025 (UTC)
I agree with your two illogical reader examples. I'll add them as tests. For the default display (B)CE doesn't solve the problem of the three abbreviated words being English words. So even before we go to other scripts, even other western languages use different abbreviations. Although it looks ugly, ISO 8601 was invented to solve this kind of problem, so seems like the natural default to me. I hope that the ugliness will spur editors to write their localised display asap! 99of9 (talk) 22:29, 2 March 2025 (UTC)
For the reader, I think we need to choose between Z20205 and Z23019. The former seems somewhat natural, the latter aligns with ISO 8601 and also with our integer representation in code. 99of9 (talk) 23:23, 2 March 2025 (UTC)
Yeah 🙁 Nothing about negative years feels natural to me! But different interpretations in different contexts would just be doubly evil. Perhaps some other symbol(s) could represent BC(E), so, for example, 512- or -512= would read as 512 BC? GrounderUK (talk) 00:08, 3 March 2025 (UTC)

Rollback for Functioneer?

Do people here think rollback user right can be granted to Wikifunctions:Functioneers? That is a trusted group for connecting new creations to their functions, and can be trusted with easily and quickly reverting vandalism using Rollback feature. ~/Bunnypranav:<ping> 13:49, 15 February 2025 (UTC)

I'm not sure I see this crossover of rights as very useful, as the rest of the functioneer permission set is content related, not anti-vandalism related. Feeglgeef (talk) 19:50, 15 February 2025 (UTC)
Eventually we can add another group or change it. But for the current status of the wiki, it may be good to include in funtioneer ~/Bunnypranav:<ping> 03:22, 16 February 2025 (UTC)
How much vandalism have you been seeing? I think there's not too much to worry about. --99of9 (talk) 04:46, 16 February 2025 (UTC)
It's not that I'm seeing vandalism, it's just about should the right be also assigned to non-admins? Now that I think of it again, the limited (if any) vandalism here may just be reverted using undo/twinkle global. ~/Bunnypranav:<ping> 07:49, 16 February 2025 (UTC)
I think it would be a better idea to have a patroller group with patrol and rollback rights. --Ameisenigel (talk) 17:21, 16 February 2025 (UTC)
That is actually a pretty good idea. Any comments @99of9 @Feeglgeef? ~/Bunnypranav:<ping> 11:20, 17 February 2025 (UTC)
Looks fine to me. Feeglgeef (talk) 15:40, 17 February 2025 (UTC)
I agree that would make sense. But I'm not in a big rush, because the list of recentchanges is still short enough to actively monitor. 99of9 (talk) 23:04, 17 February 2025 (UTC)
I think Rollback as an independent user group will be a better idea. Some users are not as interested in patrolling other users' edits as they are in rollbacking vandalism, especially those active in anti-vandalism across projects. Also, it will be useful for autopatrollers --Mohanad (talk) 16:56, 5 March 2025 (UTC)

Should we have a discord channel

Do the people contributing to this project feel that we can benefit with a discord channel in the Wikimedia discord? Currently we have a telegram channel bridged to irc. The proposed discord channel will of course be bridged to the telegram and irc. ~/Bunnypranav:<ping> 13:50, 21 February 2025 (UTC)

Even if one is set up, I oppose it being bridged to Telegram/IRC. (I don't mind anyone using Discord or creating channels there, but don't bridge proprietary, non-self-hosted platforms.) Ainali (talk) 13:58, 21 February 2025 (UTC)
I'll go wherever the community goes. I'm not against Discord, but I'm comfortable on Telegram too. 99of9 (talk) 00:07, 22 February 2025 (UTC)
I don't see what harm bridging it would cause, even if Discord is proprietary. Doing otherwise would probably fragment the Discord chatroom from the other chatrooms, which IMO defeats the purpose of hypothetically starting this in the first place. rae5e <talk> 03:49, 11 March 2025 (UTC)
@Theki: Indeed, given that Telegram is equally proprietary I'm not sure why bridging to it is fine but to Discord isn't. I know that a good chunk of wider Wikimedia community folks are on Discord, so being there doesn't feel like a bad step, but I don't want to advocate for something that would be at odds with this community's feelings. Jdforrester (WMF) (talk) 15:17, 11 March 2025 (UTC)
@Ainali, what do you think about the above? ~/Bunnypranav:<ping> 01:43, 12 March 2025 (UTC)
When the Telegram client is open source, "equally proprietary" seems like an odd comparison. Ainali (talk) 06:37, 14 March 2025 (UTC)

expression error with eastern numerals languages

Hi, for a while I noticed an "expression error" appearing in some pages whose languages ​​use eastern numerals "according to mediawiki settings". Perhaps the numbers are converted to eastern numerals and the system does not recognize the value. An example of these pages are the main page templates: ar, fa, bn, hi. Also ,look at this old revision of a page that was translated from one that used #exp: magic word for the number of administrators, It was later reverted--Mohanad (talk) 13:23, 15 February 2025 (UTC)

I think {{#time:U}} in {{Random function}} needs to be {{#time:U|now|en}}. YoshiRulz (talk) 19:44, 15 February 2025 (UTC)
Maybe turning off $wgTranslateNumerals for wikifunctions is a better long term solution for a code publishing project, multiple projects have done it --Mohanad (talk) 23:07, 17 February 2025 (UTC)
Other solution is to use the magic word: {{formatnum:...any variable num...|R}} where ever expr: is used with variables, Ex: {{#expr:{{formatnum:{{NUMBEROFADMINS}}|R}}-2}} --Mohanad (talk) 16:29, 5 March 2025 (UTC)
I’m just hoping that it won’t be long before we can use a Wikifunctions function here. GrounderUK (talk) 16:46, 5 March 2025 (UTC)
It's for RNG though? Anyway, I already posted a solution. YoshiRulz (talk) 16:01, 28 March 2025 (UTC)
I've made the change you suggested here. Did that fix the issue? --99of9 (talk) 13:32, 7 April 2025 (UTC)

proposed Reader and Display functions for Byte Type

I suggest we use the following functions as a reader and display function for the Byte Type:

Feel free to add an implementation to either with a language configuration if you would like to convert to a different numeral script. --99of9 (talk) 05:40, 27 February 2025 (UTC)

How do you feel about displaying in three or more formats: hex, binary, decimal and (maybe) octal? Something like “F1 (hex), 11110001 (binary), 241 (decimal), 361 (octal)” or “F1 [16], 11110001 [2], 241 [10], 361 [8]”? GrounderUK (talk) 12:09, 2 March 2025 (UTC)
They all seem too long IMO. Imagine how they would show up in rendered displays (e.g. on Tests). I agree that choosing a base has no right answer, but I think choosing any is still better than all. --99of9 (talk) 22:21, 2 March 2025 (UTC)
Ah, yes… you’re right, of course! GrounderUK (talk) 23:24, 2 March 2025 (UTC)
Can we proceed with this one? In particular the reader will be useful to accept a wider range of inputs. 99of9 (talk) 00:31, 26 March 2025 (UTC)
@DVrandecic (WMF): I know you've been busy, and am sorry to be pushy, but you asked for this about 6 weeks ago, and I tried to get it done the next day. --99of9 (talk) 06:32, 14 April 2025 (UTC)