Jump to content

Wikifunctions:Status updates/2025-03-20

From Wikifunctions
Wikifunctions Status updates Translate

Abstract Wikipedia via mailing list Volunteer Response Team Abstract Wikipedia on IRC Wikifunctions on Telegram Wikifunctions on Mastodon Wikifunctions on Twitter Wikifunctions on Facebook Wikifunctions on YouTube Wikifunctions website Translate

Wikidata-based simple enumerations

We introduced enumerations last year, and they have been a very popular kind of Type which has fulfilled many different use cases. But it is becoming clear that some of the use cases served by the current enumerations could be served with much less work for the community if we introduced a slightly different form of enumerations: Wikidata-based enumerations.

For example, there are suggestions for Types for grammatical genders. And we might have a Type for grammatical genders with two values for feminine and masculine (e.g. in Tuareg or French), and a Type for grammatical genders with three values for feminine, masculine, and neuter (e.g. in Telugu or German). In the way our system currently works, that would mean that we would need to create two different Types, and distinct objects for each of the two Types. Which means we would need many labels in many different languages being maintained, and confusing explanations like "this masculine is for languages with three grammatical genders". We further would need to map those objects to Wikidata items in lots of places so that Functions could reliably map to Lexemes.

We were discussing alternatives in the last few weeks, and came up with an idea we wanted to suggest: Wikidata-based enumerations.

Wikidata-based enumerations is a new kind of enumeration Types. They will be created from a list of Wikidata item references representing the different objects of the enumeration. Different Wikidata-based enumerations can share the same Wikidata item references, e.g. the feminine gender can be part of several such Types. Since we use Wikidata items to represent the values of the Type, we can also directly use their labels from Wikidata, reducing maintenance and work for these Types considerably.

We are currently working on finishing up the design of this new kind of Types, and plan to develop in the upcoming Quarter – so this is the right time for feedback!

One open question is how to deal with existing Types, which would have benefitted from this method, such as Gregorian calendar months. Technically, there is no need to switch existing enumerations to the new kind of Wikidata-based enumerations. But there might be an advantage in doing so, due to the maintenance of labels on Wikidata. In the end it is up to the community to decide on how to proceed. My suggestion would be to wait until the new kind of enumeration has shown that it works, and then to decide. We will be happy to help with the transition.

Since we want to use this new kind of enumeration Type where it makes sense, we have paused the creation of new enumerations that could benefit from the new Type for now.

Recent Changes in the software

There are relatively few changes shipping this week, as we home in on completing our work for the Quarter.

As part of the Wikidata support work, we've expanded Z6006/Wikidata lexeme sense to include a new key, 'Z6006K4/lexeme' (T388086). This will allow us to better represent the relationships between lexemes and their senses.

We updated the front-end code to streamline our use of mixins and utils, as part of our work to embed Wikifunctions calls inside Wikipedia articles. We expanded the test coverage of the middleware API that handles embedded function calls, so we're more confident it will operate as designed. We've landed the majority of the code to show entries in Recent Changes on wikis that will use Wikifunctions in their articles about changes made of Wikifunctions.org, in the same way that Wikidata does (T386090 and T386089).

Our work on back-end performance continues, with work to replace use of JSON Schema validation with simpler, hand-built heuristics (T381597) and unifying our Type-checking logic to reduce data conversions (T383575).

We've added support for a natural language, Z1967/mrt, as part of wider work to add that to MediaWiki (T388157).

Natural Language Generation Special Interest Group (NLG SIG) meeting recording available

The recording of the March meeting of the Natural Language Generation Special Interest Group (NLG SIG) is now available on Commons. We thank Mahir256 for his presentation “Finding lexemes for concepts, and actual and potential fallback mechanisms for this task”, all volunteers who joined us, and wish you happy watching!

Fresh functions weekly: 35 new functions

This week we had 35 new functions. Here is a list of functions with implementations and passing tests to get a taste of what functions have been created. Many of these functions are to deal with lexemes and related types. Thanks everybody for contributing!

A complete list of all functions sorted by when they were created is available on wiki.