Jump to content

Wikifunctions:Status updates/2025-07-04

From Wikifunctions
Wikifunctions Status updates Translate

Abstract Wikipedia via mailing list Volunteer Response Team Abstract Wikipedia on IRC Wikifunctions on Telegram Wikifunctions on Mastodon Wikifunctions on Twitter Wikifunctions on Facebook Wikifunctions on YouTube Wikifunctions website Translate

Coverage of 1298

Iranian tiles, created 1298

Many language editions of Wikipedia have articles about individual years – usually a few thousand of them, covering the time from 500 BC to about 2030 AD. Many Wikipedias start earlier, and some Wikipedias go further into the future.

We took a random year, 1298 AD, and took a look at how well covered that year is in different language editions of Wikipedia. The idea here is to assess not simply the number of articles in a language edition, but to go beyond that and see how much knowledge the given article provides.

When I worked on the Croatian Wikipedia, one of the early things I did was to write a Python bot that created the year articles for the Croatian Wikipedia. This reduced the number of red links (i.e., links that go to an article that does not exist yet) considerably, and made it easy to start collecting knowledge on those articles, and use the link structure for building and maintaining the encyclopedia. I wouldn’t be surprised if, in many languages, the year articles were also started by a single author, creating 2500+ articles with a bot.

Any bot doing this would be able to provide some structural information about the year – Was it a leap year? What century was it in? – but usually not more than that. Back then, I would have had neither the idea nor the means to use a bot to write more information about the year, such as births, deaths, events, etc. Today, thanks to Wikidata, that would be far more conceivable: today we can quite easily get a list of all people in Wikidata who died in 1298, sorted by the number of sitelinks they have, and use that to create parts of the article.

These bot-created articles often remained untouched for a long time afterwards.

Monastery in Spain, built in 1298

The analysis here was manual. I aimed to be rather strict with regards to the capabilities of Abstract Wikipedia, and generous regarding the complexity of the content of the language editions. A full list of my assessments can be found here. The analysis was conducted on 8 April 2025.

At that point in time, there were 341 language editions of Wikipedia. An article for the year 1298 was available in 131 languages, and was entirely missing from 208 languages. 2 language editions had redirects to the relevant century instead of an article for the given year. This means that those communities can potentially fill gaps by using Abstract Wikipedia in 61% of languages, and more than double the number of languages in which the article for the year 1298 is available, with the simple creation of a single function (the introduction to year function for the respective language, as discussed in a previous newsletter).

In 65 languages, the article consisted of a simple skeleton for that year, with no further information, likely created by a bot and not touched by a human hand since. Many of these skeleton year articles were even more bare than the basic introduction to year function would provide. Once Abstract Wikipedia is launched in these languages, these articles would be possible candidates for deletion, as it is likely that Abstract Wikipedia will very quickly improve those articles compared to the current situation.

Now it starts to get interesting: in 28 languages, we don’t only have the skeleton, but also a list of some births, deaths, and very simple events (as far as I was able to tell). These all looked like they could very easily be covered in Abstract Wikipedia, and most of these would also benefit from being sourced from Abstract Wikipedia. This means that we expect that 89% of Wikipedias would immediately benefit from Abstract Wikipedia.

By the way, one of the interesting things to find was that the Thai Wikipedia covers the concept of the Western year 1298 with the mostly overlapping year พ.ศ. 1841 in the Thai solar calendar – a great example of cultural differences influencing the structure of a multi-lingual encyclopedia. In Abstract Wikipedia, we are going to great lengths to ensure that this remains possible in the future.

Drawing from a Japanese scroll, created 1298

Which leaves us with 38 languages, where the articles describe complex events, or may even offer a worthwhile article, covering the trends and changes in that year. The content of these articles will take quite a bit more work than anything mentioned above to really be captured in Abstract Wikipedia. I want to positively call out the articles in Slovenian and Ukrainian for their narrative content. Yet, it is likely that some of these articles will still benefit from being replaced or augmented by the Abstract Wikipedia version.

Here is one big advantage of the Abstract Wikipedia model compared to the current situation, based on my experience with the Croatian Wikipedia: if the articles are created in a one-off by bots, written by a contributor and run on their machine, it becomes quite difficult for the community to maintain proper ownership of the outcome, especially later on. In small communities, the expertise to write, adapt, and run such bots might not be readily available. So if someone runs a bot for thousands of articles they take over an outsized responsibility for large swaths of content on the given Wikipedia.

That will change with Abstract Wikipedia, because the mechanisms creating the content are on wikis themselves, they can be collaboratively maintained, and improvements will roll out to the respective Wikipedias – and owned by the community together. Also, some simple mistakes will hopefully be less likely, such as copy and paste errors leading to wrong Roman numerals, the wrong instantiation of a template, or errors in the template.

Recent Changes in the software

We landed the last expected part of the proof-of-concept work on rich text (HTML) output support from embedded Function calls, handling the output by passing it through MediaWiki's HTML sanitisation code to ensure it is safe to insert into the content page (T391983 and T391984). We will demonstrate it working shortly.

We adjusted our code that runs when an Object is deleted, which ensures the Object is dropped from any caches, to also disconnect on-wiki the deleted Object from any other Objects when it was an approved Implementation or Test case (T392160 and T383502).

We have fixed the Function Evaluator component to properly re-initialise its status when switching what Function is being used (T395119). We also fixed a recent bug in that component that meant that Functions which have a Typed list as an input parameter would crash the UX (T397682). We fixed the code that shows an Object as a simple string to not show 'undefined' as a tooltip in some circumstances.

As part of our work to better monitor how well the system is behaving and performing, we've switched the middleware code around to pass on non-HTTP 200 responses from the back-end services, rather than immediately error. This will allow us to more correctly return HTTP 400s when user input is at fault, and more importantly HTTP 500s when the service is breaking in some way, so we can spot and fix it faster (T393522). We also adjusted the logging to a lower alert level when we get an HTTP 400 from the back-end.

We extended our LoadJsonDump maintenance script to let us import a range of Objects rather than the whole database, for faster local debugging based on production data. Now that all MediaWiki code requires PHP 8.1 or later, we started gently switching our code to not declare catch variable names that we don't use. We also stopped catching Throwable instead of Exception for user-response-level issues where that's not appropriate.

We, along with all Wikimedia-deployed code, are now using the latest version of the Codex UX library, v2.2.0, as of this week. We believe that there should be no user-visible changes on Wikifunctions, so please comment on the Project chat or file a Phabricator task if you spot an issue.

Volunteer’s Corner on Monday

On Monday, 7 July 2025, at 17:30 UTC, we will have our monthly Volunteers’ Corner. Unless you have many questions, we will follow our usual agenda, of giving updates on the upcoming plans and recent activities, having plenty of time and space for your questions, and building a Function together. Looking forward to seeing you online on Monday!

Fresh Functions weekly: 47 new Functions

This week we had 47 new functions. Here is a list of functions with implementations and passing tests to get a taste of what functions have been created. Thanks everybody for contributing!

A complete list of all functions sorted by when they were created is available.