Jump to content

Wikifunctions:Type proposals/Gregorian calendar date

From Wikifunctions

Done Z20420

Summary

A Gregorian calendar date identifies a specific day using the Gregorian calendar system introduced in 1582. It is the most widely used calendar system today.

The Type is proleptic, i.e. it is also calculated backwards before its introduction. There is no year 0. Another type can be introduced that has a year 0. The Type is naïve with regards to UTC, i.e. it ignores it as it only resolves to the level of days. When we introduce Functions and Types with a higher resolution, we need to resolve possible discrepancies.

Uses

  • Why should this exist?

In order to be able to reference dates and have functions that work with dates.

  • What kinds of functions would be created using this?
    • How old was a person when they died?
    • How many days have passed between two days
    • What day of the week was a certain day (requires days of the week as a type)
    • What is this date in another calendar? (requires the other calendar)
    • What is the Julian number of a given date?
    • When is Easter Sunday in a given year? (one of the main use cases for introducing the calendar)
  • What standard concepts, if any, does this align with?

The Gregorian calendar date is widely used. It was introduced through the Papal bull Inter gravissimas.

This is not the same as the time datatype in Wikidata, but it can be used in using it.

Structure

A Gregorian calendar date has two keys:

  1. K1 of Type Gregorian year
  2. K2 of Type Roman day of the year

Example values

Value for October 27, 2014:

{
  "type": "Gregorian calendar date",
  "year": {
    "type": "Gregorian year",
    "era": "CE",
    "year": {
      "type": "Natural number",
      "value": "2014"
    }
  },
  "day of the year": {
    "type": "Day of the Roman year",
    "month": "October",
    "day": {
      "type": "Natural number",
      "value": "27"
    }
  }
}
{
  "Z1K1": "Znnn",
  "ZnnnK1": {
    "Z1K1": "Zppp",
    "ZpppK1": "Zqqq",
    "ZpppK2": {
      "Z1K1": "Z13518",
      "Z13518K1": "2014"
    }
  },
  "ZnnnK2": {
    "Z1K1": "Zmmm",
    "ZmmmK1": "Z16110",
    "ZmmmK2": {
      "Z1K1": "Z13518",
      "Z13518K1": "27"
    }
  }
}

Validator

The validator ensures that:

  • February 29 only appears in leap years
  • Further validation will be performed by the types used in the keys.
  • If we limit the years, the validator should implement the limitations

Identity

Two dates are the same if their day of the year and their year are the same.

Converting to code

Python

Here are three proposals how to convert to Python.

4 keys

We convert the Gregorian calendar into a dictionary with the following structure (for the above example date):

{
  'K1': True,
  'K2': 2014,
  'K3': 10,
  'K4': 27
}

3 keys

1 BC is represented by 0, and 2 BC by -1, etc.

{
  'K1': 2014,
  'K2': 10,
  'K3': 27
}

2 keys

We could use a two-key object, with one key being Python's date object, and the other being an offset. The offset must be a multiple of 400, in order to ensure that weekdays line up. It is usually 0, unless it is out of range for Python (i.e. after December 31st 9999 or before January 1st 1). For conversion, the offset is a multiple of 2000. The multiple can be negative.

{
  'K1': datetime.date(2014, 10, 27),
  'K2': 0
}

The proper handling of the offset is a bit iffy.

JavaScript

4 keys

We will use the following object to convert to:

{
  K1: true,
  K2: 2014n,
  K3: 9,
  K4: 27
}

Note that as with Gregorian calendar months, months are started to be counted with 0, i.e. October is 9, not 10.

3 keys

We will use the following object:

{
  K1: 2014n,
  K2: 9,
  K3: 27
}

Non-positive numbers for K1 represent the years BC, with 0 being 1 BC, -1 being 2 BC, etc.

2 keys

The language standard Date object has an impressive range, covering more than a quarter million years into the future and the past (to be exact, from 20 April 271821 BCE to 13 September 275760 CE). Nevertheless, in order to cover the unlimited range of the Wikifunctions type, we need more.

We use a two-key object, with one key being JavaScript's Date object, and the other being an offset. The offset must be a multiple of 200000 as a BigInt, in order to ensure that weekdays line up. It is usually 0, unless it is out of range for JavaScript (i.e. after September 13th 275760 or before April 20th 271821 BC). The multiple can be negative.

{
  K1: new Date(2014, 10, 27),
  K2: 0n
}

The proper handling of the offset is a bit iffy.

Pure date object

We limited to dates for the Gregorian calendar date in the range of arbitrary years within the range of the JavaScript object, e.g. we say that all dates have to be before 100000AD and after 100000BC. In that case we can just use JavaScript's builtin Date object directly.

Renderer

Renderers depend on the language. We will start with a general renderer outputting an ISO string as the default behaviour, i.e. “2014-10-27 CE”, but we will have a configuration that can be adjusted for a given language, e.g. "27 October 2014" or "le 27 octobre 2014 AD".

Parsers

Parsers depend on the language. We will start with a general parser that can take an ISO string as the default behaviour, but we will have a configuration that can be adjusted for a given language.

Alternatives

  1. We could use different calendars for dates. And we certainly should! This is just to support a first calendar. Proposals for other calendars are welcome.
  2. We could follow ISO 8601 and have a year 0. But this would be inconsistent with most usages on Wikipedia. The suggestion is that we should have an ISO 8601 compatible calendar date as its own Type.
  3. The Type could be non-proleptic, i.e. not allow dates before its introduction (though its introduction varied by location and polity, so this becomes complicated).
  4. The Type could use both the Julian calendar before the introduction of the Gregorian calendar, and Gregorian aftwards, instead of being proleptic. Whereas such a date Type might be very interesting, as it may be the closest to what most written texts including Wikipedia and encyclopaedias are doing, it would be very difficult to implement correctly, might be confusing for users, and it would need an underlying proleptic Gregorian calendar date as a supporting Type anyway. So, we start here with the proleptic Gregorian calendar date, and allow for the development of a more complex Type later, that supports a mixed calendar model.
  5. Instead of using two keys with the new “day of the Roman year” Type and “Gregorian year” type, we could have a flatter representation with four keys, for a day, month, year, and era. Since both these subtypes seem useful in their own right, we used the more composed approach instead.
  6. some mixes between the previous and current proposal could also be possible, i.e. flatten the day of the year but not the year or the other way around.
  7. Instead of using a year and an era, we could use the Integer Type, and interpret negative numbers as being BCE. This seems more aligned with the ISO 8601 calendar though, which allows a year 0. Since we do not have a year 0, using the Integer Type could easier lead to mistakes.
  8. We could represent every day with just an Integer for the Julian day number, and make it look like a calendar day using parsers and renderers.
  9. The Type could be aware of UTC and define itself with a specific time zone in mind. There is a necessity for a naive date type, in order to express birthdays, events, etc., which often are intentionally naive with regards to a timezone (e.g. if a person is born in San Francisco at 23:30 on December 31st 2000, the person would have been born on January 1st 2001 6:30 UTC. We don’t want to record their Birthdate as January 1st 2001 instead of December 31st 2000. So we need to have Functions that assume naivety with regards to UTC.
  10. Instead of leaving unlimited time frames, we could stop at some big (but ultimately arbitrary) date, e.g. 100,000 BCE to 100,000 CE. Given the imprecision of the Gregorian calendar and the change in speed of the Earth, it is likely that the Gregorian proleptic calendar would fail outside of this time frame anyway. In addition, this would allow us to use the built-in JavaScript Date object, which could be a real advantage of this limitation. Dates outside this timeframe seem extremely rare.
  11. We could even constrain it to the space that Python covers (from 1CE to 9999 CE), but that seems too limiting

Discussion

  • Support Support as proposer with the three-key representation. --DVrandecic (WMF) (talk) 20:30, 26 June 2024 (UTC)[reply]
    Another alternative is a variation of 8 that recognises that the Gregorian calendar is a 146,097-day cycle. Specifying a Natural number representing the day within the cycle and an integer representing the cycle is guaranteed to convert as simply as possible. GrounderUK (talk) 18:23, 27 November 2024 (UTC)[reply]
    (Bearing in mind that 146,097 is a multiple of seven, so the weekdays also repeat.) GrounderUK (talk) 19:22, 27 November 2024 (UTC)[reply]
    Oppose Oppose this, we should approach how people think of a calendar. This is convincing in its simplicity. Maybe its own type in the mid-far future? Feeglgeef (talk) 19:28, 27 November 2024 (UTC)[reply]
    Yeah, that’s why I flagged it as an “alternative”. It’s relevant for extensions beyond the ranges supported by date types in Python and JavaScript, however, as in Z20311. GrounderUK (talk) 19:52, 27 November 2024 (UTC)[reply]
  • Support Support will be a useful type --Ameisenigel (talk) 17:54, 6 July 2024 (UTC)[reply]
  • I'm personally in favor of a three key type converter, where K1 is the ISO year, same as the existing year type converter, K2 is the month, from 1-12, and K3 is the day, from 1-31. This would be better than the offset system, which I think will lead to confusion and complicate things, and better than the 4 key system, because it matches the year conversion and is easier to work with. Feeglgeef (talk) 17:29, 27 November 2024 (UTC)[reply]
    @Feeglgeef I like the proposal in general, but shouldn't the month be 0-11 in JavaScript and 1-12 in Python, to keep it consistent with the respective languages? --Denny (talk) 18:53, 27 November 2024 (UTC)[reply]
    Yes, sorry! Feeglgeef (talk) 19:11, 27 November 2024 (UTC)[reply]
    1. I can’t see why the Wikifunctions representation of month would be anything other than Z16098. This avoids any possibility of confusion between days and months. It already converts to an integer in Python and JavaScript and I’m not aware of any issues with that.
    2. Automatic conversion to native Date representations in code is a high priority. I don’t see that introducing an intermediate representation (as in the original proposal) is an advantage.
    3. Avoiding a year zero is desirable, but precise dates from the period are uncommon and, of course, were not recorded using this calendar.
    4. For years, I do see advantages in consistency with ISO 8601, however. It may be unimportant to recall that the ISO 8601 representation of a year is a string with a minimum of four characters (where year 0000 represents 1 BC). Years outside this range require an initial + or - character. This converts easily to an integer, of course, but the decision of when to convert it for Wikifunctions seems finely balanced. A hybrid representation with an ISO 8601 string year and a Z16098 seems a viable date object, at least, and would offer simpler conversions to an ISO 8601 type, once it’s available.
    5. It would seem a little odd not to extend that thinking to the day as well. However, I think people generally intuit the day of the month as a positive Natural number. I am tempted to propose a new type of “little counting number” that represents the natural numbers from 1 to 31, but I won’t.
    6. I see no real advantage in embedding the day of the month within a Z20342 (in the case where the year is known), but consistency between the date type and Z20342 should be conserved, even if that means changes to Z20342. This reinforces point 1.
    7. Neutral Neutral {Z6, Z16098, Z13518}, where Z6 is an ISO 8601 representation of the year (which would be better as a specific subtype of Type Z6, even if that is a general “constrained string” of some kind (with a Regular Expression filter, for example)).
    8. Neutral Neutral {Z16683, Z16098, Z13518}. This is likely to be less efficient because of the explicit Z16659 in the Z16683.
    GrounderUK (talk) 11:25, 28 November 2024 (UTC)[reply]
  • I'm going to create a definition of this type from scratch, as I'd like to make this technically real-calendar independent. Below is a possibly up-for-interpretation definition I will use.
    • "The start of 30 Nov" will be the start of the date 30 November, 2024 in w:UTC, around the time this comment was sent. It is the Unix timestamp in seconds 1732924800. You can view said time in your timezone here
With that out of the way, here's my definition for this type:
A day is equal to the duration of 794,243,384,932,000 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the caesium 133 atom, at rest, at a temperature of absolute zero
A month can be equal to anywhere from 28-31 days.
  • January is 31 days long
  • February is 28 or 29 days long, depending on whether the day is in a leap year.
  • March is 31 days long
  • April is 30 days long
  • May is 31 days long
  • June is 30 days long
  • July is 31 days long
  • August is 31 days long
  • September is 30 days long
  • October is 31 days long
  • November is 30 days long
  • December is 31 days long
A literal year is equal to the duration of 290,091,439,521,026,010 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the caesium 133 atom, at rest, at a temperature of 0 Kelvin
To advance days, you add one to the day number in the month. If this number becomes larger than the length of the month, the day number is set to 1 and the month is incremented by 1. If the month is December, one is added to the calendar year counter, and the month counter is reset to January and the day counter to 1.
The start of 30 Nov is the time at which the day 30 November 2024 (Day counter: 30, Month: November, Calendar year: 2024) is started, thus, 1 December 2024 (Day counter: 1, Month: December, Calendar year: 2024) counter starts one day after 30 November.
This can also be applied in reverse, so the 29 November 2024 (Day counter: 29, Month: November, Calendar year: 2024) starts one day before The start of 30 Nov.
Because there is not a round number of days in one literal year, some calendar years are leap years, where February has 29 days. February only has 29 days in calendar years where the number is visible by 4, and, either the number is not divisible by 100 or the number is divisible by 400
Days can be added or subtracted infinitely. The only anchor for this system is The start of 30 Nov.
When the Calendar year goes to or above 0, the date to be in the era BC. Otherwise, the date is in the era AD
Every day has an attached Weekday. Weekdays are a cycle of 7. They are Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, where adding one day moves one forward, and removing one day moves one backward. The day of 30 November 2024 is Saturday.
I hope this clears everything up. Thanks! Feeglgeef (talk) 00:06, 30 November 2024 (UTC)[reply]
The current definition of the SI second is likely to be superseded within ten years.
https://www.bipm.org/en/redefinition-second
Although it is unlikely that additional leap seconds will be agreed between now and 2035 (when it has been agreed that they will cease), it makes sense to define tO as some point in 2036. I don’t see the relevance of the definition of a second for this particular type, however. GrounderUK (talk) 10:59, 30 November 2024 (UTC)[reply]
I don't see where the definition of a second is mentioned? This actively avoids said definition. Feeglgeef (talk) 13:06, 30 November 2024 (UTC)[reply]

Comments and Votes

Given the above options for conversion, which ones should be used?


@GrounderUK @Ameisenigel @Feeglgeef -- pings to see if you have thoughts on the three options. --Denny (talk) 20:14, 28 November 2024 (UTC)[reply]
@99of9: too Feeglgeef (talk) 20:39, 28 November 2024 (UTC)[reply]
JavaScript’s range is difficult to justify, so extending it further by the use of a BigInt for the year seems gratuitous. As it adds some complexity for the coder if the three numbers are of different types, I would prefer to see the year being converted into a Number (integer) (or an ISO 8601 string if the BigInt range is important).
On the assumption that some sort of access to common Python functions (implementations) will become available even if full re-entrancy (across languages) is delayed, I’m inclined to oppose conversion to a Python Date object with offset. I don’t oppose a separate value representing the Gregorian era, allowing the avoidance of a year zero but it would be confusing to diverge from JavaScript for little benefit. (This would a little weight to the option for the year to be an ISO 8601 string in both conversions but I don’t believe Python
natively supports the extended range above 9999 or below 0000, so, in fact, it pushes us back to an integer year.)
So, yeah…
Support Support Three keys, with the year as either Number or BigInt in JavaScript, depending on whether the range is limited by the converter (which I would argue is sensible, even if the JavaScript limits are not chosen; the actual limits could be changed with little or no impact, so long as they remain within the safe integer range, I believe). GrounderUK (talk) 12:25, 29 November 2024 (UTC)[reply]
Oppose Oppose On reflection, I believe that year, month and day should all be converted to BigInt in JavaScript. This is because accompanying number arguments will be BigInt-based and having to approach months and days differently from years is an unnecessary source of confusion and error. (It will also be consistent with Python.) I also believe that it was a mistake to implement Z20342 with conversions to Number rather than BigInt (as in Z20236). We should change this while we still have very few JavaScript implementations (like Z20335, where the impact is small).
(For the avoidance of doubt, I remain ambivalent on the question of limiting years in any way, but this would be an irrelevant consideration if year, month and day are all BigInt.) GrounderUK (talk) 12:40, 1 December 2024 (UTC)[reply]
Support Support this is fine. Feeglgeef (talk) 12:58, 1 December 2024 (UTC)[reply]
At second thought, I Oppose Oppose this, making months and days BigInts is counterintuitive, especially with what the BigInt spec says they are for. Feeglgeef (talk) 01:06, 3 December 2024 (UTC)[reply]
This argument-typing issue has the potential to be very difficult. I agree that if we have 3 JS keys, they are better off all being either int or bigint (I'd go with int though, to simplify the efforts of implementation writers, and because int covers the natural range of all three args). But I see that if we are calling our other functions, they are currently bigint based. Are we anticipating never ever having a small int type? If we anticipate one, maybe we should make that first? It might also help to have more information about how re-entrancy will work. If one JS function calls another inline, will the object returned go through conversion from code and then conversion to code before it is returned, or do we get it directly? 99of9 (talk) 00:04, 3 December 2024 (UTC)[reply]
Instead of re-entrancy, what we really should do is eval the functions themselves inside the evaluator, avoiding the whole re-entrancy thing. This does mean that a function must have an implementation in the programming language you are using, but this a sacrifice worth making. Otherwise, I see a large amount of implementations that are too slow to run. Feeglgeef (talk) 01:04, 3 December 2024 (UTC)[reply]
  • I'll put my placeholder vote in, but I feel we still need discussion/information to get this right. For Python I agree with 3 keys. For JS: my preference if we can get it to work is for a single key in Date type. My second preference would be three keys all int. My third would be three keys all bigint. My fourth would be three keys with int,int,bigint. I don't put any value on Gregorian dates outside the (-250k,+250k) range, so am happy for the JS code-converter to return an error if infinite-input is allowed through the validator. But as in my reply to GrounderUK above, if he is right about re-entrancy issues, then my programmer-centric-idealism order may have to change. --99of9 (talk) 00:18, 3 December 2024 (UTC)[reply]
    You not putting any value on dates where 250000>x>-250000 does not mean that others won't! Feeglgeef (talk) 01:07, 3 December 2024 (UTC)[reply]

Decision

Unfortunately, we didn't come to a consensus, but the discussion has quieted down. So I made a decision to go with the three key solution for both programming languages. For Python it was uncontroversial, for JavaScript there have been good arguments laid out for both the 1-key and 3-key solution. I chose the 3-key solution in order to not have an arbitrary (although very defensible) limit on the type. I also choose to remain consistent with the JavaScript representations of the components, Z20342 and Z20159. I have seen good arguments for other options, but in the end I decided that consistency is the way to go.

Z20420 is the type, and I hope I got the conversions right. A few functions have been already created, but the implementations are not perfect yet, which should become visible with more test cases:

  • Z20430, this one should be good
  • Z20440, this one should be fine for Python, but JavaScript should run into issues beyond the JS date type borders
  • Z20421, this one should run into issues both for Python and for JavaScript beyond their respective borders

For now, I am still leaving a warning on the type, but I plan to remove this within 24 hours unless significant issues are discovered.

Thanks everybody for joining this lively discussion! --DVrandecic (WMF) (talk) 16:07, 10 December 2024 (UTC)[reply]