Jump to content

Wikifunctions:Type proposals/SI units

From Wikifunctions

Why

These types will allow for smoother use of SI units on Wikifunctions. Having types for these units will ensure consistency when measurements are used, allowing for:

  • The chaining of SI Unit functions.
  • Consistency in what is shown to a user to represent SI units across Wikifunctions.
  • Consistency in the parser of SI unit inputs.

SI Units are suggested first instead of other systems of measurement. Types for other measurement systems can be created later, and will benefit from conversion to and from the SI Type. This has the following advantages:

  • Its wide adoption. SI is used almost universally, with the only notable exceptions being the US and UK.
  • Its coverage. SI covers almost all quantities you would need to measure, which is not true of other systems.
  • It's built off of a base 10 system. In the modern-day, base-10 feels more intuitive to more people and is more used than other systems built off of other bases, like base-12 and the imperial system

Types

This proposal would create 2 types.

SI Unit

This type would be used to show what unit you are talking about. It can be inputted in or outputted by functions. It may be referred to by ZAAA. It should have 8 keys, 7 as integers and one as a string. The 7 integers will be for representing what exponent a unit should be exponentiated by. For example, meters cubed (m^3) would have a 3 for the meters value, and a 0 for everything else. Newtons would have a -2 for seconds, a 1 for kilograms, and a 1 for meters. The last item will be a string, which represents the name of the unit. This will be used to

  • Disambiguate units with the same definition of other SI units, for example, radian and steradian, and hertz and becquerel
  • Allow readers and writers of code to easily tell what a unit is, without having to manually figure this out.

This will not be required, and functions that input and output multiple types of units should leave the name in the output blank.

SI Measurement

This type would be used to have a specific SI measurement. It will contain a rational number, which may be called ZRR, and an SI Unit. It may be referred to by ZBBB. The rational number's value should be the value in the included unit. It should be the result without a metric prefix (besides kilo), which means that 10km should be represented as 10,000m with this type.

Renderer/Parser

For ZAAA, the symbol of the unit should be used, if it could be done in one symbol. If not, it should chain symbols together with multiplication For ZIII, the renderer should get the symbol if available, and if not just do like in ZHHH. It should then get the correct metric prefix, divide the value by the value of the metric prefix, and concatenate them together.

Examples
ZRRRK1 ZRRRK2 ZBBBK1 ZBBBK2 ZBBBK3 ZBBBK4 ZBBBK5 ZBBBK6 ZBBBK7 ZBBBK8 Output
5800 1 0 1 0 0 0 0 0 Q11573 5.8 km
5800 1 -2 1 1 0 0 0 0 Q12438 5.8 kN
1 100 -2 1 1 0 0 0 0 Q104180541 1 cm*kg/s^2
5800 1 0 -1 1 0 0 0 0 Q92896481 5.8 km/kg
5800 1 -1 0 0 0 0 0 0 Q39369 5.8 kHz
5800 1 -1 0 0 0 0 0 0 Q102573 5.8 kBq
5800 1 -1 0 0 0 0 0 0 Q104180541 5.8 ks^-1

The parser should just do the opposite of this.

Comparisons

For SI Measurements Comparisons can be made based on the values. If two compared units have a different type, either false or void should be returned. For SI Units, two are equal if they have the same values (with some consideration given to tolerances for the equality of the rational numbers).

Example values

ZHH Newtons

{
  "type": "derived-unit-supporting SI unit",
  "seconds": {
    "type": "integer",
    "sign": "negative",
    "absolute value": {
      "type": "natural number",
      "value": "2"
    }
  },
  "meter": {
    "type": "integer",
    "sign": "positive",
    "absolute value": {
      "type": "natural number",
      "value": "1"
    }
  },
  "kilogram": {
    "type": "integer",
    "sign": "positive",
    "absolute value": {
      "type": "natural number",
      "value": "1"
    }
  },
  "ampere": {
    "type": "integer",
    "sign": "neutral",
    "absolute value": {
      "type": "natural number",
      "value": "0"
    }
  },
  "kelvin": {
    "type": "integer",
    "sign": "neutral",
    "absolute value": {
      "type": "natural number",
      "value": "0"
    }
  },
  "mole": {
    "type": "integer",
    "sign": "neutral",
    "absolute value": {
      "type": "natural number",
      "value": "0"
    }
  }
  "candela": {
    "type": "integer",
    "sign": "neutral",
    "absolute value": {
      "type": "natural number",
      "value": "0"
    }
  }
}
{
  "Z1K1": "ZHHH"
  "ZHHHK1": {
      "Z1K1": "Z16683",
      "Z16683K1": "Z16662",
      "Z16683K2": {
        "Z1K1": "Z13518",
        "Z13518K1": "2"
      }
  }
  "ZHHHK2": {
      "Z1K1": "Z16683",
      "Z16683K1": "Z16660",
      "Z16683K2": {
        "Z1K1": "Z13518",
        "Z13518K1": "1"
      }
  }
  "ZHHHK3": {
      "Z1K1": "Z16683",
      "Z16683K1": "Z16660",
      "Z16683K2": {
        "Z1K1": "Z13518",
        "Z13518K1": "1"
      }
  }
  "ZHHHK4": {
      "Z1K1": "Z16683",
      "Z16683K1": "Z16661",
      "Z16683K2": {
        "Z1K1": "Z13518",
        "Z13518K1": "0"
      }
  }
  "ZHHHK5": {
      "Z1K1": "Z16683",
      "Z16683K1": "Z16661",
      "Z16683K2": {
        "Z1K1": "Z13518",
        "Z13518K1": "0"
      }
  }
  "ZHHHK6": {
      "Z1K1": "Z16683",
      "Z16683K1": "Z16661",
      "Z16683K2": {
        "Z1K1": "Z13518",
        "Z13518K1": "0"
      }
  }
  "ZHHHK7": {
      "Z1K1": "Z16683",
      "Z16683K1": "Z16661",
      "Z16683K2": {
        "Z1K1": "Z13518",
        "Z13518K1": "0"
      }
  }

}

ZIII 5.8 Newtons

{
  "type": "derived-unit-supporting si unit",
  "unit": {
  "seconds": {
    "type": "integer",
    "sign": "negative",
    "absolute value": {
      "type": "natural number",
      "value": "2"
    }
  },
  "meter": {
    "type": "integer",
    "sign": "positive",
    "absolute value": {
      "type": "natural number",
      "value": "1"
    }
  },
  "kilogram": {
    "type": "integer",
    "sign": "positive",
    "absolute value": {
      "type": "natural number",
      "value": "1"
    }
  },
  "ampere": {
    "type": "integer",
    "sign": "neutral",
    "absolute value": {
      "type": "natural number",
      "value": "0"
    }
  },
  "kelvin": {
    "type": "integer",
    "sign": "neutral",
    "absolute value": {
      "type": "natural number",
      "value": "0"
    }
  },
  "mole": {
    "type": "integer",
    "sign": "neutral",
    "absolute value": {
      "type": "natural number",
      "value": "0"
    }
  }
  "candela": {
    "type": "integer",
    "sign": "neutral",
    "absolute value": {
      "type": "natural number",
      "value": "0"
    }
  }
 },
  "value": {
    "type": "rational number",
    "numerator": {
      "type": "integer",
      "sign": "positive",
      "absolute value": {
        "type": "natural number",
        "value": "29"
      }
    }
    "denominator": {
      "type": "natural number",
      "value": "5"
    }
}
{
  "Z1K1": "ZIII",
  "ZIIIK1": {
      "Z1K1": "ZHHH"
      "ZHHHK1": {
          "Z1K1": "Z16683",
          "Z16683K1": "Z16662",
          "Z16683K2": {
            "Z1K1": "Z13518",
            "Z13518K1": "2"
          }
      }
      "ZHHHK2": {
          "Z1K1": "Z16683",
          "Z16683K1": "Z16660",
          "Z16683K2": {
            "Z1K1": "Z13518",
            "Z13518K1": "1"
          }
      }
      "ZHHHK3": {
          "Z1K1": "Z16683",
          "Z16683K1": "Z16660",
          "Z16683K2": {
            "Z1K1": "Z13518",
            "Z13518K1": "1"
          }
      }
      "ZHHHK4": {
          "Z1K1": "Z16683",
          "Z16683K1": "Z16661",
          "Z16683K2": {
            "Z1K1": "Z13518",
            "Z13518K1": "0"
          }
      }
      "ZHHHK5": {
          "Z1K1": "Z16683",
          "Z16683K1": "Z16661",
          "Z16683K2": {
            "Z1K1": "Z13518",
            "Z13518K1": "0"
          }
      }
      "ZHHHK6": {
          "Z1K1": "Z16683",
          "Z16683K1": "Z16661",
          "Z16683K2": {
            "Z1K1": "Z13518",
            "Z13518K1": "0"
          }
      }
      "ZHHHK7": {
          "Z1K1": "Z16683",
          "Z16683K1": "Z16661",
          "Z16683K2": {
            "Z1K1": "Z13518",
            "Z13518K1": "0"
          }
      }
   },
   "ZIIIK2": {
    "Z1K1": "ZRRR",
    "ZRRRK1": {
      "Z1K1": "Z16683",
      "Z16683K1": "Z16660",
      "Z16683K2": {
        "Z1K1": "Z13518",
        "Z13518K1": "29"
      }
    }
    "ZRRRK2": {
      "Z1K1": "Z13518",
      "Z13518K1": "5"
    }
   }
}

Converting to code

SI Unit

This should be represented as a tuple in python and as an array in Javascript. It should consist of 7 values of the type integer

SI Measurement

This should consist of a list, with a float or int as the the first item, and the languages conversion for SI Unit in the other.

Built-in functions to consider

Given the complexity of these units, I suggest it be considered a few functions be added.

Multiply SI Measurements

This should multiply the two rational number values, and add together the indexes in the unit. [5, (1, 0, 1, 0, 1, 0, 1)] * [7, (-1, 5, 2, -2, 0, 0, 0)] = [35, (0, 5, 3, -2, 1, 0, 1)]

Divide SI Measurements

This should divide the two rational number values, and subtract the indexes in the unit. [5, (1, 0, 1, 0, 1, 0, 1)] / [7, (-1, 5, 2, -2, 0, 0, 0)] = [0.714, (2, -5, -1, 2, 1, 0, 1)]

Multiply SI Units

This should add together the indexes in the unit. (1, 0, 1, 0, 1, 0, 1) * (-1, 5, 2, -2, 0, 0, 0) = (0, 5, 3, -2, 1, 0, 1)

Divide SI Units

This should subtract the indexes in the unit. (1, 0, 1, 0, 1, 0, 1) / (-1, 5, 2, -2, 0, 0, 0) = (2, -5, -1, 2, 1, 0, 1)

Alternatives

Comments

  • Support Support as proposer. Feeglgeef (talk) 19:20, 27 October 2024 (UTC)[reply]
    • Please insert this link somewhere: w:International System of Units.
    • There are seven base units and twenty-two (named) derived units, apparently. I don’t see why we would not have an identity Type for each of these (“simple”) units. A structure that combines a number and a unit would naturally (or naively) be a Typed pair (Z882), a generic Type that would not need to be pre-defined (given full support for Typed pairs).
    • An array of such structures would naturally (or naively) be a Typed list (Z881) with Typed pair (“number”, unit) as its Type. (If we need to restrict “number” to a single numeric Type, I would favour Rational number.)
    • For a “measurement” (quantity, datum) using any of the 29 SI units, a “normalized” representation is proposed. Here, each (explicit or implicit) base unit becomes an explicit Key reference (Z39) with “number” as its associated value.
    • This representation can also be used for a measurement that uses an arbitrary combination of SI units.
    • Calculations follow the normal rules of arithmetic for the number part and normal rules of algebra for the unit part. The latter connects this proposal to phab:T378381.
    • In code, we may use a common representation for all SI unit quantities, whether or not specific units are used.
    • This tends to imply that the notion of equality will extend to different representations of the same quantity. (I don’t see any reason to resist this.)
    • In general, a function signature should specify the units that it expects and the units it produces. A function may accept or produce a “normalized” quantity and we should consider whether these might usefully be constrained in the function specification.
    • Whether converted explicitly or implicitly (by a Type converter to code (Z46)), the original (simple) Type can (should) be conserved for future reference. A function may also specify what (simple) Type of unit it is returning but type-checking will reject a result that conflicts with the function signature. It may also be appropriate to reject as invalid a result with a specified Type that is inconsistent with its base unit exponents. If not, the incompatible Type should be highlighted in the display of the returned result.
    • If a specific Type is specified in the function signature, the returned result will be converted to that Type by the appropriate Type converter from code (Z64). Otherwise, conversion from a normalized representation will always be explicit (either by specifying a unit Type within the calculating function or by means of a user-specified conversion function). If no explicit unit Type is specified, the result will be displayed using (by default, as a short form) the algebraic representation of the non-zero Keys (which view can be expanded to the full object form).
    • Metric prefixes are an integral part of the SI and deserve a mention. It would be convenient to have the option of specifying the prefix by means of a dropdown. Whether we should extend this to scaling options in the results is a question I am happy to leave to another time.
    GrounderUK (talk) 12:19, 30 October 2024 (UTC)[reply]
    • It's at the end
    • I'd like it to have its own parser and renderer, so I disagree with this.
    • I don't object to this.
    • This is for representing any unit, beyond the 29 that are in SI. For example, SI does not give a name to meter/kilo, but I'd like to be able to represent this with [1, -1, 0, 0, 0, 0, 0]
    • Yes
    • Sure
    • Yes
    • If you mean that 5/2 kg = 10/4 kg, yes. If you mean that 5/2 m = 5/2 kg, no.
    • Sure
    • Sure
    • Sure
    • I'd support a dropdown if available. Otherwise, the closest prefix that would allow for the ones place should be used.
    Oqwd3892 (talk) 12:49, 30 October 2024 (UTC)[reply]
  • I'm getting whiplash with all the total rewrites of this proposal, and deletion of all prior discussions. But I am relatively happy that this version is converging on what I was suggesting earlier. --99of9 (talk) 21:31, 29 October 2024 (UTC)[reply]
    I'm still not sold on the string ZBBBK8. For one, it seems to prioritise the English name, which is an error for multilingual WF. Personally I don't think Hertz and Becquerel need to be distinguished at the level of types. They can both just be returned as s^-1, then they can be called by functions that know what they are calling for. "Allow readers and writers of code to easily tell what a unit is" probably wouldn't be established by adding an extra key, and anyway, WF already has plenty of key ZIDs that are not explained within the code, that's just a consequence of having a multilingual project. 99of9 (talk) 21:42, 29 October 2024 (UTC)[reply]
    Oh, I forgot about the multilingual thing. I'd like to distinguish Hertz and Becquerels because of their symbol. The renderer would be weird to show an answer in hertz if it should be showing Becquerels. Would you consider using wikidata items for them, to not make it English-centered? This would mean that instead of "meters", "Q11573" would be used. Feeglgeef (talk) 21:56, 29 October 2024 (UTC)[reply]
    I'd suggest that it renders the answer as 1.1 s^-1 (possibly with HTML superscripting if we get fancy). That is neither clearly Hz nor clearly Becquerels, and the answer will make sense in the context of a function which expects to return either one. Remember that the renderer is just for the human audience of the website itself. If we want a function to return a radioactive activity as a string, we would need to process the underlying type anyway. Something like Activity_as_string(Calculate_SI_activity(), number_of_significant_figures). Within that kind of function we could easily set it to return "1.1 Bq" whenever it received an input of ([[−1,0,0,0,0,0,0],[+11,10]], 2). 99of9 (talk) 01:27, 30 October 2024 (UTC)[reply]
    I think that we should display it in a general population readable way. s^-1 looks weird and is harder to understand to the general public than the Hz symbol. Feeglgeef (talk) 01:51, 30 October 2024 (UTC)[reply]
    There are alternatives like 1.1 s−1 or 1.1 /s if that makes it easier to read. Short forms are well established in scientific calculations. Imagine getting back the result for the molar heat capacity of water "75.3 joule per mole kelvin" (label drawn from Q20966455) instead of 75.3 J⋅K−1⋅mol−1. The latter is much better than spelling it out in every answer, hence it's adoption everywhere a measurement value is written in w:Molar_heat_capacity. 99of9 (talk) 02:21, 30 October 2024 (UTC)[reply]
    I'm not sure where you get outputting the full name from. All of the rows in the renderer table, especially the 4th row, support 75.3 J⋅K^-1⋅mol^-1. Where I disagree with you is Hz instead of s^-1, because Hz is explicitly labeled in SI. Feeglgeef (talk) 02:34, 30 October 2024 (UTC)[reply]
    I got it from the (English) label of joule per mole kelvin (Q20966455). Under what circumstances would you use the value of ZBBBK8 in your renderer output? In all of the examples in the table, the value of ZBBBK8 is ignored. If you never use it, it's pointless having a key for it. By suggesting using the QID in ZBBBK8, I thought you were planning to use the label of the QID as part of the rendered output? Can you add some lines for a measurement of a frequency and a measurement of an activity to show how you'd like to use/render them? 99of9 (talk) 02:50, 30 October 2024 (UTC)[reply]
    The value would be used to get what symbol is supposed to be used. If no matching symbol is found, it will use the values in ZBBBK1-ZBBBK2 to generate it, so [0,0,-1,0,0,0,0,Q39369] would give Hz, and [1,0,0,0,0,0,0,""] would give s^-1. The Wikidata ID is just to identify it, not to use any of it's properties. Feeglgeef (talk) 03:02, 30 October 2024 (UTC)[reply]
    Also, if you look closely at 3, it is not ignoring it. It is reading it as invalid, and using cm*kg/s^2 instead of cN. Feeglgeef (talk) 03:06, 30 October 2024 (UTC)[reply]
    It should be noted that I made this change at only 1:00, and before it was only showing cN. Feeglgeef (talk) 03:42, 30 October 2024 (UTC)[reply]
    Done, rows 5-7. Oqwd3892 (talk) 18:48, 30 October 2024 (UTC)[reply]
    The users are ultimately more important here. If that means we have to add an optional extra item, I think it should be done. Perhaps, if nothing is specified, it can display s^-1. This would allow non-labeling if a label could not be returned or if the function author prefers not to use non-base symbols. Feeglgeef (talk) 01:56, 30 October 2024 (UTC)[reply]
    Regarding attaching a Wikidata item as a final key, I agree it would solve the language issue, but it would unnecessarily make functions using this type a fair bit more complicated. A simple example would be the divide_two_SI_units function, when sent a distance and a time to calculate a speed, it would receive Q11573 and Q11574, and would have to figure out to return Q182429 as part of the answer. --99of9 (talk) 01:45, 30 October 2024 (UTC)[reply]
    Functions that divide SI units (I'm actually in support of this being built in) should not include one. Under the Types section it says that "This will not be required, and functions that input and output multiple types of units should leave the name in the output blank". Feeglgeef (talk) 01:49, 30 October 2024 (UTC)[reply]
    Sorry, I didn't read that carveout carefully enough. In that case, I still see at least two issues. Firstly, many results of calculations would have that key blank, and would have to render the value into a string anyway, so we'd already need to set up the renderer well as per my comment at 01:27, which means that this key has minimal advantage for disambiguation. Secondly any composition that relied on "multiple type of unit" functions, but returns a consistent output unit would need to hardcode in the unit of its answer. So, for example Rectangular_prism_volume(height, length, depth) would receive Q11573 and in my model would simply be composed as multiply_3_SI_measurements(height,length,depth), in your model would also need to replace the blank return key with "Q25517". 99of9 (talk) 02:07, 30 October 2024 (UTC)[reply]
    My comment at 1:56 answers this. I'm in favor of displaying it in only SI base units if nothing is listed. This would
    • Allow the K8 key to be optional
    • Allow devs to explicitly decide to show s^-1 instead of Hz, should they want to.
    For the composition, either it can just leave it blank, or a function for adding a key could be made or built in. Feeglgeef (talk) 02:39, 30 October 2024 (UTC)[reply]
  • A minor tweak: I think we should match the order of keys to the order defined in ISO 80000-⁠1, which appears to be m, kg, s, A, K, mol, cd. 99of9 (talk) 02:38, 30 October 2024 (UTC)[reply]
    Yeah, I'm in favor of this. Feeglgeef (talk) 02:39, 30 October 2024 (UTC)[reply]
    I had originally gotten this from the Wikipedia article about it, but we should definitely prefer ISO over Wikipedia. Oqwd3892 (talk) 12:16, 30 October 2024 (UTC)[reply]
@99of9: In "Types for other measurement systems can be created later", what systems are you referencing? The only 2 things I can think of are the imperial system and currency. Feeglgeef (talk) 01:07, 31 October 2024 (UTC)[reply]
inch, teaspoon, kiloton of TNT, calorie, BTU, football field, etc... and that's just some of the western ones. 99of9 (talk) 22:33, 6 November 2024 (UTC)[reply]
Can we have a general version of this with just a rational number and a QID? Feeglgeef (talk) 22:40, 16 November 2024 (UTC)[reply]
You mean a Wikidata item reference (Z6091)? One challenge there is type safety, but I had a few thoughts about that some years ago (probably here). GrounderUK (talk) 00:08, 17 November 2024 (UTC)[reply]
Yeah. Feeglgeef (talk) 01:01, 17 November 2024 (UTC)[reply]
I don't know the best answer for them. Hence I defer until later when we will have more experience. 99of9 (talk) 00:58, 17 November 2024 (UTC)[reply]

Question: would representing wikidata:property:P4020 (International System of Quantities dimension) be a helpful preliminary step? It seems independent of the system of units... Arlo Barnes (talk) 20:43, 28 November 2024 (UTC)[reply]

It effectively would be the type "SI Unit" described in this proposal. I'd be fine if that were created first. Feeglgeef (talk) 20:50, 28 November 2024 (UTC)[reply]