Wikifunctions talk:Representing identity
Questions
I've probably misunderstood, but some questions I have based on my current understanding:
- Is another term for this "primary key"? It it related to that concept in database language?
- Can "identity" ever be rightly spread across two keys? Since your examples were always Booleans, there was only one key, which had to be the single identity. But, for example, if Wikifunctions:Type_proposals/Integer was created as proposed, I imagine the identity might be spread across K1 and K2 (i.e. you can't know the value unless you know both).
Thanks. --99of9 (talk) 00:00, 14 March 2024 (UTC)
- Let’s call “primary key” a potentially useful analogy. In the world of tables and relational database management systems, “primary key” is a uniqueness constraint. A table can have only one primary key and rows in the table cannot have identical values for their primary key. In this sense, a primary key can be defined across more than one column. However, it is common practice to instead choose to have an additional column containing a unique identifier and call that the primary key.
- On the question of whether integers have identity, I am undecided. I mean, it’s obvious that they do, right? Otherwise we couldn’t tell them apart. And yet… We can take some arbitrary integer and generate all the others by successive additions and successive subtractions, and it is this that, at least historically, defines the type. Given a true identity for the predecessor of zero (“-1”), we can generate the negative integers from the Natural numbers, and that is the most straightforward approach.
- Sorry if that’s a bit off-topic, but there are analogies that seem to be worth considering. GrounderUK (talk) 10:36, 14 March 2024 (UTC)
- @99of9, @GrounderUK: I don't think we'd need to (or want to) add an identity value to arbitrary-input Types like integers, no. Most Types won't want an Identity, but for those that will it's likely to be quite important. (The uses in Functions are a bit different, so I won't mention those here.)
- The main purpose of using identities on Types is to distinguish between a fixed, known set of values, e.g. with Booleans (as already hard-coded into the system), but also for user-controlled Types like the selection controls we'll need for meaningfully interacting with Lexemes, etc.:
- singular/plural
- first/second/third-person
- nominative/vocative/accusative/genitive/dative/ablative concepts
- etc.
- This will allow the interface to know and guide the user to select from a limit set of values (e.g. a drop-down or whatever), rather than have them guess that they need to enter some magic English term to get the result they want, so a Wikifunctions natural language generation Function will be able to take a concept (Wikidata Lexeme Sense), a Language, and whatever appropriate things about, and provide a simple, user-friendly interface through which users could generate e.g. [en]"she was singing Songs from the Auvergne" or [fr]"elle chantait des Chants d'Auvergne" or [de]"sie sang Lieder aus der Auvergne" or [ja]"彼女はオーヴェルニュの歌を歌っていた" or whatever.
- In these cases, each of the potential values are concrete, saved Objects (Z2s) that can be referenced by ZID, rather than a finger-print of their current values as would be needed for identifying e.g. '-3'.
- [Edit – Sorry, I half-wrote this and then neither finished nor removed it; completed now! Jdforrester (WMF) (talk) 15:34, 15 March 2024 (UTC)] There are admittedly potential situations I can think of where we might in future want to provide a limited-value set of prompts to users but free-form entry (e.g. in a planetary body selector, prompting with Q2/Earth and Q111/Mars but allowing Q2259589/Telos IV), but that is probably dealt with later in the project's development when we have a firmer grasp of what features we might want from it, rather than trying a priori to develop the perfect system now. Jdforrester (WMF) (talk) 15:39, 14 March 2024 (UTC)
- The example I chose of the as-proposed integer was less about its infinite nature and more about the fact that the proposal has the value split across two keys, so I fear the identity may also be across two keys (if it had an identity.)
- So let me choose a different (this time finite and small) hypothetical to ask my same question: Can "identity" ever be rightly spread across two keys? Imagine you had a Type called a coloured baryon quark, which had two keys: K1=colour(red/green/blue) and K2=type(down/up/top/bottom/charm/strange). Now we want to guide the user to select from a limit set of 18 values (e.g. red-charm or blue-down). Does that mean the identity for this type is spread across both keys? Would this cause an issue for any of the identity proposals discussed here? --99of9 (talk) 04:20, 15 March 2024 (UTC)
- I’ll reply later. I want to reply to James first. (And the appearance of series of unreplied-to replies is unsatisfactory.) GrounderUK (talk) 10:17, 15 March 2024 (UTC)
- @99of9: In practice, I'd model this as having the keys each having a Type with its own set of identities (so a coloured-quark Type and identity each for red-quark, blue-quark, and green-quark, and one for flavoured-quark for down-quark, etc.); the upper, compound coloured-flavoured-quark Type would just implicitly inherit the identities of its sub-type values, but it itself would not have identities for its values directly. Jdforrester (WMF) (talk) 15:29, 15 March 2024 (UTC)
- Sorry it’s taken so long to get back to you; I slipped down a rabbit hole! It happens to be the case in your example that all the combinations are possible, so you can indeed treat each “selection” separately, as @Jdforrester (WMF) suggests. What happens, though, when only some of the combinations are valid, as with languages and scripts? We should then say (logically) that each language has its own identity, each script has its own identity and each “valid” combination has its own identity. Which of the logical identities should have physical counterparts is an open question, but I think they are all viable in all eight Proposals (or all nine, if you include the status quo).
- The current set of Z60s are actually language/script/orthography/… combinations. Natural language functions will often be applicable across a set of such combinations, as we already see with representations of Natural numbers. It will be interesting to see what challenges the current approach presents in practice (aside from performance and associated costs). GrounderUK (talk) 09:38, 26 March 2024 (UTC)
- I agree (although your final sentence defeated me). I rather hoped we could avoid having a discrete persistent object for each relevant identity, but I have never thought of a way to isolate Wikifunctions from side-effects arising from Wikidata updates. (Well, I did, but I didn’t like it and Denny was happy to defer further discussion at that time.) I’m happy to discuss this further but it’s a little off-topic at this time, don’t you think? GrounderUK (talk) 10:38, 15 March 2024 (UTC)
- [Sorry, my final sentence was incomplete; have completed it to be hopefully more readable.]
- Yes, I fear that identities in practice will have to be tied to persistent Objects, at least for the foreseeable. Jdforrester (WMF) (talk) 15:36, 15 March 2024 (UTC)
- Just adding a link back to Meta and the conversation with Denny, for future reference::m:Talk:Abstract_Wikipedia/Object_creation_requirements#Wikifunctions_and_Wikidata GrounderUK (talk) 22:31, 17 March 2024 (UTC)
Comments
Question on semantics/vocabulary
The term "key" is more closer to an id itself. As in a key-value database? Values can be anything, even complete objects or arrays that also contain other keys, values, and objects (like arrays)? But still, I think the simpler terminology of just "id" can suffice, rather than "key". Where in the examples you have "Keys", that could be instead called "IDs", or narrowly as "Type IDs". Another example is this sentence: "Z891 takes the type the identity key expects and returns something like Z91 from Proposal 1, but type safe." Where I think the phrase "identity key expects..." could actually simply be rewritten as "identity id expects..." and still make sense. Henceforth, I think that "Keys" could be renamed to "IDs" or just "ID" itself (since the term "ID" is an acronym for "identifier". Thus, more simply "ID" has a `"Type": "Identity"`? Wouldn't that be much, much simpler?
If we breakdown the semantics, I would expect that the concept of identity, type, etc. would be first be isolated and written down, so that the actual vocabulary can be looked at objectively, and concepts clearly connected to each other. Currently, the vocabulary seems to be quite "awkward" and getting more awkward in this set of identity proposals. So could we see the vocabulary as it stands currently written out in a semantic form "has a"/"is a" such as:
Identity "has a" : - Names - Markers - Type - ???
Type "has a" : - Names - Identity - ???
Keys "has a" : - ???
ID "has a" : - Type - Names - Identity (boolean)???
Putting descriptions onto each property would probably help TREMENDOUSLY to help everyone including the team actually think about them and the values that they could hold and how they would be used. For instance, what is a "key marker", why is it useful, what Types use it and how? We did this A LOT in Schema.org and so when it came to programmatic constructs in Python it was easy to build code for the object modeling because we had really good semantics from the start.
-- Thadguidry (talk) 06:21, 14 March 2024 (UTC)
- And, of course:
Object "has a" : - Identity
"Z1K1": "Z2", "Z2K1": { "Z1K1": "Z6", "Z6K1": "Z2" }
Comments on the document
Having read the linked Phabricator tickets, it is still unclear to me what the consequences of doing nothing are. I think these should be clarified before alternative solutions are compared.
If we were having this discussion three years ago, and with the benefit of hindsight, would the proposals be different?
- For example, I wouldn’t have come up with Option 5, since it is simply a phased approach to Option 4. If “identity” had been the only question, I might have come up with Option 3, but probably not Option 4.--GrounderUK (talk) 17:06, 16 March 2024 (UTC)
More theoretically, persistent objects have “identity” in their Z2K1, and this is true of persistent types too. Transient objects are anonymous, as I understand it. To me, that means they simply do not have “identity” (or, like a tuple, they are their own identity). I’m sure there is a benefit in having a convenient “handle” for a transient, but I don’t (yet) believe that calling this an “identity” (or a “reference” or a “key”) would be helpful.
I am tempted to suppose that persistent objects whose types are not persistent are a special case of the above: their types do not have “identity”. This may be where “generic types” prove problematic. Here we allow the specification of the function call to provide the “handle”. This is satisfactory (if inconvenient) because we assume consistent evaluation (even though the function being called can itself change over time). But the evaluation is still a transient object and so (according to my theory) it is anonymous and lacks identity.
When it comes to solutions, I think it will prove helpful to allocate reproducible identifiers (“handles”) to transient objects but I don’t think these should appear in a Persistent object (Z2) (which is why I specify “reproducible”). They should also take account of the version of code for the implementation and recognise that successive evaluations while code is being edited (or compositions composed) may be completely different versions of the code (or composition).
Key optionality does not seem to have a clear connection to “identity” and its transient analogue (“handle”). Conceptually, a Key may be mandatory in nature (like Z1K1 and Z4K1) or mandatory in context (all the others?). Mandatory Keys are (conceptually) a different Type, so one might think that the Z4 should have one list of mandatory keys and a separate list of optional Keys. However, it is likely that optional Keys will not always be optional (for example, if you have one Key, you must also have some other Key, or you must have exactly one of some group of Keys). I would expect the Validator to enforce these constraints, so it might as well enforce all constraints. Rather than messing with Z4K2, it makes more sense to me (just for optionality) to have a separate Key for a list of constraints and validations, which is how I view Z4K3. But I agree that optionality and identity should be considered together when evaluating solutions.—GrounderUK (talk) 12:52, 14 March 2024 (UTC) --GrounderUK (talk) 16:43, 15 March 2024 (UTC)
Proposal 5: A new persistent object for each Key marker for each affected type
This may be an alternative or phased approach to Proposal 3 or Proposal 4, and may defer or avoid changing the Z4K2. Instead (or first), we create the new persistent object Type that references a Z4K2 and makes an assertion about it, like whether it is an identity or optional or wears a blue hat. The persistent object’s own Keys would include a Z9/reference to the subject Z4/type and a Z39/key reference to the relevant Key in that Type. A Key marker could either be a separate Z2 for each market, or a single Z2 for all the markers for a particular Key. The latter would be required for incorporation into the subject Z4K2, as in Option 4, but it is just a Typed list of the separate Z2s, if that approach is preferred. GrounderUK (talk) 16:26, 14 March 2024 (UTC)
Required work
(Derived from Option 4 by guesswork)
- Add key marker type (no difference)
- Add a persistent object with each value for the key marker type (one per marker for each affected type, as each is rolled out)
- Add key Z3K4 to Type Z3 / Key (may be deferred and rolled out over an extended period)
- Adapt types that use identity (may be deferred and or rolled out progressively)
- Specific changes to backend or frontend would be required to find the marker for a Key where it is not explicit (because this has been deferred)
Proposal 6: Multi-typed lists
>>! In T282062#7074909, @Jdforrester-WMF wrote [of optionality] > Is this properly a property of the Key or of the Type which uses it?
Yes. And no.
It is “properly” a property of the relationship of the Key to the Type. When the cardinality is greater than one, you need a list; when the cardinality may be less than one, you have optionality. More generally, though, various kinds of dependency (one-of, some-or-none-of, all-or-none-of etc), are difficult to express within the Key and less difficult to express in its relationship to the Type. In Z14, for example, we might have [one-of[Z14K1], one-of[Z14K2, Z14K3, Z14K4]] (they’re mutually exclusive but you must have one). Here, the two embedded lists happen to have the same “type” (corresponding to one-of). If there were an optional [Z14K5], we would have zero-or-one-of[Z14K5], which would make typing the overall list problematic. Using Z1 is always an option (sadly), but I would be happier with the first element in the Benjamin array itself being a “one-of” list of permitted types. In the hypothetical case, that would be (informally): [[one-key-from, all-keys-or-none], [one-key-from, Z14K1], [one-key-from, Z14K2, Z14K3, Z14K4], [all-keys-or-none, Z14K5]]. This assumes that none of the Keys is an identity. (The “types” here are still a bit functional here, but this will probably sort itself out.)
In the proposals, “identity” turns out to be an unfortunate label (well, it confused me for quite a while). It is simply a reference to a Z2, a “foreign” Z2K1. This is an important distinction because “properly”, the fact that a Key must contain such a value is indeed a property of the Key, and not of the relationship of the Key to its Type. So, assuming (for illustrative purposes) that a Z14K3 is an identity, we might try to express that (informally) as [[one-key-from, all-keys-or-none], [one-key-from, Z14K1], [one-key-from, Z14K2, [identity, Z14K3], Z14K4], [all-keys-or-none, Z14K5]]. This breaks the typing, however. It would have to be more like [[one-key-from, [one-key-from, identity], all-keys-or-none], [one-key-from, Z14K1], [[one-key-from, identity], [one-key-from, Z14K2, [identity, Z14K3], Z14K4], [all-keys-or-none, Z14K5]].
It doesn’t look pretty or easy to implement, but it does offer a way to escape the proliferation of untyped lists across the Function and Type domains. In the more immediate context of identities, it tilts me away from favouring Option 4 and towards Option 3 (with some flavour of Option 6 to follow in due course). However, I don’t think it is simply a question of whether a Key’s value is or is not an identity, because we have different types of identity. Specifically, we want to handle ZIDs, QIDs, PIDs and LIDs differently and require, for example, that a Key link to a Z2 or to Wikidata Item. See, for example, phab:T344170, (the proposal to add the QID to Z60) and consider doing the same for Z41 and Z42. GrounderUK (talk) 00:31, 17 March 2024 (UTC)
With that in mind, I would be inclined to remove “identity” from the Z3/key relationship and add “reference type” to the Key itself. This clarifies that it is the value in the Key that is being “typed”, not the Key itself. The general extension of the Z3 definition is assumed, but the additional keys for “identity” are a list of two keys from which both must be selected, more generally “all listed keys” (a subtype of Z3). Consequentially, the Z3-typed list becomes a Multi-typed list:
{
"keys": [
["Type",
"Key",
"all-listed-keys"],
{
"type": "Key",
"value type": "Boolean",
"mandatory keys": [
"all-listed-keys", //like Z3
{
"identity": "true",
"reference type": "identity"
},
],
"labels": […]
},
]
|
There is an implicit “or” between "Key" and "all-listed-keys", because each element in the list must be of exactly one type. I guess the current "Key" is implicitly “optional” (but not generally). Expressing optionality more formally would shift the interpretation of a simple K3 to “mandatory”. I think that allows us to leave “type”, “value type” and “labels” unchanged.
It is tempting to generalize all-listed-keys. The “listed” is redundant and included here for clarity. The “keys”, however, begins to suggest a generic type to extend it to types other than Z3. For Z3 subtypes, my preference is for separate persistent types for each of the required subtypes (all-listed-keys-or-none, exactly-one-key-from and all-listed-keys: simply optional, mutually exclusive and mandatory, respectively).
I hope that makes some sort of sense. I have no more time available today to refine it further.--GrounderUK (talk) 16:15, 17 March 2024 (UTC)
- Returning to Jdforrester (WMF)’s original question, I think I might now reply: “The Key is the Key; all else is the Type.” This means that I am inclined to oppose not only Proposals 1 to 6, but also the status quo. If there is supplementary information about a Key, it should be in a separate Key. We could use some of the thinking about optionality at the Type level… or not …but I’m thinking it might be better thinking of it as a parallel structure to (or within) Z4K2. Then, all Keys are simple and similar. Probably, all lists of Keys in Types are simply lists. Key and Key relationship metadata is a structured list of Keys in a new Z4 Key (maybe the Z4K2’s functional constructor?). No time now to elaborate, I’m afraid 😉 GrounderUK (talk) 08:21, 18 March 2024 (UTC)
Functionally constrained lists
- Maybe a better way to think about Multi-typed lists “in the wild” is “functionally constrained” Typed lists. When the user chooses Typed list, they get to choose (or compose) a function that constrains future choices of elements in the list. In particular, it allows the permitted types to be specified but it might also allow values to be constrained. That is informal (sub-)typing, so a persistent function that achieves the desired effect can become the Validator for the custom type, should one be required. The same approach could apply for objects of type Z1 (in particular) and any other type. Such functions would (necessarily, I think) imply the object or element type but I am currently thinking of the user selecting both the type and (optionally) a constraining function (or Guards). There could also be constraints for the list as a whole (permitted and prohibited types, most notably). Should this be a Feature request? GrounderUK (talk) 10:12, 18 March 2024 (UTC)
Proposal 7 Benjamin array or embedded object (Z93)
It is confusing to hear that the definition of Z91 is the same as Z9. How, then, are they different types?
I’m guessing that Z91 is better understood as a Type for a Key whose value is a Reference (but why not Z93?). Using that definition here, in any event, we can simply use a Benjamin array to constrain the Key’s value to have that Type: {"Z3K2": ["Z93", "Z40K1"]}. Z40K1 might usefully follow a similar pattern in conformity: {"Z40K1": ["Z9", "Z41"]}. GrounderUK (talk) 00:17, 19 March 2024 (UTC)
More conventionally and in context, that is simply [or not so simply, corrected--GrounderUK (talk) 12:21, 20 March 2024 (UTC)]:
Z40/Boolean keys with embedded Z93/Reference key
"keys": [
"Key",
{
"type": "Key",
"value type": "Identity",
"key id": {
"type": "Reference key",
"reference type": "Boolean",
"reference id": "Identity"
},
"labels": {…<Identity>…}
}
],
"validator":…
|
"Z4K2": [
"Z3",
{
"Z1K1": "Z3",
"Z3K1": "Z93",
"Z3K2": {
"Z1K1": "Z93",
"Z93K1": "Z40",
"Z93K2": "Z40K1"
},
"Z3K3": {…"identity"…}
}
],
"Z4K3": …
}
|
--GrounderUK (talk) 10:13, 19 March 2024 (UTC)
Optionality and cardinality
- Thinking about different kinds of optionality, currently we say that a Z12/texts is mandatory in the Key and must contain a list of Z11-type elements but may contain zero such elements. In a Z11, however, both keys should have non-null values. We might consider different types of Typed list to express and enforce the cardinality (prohibiting an empty list, or specifying a maximum number of elements, for example). But we also want to say that each Z60 must be different. This is presumably in the Validator for the Z12, but it suggests a different type of Typed list (and we are back to #functionally constrained lists).
- We see a different kind of constraint in the Z14/implementation, where there is mutual exclusivity. I’m inclined to the view that the three mutually exclusive Keys here are more logically understood as a single Key. (I don’t understand why Z14K2 is a Z1, rather than a Z7, by the way. To my mind, an Implementation is either a function call or its a code-function evaluation. If it’s the latter, it’s either built-in or “community”. I note that Z14K4 is defined as a Z8/function, however.) Perhaps what we have here is a sub-list of Keys with each Key having a different type of Z93/reference key. Informally,
- Z14K?: [(one of) Z93, Z14K2, Z14K3, Z14K4].
- As with the Z40, this does not express the different identity Types (which are defined against the Keys themselves, so the fact that the Keys are all identity keys is perhaps incidental, in which case the Z93 could just be Z39). The parenthetical “one-of” could simply indicate that the Typed list has a different function from Z881, one which (presumably) accepts the minimum cardinality of one. There is nothing here to explicitly constrain Z14K? to a single one of the referenced Keys (in a given object of this Type). This may resolve itself in the formalization, but I’m not convinced that it should. (It’s a question of levels. Is the need for a single value in Z14K? a characteristic of the Key itself, or of its value? One way or another, though, it’s min. 1 max. 1 at the Object level, which is a constraint that should be explicit in the Type.)
- GrounderUK (talk) 12:29, 19 March 2024 (UTC)
Z14K2/composition (Key)
Here is one of the Keys in Z14/implementation, following the pattern proposed above for the Z40K1:
"keys": [
"Key",
{
"type": "Key",
"value type": "Identity",
"key id": {
"type": "Reference key",
"reference type": "Function call",
"reference id": "Z14K2"
},
"labels": {…<composition>…}
},
…
],
"validator": …
|
"Z4K2": [
"Z3",
…
{
"Z1K1": "Z3",
"Z3K1": "Z93",
"Z3K2": {
"Z1K1": "Z93",
"Z93K1": "Z7",
"Z93K2": "Z14K2"
},
"Z3K3": {…"composition"…}
},
…
],
"Z4K3": …
}
|
Cardinalized Z3/key
If we want to formalize the rule that this Key can take a value only when the other two Keys (Z14K3 and Z14K4) do not, we could place all three inside a cardinalized key-list object (a new Type). Let’s see what the Keys of that Type might look like:
"keys":
{
"type": "Key",
"value type": "Natural number",
"key id": "ZnnK1"
"label": {…[…{…<minimum Keys from list>}…}]
},
{
"type": "Key",
"value type": "Natural number",
"key id": "ZnnK2",
"label": {…[…{…<maximum Keys from list>}…]}
},
[ //following the current Z3 pattern
"Key",
{
"type": "Key",
"value type": "Type",
"key id": "Z3K1",
"label": {…[…{…<value type>}…]}
},
{
"type": "Key",
"value type": "Type",
"key id": "Z3K2",
"label": {…[…{…<key id>}…]}
},
{
"type": "Key",
"value type": "Type",
"key id": "Z3K3",
"label": {…[…{…<label>}…]}
}
],
"validator": …
|
"Z4K2":
{
"Z1K1": "Z3",
"Z3K1": "Z13518",
"Z3K2": "ZnnK1",
"Z3K3": {…[…{…"minimum Keys"}…]}
},
{
"Z1K1": "Z3",
"Z3K1": "Z13518",
"Z3K2": "ZnnK2",
"Z3K3": {…[…{…"maximum Keys"}…]}
},
[ //following the current Z3 pattern
"Z3",
{
"Z1K2": "Z3",
"Z3K1": "Z4",
"Z3K2": "Z3K1",
"Z3K3": {…[…{"value type"}…]}
},
{
"Z1K2": "Z3",
"Z3K1": "Z4",
"Z3K2": "Z3K2",
"Z3K3": {…[…{"key id"}…]}
},
{
"Z1K2": "Z3",
"Z3K1": "Z4",
"Z3K2": "Z3K3",
"Z3K3": {…[…{"label"}…]}
}
]
],
"Z4K3": …
}
|
- This is just Z3 with two additional Keys for min. and max. so ZnnK1 could be Z3K4 and ZnnK2, Z3K5. These Keys are independently optional (either, both or neither), but we are still formalizing cardinality, which, here, would be min: 0, max: 2.
- For now, we want to construct a Key with cardinality min:1, max:1 containing a list of three Keys (Z14K2, Z14K3 and Z14K4), but these things take time…
Because the final representation of the full (transient) Z14/implementation Type is rather long, only the version labelized in English is given.
{
"type": "Type",
"identity": "Implementation",
"keys": [
"Key",
{
"type": "Key",
"value type": "Function",
"key id": {
"type": "Key",
"id type": "Reference key",
"identity": "Z14K1",
},
"label": {
"type": "Multilingual text",
"texts": [
"Monolingual text",
{
"type": "Monolingual text",
"language": "English",
"text": "function"
}
]
}
},
{
"type": "Key",
"value type": "Cardinalized keylist",
"key id": {
"type": "Cardinalized keylist",
"minimum Keys": {
"type": "Natural number",
"value": "1"
},
"maximum Keys": {
"type": "Natural number",
"value": "1"
},
"key id": "Z14K5",
"value object": [
"Key",
{
"type": "Key",
"value type": "Function call",
"key id": "Z14K2",
"label": {
"type": "Multilingual text",
"texts": [
"Monolingual text",
{
"type": "Monolingual text",
"language": "English",
"text": "composition"
}
]
}
},
{
"type": "Key",
"value type": "Reference key",
"key id": {
"type": "Reference key",
"id type": "Code",
"identity": "Z14K3"
},
"label": {
"type": "Multilingual text",
"texts": [
"Monolingual text",
{
"type": "Monolingual text",
"language": "English",
"text": "code"
}
]
}
},
{
"type": "Key",
"value type": "Reference key",
"key id": {
"type": "Reference key",
"id type": "Function",
"identity": "Z14K4"
},
"label": {
"type": "Multilingual text",
"texts": [
"Monolingual text",
{
"type": "Monolingual text",
"language": "English",
"text": "built-in"
}
]
}
}
],
"label": {
"type": "Multilingual text",
"texts": [
"Monolingual text",
{
"type": "Monolingual text",
"language": "English",
"text": "implementation id"
}
]
}
]
}
]
}
|
For the time being, I assume that we would not be changing the Z3 across the board. As mentioned before, it is reasonable to consider that all specified Keys are mandatory unless they are explicitly marked with some kind of optionality.
In this representation, Z14 has just two level-1 keys: the function and the implementation id. The implementation id is not directly constrained to be an identity, but perhaps it should be. The effect of such a constraint would be to require that all elements in its keylist would themselves be identities (as they are represented above). --GrounderUK (talk) 23:37, 20 March 2024 (UTC)
- Amended Z14K1 to also be an identity. I’ve noticed that the keylist is lacking a key, although it has a label. This is something that has often troubled me about Z3 (that the id is the list), so I’m giving its cardinalized version both an id (ZnnK3: Z14K5) and a “value object” (ZnnK4) (without adjusting the indentation). This is not reflected in the Type definition that precedes it.--GrounderUK (talk) 10:04, 22 March 2024 (UTC)
- I don’t know why I was thinking Z14K2 was an “identity”. I’ll correct that when I have a moment. GrounderUK (talk) 12:49, 24 March 2024 (UTC) Done--GrounderUK (talk) 19:09, 24 March 2024 (UTC)
Proposal 8: Type-constrained Z9/reference
It seems that our use of Z3K1/value type diverges from my expectation. In Z40, I think that the Z40K1 must be a Z9/reference (canonically). To me, that suggests that its value type should be "Z9" rather than "Z40". When a Reference is valid only if the object it references has a particular type, we have a type-constrained Reference. I suspect this is what the Boolean in Proposal 1 is intended to indicate. I suggest that what we need instead is a “reference id value type” to accompany the mandatory Z9. I think that would be a new key, Z9K2 on Z9 (which could take "Z1" as a value), or a new Z9?/Typed reference Type with the additional key. In either case, the value type of the additional key would be a Z4-list.
I believe we currently allow that a value can be either a Z7/function call that evaluates to an object of the specified type or a Z9/reference to a persistent object with that Type. In the case of the Z40, though, it looks like we would need any function call to evaluate to a Reference to a Z40, rather than to a Z40 object. This seems to be what in fact happens, but {"Z1K1": "Z40", "Z40K1": "Z41"}
, for example, is only valid because "Z41" has "Z40" as its type. What I propose here is that the we would instead get Instead of that, Z40 validation should allow "Z41" because it is a Reference and (new) Z9 validation allows "Z41" because the Z1K1 in the referenced object’s Z2K2 has the literal value "Z40".
{"Z1K1": "Z9", "Z9K1": "Z41", "Z9K2": ["Z4", "Z40"]}
. Z40 validation allows
- See the revised Z40K1 below. I think a transient can continue to be a simple Boolean object (or appear to be such an object) but the object it in fact references is a reference to itself, not a Boolean value. Presumably self-reference in persistent objects is currently handled as a special case but I think such objects can ground themselves instead, as described below--GrounderUK (talk) 09:15, 25 March 2024 (UTC)
We should also consider listing prohibited Types in a new Z9K3 (for the sake of argument). I would expect "Z1" to appear in exactly one of the two additional keys. Perhaps that should carry forward into each Reference object. This could be helpful in the Z9K2 but not in a separate Z9K3. In effect, Z9K2 and Z9K3 are alternative representations of the same constraint, so it would seem more appropriate to have a single additional key (Z9K2) whose value is the applicable constraint, expressed positively or negatively (or as a function call or Reference). This might be most conveniently represented as a function call. GrounderUK (talk) 13:09, 23 March 2024 (UTC)
- Actually, the problem I have with Z41 and Z42 is that the persistent objects should not function as References to themselves. In order to stop them functioning as References, the value should be escaped as a Z6/string in the Z2.
{"Z2K2": "Z40",
. Functions returning Z41 are returning a Reference to that object"Z41", "Z41K1":"Z40K1": {"Z1K1": "Z9", "Z9K1": {"Z1K1": "Z6", "Z6K1": "Z41"}}}{"Z1K1": "Z9", "Z9K1": "Z41"}
(represented canonically as “Z41”) within an object of Type Z40. From that, it seems to follow that the correct Z3K1/value type for Z40K1 is indeed "Z9" rather than "Z40". - The new "Z9K2" type constraint would be expressed in the Z3/key for Z40K1:
{
"Z3K1": {
"Z1K1": "Z9",
"Z9K1": "Z9",
"Z9K2": ["Z4", "Z40"]
},
"Z3K2": "Z40K1",
"Z3K3": { <Boolean> }
}
- GrounderUK (talk) 12:40, 24 March 2024 (UTC)
- As clarified below, the object is still a Z40, but the “type” of value is a Reference. I suspect Reference is not a type at all, properly speaking (rather like Function call). Whilst I find Proposal 8 less flawed than Proposals 1 and 2, I would still be inclined now to reject it (preferring to add a “value mode” key directly to the Z3, as envisaged below, but recognising that a mandatory list of permitted value modes or an optional list of excluded value modes would be required). GrounderUK (talk) 09:49, 3 April 2024 (UTC)
Proposed Z9 or Z9?
{
"Z1K1": "Z2",
"Z2K1": {
"Z1K1": "Z6",
"Z6K1": "Z9"
},
"Z2K2": {
"Z1K1": "Z4",
"Z4K1": "Z9",
"Z4K2": [
"Z3",
{
"Z1K1": "Z3",
"Z3K1": "Z6",
"Z3K2": "Z9K1",
"Z3K3": { <"Reference id"> }
},
{
"Z1K1": "Z3",
"Z3K1": {
"Z1K1": "Z7",
"Z7K1": "Z881",
"Z881K1": "Z4"
},
"Z3K2": "Z9K2",
"Z3K3": { <"Reference object types permitted"> }
}
],
"Z4K3": "Z109"
},
"Z2K3": { ... },
"Z2K4": { ... },
"Z2K5": { ... }
}
--GrounderUK (talk) 21:12, 24 March 2024 (UTC)
ChatGPT summary of GrounderUK's input
I fed most of this into ChatGPT because I am aware that my contributions have become a bit of a sprawl. This is what it made of it:
”The text you provided seems to be a discussion on various proposals and considerations regarding the management of keys, types, and references within a system. The author appears to be grappling with conceptual and practical challenges related to identity, optionality, cardinality, and type constraints.
Some key points and proposals discussed include:
1. Clarifying the consequences of inaction before evaluating alternative solutions.
2. Reflecting on how proposals might have differed with hindsight.
3. Exploring the distinction between persistent and transient objects, and the implications for identity and references.
4. Proposing different approaches, such as creating new persistent objects for key markers or multi-typed lists.
5. Considering the relationship between optionality, identity, and type constraints, and suggesting changes to how these are represented and enforced.
6. Discussing the complexity of expressing various kinds of constraints within keys and types, including cardinality and mutual exclusivity.
7. Proposing adjustments to the representation of references to enforce type constraints more effectively.
8. Reflecting on the need for clarity and refinement in the design of the system, including the potential for introducing new features or structures.
Overall, the discussion appears to be focused on finding effective and efficient ways to manage keys, types, and references within a system, with the aim of improving clarity, flexibility, and usability.”
Further interaction might be helpful: https://chat.openai.com/share/431e8a77-cab1-4b13-ae24-fcf6864e0a72 GrounderUK (talk) 13:43, 23 March 2024 (UTC)
Phabricator tickets
- as listed on the main page
- T282716: ZObject model: Create a type for identity
- Why? How? Ultimately, if objects are always Types, it seems to me that identity objects that are persistent must be self-referential, and perhaps never anything more than their Z2 keys. When would such an object not be an identity? When would an identity not be such an object?
- T320557: Consider adding an identity key to the functional model for ZImplementations
- Why? Implementations have versions and it is the versions we probably want to be tracking. Would we update the so-called “identity key” every time the code or composition changed? Sounds a bit like oldid to me. But in any event code can be changed and executed without a new version, so some sort of hash of the code might be required to allow separate evaluations using the same code to be identified, and differentiated from evaluations for the “same” implementation with different code (or a different composition), and from the “current” persisted version (if the implementation has changed).
- T282062: ZObject model: Add a key to Z3 to mark a key as optional
- Maybe 😉 But this is just a facet of cardinality.
- T290996: Support Optional Keys in Orchestrator
- Also phab:T292892
- T296755: Check current relevance of SELF_REFERENTIAL_KEYS and remove if unnecessary
- Presumably we do this even if we do nothing else (since Z40s and Z60s, at least, are self-referential).
- T304682: function-schemata: Fix requiredKeys for user defined validators
- Child of phab:T343469. We should explore the user journey for creating custom types but I am inclined to suppose that we (the users) would and should create some relevant functions with informal types before formalising the type (or, where an informal type will not work, proposing the Type along with the additional functionality required).
- T315914: Discuss whether we need to consider the case of new keys being added to existing types with instances
- As a fairly firm rule, no! This might be a bit like Proposal 5. Should we even allow a singleton Type to have keys that are not Z2 keys? I suppose Z60 already does… In any event, I’d expect we would have a new Type with the additional keys and convert to the new Type before eliminating the original. This corresponds to an alternative perspective on optionality, which is that the “optional” keys are mandatory at the Type level, but there is the option for concurrent Types (where one is the specialization of the other).
- T296400: When a key of an object is for a Type, default the UX to a reference rather than an instance
- Yes. When do we not? When should we not? See, for example, #Functionally constrained lists.
- T343614: Validation of types should check for identity
- I confess I don’t understand the use case. If the Type is neither a Reference nor a function call, I can’t think what its Z4K1 value might be (unless it’s "Z0", when you’re creating the Type itself).
GrounderUK (talk) 18:05, 31 March 2024 (UTC)
Proposals 3 and 4
These are introduced with: “Instead of introducing a new type, we extend Z3/Key by another key…” However, the right-hand examples have Z3K1 with the value Z91, rather than Z40. I suspect this is a drafting error. It is essential (in my opinion) that Z41 and Z42 are objects of type Z40 and, therefore, that The Z3K1 in Z40 can only be Z40. The effect of the additional key is to constrain Z40 subtypes, and I believe the required constraint is that there be exactly two subtypes and that they must both be persistent objects (as is currently the case).
(Proposal 8 seems to conform to this requirement indirectly, because the extended Reference type constrains the Type of the referenced object to be Z40. However, in the text preceding the Z40K1 key definition, I appear to have referred to “Z41” where it should be “Z40”. Apologies for any confusion. Although, in a sense, Z41 is both a Reference and a Boolean, it is primarily a Boolean, and this is why I reject Proposals 1 and 2.) — GrounderUK (talk) 09:45, 2 April 2024 (UTC)
- Note that none of the Proposals constrain the subtypes to being only Z2/Persistent objects. If this is a meaningful constraint, Proposal 4 could be extended to have an additional key-marker for the referenced object being of Tupe Z2. For Proposal 8, the most natural approach to add an explicit “value mode” to Z3. The value modes I have in mind are: function call, persistent reference, transient reference and literal object. In the case of Z40, the Reference would be a persistent reference. It would be possible to limit this new key to Z9, initially, but it seems to me that it is really a sub-key of Z9K2:
"Z9K2": {"Z1K1": "Z3?", "Z3?Kn": "Z92", "Z3?K1": ["Z4", "Z40"]}
(where, for illustrative purposes, “Z3?” is Z3 or an equivalent and “Z92” is a Reference to a persistent object). - Also note that the comments on #Optionality and cardinality relate specifically to Keys. A constraint applying to subtypes, such that there are exactly two Z40 objects or, more subtly, that a new Z60 (can or) cannot be added by an ordinary user might require enhancements to Z4. We might consider suppressing new subtype instances through their required identity Key but the structure of Z60 subtypes is a little odd in referencing a code in Z60K1, rather than its own identity (as is the case for Z40s). Perhaps with the benefit of hindsight we would have done this differently? In any event, although we could probably make this work, it seems to me that it is not quite the most logical approach. Constraints applying to the set of subtypes seem more logical at the Type level. In relation to Proposal 4, key markers should not be used for subtype constraints, only for Key constraints (accepting that this distinction may be fuzzy in a specific context, such as providing different widgets, behaviour or validation according to the subtype cardinality, as mentioned on the main page). GrounderUK (talk) 12:11, 2 April 2024 (UTC)
Consider using Type “Tests” to express constraints
Constraints such as the required number of persistent Z40s could be expressed as additional Z3/keys on Z4/Type. Whilst this could be quite successful for the simpler constraints like cardinality, I suspect it could easily become unwieldy. Given that users will be more familiar with functions than JSON objects, I propose that we consider expressing constraints using functional notation.
Evaluation of such a function would depend on the state of the system, so these representations would not be pure functions. Nevertheless, a representation like Natural number is between(count Persistent objects("Z40"), 2, 2)
should be generally accessible to users of multiple languages. Using Z20s (or an equivalent) to express these constraints as predicates should be equally accessible. In any event, all the applicable constraints could be expressed as a list of such predicates (given the required impure functions like “count Persistent objects”), so future enhancements to the constraints that can be expressed and applied would not require further structural changes to Z4.
(It seems that it would not be wholly inappropriate to refer to m:Talk:Abstract Wikipedia/Object creation requirements#Z4/Type redefinition through additional validation at this point. Creating a custom Type by altering a copy of the set of constraints applicable to an existing Type seems to offer a reasonably robust user journey, especially if the new Type’s Validator can construct itself from the declared predicates and the existing Type’s structural validation.) GrounderUK (talk) 10:54, 3 April 2024 (UTC)
Type safety
Please see Wikifunctions:Type proposals/configuration of functions for given types by @99of9 and phab:T358589. The overlap is probably off-topic here. I suppose we shall not pursue multi-typed lists in this particular context, but even if they are never a formal type, it seems sensible and convenient (to me) that a user who wishes to mix, for example, Natural numbers and Strings in a list (to represent arithmetical or algebraic expressions, for example) should be able to indicate, having chosen a Z1/Object “untyped” list, that Booleans, lists, functions etc should not be elements in said list. Even ignoring type safety (and who would?), it would simply be more convenient to select from the pre-defined types when adding an item to a list, rather than searching repeatedly for one of the required types. Alternatively, the types of existing elements should be suggested when adding a new item, perhaps even as a default where only one type has thus far been selected. GrounderUK (talk) 10:35, 5 April 2024 (UTC)
- Can you check the phab link? --99of9 (talk) 11:28, 5 April 2024 (UTC)
- Done Apologies for any inconvenience. GrounderUK (talk) 17:52, 5 April 2024 (UTC)