Wikifunctions:Determinism
This page is currently a draft. More information pertaining to this may be available on the talk page. Translation admins: Normally, drafts should not be marked for translation. |
While providing execution results, the Wikifunctions engine uses aggressive caching. This is to reduce the server load and enable more people to take advantage of the project.
Because of caching, the functions should be pure, that is their result should depend completely on the function's inputs. Such a function can be executed once for a given argument set and then cached virtually forever. An example of a pure function would be addition. Every time you add 1 to 2, you get 3. On the other hand, a function returning the current time is not pure. When executing it now and an hour ago, the result would be different even if parameters remain unchanged. Pure functions are moreover better suited for debugging and testing, because they don't have any hidden state (e.g. a counter that's incremented after every execution) nor external dependencies.
However, all the aims of Wikifunctions couldn't be fulfilled if the project would completely forbid using impure functions. Therefore, some guidelines regarding the function determinism and purity had to be put in place.
All the rules outlined below have a common goal: to reduce the number of nondeterministic functions, so that the execution engine can support them effectively.
Date and time
Functions that perform an analysis based on the current date or time are undoubtedly useful. They should be however decoupled from reading the current time and ask for it as a parameter instead. Any calculation based on the current time is just a general case of a calculation for the given date. For example, instead of creating a function get today's day of week (with no arguments), one should create a function get day of week for a date (with the date being passed as an argument).
This doesn't mean, however, that the former function cannot exist. It can be defined as a composition as follows: get day of week for a date(), with the input of the "current date" being passed in as a value at run time, from e.g. a wikitext parser function. This way, the date-related non-determinism can be limited to outside of the Wikifunctions system, and the evaluation can be read from cache.
Note: Currently, during the early days of Wikifunctions, date and time are not supported as data types, so that the whole category does not apply yet.
Environment properties
Functions should not depend on properties of the execution environment. At any point of time, the environment can be updated or altered in other way and the functions are expected to run as they used to be.
Randomness
Random values are troublesome for caching and therefore should be used sparingly. For any algorithm that makes use of randomness, it's recommended to accept a seed as well. That seed should be used for initializing any random generators invoked by the function.
Since random number generators will have to be deterministic as well, they should meet one of the following rules:
- In order to obtain the random number, one have to pass the previous value as a seed, for example:
- Assume seed is passed to our function as an argument.
- Let random_value1 be the result of random(seed).
- Let random_value2 be the result of random(random_value1).
- etc.
- The random generator accepts two arguments: seed and the previous value, for example:
- Assume seed is passed to our function as an argument.
- Choose some initial value.
- Let random_value1 be the result of random(seed, initial).
- Let random_value2 be the result of random(seed, random_value1).
- etc.
Wikidata
Eventually, we are going to support receiving data from Wikidata. This action is also deterministic (though determined by the edits to the referenced Wikidata entities).
Prices
Prices of gas, stocks etc. that fluctuates should be input as parameters so they don't need to be looked up.