We define a pattern as a relation between functions of sequences of measurements. Any pattern which has never been falsified (and thus was always confirmed / always-true) is called an “emergent law”. Our algorithms are able to creatively construct new patterns by recursively applying functions on sequences of varying length and screening them for these emergent laws.
A first example: Law about the home advantage in European soccer
We use the match results of all soccer matches in the big European Leagues (England, Spain, Germany, France and Italy) during the years 2004-2016. The blue line shows the mean number of goals for the home team after each sequence of T games. The red line shows the mean number of goals for the away team. By gradually increasing T we found that the red line was always below the blue one for sequences of length T=154. Mean home team goals were always greater than mean away team goals. The graph shows that there is an empirical law about the existence of a home advantage in European soccer leagues – discoverable by varying T and comparing the respective means.
We developed algorithms that autonomously search for such laws in databases by automatically creating and testing billions of patterns. We found a huge number of relevant emergent laws for every subject we have tried. They are the foundation for all applications of Emergent Law Based Statistics.
Example for laws in engineering and natural sciences
As stated previously, Emergent Laws can be found across many domains. For example if we look at a time series of measurements of carbon monoxide (CO) in an Italian city, we can see that in each sequence of T=180 days the mean of the CO-concentration at 7pm was higher than that at midnight. However, this does not imply that the concentration of CO at 7pm was higher than at midnight on every single day. This law only emerged after combining 180 measurements. That is why we call it an “emergent law”.
Examples for laws in business administration and economics
Do you know a single empirical law in economics or business administration that made always true predictions? So probably this is the first empirical law in business administration you see.
If we analyze prices for used cars that were offered on an internet platform (Ebay), we can see that in each sequence of T=1750 offers the mean price for VWs was higher than that for Renaults. This pattern was observed in 186.03 non overlapping sequences (of 1750 offers each). We could have predicted (186.03-1) times (for non-overlapping sequences) that the pattern will hold and we would have been right. We call the number of verifications of such a prediction “Degree of inductive Verification (DiV)”. In this case the prediction was verified 185.03 times and thus this law has a DiV of 185.03.
Even in the domain of financial markets we can find patterns that were always true until now. For example, the correlation between the returns of S&P500 and Dow Jones is greater than zero in each sequence of at least 18 trading days. This has been the case for 1778 non-overlapping sequences since 1896.
We should have good reason to believe that this law will stay true and thus its DiV will increase in the future.
Definite falsification
There is no length of sequences T<=150000 (half of the observations) for which the mean offer price for BMWs was always lower or higher than that off Mercedes cars. We can say with absolute certainty that the universal statement “There is always a difference in mean prices” will never be true for sequences with T<=150000”. This universal statement, that the pattern was always true will remain falsified forever and we will never ever have to try it again.
The role of induction and falsification in science and learning
The role of explorative search for emergent laws in the learning process can be seen as to complement the actually dominant deductive approach to science and learning. If we look at the often used example for reasoning:
- Socrates is human,
- all humans are mortal,
- –> Socrates is mortal.
We immediately see that the fundament of empirically relevant deduction is the ability to identify the pattern “is human” and the law “all humans are mortal”. From our point of view the identification of patterns (by empirical laws that show that some patterns of measurements are always called human) and other empirical laws will always rely on induction. And we are convinced that the main instrument to find empirical laws are search processes and not deduction. Search processes rely on the ability to construct and evaluate hypothesis autonomously. By far the most frequent result for the evaluation of a hypothesis is falsification. The rate of found laws is very low. So it is very important that falsification in emergent law based statistics is final because the efficiency of searching heavily relies on the fact that at least knowledge about what is eternally false is accessible.