Driven IFS and Data Analysis

Depth of History

If the bin of the current data point does not affect the bin of the next data point, then
Prob(address ij is occupied) = Prob(current data point is in bin i, given the previous was in bin j)
= P(i|j) (conditional probability)
= P(i)⋅P(j) (independence)
Denote by N(i) the number of driven IFS points with address i, and by N(ij) the number with address ij.
Then in a data set of M points, the probabilities are given by
P(i) = N(i)/M and P(ij) = N(ij)/M
This gives a simple test of independence: use Address Stats to find N(1) through N(4), and N(11) through N(44). Then compare each P(ij) with P(i)⋅P(j).
By how how much must they differ to deduce the difference is significant?
On the right are the numbers N(11) through N(44) for the previous random example.
For ease of comparison, on the left are the products of the probabilities, expressed in the same scale. That these numbers sum to 9999 instead of 10000 is a consequence of rounding.
33
353
34
660
43
660
44
1234
31
562
32
303
41
1050
42
568
13
562
14
1050
23
304
24
568
11
894
12
484
21
484
22
262
33
330
34
632
43
647
44
1252
31
623
32
294
41
1016
42
597
13
588
14
1063
23
314
24
566
11
860
12
479
21
490
22
248
Although the corresponding numbers are relatively close, they are not identical.
Nevertheless, these data points are independent, so corresponding values do not differ significantly. By how much they must differ to claim statistical significance is a more delicate point.

Return to first approach.