Computing Anomalies
The corresponding source file is available online on github.
Preamble
An “anomaly” is a concept that was introduced in [AC:BonPerTia19] (though its principle was already present in [C:BirPer15]). It can be positive or negative, and it quantifies how good the properties of a given S-box are compared to a permutation picked uniformly at random from the relevant set. If a an S-box has a high positive anomaly for a given property, then it means that its property is much better than should be expected from a random S-box. If instead its negative anomaly is high, then its property is much worse than should be expected.
Formally, consider a property \(P\) defined for any S-box \(S\), such that its outputs can be strictly ordered (for instance, if it’s a number then we can simply compare two outputs).
Let’s see how this notion behaves. We will need both the S-boxes from
the literature that are bundled with SAGE, and the very useful
defaultdict class.
from sage.crypto.sboxes import sboxes
from collections import defaultdict
Computing some anomalies
We will look at some S-boxes from the literature. To this end, we grab
the sboxes dictionary shipped with SAGE which contains many of them.
Skipjack
Let’s start with the S-box of Skipjack and let’s look at the anomalies corresponding to the main tables:
skipjack = get_sbox(sboxes["Skipjack"])
for table in ["LAT", "BCT", "DDT"]:
print("{} {:10.5f} {:10.5f}".format(
table,
table_anomaly(skipjack, table),
table_negative_anomaly(skipjack, table)
))
As we can see, the positive anomaly for the LAT is very high (remember that it is a logarithm): the S-box has been somehow optimized to have good linear property (see [C:BirPer15] and [PhD:Perrin17] for more details). Another way this property can be seen is through the display of a graph of its absolute Walsh spectrum along with what is expected for a random permutation. This is only done using the following function, which will open a new window displaying this graph. The blue area also shows the discrepancy between the expected behaviour of a random S-box and that of Skipjack.
(you must close the graph window for the program to continue)
interactive_distribution_comparison_lat(skipjack)
Other S-boxes
We can look at these values for other S-boxes: those of the AES, Fantomas, and Kuznyechik.
for k in ["AES", "Fantomas", "Kuznyechik"]:
s = get_sbox(sboxes[k])
print(k)
for table in ["LAT", "BCT", "DDT"]:
print("{} {:10.5f} {:10.5f}".format(
table,
table_anomaly(s, table),
table_negative_anomaly(s, table)
))
All the properties of the AES are extremely good, perhaps even optimal. Thus, the positive anomalies are essentially infinite (there are \(256! \approx 2^{1684}\) 8-bit permutations, meaning an anomaly above 1684 doesn’t really make sense). For the S-box of Fantomas, whose BCT properties are pretty bad, we observe the opposite. Finally, for the S-box of Kuznyechik, we can see that all the properties are too good to be random, especially the differential ones.
Looking at the maximum value
sboxU provides tools to compute the expected value for the maximum
coefficient in the LAT, DDT and BCT (based on [JMC:DaeRij07] and
[AC:BonPerTia19]). Let’s try them out in a new experiment.
First, we set the global parameters. Statistical tests lose their
relevance when the input size becomes too small so we fix n=8.
Running a couple thousand tests is sufficient to gather enough
information, hence n_tested=2**13.
n = 8
n_tested = 2**11
LAT
Experimental
We first compute experimental data by generating n_tested random
permutations. For each, we compute the linearity and increase a
corresponding counter in a dictionary called dis. We then display
the proportion of each linearity encountered.
dis = defaultdict(int)
for t in range(0, n_tested):
dis[linearity(random_permutation_S_box(n))] += 1
for k in sorted(dis.keys()):
if dis[k] != 0:
print("{:3d}: {:.4f}".format(k, float(dis[k])/n_tested))
Theoretical
For comparison, we then generate the expected distribution using the
relevant function (expected_linearity_distribution_permutation), and
then display the result.
dis = expected_linearity_distribution_permutation(n, n)
for k in sorted(dis.keys()):
if dis[k] != 0.0:
print("{:3d}: {:.4f}".format(k, dis[k]))
DDT
Our experiment for the DDT is essentially the same as for the LAT and
uses the same variables. We just need to replace linearity with
differential_uniformity and
expected_linearity_distribution_permutation with
expected_differential_uniformity_distribution_permutation
print("Experimental")
dis = defaultdict(int)
for t in range(0, n_tested):
dis[differential_uniformity(random_permutation_S_box(n))] += 1
for k in sorted(dis.keys()):
if dis[k] != 0:
print("{:3d}: {:.4f}".format(k, float(dis[k])/n_tested))
print("Theoretical")
dis = expected_differential_uniformity_distribution_permutation(n, n)
for k in sorted(dis.keys()):
if dis[k] != 0.0:
print("{:3d}: {:.4f}".format(k, dis[k]))
BCT
For the BCT, it is again the same except that we consider
boomerang_uniformity with and
expected_boomerang_uniformity_distribution_permutations.
print("Experimental")
dis = defaultdict(int)
for t in range(0, n_tested):
dis[boomerang_uniformity(random_permutation_S_box(n))] += 1
for k in sorted(dis.keys()):
if dis[k] != 0:
print("{:3d}: {:.4f}".format(k, float(dis[k])/n_tested))
print("Theoretical")
dis = expected_boomerang_uniformity_distribution_permutation(n, n)
for k in sorted(dis.keys()):
if dis[k] != 0.0:
print("{:3d}: {:.4f}".format(k, dis[k]))
Comments
As you can see by running this code, modeling the table coefficients like independent and identically distributed random variables works fairly well, though the results in the case of the expected linearity are not as good as the others. This is most likely due to the independance hypothesis being too heavy handed in this context.