My interpretation of the observed anomaly is this:
Background: The compiler uses so-called fingerprints for efficiently comparing
the equality of types (for technical reasons, i.e. avoiding endless recursion
and providing stable results independent from the evaluation order)
it even uses multiple fingerprints per type, an identifier fingerprint idfp (stops on named types),
a public fingerprint pbfp (everything exported), and a private fingerprint pvfp (everything).)
A fingerprint is a 32-bit hash code that results from
applying a function (DevCPM.FPrint) to each structural property of a type
and its elements. By using a 32-bit value it is possible that there are collisions,
i.e. two different types can produce the same fingerprint. This is a rare situation
but in the original ETH Oberon compiler there was some potential and this was well known to ominc.
Therefore they used an improved FPrint function (CRC32, see comment in FPrint)
instead of the original simpler function. This eliminated many collisions, but not all.
In our example, what happens is that we not only have two examples that produce a collision
but a pattern, where every instance produces a collision. The effect of the array length
on the fingerprint calculation is canceled out by the sequence of fingerprint update steps because it is effectively
applied twice and just by accident that effect occurs.
The compiler does not have a real bug. It just happens that there is a pattern that systematically leads to a collision.
Even more surprisingly the same pattern produces a collision also with the original FPrint function of the
ETH Oberon compiler. But not every FPrint function necessarily has this behavior. A very simple function,
INC(fp, val) would not have it in this example. Also a much more complex function such as
SHA would probably not have it.
A simple fix to destroy the pattern is to add a constant before applying btyp.pbfp to pbfp.
Using a constant like 12345 affects a lot of bits and seems to destroy the pattern reliably.
But adding btyp.n also seems to work. Adding btyp.n has the disadvantage that for a reader
of the code it looks as if it is important to use the number of array elements but in fact it is irrelevant.
The number of array elements is already fingerprinted in both pbfp and btyp.pbfp.
It is only a particular bit pattern and FPrint function that cancels the effect of the length in
the result of FPrint(pbfp, btyp.pbfp) unless btyp.pbfp is modified somehow.
Code: Select all
FPrintStr(btyp);
IF (btyp.comp = Array) & ((bstrobj = NIL) OR (bstrobj.name = null)) THEN DevCPM.FPrint(pbfp, btyp.pbfp + 12345)
ELSE DevCPM.FPrint(pbfp, btyp.pbfp)
END;
pvfp := pbfp
- Josef