issue-#59 IN function with SETs

Robert · Post by **Robert** » Tue Jun 09, 2015 11:07 am

Josef Templ wrote:The behavior of using MOD 32 ... is not a surprise.

Does this comment meant that you think "k IN s" & "(k MOD 32) IN s" are equivalent, because they are not. "k IN s" returns a random (or at least unpredictable) result.

One can argue that this is not a bug, but one can also argue that it is not desirable.

Josef Templ · Post by **Josef Templ** » Tue Jun 09, 2015 12:13 pm

Robert wrote: Does this comment meant that you think "k IN s" & "(k MOD 32) IN s" are equivalent, because they are not. "k IN s" returns a random (or at least unpredictable) result.

One can argue that this is not a bug, but one can also argue that it is not desirable.

It means that obviously k IN s is the same as (k MOD 32) IN s
and it means that it is not a surprise if you know the inner working of a 32 bit CPU.
What should it do except ignoring the higher bits and this is exactly what happens with k MOD 32.
It cuts off the higher bits because internally k is represented as binary number.
There is no range checking included with bit operations on any CPU I know.
This would need extra code emitted by the compiler.

It is not a bug because the behavior of k IN s is undefined if k > MAX(SET).
So you should not rely on it if you want to be on the safe side.

I am 100% sure that this was known to ominc but they decided to opt for the
maximum efficiency in this case. The reason is simple. Bit operations are often used for
bitmap manipulations and if you work on the bit level of pixels of a bitmap, you need
the maximum speed. A 1000 pixel by 1000 pixel bitmap has 1 million pixels and every pixel has
8 or 16 or even 32 bits. So you may end up doing a huge number of bit operations.

- Josef

Robert · Post by **Robert** » Tue Jun 09, 2015 5:18 pm

Josef Templ wrote:It means that obviously k IN s is the same as (k MOD 32) IN s

We are now disagreeing on a simple matter of fact.

About 5 posts ago I printed out the results of Iuowy's test example: 32 IN {0..31}.
Sometimes it returns TRUE, sometimes FALSE.

I have no reason to suspect (32 MOD 32) IN {0..31} is indeterminate.

luowy · Post by **luowy** » Wed Jun 10, 2015 2:16 am

Josef Templ wrote:It is not a bug because the behavior of k IN s is undefined if k > MAX(SET).
So you should not rely on it if you want to be on the safe side.

I am 100% sure that this was known to ominc but they decided to opt for the
maximum efficiency in this case. The reason is simple. Bit operations are often used for
bitmap manipulations and if you work on the bit level of pixels of a bitmap, you need
the maximum speed. A 1000 pixel by 1000 pixel bitmap has 1 million pixels and every pixel has
8 or 16 or even 32 bits. So you may end up doing a huge number of bit operations.

the "IN" is a language feature,should high level and safe,k out of range return false is Intuitive,
instead of return undefined or runtime trap for these code as Chris point out:

Code: Select all

 IF k IN s  ....

"(k MOD 32) IN s" is not same safe as "k IN s" ,as Robert point out.

"for the maximum efficiency" use code procedure

Code: Select all

bitin(k,address)

instead of comipler feature is better.

translate 'IN' code to C(like use your Ofront), the equal one is: x&(1<<k) != 0 or (x >> k)&1 != 0 (Oleg point out),
the "efficiency feature" not take care.

the fixup add 2 instructions(5 bytes code) only,efficient enough.
like array index check, even we know the index not out of range.

luowy

Josef Templ · Post by **Josef Templ** » Wed Jun 10, 2015 4:53 am

luowy wrote: the "IN" is a language feature,should high level and safe,k out of range return false is Intuitive,
instead of return undefined or runtime trap for these code as Chris point out:

Chris is right. The only option you have is to generate a TRAP.
Otherwise there would be an inconsistency with the compile time error message in case of

Code: Select all

IF 32 IN {} THEN

But I am not proposing to make any change at all. It is not worth the effort
and it must be seen in a bigger context. There are more such operations
that would need a check. Adding it to IN is too specific.

We have more important things to do. Look at the CPC changes list. We still have
about 40 unresolved items on it. We can come back to this topic in 1.8 if there is any need.
But then it must be discussed in a more general context, i.e. not only for IN.

The runtime check for array index bounds is of a completely different nature.
It guarantees that the memory is not corrupted which could otherwise lead to fatal
system crashes in the garbage collector or heap management.
The same applies to type guards. They are REQUIRED for technical reasons.
SET operations cannot destroy the memory outside the variable.
Memory integrity is the reason that forces runtime checks.

- Josef

Josef Templ · Post by **Josef Templ** » Wed Jun 10, 2015 5:23 am

Robert wrote: We are now disagreeing on a simple matter of fact.

About 5 posts ago I printed out the results of Iuowy's test example: 32 IN {0..31}.
Sometimes it returns TRUE, sometimes FALSE.

I have no reason to suspect (32 MOD 32) IN {0..31} is indeterminate.

It is also deterministic but in a different sense.
When the set is not in a CPU register, a memory bit test operation is used
that takes the start address of the variable and adds the bit counter
without the MOD 32 operation. So you end up outside that variable in the adjacent
memory location. The INCL operation, on the other side, does not write outside the set
but has the MOD 32 behavior. This is important for preserving memory integrity.

In any case, don't rely on the behavior of an unspecified operation.

- Josef

luowy · Post by **luowy** » Wed Jun 10, 2015 2:20 pm

Hi Josef,
I understand what you say,and agree with you;
the language definition
1) oberon(oberon7): x IN s stands for "x is an element of s". x must be of type INTEGER, and s of type SET.
2) CP:x IN s stands for "x is an element of s". x must be an integer in the range 0..MAX(SET), and s of type SET.

the " x must be an integer in the range 0..MAX(SET)" is the point:must trap when out of range;

our BB compiler has been provide the CheckRange() procedure,trap on x out of range,tha is right.
but the ranchk is off on default.

I set range check on when debugging:

Code: Select all

 	PROCEDURE In* (VAR x, y: DevCPL486.Item);
		VAR c:DevCPL486.Item;
	BEGIN
		(*IF y.form = Set THEN CheckRange(x, 0, 31) END;*)
		IF x.mode = Con THEN 
			DevCPL486.GenBitOp(BT, x, y); 
		ELSE 
			DevCPL486.MakeConst(c, 31, x.form); DevCPL486.GenComp(c, x);
			DevCPL486.GenAssert(ccBE,ranTrap);
			DevCPL486.GenBitOp(BT, x, y); 
		END;
		Free(x); Free(y);
		SetCC(x, lss, FALSE, FALSE); (* carry set *) 
	END In;

luowy

cfbsoftware · Post by **cfbsoftware** » Wed Jun 10, 2015 10:42 pm

luowy wrote:I would like make range check as default,as a personal option to find bugs as early as possible.
Code: Select all
IF y.form = Set THEN check:=ranchk; CheckRange(x, 0, 31); ranchk:= check; END; 

The problem with that proposal is that it removes the capability for a programmer to disable the range check for IN expressions if that is what he wants. Relying on compiler-generated range checks to cater for bad data is like using a sledgehammer to crack a walnut. They can be useful to temporarily flush out programmer errors in experimental code but there are more efficient ways of doing this in production code. Consider the following (hypothetical) code.

Code: Select all

ASSERT((i >= MIN(SET)) & (i <= MAX(SET)));
ok := TRUE;
FOR j := 0 TO 1000000 DO
 IF i IN set[j] THEN ok := FALSE END
END;

I have ensured that the IN expression is valid and would not want the compiler to insert redundant range checks in every iteration of that loop.

luowy · Post by **luowy** » Thu Jun 11, 2015 6:11 am

Hi Chris,
thanks your example,you have a good reason for not checking range in this example,I agree you.

but a lots of codes like my, Robert's even the bb framework have not guarded at all for reasons;
if these codes has bug,there is no compiler error,no runtime error(most time),and has a random behavior.
it is hard to find it,and hard to debug it,that is why we want this range check built in.

most of our code is not time sensitive,we want it run right first, efficient second.
even we know this array's index is never out of range.

Code: Select all

VAR a:ARRAY 1024 OF CHAR;
FOR i := 0 to 1023 DO a[i]:=0X END;

after the discussion ,I think trigger a trap is more useful ,more nearest the language definition than return FALSE;

Code: Select all

   PROCEDURE In* (VAR x, y: DevCPL486.Item);
      VAR c:DevCPL486.Item;
   BEGIN
      (*IF y.form = Set THEN CheckRange(x, 0, 31) END;*)
      IF x.mode = Con THEN
         DevCPL486.GenBitOp(BT, x, y);
      ELSE
         DevCPL486.MakeConst(c, 31, x.form); DevCPL486.GenComp(c, x);
         DevCPL486.GenAssert(ccBE,ranTrap);
         DevCPL486.GenBitOp(BT, x, y);
      END;
      Free(x); Free(y);
      SetCC(x, lss, FALSE, FALSE); (* carry set *)
   END In;

luowy

Josef Templ · Post by **Josef Templ** » Fri Jun 12, 2015 7:45 am

As I mentioned before, index checks have a very different importance.
They are here in order to guarantee memory integrity and thereby for avoiding fatal errors
that cannot be debugged because they appear at a later time and a different place any where
in the program.
This is the reason why they are turned on by default.

It does not give sense to add range checks for In but ignore the switched off range checks for
other situations. This is too specific. You must see the complete picture.

One approach would be to turn on all runtime checks by default in a future release or to
have an option for turning it on. This does not require any separate change for IN.
I would strongly propose to leave this to a future release and to discuss it later.
The current behavior served us well for the last 20 years. It was only a surprising observation to
some of the center members who did not know this behavior.

- Josef

BlackBox Framework Center

issue-#59 IN function with SETs

Re: IN function with SETs

Re: IN function with SETs

Re: IN function with SETs

Re: IN function with SETs

Re: IN function with SETs

Re: IN function with SETs

Re: IN function with SETs

Re: IN function with SETs

Re: IN function with SETs

Re: IN function with SETs