Using variable parameters for efficiency

Discussion:

(too old to reply)

August

2004-10-04 01:46:13 UTC

Hi folks,

For the case of efficiency, Wirth himself among others encourages the use of
reference parameters with arrays and records even if they are not being
modified. I have never studied the subject of compilers in great detail, but
shouldn't an optimizing compiler be able to deal with this.

As the actual parameter in these cases is either a variable or a string
literal (which have an address), the compiler could translate the formal
parameter into a reference parameter if it can conclude that the (value)
parameter is never modified. Using reference parameters for intended
efficiency would instead have negative effects as the compiler, without
access to the procedure implementation, must assume that the procedure may
modify these paramterers. This means that it will not be able to perform the
optimization described.

Any comments?

-- August

Michael van Acken

2004-10-04 09:16:05 UTC

Permalink

Finding out that a value parameter and its source is unchanged across
the a procedure call must prove that

a) that the called procedure does not write to its local copy of the
value parameter, and

b) that within the called procedure no piece of code modifies the
source variable that holds the _original_ copy of the data in the
value parameter.

Only if both is true, the value parameter can be changed to VAR. The
first issue would cause unintended side effects, and the second issue
would cause a piece of data to mutate that was previously a private
and immutable copy.

As an aside, changing value to variable parameters may also have an
effect on the calling convention, changing the binary interface
exposed by a procedure to external clients.

--mva

August

2004-10-05 00:48:20 UTC

Permalink

The following quotation is from "The Oakwood Guidelines for Oberon-2
Compiler Developers" section 5.13:

"There has been many requests to make ARRAY and RECORD parameters read-only
to achieve the efficiency of passing by reference without the associated
possibility for corruption of the calling parameter. An attempt to make an
assignment to any component of such a read-only parameter is a compile-time
error.... Discussions with ETH suggests this is really a compiler code
optimization issue and on this basis it is recommended that this extension
is not implemented."

The last statement suggests that an optimization can be performed.

You mention two conditions, (a) and (b), that has to be satisfied to make
the optimization possible. To check condition (a) is no more difficult than
checking modification of constants, right? Condition (b) means that in the
body of the procedure

PROCEDURE P(x: T);

the compiler has to check every designator of type T (which could be used by
a client as an actual parameter to P) for modification. Is this feasable?

If it's the case that clients in some situations need to know the binary
interface of a procedure, as you mention, then ETH is wrong---it's *not* a
compiler code optimization issue, and there is a real need for a read-only
option.

--August

Post by Michael van Acken

Finding out that a value parameter and its source is unchanged across
the a procedure call must prove that
a) that the called procedure does not write to its local copy of the
value parameter, and
b) that within the called procedure no piece of code modifies the
source variable that holds the _original_ copy of the data in the
value parameter.
Only if both is true, the value parameter can be changed to VAR. The
first issue would cause unintended side effects, and the second issue
would cause a piece of data to mutate that was previously a private
and immutable copy.
As an aside, changing value to variable parameters may also have an
effect on the calling convention, changing the binary interface
exposed by a procedure to external clients.
--mva

Ondrej Hrabal

2004-10-05 09:41:32 UTC

Permalink

Condition (b) means that in the body of the procedure
PROCEDURE P(x: T);
the compiler has to check every designator of type T (which could be used by
a client as an actual parameter to P) for modification. Is this feasable?

It's surely not that easy. Consider the following:

VAR
counter: INTEGER;

PROCEDURE Increment;
BEGIN
INC(counter)
END Increment;

PROCEDURE IncrBy (n: INTEGER);
BEGIN
WHILE n > 0 DO
Increment;
DEC(n)
END
END IncrBy;

If "counter" is passed to IncrBy as "n", then good luck...

Andy

August

2004-10-06 00:10:19 UTC

Permalink

To summarize the discussion, maybe the following changes to Oberon-2 would
improve the language (no new syntax):

1. Make all array and record parameters read-only reference parameters by
default (no special keyword). Is there any situation where an array or
record value parameter is useful?

2. For consistency and to discourage "bad" programming style---make all
value parameters read-only.

-- August

Post by Ondrej Hrabal

VAR
counter: INTEGER;
PROCEDURE Increment;
BEGIN
INC(counter)
END Increment;
PROCEDURE IncrBy (n: INTEGER);
BEGIN
WHILE n > 0 DO
Increment;
DEC(n)
END
END IncrBy;
If "counter" is passed to IncrBy as "n", then good luck...
Andy

Chris Burrows

2004-10-06 00:19:58 UTC

Permalink

"August" <***@tele2.se> wrote in message news:OJG8d.3885$***@nntpserver.swip.net...
Is there any situation where an array or

Post by August
record value parameter is useful?

yes - string constants

Chris Burrows
CFB Software
http://www.cfbsoftware.com

August

2004-10-06 00:25:10 UTC

Permalink

Post by August
Is there any situation where an array or

Post by August
record value parameter is useful?

yes - string constants

Explain why a read-only reference array parameter is not good enough!

Ondrej Hrabal

2004-10-06 13:32:20 UTC

Permalink

Post by August
To summarize the discussion, maybe the following changes to Oberon-2 would
1. Make all array and record parameters read-only reference parameters by
default (no special keyword). Is there any situation where an array or
record value parameter is useful?
2. For consistency and to discourage "bad" programming style---make all
value parameters read-only.
-- August

Seems to be quite reasonable.
(But record value parameters can still be useful, e.g. complex numbers
or RGB triples.)

August

2004-10-06 17:06:49 UTC

Permalink

Post by Ondrej Hrabal
(But record value parameters can still be useful, e.g. complex numbers
or RGB triples.)

It may be so, but please exemplify.

Ondrej Hrabal

2004-10-06 23:07:26 UTC

Permalink

Post by August

Post by Ondrej Hrabal
(But record value parameters can still be useful, e.g. complex numbers
or RGB triples.)

It may be so, but please exemplify.

The problem is that when the record size is only 4 bytes, then passing
by reference saves nothing. Instead, an extra pointer dereferencing is
needed each time the parameter is used. But, to be honest, that will
hardly be a significant slowdown.
Your suggestion is good. I can't see any reason to pass big structures
or strings by value (copy them explicitly if needed)...

Andy

Chris Burrows

2004-10-07 00:51:55 UTC

Permalink

Have you taken into account the effect this would have on the use of such
parameters in recursive procedure calls?

Chris Burrows
CFB Software
http://www.cfbsoftware.com

Ondrej Hrabal

2004-10-07 09:18:05 UTC

Permalink

Post by Chris Burrows
Have you taken into account the effect this would have on the use of such
parameters in recursive procedure calls?

There can't be any bad effect as long as the parameters are read-only.
If you need to modify them, copy them to local variables. Again I see
no problem. And the writable ref parameters remain unchanged.

Andy

Chris Burrows

2004-10-07 11:51:45 UTC

Permalink

Post by Ondrej Hrabal

Post by Chris Burrows
Have you taken into account the effect this would have on the use of such
parameters in recursive procedure calls?

The problem is that existing code is potentially broken - possibly without
warning. As in my earlier post - one simple solution is the use of an
additional keyword to make it clear that read-only parameters are intended
e.g. IN as used in Component Pascal. This has been tried and tested for some
years now. I'm not aware of any problems.

Chris Burrows
CFB Software
http://www.cfbsoftware.com

Ondrej Hrabal

2004-10-07 22:00:02 UTC

Permalink

Post by Chris Burrows
The problem is that existing code is potentially broken - possibly without
warning.

It should never go without a warning. The new rules would just be more
restrictive and incompatible uses would be reported as errors. Anyway,
I don't think it's extremely important whether read-only parameters
have an IN modifier or no modifier. I like the former for symmetry and
readability, but it's quite an unimportant problem.

Andy

Chris Burrows

2004-10-08 00:05:17 UTC

Permalink

Post by Ondrej Hrabal

Post by Chris Burrows
Have you taken into account the effect this would have on the use of such
parameters in recursive procedure calls?

There can't be any bad effect as long as the parameters are read-only.
If you need to modify them, copy them to local variables. Again I see
no problem.

Note that recursive procedures can have open arrays as parameters. The
complexity of the consequences of this proposal is increasing.

Chris Burrows
CFB Software
http://www.cfbsoftware.com

Ondrej Hrabal

2004-10-08 11:50:55 UTC

Permalink

Post by Chris Burrows
Note that recursive procedures can have open arrays as parameters. The
complexity of the consequences of this proposal is increasing.
Chris Burrows
CFB Software
http://www.cfbsoftware.com

Sorry, can't still see the problem. Open arrays are always passed by
ref already in the current practice.
Haven't we already got lost in this message thread? To make sure that
we talk about the same, I'd resume the proposal:

1. Read-only parameters (with or without the IN modifier), implemented
by-val or by-ref depending on some well known criteria.

2. Read-write and write-only parameters just like in Component Pascal.

3. No value parameters acting like local variables.

Anyone is welcome to show an example when this doesn't work.

Regards, Andy

Chris Burrows

2004-10-09 03:40:37 UTC

Permalink

Post by Ondrej Hrabal
1. Read-only parameters (with or without the IN modifier), implemented
by-val or by-ref depending on some well known criteria.
2. Read-write and write-only parameters just like in Component Pascal.
3. No value parameters acting like local variables.
Anyone is welcome to show an example when this doesn't work.

I am not concerned about proposal 1. (as long as it includes the IN
modifier) or proposal 2.

If you were considering creating a brand new language then 3 might be OK.
However, if it were to be introduced as a modification to an existing
language like Oberon-2 then it would cause more problems than it is intended
to prevent.

Consider a function which compares two arbitrary strings. It is required to
return true if

a) the two strings are identical

or

b) the first character of each string is a letter and the strings only
differ by the case of that letter. e.g.

'Chris' = 'chris'
'12345' = '12345'
'cHris' # 'chris'
empty string = empty string
'abc' # 'def'
' Chris' # 'Chris' (* leading blank character *)
etc.

After the return from a call to the procedure the input values must be the
same as they were prior to the call.

A simple solution that

a) would work for all Oberon-2 implementations that conform to the existing
Language Report, and
b) a novice should not have too much difficulty comprehending

would be:

PROCEDURE Smatched(n1, n2: ARRAY OF CHAR): BOOLEAN;
BEGIN
n1[0] := CAP(n1[0]);
n2[0] := CAP(n2[0]);
RETURN n1 = n2;
END Smatched;

Obviously it is possible to rewrite this even if n1 and n2 cannot be
considered as local variables. However, I would argue that the exercise is
not quite as trivial as one might first think, and the end result is less
comprehensible and / or maintainable. Try it for yourself.

Chris Burrows
CFB Software
http://www.cfbsoftware.com

Ondrej Hrabal

2004-10-09 18:08:12 UTC

Permalink

Post by Chris Burrows
I am not concerned about proposal 1. (as long as it includes the IN
modifier) or proposal 2.
If you were considering creating a brand new language then 3 might be OK.
However, if it were to be introduced as a modification to an existing
language like Oberon-2 then it would cause more problems than it is intended
to prevent.

If I get down to the task of designing a programming language some
time in the future, then it'll be a brand new one. :-) But even if
some existing Oberon code was to be ported to suit the mentioned
rules, it wouldn't be such a big disaster as every inconsistency would
be discovered by the compiler and would be very easy to correct.

Post by Chris Burrows
Consider a function which compares two arbitrary strings. It is required to
return true if
a) the two strings are identical
or
b) the first character of each string is a letter and the strings only
differ by the case of that letter. e.g.

...

Post by Chris Burrows
PROCEDURE Smatched(n1, n2: ARRAY OF CHAR): BOOLEAN;
BEGIN
n1[0] := CAP(n1[0]);
n2[0] := CAP(n2[0]);
RETURN n1 = n2;
END Smatched;
Obviously it is possible to rewrite this even if n1 and n2 cannot be
considered as local variables. However, I would argue that the exercise is
not quite as trivial as one might first think, and the end result is less
comprehensible and / or maintainable. Try it for yourself.
Chris Burrows
CFB Software
http://www.cfbsoftware.com

Well, such examples can be found. To a novice I'd always recommend the
solution with read-only reference parameters and explicit local
variables. The code would be a bit longer, but much much cleaner.
Consider that the novice gets an error message that open arrays cannot
be used for value parameters - as it will necessarily happen. What he
or she will probably do is add the VAR modifier.
Wow, the function works! But it adjusts the compared strings according
to its taste... :-(

It's all just a matter of choice:

Pros: clean parameter semantics reflecting the intended use instead of
implementation, no hidden expensive copying, no chance of confusing a
local variable with an in-out paramenter;

Cons - yet in rare cases: some more variable declarations and explicit
assignments plus a few more bytes allocated on the stack for the
pointer parameters;

Andy

Michael van Acken

2004-10-06 08:49:10 UTC

Permalink

Post by August
[...]
You mention two conditions, (a) and (b), that has to be satisfied to make
the optimization possible. To check condition (a) is no more difficult than
checking modification of constants, right?

Checking an invariant of a piece of code like "is this variable
modified?" is more difficult than enforcing a declared property of an
object. The former is an analysis, the latter a check against the
rules of the language.

Post by August
Condition (b) means that in the
body of the procedure
PROCEDURE P(x: T);
the compiler has to check every designator of type T (which could be used by
a client as an actual parameter to P) for modification. Is this feasable?

It is not sufficient to check only the body of procedure P. The
analysis must include all procedures called from P that might modify
the argument currently passed to `x'. This widens the scope of the
analysis to all _potential_ callers of P, present and future.

-- mva

Chris Burrows

2004-10-04 09:09:33 UTC

Permalink

... if (* Thats a big IF *) it can conclude that the (value) parameter is
never modified - not as easy as you might think. Consider for example the
case where the parameter is passed on as a parameter to another procedure
call, where it is then passed on as a parameter to another procedure call
...

I like the solution used in Component Pascal. Additional keywords IN and OUT
are used to qualify parameters as follows:

"Variable parameters can be used for input only (keyword IN), output only
(keyword OUT), or input and output (keyword VAR). IN can only be used for
array and record parameters. Inside the procedure, input parameters are
read-only."

As well as allowing the compiler to use the most efficient means of passing
the parameter without the overhead and complexity of optimisation
techniques, it gives the reader of the program a better idea of the
intentions of the programmer, and also gives the compiler a better chance to
catch unintentional errors.

Chris Burrows
CFB Software
http://www.cfbsoftware.com/gpcp

Ondrej Hrabal

2004-10-04 16:07:10 UTC

Permalink

Post by August
Hi folks,
For the case of efficiency, Wirth himself among others encourages the use of
reference parameters with arrays and records even if they are not being
modified. I have never studied the subject of compilers in great detail, but
shouldn't an optimizing compiler be able to deal with this.
As the actual parameter in these cases is either a variable or a string
literal (which have an address), the compiler could translate the formal
parameter into a reference parameter if it can conclude that the (value)
parameter is never modified. Using reference parameters for intended
efficiency would instead have negative effects as the compiler, without
access to the procedure implementation, must assume that the procedure may
modify these paramterers. This means that it will not be able to perform the
optimization described.
Any comments?
-- August

Hi August,

you're right that handling of parameters is sometimes quite awkward.
Actually, if only value and VAR parameters are supported (which is the
case of standard Oberon), then there is a big problems with string
constants: passing by value is pretty inefficient, passing by ref is
impossible, as they're not writable.
Compilers like XDS support read-only ref parameters as an extension of
the language and Component Pascal already has three kinds of reference
parameters - VAR (read-write), IN (read-only), OUT (read-write, but
initially undefined).
This works well, but it's still quite low-level: the programmer is
forced to think in terms of the underlying implementation, not
semantics.

My solution would be to have just 3 kinds of parameters:

IN - read-only, the compiler would choose the implementation depending
on the size of the parameter (say, upto 8 bytes => by value, more or
variable => by ref)

OUT - read-write, initially undefined, changes propagate to the caller
=> by ref

IN OUT - read-write, initialized by the caller, changes propagate to
the caller => by ref

I would completely discard the writable value parameters, as they are
IMO a very ugly concept. Why should a function modify its INPUT
values? Moreover, with this feature present, the compiler is unable to
detect a forgotten VAR/OUT modifier as an error! I know they are
retained for efficiency reasons, as they can save some additional
local variables, but think that's a rather rare situation and the
effect is negligible. I've restricted myself to always treat value
parameters as constants...

Regards, Andy