Line breaks in strings

User avatar
DGDanforth
Posts: 1061
Joined: Tue Sep 17, 2013 1:16 am
Location: Palo Alto, California, USA
Contact:

Re: Line breaks in strings

Post by DGDanforth »

Ivan Denisov wrote:Doug, you can do something like this

Code: Select all

mat := StrToMat(
"1, 0, 0, 0, 0" + 0DX +
"1, 1, 0, 0, 0" + 0DX +
"1, 0, 1, 0, 0" + 0DX +
"1, 0, 0, 1, 0" + 0DX +
"1, 0, 0, 0, 1"
)
I like that.
Thanks, Ivan.
User avatar
Robert
Posts: 1024
Joined: Sat Sep 28, 2013 11:04 am
Location: Edinburgh, Scotland

Re: Line breaks in strings

Post by Robert »

I sympathise with both of Doug's desires:
- To have a simple way to texturally specify a small matrix
- And to make the corresponding source code easy to read.

I already had such a procedure for Vectors (REAL & INTEGER versions). This thread has inspired me to write a matrix version (see below).

In this thread line feeds have been used for both purposes, but I have chosen to only use them for one. I use "|" in the string, not line feed, as the matrix row separator. That is an advantage of keeping this kind of functionality in private libraries, the only person you have to agree the syntax with is yourself!

The first box shows the before & after source code syntax, the second the new procedure in Lib, which I intend to republish very soon.

Code: Select all

    NEW (mat, 4);
    mat [0]  :=  IVec.SetStr ('1, 1, 1, 0, 0, 0');
    mat [1]  :=  IVec.SetStr ('1, 0, 1, 0, 0 +0');
    mat [2]  :=  IVec.SetStr ('1  1  1  0  0 -0');
    mat [3]  :=  IVec.SetStr ('0  0 -1  0 +1, 5');

    mat  :=  IMat.SetStr (
               '1, 1, 1, 0, 0, 0  |' + 
               '1, 0, 1, 0, 0 +0  |' + 
               '1  1  1  0  0 -0  |' + 
               '0  0 -1  0 +1, 5');

Code: Select all

PROCEDURE  SetStr* (IN str : ARRAY OF CHAR) : Matrix;
  VAR
   a, b, r  :  INTEGER;
   chr      :  CHAR;
   txt      :  ARRAY  128  OF  CHAR;
   mat      :  Matrix;
  BEGIN
    LOOP
      a  :=  0; b  :=  0; r  :=  0;
      LOOP
        chr  :=  str [b]; txt [a]  :=  chr;
        IF  (chr = '|')  OR  (chr = 0X)  THEN
          IF  mat  #  NIL  THEN
            txt [a]  :=  0X; mat [r]  :=  IVec.SetStr (txt); a  :=  -1
          END;
          INC (r);
          IF  chr  =  0X  THEN
            IF  (mat = NIL) & (b > 0)  THEN  NEW (mat, r); EXIT  ELSE  RETURN  mat  END
          END
        END;
        INC (a); INC (b)
      END
    END
  END  SetStr;
User avatar
DGDanforth
Posts: 1061
Joined: Tue Sep 17, 2013 1:16 am
Location: Palo Alto, California, USA
Contact:

Re: Line breaks in strings

Post by DGDanforth »

Robert wrote:I sympathise with both of Doug's desires:
- To have a simple way to texturally specify a small matrix
- And to make the corresponding source code easy to read.
I have implemented Ivan's suggestion (it took a little more code than I expected)
and it works just fine.

Your use of "|" is interesting. It avoids the need of double plus " + 0DX +".
Good job.

-Doug
User avatar
DGDanforth
Posts: 1061
Joined: Tue Sep 17, 2013 1:16 am
Location: Palo Alto, California, USA
Contact:

Re: Line breaks in strings

Post by DGDanforth »

Final comment.
One nice thing about having source code be a document is that
one can apply a ruler to aline columns of a matrix.
RuledMatrix.jpg
RuledMatrix.jpg (24.65 KiB) Viewed 9014 times
-Doug
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: Line breaks in strings

Post by Josef Templ »

DGDanforth wrote:
Robert wrote:I'm not sure this is so cumbersome & ugly that it justifies a language change.
You are probably right.

So let me shift the discussion to "why did Wirth think it necessary to exclude line breaks?"

Helmut's comment about CR vs CR+ LF may be the reason but each Host would have its
convention that could easily be incorporated into the compiler
I think Helmut pointed out the main reason.
The compiler doesn't know how to handle line feeds
because it depends on the intended usage.
When the string is inserted into a BlackBox TextModels.Model it is always a CR,
when output to the console or an ASCII text file it is either a CR, CR-LF or a LF, for example.

Another reason is the error reporting in the compiler.
An unclosed string may lead to skipping parts of the source text and
to strange follow up errors further below and you don't always see where the
erroneous symbol started.

Another problem is source code indentation, which leads to leading white space in the lines of the string,
or if not acceptable, you must give up source code indentation.

Note, a constant expression like line1 + CR + line2 is evaluated at compile time.
There is no runtime overhead.

- Josef
User avatar
DGDanforth
Posts: 1061
Joined: Tue Sep 17, 2013 1:16 am
Location: Palo Alto, California, USA
Contact:

Re: Line breaks in strings

Post by DGDanforth »

Josef Templ wrote: I think Helmut pointed out the main reason.
The compiler doesn't know how to handle line feeds
Doug wrote:why not?
because it depends on the intended usage.
Doug wrote:How so?
When the string is inserted into a BlackBox TextModels.Model it is always a CR,
Doug wrote:When a line is insterted
when output to the console or an ASCII text file it is either a CR, CR-LF or a LF, for example.
Doug wrote:Why 3 different forms?
Another reason is the error reporting in the compiler.
An unclosed string may lead to skipping parts of the source text and
to strange follow up errors further below and you don't always see where the
erroneous symbol started.
Doug wrote:Yes, and how does that affect line breaks in strings? If one forgets to close a string then you simply get, as it is now, a compiler warning.
Another problem is source code indentation, which leads to leading white space in the lines of the string,
or if not acceptable, you must give up source code indentation.
Doug wrote:For the example considered below with matrices white space is consider a delimiter and removed.
Note, a constant expression like line1 + CR + line2 is evaluated at compile time.
There is no runtime overhead.

- Josef
cfbsoftware
Posts: 204
Joined: Wed Sep 18, 2013 10:06 pm
Contact:

Re: Line breaks in strings

Post by cfbsoftware »

Josef Templ wrote: Another reason is the error reporting in the compiler.
An unclosed string may lead to skipping parts of the source text and
to strange follow up errors further below and you don't always see where the
erroneous symbol started.
Doug wrote:Yes, and how does that affect line breaks in strings? If one forgets to close a string then you simply get, as it is now, a compiler warning.
There wouldn't be a compiler warning as it wouldn't know the string wasn't closed if strings can span several lines. It would swallow everything until the opening quote of the beginning of the next string, which it would think was the closing quote of this string. It would then believe that everything in the next string was code etc. etc. Aaargh!!!
User avatar
DGDanforth
Posts: 1061
Joined: Tue Sep 17, 2013 1:16 am
Location: Palo Alto, California, USA
Contact:

Re: Line breaks in strings

Post by DGDanforth »

cfbsoftware wrote:
Josef Templ wrote: Another reason is the error reporting in the compiler.
An unclosed string may lead to skipping parts of the source text and
to strange follow up errors further below and you don't always see where the
erroneous symbol started.
Doug wrote:Yes, and how does that affect line breaks in strings? If one forgets to close a string then you simply get, as it is now, a compiler warning.
There wouldn't be a compiler warning as it wouldn't know the string wasn't closed if strings can span several lines. It would swallow everything until the opening quote of the beginning of the next string, which it would think was the closing quote of this string. It would then believe that everything in the next string was code etc. etc. Aaargh!!!
Chris,
OK that's a good point.
But we have fudged string processing by restricting strings to not contain line breaks simply so forgetful errors of not closing a string doesn't cause the catastrophe you describe. That feels like an "engineering" solution to a theory. What is a "string"?
-Doug
User avatar
Robert
Posts: 1024
Joined: Sat Sep 28, 2013 11:04 am
Location: Edinburgh, Scotland

Re: Line breaks in strings

Post by Robert »

Josef Templ wrote:Note, a constant expression like line1 + CR + line2 is evaluated at compile time.
There is no runtime overhead.
A useful reminder.
A string "+", evaluated at run-time, adds about 30 - 40 bytes of code, so seems quite expensive (I guess it has to do quite a bit of work), and I try to avoid them. I tend to be quite mean about these things - I forget just how cheap modern memory is.

Likewise the string assignment a := b uses much less code than a := b$. I don't know which is faster.
Doug wrote:Final comment.
That was never likely!
User avatar
Robert
Posts: 1024
Joined: Sat Sep 28, 2013 11:04 am
Location: Edinburgh, Scotland

Re: Line breaks in strings

Post by Robert »

DGDanforth wrote:That feels like an "engineering" solution to a theory. What is a "string"?
I don't think we are talking about strings, and whether they can contain line-feeds (they can).
I also suspect that your example does not require line-feeds; would this work?

Code: Select all

int train_X[6][6] = {{1, 1, 1, 0, 0, 0},{1, 0, 1, 0, 0, 0},{1, 1, 1, 0, 0, 0},{0, 0, 1, 1, 1, 0},{0, 0, 1, 0, 1, 0},{0, 0, 1, 1, 1, 0}};
What we are talking about is how we represent strings in CP source code.
The current current way of making the representation multi-line (which requires breaking the string and using a "+" symbol) is not really too onerous, it seems to me.
The direct alternative has several difficulties as already pointed out.

Incidently, if you use a TextField Control to input a string it has a Multi Line option that allows you to directly input line breaks.
Post Reply