brainstorming Strings extensions

brainstorming Strings extensions

Postby Josef Templ » Tue Mar 20, 2018 1:26 pm

The BlackBox Strings module is rather minimal as it is now and there has been some discussion
about possible extensions from time to time.
This topic is intended for collecting ideas and proposals for possible extensions
and to discuss the pros and cons of such extensions.
It is not the goal to do any such extension immediately but rather to trigger some discussion.

- Josef
User avatar
Josef Templ
 
Posts: 1952
Joined: Tue Sep 17, 2013 6:50 am

Re: brainstorming Strings extensions

Postby Zinn » Thu Mar 22, 2018 9:31 am

We have already done some extension:

Code: Select all
   PROCEDURE IsAlpha (ch: CHAR): BOOLEAN;
   PROCEDURE IsAlphaNumeric (ch: CHAR): BOOLEAN;
   PROCEDURE IsIdent (ch: CHAR): BOOLEAN;
   PROCEDURE IsIdentStart (ch: CHAR): BOOLEAN;
   PROCEDURE IsLower (ch: CHAR): BOOLEAN;
   PROCEDURE IsNumeric (ch: CHAR): BOOLEAN;
   PROCEDURE IsUpper (ch: CHAR): BOOLEAN;

   PROCEDURE SetToString (x: SET; OUT str: ARRAY OF CHAR);
   PROCEDURE StringToSet (IN s: ARRAY OF CHAR; OUT x: SET; OUT res: INTEGER);

   PROCEDURE StringToUtf8 (IN in: ARRAY OF CHAR; OUT out: ARRAY OF SHORTCHAR; OUT res: INTEGER);
   PROCEDURE Utf8ToString (IN in: ARRAY OF SHORTCHAR; OUT out: ARRAY OF CHAR; OUT res: INTEGER);


and should minimize the further extension. Currently I miss the following extension:

Code: Select all
   PROCEDURE IndexOf (IN s: ARRAY OF CHAR; c: CHAR): INTEGER;
   PROCEDURE LastIndexOf (IN s: ARRAY OF CHAR; c: CHAR): INTEGER;

   PROCEDURE Trim (VAR s: ARRAY OF CHAR);

- Helmut
Zinn
 
Posts: 463
Joined: Tue Mar 25, 2014 5:56 pm
Location: Frankfurt am Main

Re: brainstorming Strings extensions

Postby Josef Templ » Thu Mar 22, 2018 11:59 am

I had a look at the latest ETH Oberon A2 version of the Strings module.
(see https://trac.inf.ethz.ch/trac/lecturers/a2/browser/trunk/source/Strings.Mod).

It also has Trim, IndexOf, and LastIndexOf operations.
In addition, it has TrimLeft, TrimRight and much more.

It has, for example, a very nice and small "Match" function that can do string comparison
with wildcards "*" (a sequence of arbitrary characters) and "?" (a single arbitrary character).

In addition to Strings, it has a module named DynamicStrings which provides
dynamically growing string buffers and hash table based string pools.
DynamicStrings is used e.g. for XML and HTML parsing.

- Josef
User avatar
Josef Templ
 
Posts: 1952
Joined: Tue Sep 17, 2013 6:50 am

Re: brainstorming Strings extensions

Postby DGDanforth » Tue May 01, 2018 4:05 am

TYPE
SString = POINTER TO ARRAY OF SHORTCHAR;
String = POINTER TO ARRAY OF CHAR;
PROCEDURE New (IN x: ARRAY OF CHAR): String;
PROCEDURE SNew (IN x: ARRAY OF SHORTCHAR): SString;

User avatar
DGDanforth
 
Posts: 1060
Joined: Tue Sep 17, 2013 1:16 am
Location: Palo Alto, California, USA

Re: brainstorming Strings extensions

Postby Josef Templ » Wed May 02, 2018 5:04 am

DGDanforth wrote:
TYPE
SString = POINTER TO ARRAY OF SHORTCHAR;
String = POINTER TO ARRAY OF CHAR;
PROCEDURE New (IN x: ARRAY OF CHAR): String;
PROCEDURE SNew (IN x: ARRAY OF SHORTCHAR): SString;



I agree. The types Strings.String and Strings.SString would be good candidates.
The constructor function is named 'NewString' in A2 ('New' + type name).
This naming convention would also give a good name for the SString
constructor (NewSString).

If String (SString) is introduced then the question is if there are any meaningful
operations on it. The only one that comes to my mind is a substring function like

PROCEDURE Substring*(IN s: ARRAY OF CHAR, offset, len: INTEGER): String;

similar to Strings.Extract but returning a String object.

- Josef
User avatar
Josef Templ
 
Posts: 1952
Joined: Tue Sep 17, 2013 6:50 am

Re: brainstorming Strings extensions

Postby DGDanforth » Wed May 02, 2018 6:42 pm

If String (SString) is introduced then the question is if there are any meaningful
operations on it.

I frequently return a string from a function.
User avatar
DGDanforth
 
Posts: 1060
Joined: Tue Sep 17, 2013 1:16 am
Location: Palo Alto, California, USA

Re: brainstorming Strings extensions

Postby Ivan Denisov » Thu May 03, 2018 8:28 am

There is Strings module by Ivan Goryachev.

https://blackbox.obertone.ru/extension/Strings

Code: Select all
DEFINITION StringsXml;

   CONST
      dataSize = 64;
      strSize = 256;

   TYPE
      String = POINTER TO LIMITED RECORD
         len-: INTEGER
      END;

   VAR
      null-: StrPtr;

   PROCEDURE AppendChar (s: String; ch: CHAR);
   PROCEDURE AppendStr (s: String; IN str: ARRAY OF CHAR);
   PROCEDURE CloneStr (s: String): StrPtr;
   PROCEDURE Create (IN str: ARRAY OF CHAR): String;
   PROCEDURE ExtractStr (s: String; start, len: INTEGER): StrPtr;
   PROCEDURE GetChar (s: String; pos: INTEGER): CHAR;
   PROCEDURE GetStr (s: String; VAR str: ARRAY OF CHAR);
   PROCEDURE SetLength (s: String; len: INTEGER);

END StringsXml.


and it's object oriented wrapper:
Code: Select all
DEFINITION StringsDyn;

   TYPE
      DynString = POINTER TO RECORD
         (ds: DynString) AddChar (ch: CHAR), NEW;
         (ds: DynString) AddString (str: ARRAY OF CHAR), NEW;
         (ds: DynString) Char (idx: INTEGER): CHAR, NEW;
         (ds: DynString) Clear, NEW;
         (ds: DynString) Length (): INTEGER, NEW;
         (ds: DynString) PartAsString (from, to: INTEGER): String, NEW;
         (ds: DynString) SetLength (len: INTEGER), NEW;
         (ds: DynString) String (): String, NEW
      END;

      String = POINTER TO ARRAY OF CHAR;

   PROCEDURE Create (str: ARRAY OF CHAR): DynString;

END StringsDyn.


It is good to keep the original Strings module As Simple As Possible, but develop this or similar Strings extension with all yours experience.
Ivan Denisov
 
Posts: 1690
Joined: Tue Sep 17, 2013 12:21 am
Location: Russia

Re: brainstorming Strings extensions

Postby Josef Templ » Fri May 04, 2018 2:09 pm

In my Xml subsystem I also use a similar module for dynamically growing strings.
I called it XmlDStrings. It has dynamically growing strings, string pools, and a string splitter.

All of these is certainly not appropriate for the base module Strings but nevertheless
I agree that it is very general purpose. The foundation is actually coming from ETH Oberon (Aos, A2)
but it needed to be adapted to BlackBox. I am trying to gather some experience in using it in
the Xml subsystem.

The extensions, if any, in Strings must be more basic.
Working on 'ARRAY OF CHAR', not objects.

My current favorites are:
TYPEs String and SString, with NewString/NewSString, and Substring; very cheap.
PROCEDUREs Trim, TrimLeft, TrimRight on ARRAY OF CHAR; very common in all other String libraries in the world.
PROCEDUREs StartsWith, EndsWith on ARRAY OF CHAR; very common in all other String libraries in the world.
PROCEDURE Match for wildcard comparison; because there is a very compact and efficient solution that is
extremely hard to find out yourself.
Something like LastIndexOf (for searching backwards) would be useful for extracting path and file names, for example,
but under Windows there are two separator characters (/ and \), which makes it complicated.

- Josef
User avatar
Josef Templ
 
Posts: 1952
Joined: Tue Sep 17, 2013 6:50 am

Re: brainstorming Strings extensions

Postby Josef Templ » Mon May 14, 2018 8:06 pm

I just read in an Oracle newsletter about new string functions in Java.
They seem to be required because the existing functions are not
defined precisely enough for all Unicode characters.

http://app.response.oracle-mail.com/e/er?elq_mid=113114&sh=171282221722141115150114224&cmid=WWMK170418P00047&s=1973398186&lid=292124&elqTrackId=b638bb6c745448dd98e6d68ccb85ef05&elq=847fbd82be214cfa902f1e884b664ce1&elqaid=113114&elqat=1.


New Methods on Java String with JDK 11
It appears likely that Java's String class will be gaining some new methods with JDK 11, expected to be released in September 2018.

BUG # BUG TITLE NEW String METHOD DESCRIPTION
JDK-8200425 String::lines lines() "String instance method that uses a specialized Spliterator to lazily provide lines from the source string."
JDK-8200378 String::strip, String::stripLeading, String::stripTrailing strip() "Unicode-aware" evolution of trim()
stripLeading() "removal of Unicode white space from the beginning"
stripTrailing() "removal of Unicode white space from the end"
JDK-8200437 String::isBlank isBlank() "instance method that returns true if the string is empty or contains only white space"
User avatar
Josef Templ
 
Posts: 1952
Joined: Tue Sep 17, 2013 6:50 am


Return to Discussion

Who is online

Users browsing this forum: No registered users and 1 guest

cron