brainstorming Strings extensions
- Josef Templ
- Posts: 2047
- Joined: Tue Sep 17, 2013 6:50 am
brainstorming Strings extensions
The BlackBox Strings module is rather minimal as it is now and there has been some discussion
about possible extensions from time to time.
This topic is intended for collecting ideas and proposals for possible extensions
and to discuss the pros and cons of such extensions.
It is not the goal to do any such extension immediately but rather to trigger some discussion.
- Josef
about possible extensions from time to time.
This topic is intended for collecting ideas and proposals for possible extensions
and to discuss the pros and cons of such extensions.
It is not the goal to do any such extension immediately but rather to trigger some discussion.
- Josef
Re: brainstorming Strings extensions
We have already done some extension:
and should minimize the further extension. Currently I miss the following extension:
- Helmut
Code: Select all
PROCEDURE IsAlpha (ch: CHAR): BOOLEAN;
PROCEDURE IsAlphaNumeric (ch: CHAR): BOOLEAN;
PROCEDURE IsIdent (ch: CHAR): BOOLEAN;
PROCEDURE IsIdentStart (ch: CHAR): BOOLEAN;
PROCEDURE IsLower (ch: CHAR): BOOLEAN;
PROCEDURE IsNumeric (ch: CHAR): BOOLEAN;
PROCEDURE IsUpper (ch: CHAR): BOOLEAN;
PROCEDURE SetToString (x: SET; OUT str: ARRAY OF CHAR);
PROCEDURE StringToSet (IN s: ARRAY OF CHAR; OUT x: SET; OUT res: INTEGER);
PROCEDURE StringToUtf8 (IN in: ARRAY OF CHAR; OUT out: ARRAY OF SHORTCHAR; OUT res: INTEGER);
PROCEDURE Utf8ToString (IN in: ARRAY OF SHORTCHAR; OUT out: ARRAY OF CHAR; OUT res: INTEGER);
Code: Select all
PROCEDURE IndexOf (IN s: ARRAY OF CHAR; c: CHAR): INTEGER;
PROCEDURE LastIndexOf (IN s: ARRAY OF CHAR; c: CHAR): INTEGER;
PROCEDURE Trim (VAR s: ARRAY OF CHAR);
- Josef Templ
- Posts: 2047
- Joined: Tue Sep 17, 2013 6:50 am
Re: brainstorming Strings extensions
I had a look at the latest ETH Oberon A2 version of the Strings module.
(see https://trac.inf.ethz.ch/trac/lecturers ... trings.Mod).
It also has Trim, IndexOf, and LastIndexOf operations.
In addition, it has TrimLeft, TrimRight and much more.
It has, for example, a very nice and small "Match" function that can do string comparison
with wildcards "*" (a sequence of arbitrary characters) and "?" (a single arbitrary character).
In addition to Strings, it has a module named DynamicStrings which provides
dynamically growing string buffers and hash table based string pools.
DynamicStrings is used e.g. for XML and HTML parsing.
- Josef
(see https://trac.inf.ethz.ch/trac/lecturers ... trings.Mod).
It also has Trim, IndexOf, and LastIndexOf operations.
In addition, it has TrimLeft, TrimRight and much more.
It has, for example, a very nice and small "Match" function that can do string comparison
with wildcards "*" (a sequence of arbitrary characters) and "?" (a single arbitrary character).
In addition to Strings, it has a module named DynamicStrings which provides
dynamically growing string buffers and hash table based string pools.
DynamicStrings is used e.g. for XML and HTML parsing.
- Josef
- DGDanforth
- Posts: 1061
- Joined: Tue Sep 17, 2013 1:16 am
- Location: Palo Alto, California, USA
- Contact:
Re: brainstorming Strings extensions
TYPE
SString = POINTER TO ARRAY OF SHORTCHAR;
String = POINTER TO ARRAY OF CHAR;
PROCEDURE New (IN x: ARRAY OF CHAR): String;
PROCEDURE SNew (IN x: ARRAY OF SHORTCHAR): SString;
- Josef Templ
- Posts: 2047
- Joined: Tue Sep 17, 2013 6:50 am
Re: brainstorming Strings extensions
I agree. The types Strings.String and Strings.SString would be good candidates.DGDanforth wrote:TYPE
SString = POINTER TO ARRAY OF SHORTCHAR;
String = POINTER TO ARRAY OF CHAR;
PROCEDURE New (IN x: ARRAY OF CHAR): String;
PROCEDURE SNew (IN x: ARRAY OF SHORTCHAR): SString;
The constructor function is named 'NewString' in A2 ('New' + type name).
This naming convention would also give a good name for the SString
constructor (NewSString).
If String (SString) is introduced then the question is if there are any meaningful
operations on it. The only one that comes to my mind is a substring function like
PROCEDURE Substring*(IN s: ARRAY OF CHAR, offset, len: INTEGER): String;
similar to Strings.Extract but returning a String object.
- Josef
- DGDanforth
- Posts: 1061
- Joined: Tue Sep 17, 2013 1:16 am
- Location: Palo Alto, California, USA
- Contact:
Re: brainstorming Strings extensions
I frequently return a string from a function.If String (SString) is introduced then the question is if there are any meaningful
operations on it.
-
- Posts: 1700
- Joined: Tue Sep 17, 2013 12:21 am
- Location: Russia
Re: brainstorming Strings extensions
There is Strings module by Ivan Goryachev.
https://blackbox.obertone.ru/extension/Strings
and it's object oriented wrapper:
It is good to keep the original Strings module As Simple As Possible, but develop this or similar Strings extension with all yours experience.
https://blackbox.obertone.ru/extension/Strings
Code: Select all
DEFINITION StringsXml;
CONST
dataSize = 64;
strSize = 256;
TYPE
String = POINTER TO LIMITED RECORD
len-: INTEGER
END;
VAR
null-: StrPtr;
PROCEDURE AppendChar (s: String; ch: CHAR);
PROCEDURE AppendStr (s: String; IN str: ARRAY OF CHAR);
PROCEDURE CloneStr (s: String): StrPtr;
PROCEDURE Create (IN str: ARRAY OF CHAR): String;
PROCEDURE ExtractStr (s: String; start, len: INTEGER): StrPtr;
PROCEDURE GetChar (s: String; pos: INTEGER): CHAR;
PROCEDURE GetStr (s: String; VAR str: ARRAY OF CHAR);
PROCEDURE SetLength (s: String; len: INTEGER);
END StringsXml.
Code: Select all
DEFINITION StringsDyn;
TYPE
DynString = POINTER TO RECORD
(ds: DynString) AddChar (ch: CHAR), NEW;
(ds: DynString) AddString (str: ARRAY OF CHAR), NEW;
(ds: DynString) Char (idx: INTEGER): CHAR, NEW;
(ds: DynString) Clear, NEW;
(ds: DynString) Length (): INTEGER, NEW;
(ds: DynString) PartAsString (from, to: INTEGER): String, NEW;
(ds: DynString) SetLength (len: INTEGER), NEW;
(ds: DynString) String (): String, NEW
END;
String = POINTER TO ARRAY OF CHAR;
PROCEDURE Create (str: ARRAY OF CHAR): DynString;
END StringsDyn.
- Josef Templ
- Posts: 2047
- Joined: Tue Sep 17, 2013 6:50 am
Re: brainstorming Strings extensions
In my Xml subsystem I also use a similar module for dynamically growing strings.
I called it XmlDStrings. It has dynamically growing strings, string pools, and a string splitter.
All of these is certainly not appropriate for the base module Strings but nevertheless
I agree that it is very general purpose. The foundation is actually coming from ETH Oberon (Aos, A2)
but it needed to be adapted to BlackBox. I am trying to gather some experience in using it in
the Xml subsystem.
The extensions, if any, in Strings must be more basic.
Working on 'ARRAY OF CHAR', not objects.
My current favorites are:
TYPEs String and SString, with NewString/NewSString, and Substring; very cheap.
PROCEDUREs Trim, TrimLeft, TrimRight on ARRAY OF CHAR; very common in all other String libraries in the world.
PROCEDUREs StartsWith, EndsWith on ARRAY OF CHAR; very common in all other String libraries in the world.
PROCEDURE Match for wildcard comparison; because there is a very compact and efficient solution that is
extremely hard to find out yourself.
Something like LastIndexOf (for searching backwards) would be useful for extracting path and file names, for example,
but under Windows there are two separator characters (/ and \), which makes it complicated.
- Josef
I called it XmlDStrings. It has dynamically growing strings, string pools, and a string splitter.
All of these is certainly not appropriate for the base module Strings but nevertheless
I agree that it is very general purpose. The foundation is actually coming from ETH Oberon (Aos, A2)
but it needed to be adapted to BlackBox. I am trying to gather some experience in using it in
the Xml subsystem.
The extensions, if any, in Strings must be more basic.
Working on 'ARRAY OF CHAR', not objects.
My current favorites are:
TYPEs String and SString, with NewString/NewSString, and Substring; very cheap.
PROCEDUREs Trim, TrimLeft, TrimRight on ARRAY OF CHAR; very common in all other String libraries in the world.
PROCEDUREs StartsWith, EndsWith on ARRAY OF CHAR; very common in all other String libraries in the world.
PROCEDURE Match for wildcard comparison; because there is a very compact and efficient solution that is
extremely hard to find out yourself.
Something like LastIndexOf (for searching backwards) would be useful for extracting path and file names, for example,
but under Windows there are two separator characters (/ and \), which makes it complicated.
- Josef
- Josef Templ
- Posts: 2047
- Joined: Tue Sep 17, 2013 6:50 am
Re: brainstorming Strings extensions
I just read in an Oracle newsletter about new string functions in Java.
They seem to be required because the existing functions are not
defined precisely enough for all Unicode characters.
http://app.response.oracle-mail.com/e/e ... 14&elqat=1.
New Methods on Java String with JDK 11
It appears likely that Java's String class will be gaining some new methods with JDK 11, expected to be released in September 2018.
BUG # BUG TITLE NEW String METHOD DESCRIPTION
JDK-8200425 String::lines lines() "String instance method that uses a specialized Spliterator to lazily provide lines from the source string."
JDK-8200378 String::strip, String::stripLeading, String::stripTrailing strip() "Unicode-aware" evolution of trim()
stripLeading() "removal of Unicode white space from the beginning"
stripTrailing() "removal of Unicode white space from the end"
JDK-8200437 String::isBlank isBlank() "instance method that returns true if the string is empty or contains only white space"
They seem to be required because the existing functions are not
defined precisely enough for all Unicode characters.
http://app.response.oracle-mail.com/e/e ... 14&elqat=1.
New Methods on Java String with JDK 11
It appears likely that Java's String class will be gaining some new methods with JDK 11, expected to be released in September 2018.
BUG # BUG TITLE NEW String METHOD DESCRIPTION
JDK-8200425 String::lines lines() "String instance method that uses a specialized Spliterator to lazily provide lines from the source string."
JDK-8200378 String::strip, String::stripLeading, String::stripTrailing strip() "Unicode-aware" evolution of trim()
stripLeading() "removal of Unicode white space from the beginning"
stripTrailing() "removal of Unicode white space from the end"
JDK-8200437 String::isBlank isBlank() "instance method that returns true if the string is empty or contains only white space"