brainstorming Strings extensions

Post Reply
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

brainstorming Strings extensions

Post by Josef Templ »

The BlackBox Strings module is rather minimal as it is now and there has been some discussion
about possible extensions from time to time.
This topic is intended for collecting ideas and proposals for possible extensions
and to discuss the pros and cons of such extensions.
It is not the goal to do any such extension immediately but rather to trigger some discussion.

- Josef
Zinn
Posts: 476
Joined: Tue Mar 25, 2014 5:56 pm
Location: Frankfurt am Main
Contact:

Re: brainstorming Strings extensions

Post by Zinn »

We have already done some extension:

Code: Select all

	PROCEDURE IsAlpha (ch: CHAR): BOOLEAN;
	PROCEDURE IsAlphaNumeric (ch: CHAR): BOOLEAN;
	PROCEDURE IsIdent (ch: CHAR): BOOLEAN;
	PROCEDURE IsIdentStart (ch: CHAR): BOOLEAN;
	PROCEDURE IsLower (ch: CHAR): BOOLEAN;
	PROCEDURE IsNumeric (ch: CHAR): BOOLEAN;
	PROCEDURE IsUpper (ch: CHAR): BOOLEAN;

	PROCEDURE SetToString (x: SET; OUT str: ARRAY OF CHAR);
	PROCEDURE StringToSet (IN s: ARRAY OF CHAR; OUT x: SET; OUT res: INTEGER);

	PROCEDURE StringToUtf8 (IN in: ARRAY OF CHAR; OUT out: ARRAY OF SHORTCHAR; OUT res: INTEGER);
	PROCEDURE Utf8ToString (IN in: ARRAY OF SHORTCHAR; OUT out: ARRAY OF CHAR; OUT res: INTEGER);
and should minimize the further extension. Currently I miss the following extension:

Code: Select all

	PROCEDURE IndexOf (IN s: ARRAY OF CHAR; c: CHAR): INTEGER;
	PROCEDURE LastIndexOf (IN s: ARRAY OF CHAR; c: CHAR): INTEGER;

	PROCEDURE Trim (VAR s: ARRAY OF CHAR);
- Helmut
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: brainstorming Strings extensions

Post by Josef Templ »

I had a look at the latest ETH Oberon A2 version of the Strings module.
(see https://trac.inf.ethz.ch/trac/lecturers ... trings.Mod).

It also has Trim, IndexOf, and LastIndexOf operations.
In addition, it has TrimLeft, TrimRight and much more.

It has, for example, a very nice and small "Match" function that can do string comparison
with wildcards "*" (a sequence of arbitrary characters) and "?" (a single arbitrary character).

In addition to Strings, it has a module named DynamicStrings which provides
dynamically growing string buffers and hash table based string pools.
DynamicStrings is used e.g. for XML and HTML parsing.

- Josef
User avatar
DGDanforth
Posts: 1061
Joined: Tue Sep 17, 2013 1:16 am
Location: Palo Alto, California, USA
Contact:

Re: brainstorming Strings extensions

Post by DGDanforth »

TYPE
SString = POINTER TO ARRAY OF SHORTCHAR;
String = POINTER TO ARRAY OF CHAR;
PROCEDURE New (IN x: ARRAY OF CHAR): String;
PROCEDURE SNew (IN x: ARRAY OF SHORTCHAR): SString;
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: brainstorming Strings extensions

Post by Josef Templ »

DGDanforth wrote:
TYPE
SString = POINTER TO ARRAY OF SHORTCHAR;
String = POINTER TO ARRAY OF CHAR;
PROCEDURE New (IN x: ARRAY OF CHAR): String;
PROCEDURE SNew (IN x: ARRAY OF SHORTCHAR): SString;
I agree. The types Strings.String and Strings.SString would be good candidates.
The constructor function is named 'NewString' in A2 ('New' + type name).
This naming convention would also give a good name for the SString
constructor (NewSString).

If String (SString) is introduced then the question is if there are any meaningful
operations on it. The only one that comes to my mind is a substring function like

PROCEDURE Substring*(IN s: ARRAY OF CHAR, offset, len: INTEGER): String;

similar to Strings.Extract but returning a String object.

- Josef
User avatar
DGDanforth
Posts: 1061
Joined: Tue Sep 17, 2013 1:16 am
Location: Palo Alto, California, USA
Contact:

Re: brainstorming Strings extensions

Post by DGDanforth »

If String (SString) is introduced then the question is if there are any meaningful
operations on it.
I frequently return a string from a function.
Ivan Denisov
Posts: 1700
Joined: Tue Sep 17, 2013 12:21 am
Location: Russia

Re: brainstorming Strings extensions

Post by Ivan Denisov »

There is Strings module by Ivan Goryachev.

https://blackbox.obertone.ru/extension/Strings

Code: Select all

DEFINITION StringsXml;

	CONST
		dataSize = 64;
		strSize = 256;

	TYPE
		String = POINTER TO LIMITED RECORD 
			len-: INTEGER
		END;

	VAR
		null-: StrPtr;

	PROCEDURE AppendChar (s: String; ch: CHAR);
	PROCEDURE AppendStr (s: String; IN str: ARRAY OF CHAR);
	PROCEDURE CloneStr (s: String): StrPtr;
	PROCEDURE Create (IN str: ARRAY OF CHAR): String;
	PROCEDURE ExtractStr (s: String; start, len: INTEGER): StrPtr;
	PROCEDURE GetChar (s: String; pos: INTEGER): CHAR;
	PROCEDURE GetStr (s: String; VAR str: ARRAY OF CHAR);
	PROCEDURE SetLength (s: String; len: INTEGER);

END StringsXml.
and it's object oriented wrapper:

Code: Select all

DEFINITION StringsDyn;

	TYPE
		DynString = POINTER TO RECORD 
			(ds: DynString) AddChar (ch: CHAR), NEW;
			(ds: DynString) AddString (str: ARRAY OF CHAR), NEW;
			(ds: DynString) Char (idx: INTEGER): CHAR, NEW;
			(ds: DynString) Clear, NEW;
			(ds: DynString) Length (): INTEGER, NEW;
			(ds: DynString) PartAsString (from, to: INTEGER): String, NEW;
			(ds: DynString) SetLength (len: INTEGER), NEW;
			(ds: DynString) String (): String, NEW
		END;

		String = POINTER TO ARRAY OF CHAR;

	PROCEDURE Create (str: ARRAY OF CHAR): DynString;

END StringsDyn.
It is good to keep the original Strings module As Simple As Possible, but develop this or similar Strings extension with all yours experience.
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: brainstorming Strings extensions

Post by Josef Templ »

In my Xml subsystem I also use a similar module for dynamically growing strings.
I called it XmlDStrings. It has dynamically growing strings, string pools, and a string splitter.

All of these is certainly not appropriate for the base module Strings but nevertheless
I agree that it is very general purpose. The foundation is actually coming from ETH Oberon (Aos, A2)
but it needed to be adapted to BlackBox. I am trying to gather some experience in using it in
the Xml subsystem.

The extensions, if any, in Strings must be more basic.
Working on 'ARRAY OF CHAR', not objects.

My current favorites are:
TYPEs String and SString, with NewString/NewSString, and Substring; very cheap.
PROCEDUREs Trim, TrimLeft, TrimRight on ARRAY OF CHAR; very common in all other String libraries in the world.
PROCEDUREs StartsWith, EndsWith on ARRAY OF CHAR; very common in all other String libraries in the world.
PROCEDURE Match for wildcard comparison; because there is a very compact and efficient solution that is
extremely hard to find out yourself.
Something like LastIndexOf (for searching backwards) would be useful for extracting path and file names, for example,
but under Windows there are two separator characters (/ and \), which makes it complicated.

- Josef
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: brainstorming Strings extensions

Post by Josef Templ »

I just read in an Oracle newsletter about new string functions in Java.
They seem to be required because the existing functions are not
defined precisely enough for all Unicode characters.

http://app.response.oracle-mail.com/e/e ... 14&elqat=1.


New Methods on Java String with JDK 11
It appears likely that Java's String class will be gaining some new methods with JDK 11, expected to be released in September 2018.

BUG # BUG TITLE NEW String METHOD DESCRIPTION
JDK-8200425 String::lines lines() "String instance method that uses a specialized Spliterator to lazily provide lines from the source string."
JDK-8200378 String::strip, String::stripLeading, String::stripTrailing strip() "Unicode-aware" evolution of trim()
stripLeading() "removal of Unicode white space from the beginning"
stripTrailing() "removal of Unicode white space from the end"
JDK-8200437 String::isBlank isBlank() "instance method that returns true if the string is empty or contains only white space"
Post Reply