issue-#75 fixing text search with 'Word Begins/Ends With'

User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

issue-#75 fixing text search with 'Word Begins/Ends With'

Post by Josef Templ »

This issue is about several fixes for text search with the option 'Word Begins/Ends With' specified.
For the issue see http://redmine.blackboxframework.org/issues/75.

A proposal exists in CPC 1.7 rc5 but it addresses only the handling of non-ASCII characters and in a questionable way.
It simply reverses the previous handling of non-ASCII characters as being non terminators which results in finding less occurrences than before. A precise solution in my opinion should take into account if a character is Strings.IsIdent.

The proposal also ignores the anomaly described in the issue:
"searching for 'pattern' in the text ')pattern' does not find it."
This anomaly results from using different rules for left and right terminators.
')' is not a left terminator but only a right terminator.
To me this separation does not give any sense.
Is there any reason for having different left and right terminators?

A possible (untested) replacement for LeftTerminator and RightTerminator may be:

Code: Select all

PROCEDURE IsTerminator (ch: CHAR): BOOLEAN;
BEGIN RETURN ~String.IsIdent(ch);
END IsTerminator;
- Josef
Last edited by Josef Templ on Wed Oct 14, 2015 2:52 pm, edited 1 time in total.
Ivan Denisov
Posts: 1700
Joined: Tue Sep 17, 2013 12:21 am
Location: Russia

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Post by Ivan Denisov »

I think, here we have one more example of side effect after redundant optimisation.

We can simply remove from LeftTerminator and RightTerminator this condition:

Code: Select all

IF ch < 100X THEN
And all will works fine for Unicode.

In my assembly I am using:

Code: Select all

	PROCEDURE LeftTerminator (ch: CHAR): BOOLEAN;
	BEGIN
		CASE ch OF
			viewcode, tab, line, para, " ",
			"(", "[", "{", "=",
			hyphen, softhyphen: RETURN TRUE
		ELSE
			RETURN FALSE
		END
	END LeftTerminator;

	PROCEDURE RightTerminator (ch: CHAR): BOOLEAN;
	BEGIN
		CASE ch OF
			0X, viewcode, tab, line, para, " ",
			"!", "(", ")", ",", ".", ":", ";", "?", "[", "]", "{", "}",
			hyphen, softhyphen: RETURN TRUE
		ELSE
			RETURN FALSE			
		END
	END RightTerminator;
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Post by Josef Templ »

Ivan, your comment does not answer the question why we need different left and right terminators.
Unless there is any reason for that it should be removed. The example I have shown should be convincing.

What side effect do you mean and are you talking about my proposal or about yours?

- Josef
Ivan Denisov
Posts: 1700
Joined: Tue Sep 17, 2013 12:21 am
Location: Russia

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Post by Ivan Denisov »

Josef Templ wrote:Ivan, your comment does not answer the question why we need different left and right terminators.
Unless there is any reason for that it should be removed. The example I have shown should be convincing.

What side effect do you mean and are you talking about my proposal or about yours?
For example, if you need to find module Strings in this code, you can use Word Begins and pattern String.

Code: Select all

Log.String

Strings
Dot here is not left terminator, so Log.String will not match the pattern.

You can not remove "." from terminators set, because it will break logic of searching in texts with sentences.
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Post by Josef Templ »

Sorry Ivan, but your argumentation is wrong.
I do not remove . from the set of terminators but I do add ) to the set of left terminators.

- Josef
Ivan Denisov
Posts: 1700
Joined: Tue Sep 17, 2013 12:21 am
Location: Russia

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Post by Ivan Denisov »

I will try to be more clear. Dot should be left terminator, but should be right terminator. So it is impossible to joint this in one procedure.
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Post by Josef Templ »

Ivan Denisov wrote:I will try to be more clear. Dot should be left terminator, but should be right terminator. So it is impossible to joint this in one procedure.
It is still not clear. Please look at the wording of your comment.
Do you mean "should NOT be left terminator" ?

Please clarify.

- Josef
Ivan Denisov
Posts: 1700
Joined: Tue Sep 17, 2013 12:21 am
Location: Russia

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Post by Ivan Denisov »

Josef Templ wrote:
Ivan Denisov wrote:I will try to be more clear. Dot should be left terminator, but should be right terminator. So it is impossible to joint this in one procedure.
It is still not clear. Please look at the wording of your comment.
Do you mean "should NOT be left terminator" ?

Please clarify.

- Josef
Right.

Dot should NOT be left terminator, but should be right terminator.

But I do not against adding ")" to left terminators set.
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Post by Josef Templ »

Dot should of course also be left terminator.
Text search is not only for CP programs and even
for that it does not give much sense.
Is this strange behavior documented somewhere?

What we need is a simple rule that also works well for CP texts.
That's why I would use IsIdent.

This corresponds nicely with the double-click behavior, which does a "word"-selection.
Otherwise the meaning of 'word' is defined in different ways depending on the command.

Ivan, in your proposal is also a side-effect which you may not be aware of.
The terminator behavior of non-ASCII characters is different from the BB1.6 version.
This can lead to big surprises because many non-ASCII characters must be treated
as terminators (graphics, arrows, smileys, etc.).
The IsAlpha property defines that and that is also used in IsIdent.

- Josef
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Post by Josef Templ »

Here is another interesting observation:

Search in a text 'a..b' for 'b' and choose Option 'Word Begins With'.
You don't find it because it is preceded by a dot.

- Josef
Post Reply