Page 1 of 2

issue-#75 fixing text search with 'Word Begins/Ends With'

Posted: Fri Oct 09, 2015 10:05 am
by Josef Templ
This issue is about several fixes for text search with the option 'Word Begins/Ends With' specified.
For the issue see http://redmine.blackboxframework.org/issues/75.

A proposal exists in CPC 1.7 rc5 but it addresses only the handling of non-ASCII characters and in a questionable way.
It simply reverses the previous handling of non-ASCII characters as being non terminators which results in finding less occurrences than before. A precise solution in my opinion should take into account if a character is Strings.IsIdent.

The proposal also ignores the anomaly described in the issue:
"searching for 'pattern' in the text ')pattern' does not find it."
This anomaly results from using different rules for left and right terminators.
')' is not a left terminator but only a right terminator.
To me this separation does not give any sense.
Is there any reason for having different left and right terminators?

A possible (untested) replacement for LeftTerminator and RightTerminator may be:

Code: Select all

PROCEDURE IsTerminator (ch: CHAR): BOOLEAN;
BEGIN RETURN ~String.IsIdent(ch);
END IsTerminator;
- Josef

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Posted: Fri Oct 09, 2015 10:50 am
by Ivan Denisov
I think, here we have one more example of side effect after redundant optimisation.

We can simply remove from LeftTerminator and RightTerminator this condition:

Code: Select all

IF ch < 100X THEN
And all will works fine for Unicode.

In my assembly I am using:

Code: Select all

	PROCEDURE LeftTerminator (ch: CHAR): BOOLEAN;
	BEGIN
		CASE ch OF
			viewcode, tab, line, para, " ",
			"(", "[", "{", "=",
			hyphen, softhyphen: RETURN TRUE
		ELSE
			RETURN FALSE
		END
	END LeftTerminator;

	PROCEDURE RightTerminator (ch: CHAR): BOOLEAN;
	BEGIN
		CASE ch OF
			0X, viewcode, tab, line, para, " ",
			"!", "(", ")", ",", ".", ":", ";", "?", "[", "]", "{", "}",
			hyphen, softhyphen: RETURN TRUE
		ELSE
			RETURN FALSE			
		END
	END RightTerminator;

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Posted: Fri Oct 09, 2015 11:09 am
by Josef Templ
Ivan, your comment does not answer the question why we need different left and right terminators.
Unless there is any reason for that it should be removed. The example I have shown should be convincing.

What side effect do you mean and are you talking about my proposal or about yours?

- Josef

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Posted: Fri Oct 09, 2015 11:48 am
by Ivan Denisov
Josef Templ wrote:Ivan, your comment does not answer the question why we need different left and right terminators.
Unless there is any reason for that it should be removed. The example I have shown should be convincing.

What side effect do you mean and are you talking about my proposal or about yours?
For example, if you need to find module Strings in this code, you can use Word Begins and pattern String.

Code: Select all

Log.String

Strings
Dot here is not left terminator, so Log.String will not match the pattern.

You can not remove "." from terminators set, because it will break logic of searching in texts with sentences.

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Posted: Fri Oct 09, 2015 11:57 am
by Josef Templ
Sorry Ivan, but your argumentation is wrong.
I do not remove . from the set of terminators but I do add ) to the set of left terminators.

- Josef

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Posted: Fri Oct 09, 2015 12:32 pm
by Ivan Denisov
I will try to be more clear. Dot should be left terminator, but should be right terminator. So it is impossible to joint this in one procedure.

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Posted: Fri Oct 09, 2015 12:37 pm
by Josef Templ
Ivan Denisov wrote:I will try to be more clear. Dot should be left terminator, but should be right terminator. So it is impossible to joint this in one procedure.
It is still not clear. Please look at the wording of your comment.
Do you mean "should NOT be left terminator" ?

Please clarify.

- Josef

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Posted: Fri Oct 09, 2015 12:41 pm
by Ivan Denisov
Josef Templ wrote:
Ivan Denisov wrote:I will try to be more clear. Dot should be left terminator, but should be right terminator. So it is impossible to joint this in one procedure.
It is still not clear. Please look at the wording of your comment.
Do you mean "should NOT be left terminator" ?

Please clarify.

- Josef
Right.

Dot should NOT be left terminator, but should be right terminator.

But I do not against adding ")" to left terminators set.

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Posted: Fri Oct 09, 2015 4:29 pm
by Josef Templ
Dot should of course also be left terminator.
Text search is not only for CP programs and even
for that it does not give much sense.
Is this strange behavior documented somewhere?

What we need is a simple rule that also works well for CP texts.
That's why I would use IsIdent.

This corresponds nicely with the double-click behavior, which does a "word"-selection.
Otherwise the meaning of 'word' is defined in different ways depending on the command.

Ivan, in your proposal is also a side-effect which you may not be aware of.
The terminator behavior of non-ASCII characters is different from the BB1.6 version.
This can lead to big surprises because many non-ASCII characters must be treated
as terminators (graphics, arrows, smileys, etc.).
The IsAlpha property defines that and that is also used in IsIdent.

- Josef

Re: issue-#75 fixig text search with 'Word Begins/Ends With'

Posted: Fri Oct 09, 2015 8:31 pm
by Josef Templ
Here is another interesting observation:

Search in a text 'a..b' for 'b' and choose Option 'Word Begins With'.
You don't find it because it is preceded by a dot.

- Josef