Page 1 of 2
issue-#75 fixing text search with 'Word Begins/Ends With'
Posted: Fri Oct 09, 2015 10:05 am
by Josef Templ
This issue is about several fixes for text search with the option 'Word Begins/Ends With' specified.
For the issue see
http://redmine.blackboxframework.org/issues/75.
A proposal exists in CPC 1.7 rc5 but it addresses only the handling of non-ASCII characters and in a questionable way.
It simply reverses the previous handling of non-ASCII characters as being non terminators which results in finding less occurrences than before. A precise solution in my opinion should take into account if a character is Strings.IsIdent.
The proposal also ignores the anomaly described in the issue:
"searching for 'pattern' in the text ')pattern' does not find it."
This anomaly results from using different rules for left and right terminators.
')' is not a left terminator but only a right terminator.
To me this separation does not give any sense.
Is there any reason for having different left and right terminators?
A possible (untested) replacement for LeftTerminator and RightTerminator may be:
Code: Select all
PROCEDURE IsTerminator (ch: CHAR): BOOLEAN;
BEGIN RETURN ~String.IsIdent(ch);
END IsTerminator;
- Josef
Re: issue-#75 fixig text search with 'Word Begins/Ends With'
Posted: Fri Oct 09, 2015 10:50 am
by Ivan Denisov
I think, here we have one more example of side effect after redundant optimisation.
We can simply remove from
LeftTerminator and
RightTerminator this condition:
And all will works fine for Unicode.
In my assembly I am using:
Code: Select all
PROCEDURE LeftTerminator (ch: CHAR): BOOLEAN;
BEGIN
CASE ch OF
viewcode, tab, line, para, " ",
"(", "[", "{", "=",
hyphen, softhyphen: RETURN TRUE
ELSE
RETURN FALSE
END
END LeftTerminator;
PROCEDURE RightTerminator (ch: CHAR): BOOLEAN;
BEGIN
CASE ch OF
0X, viewcode, tab, line, para, " ",
"!", "(", ")", ",", ".", ":", ";", "?", "[", "]", "{", "}",
hyphen, softhyphen: RETURN TRUE
ELSE
RETURN FALSE
END
END RightTerminator;
Re: issue-#75 fixig text search with 'Word Begins/Ends With'
Posted: Fri Oct 09, 2015 11:09 am
by Josef Templ
Ivan, your comment does not answer the question why we need different left and right terminators.
Unless there is any reason for that it should be removed. The example I have shown should be convincing.
What side effect do you mean and are you talking about my proposal or about yours?
- Josef
Re: issue-#75 fixig text search with 'Word Begins/Ends With'
Posted: Fri Oct 09, 2015 11:48 am
by Ivan Denisov
Josef Templ wrote:Ivan, your comment does not answer the question why we need different left and right terminators.
Unless there is any reason for that it should be removed. The example I have shown should be convincing.
What side effect do you mean and are you talking about my proposal or about yours?
For example, if you need to find module Strings in this code, you can use
Word Begins and pattern String.
Dot here is not left terminator, so Log.String will not match the pattern.
You can not remove "." from terminators set, because it will break logic of searching in texts with sentences.
Re: issue-#75 fixig text search with 'Word Begins/Ends With'
Posted: Fri Oct 09, 2015 11:57 am
by Josef Templ
Sorry Ivan, but your argumentation is wrong.
I do not remove . from the set of terminators but I do add ) to the set of left terminators.
- Josef
Re: issue-#75 fixig text search with 'Word Begins/Ends With'
Posted: Fri Oct 09, 2015 12:32 pm
by Ivan Denisov
I will try to be more clear. Dot should be left terminator, but should be right terminator. So it is impossible to joint this in one procedure.
Re: issue-#75 fixig text search with 'Word Begins/Ends With'
Posted: Fri Oct 09, 2015 12:37 pm
by Josef Templ
Ivan Denisov wrote:I will try to be more clear. Dot should be left terminator, but should be right terminator. So it is impossible to joint this in one procedure.
It is still not clear. Please look at the wording of your comment.
Do you mean "should NOT be left terminator" ?
Please clarify.
- Josef
Re: issue-#75 fixig text search with 'Word Begins/Ends With'
Posted: Fri Oct 09, 2015 12:41 pm
by Ivan Denisov
Josef Templ wrote:Ivan Denisov wrote:I will try to be more clear. Dot should be left terminator, but should be right terminator. So it is impossible to joint this in one procedure.
It is still not clear. Please look at the wording of your comment.
Do you mean "should NOT be left terminator" ?
Please clarify.
- Josef
Right.
Dot should NOT be left terminator, but should be right terminator.
But I do not against adding ")" to left terminators set.
Re: issue-#75 fixig text search with 'Word Begins/Ends With'
Posted: Fri Oct 09, 2015 4:29 pm
by Josef Templ
Dot should of course also be left terminator.
Text search is not only for CP programs and even
for that it does not give much sense.
Is this strange behavior documented somewhere?
What we need is a simple rule that also works well for CP texts.
That's why I would use IsIdent.
This corresponds nicely with the double-click behavior, which does a "word"-selection.
Otherwise the meaning of 'word' is defined in different ways depending on the command.
Ivan, in your proposal is also a side-effect which you may not be aware of.
The terminator behavior of non-ASCII characters is different from the BB1.6 version.
This can lead to big surprises because many non-ASCII characters must be treated
as terminators (graphics, arrows, smileys, etc.).
The IsAlpha property defines that and that is also used in IsIdent.
- Josef
Re: issue-#75 fixig text search with 'Word Begins/Ends With'
Posted: Fri Oct 09, 2015 8:31 pm
by Josef Templ
Here is another interesting observation:
Search in a text 'a..b' for 'b' and choose Option 'Word Begins With'.
You don't find it because it is preceded by a dot.
- Josef