issue-#201 Importer and Exporter for UTF-8 texts

issue-#201 Importer and Exporter for UTF-8 texts

Postby Josef Templ » Fri Aug 09, 2019 3:04 pm

I have created an issue for the proposal of Helmut, see https://redmine.blackboxframework.org/issues/201.

One detail question is:
What is the effect of specifying {Converters.importAll}? I cannot see any difference in the behavior.

- Josef
User avatar
Josef Templ
 
Posts: 1968
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#201 Importer and Exporter for UTF-8 texts

Postby luowy » Fri Aug 09, 2019 5:09 pm

Josef Templ wrote:What is the effect of specifying {Converters.importAll}? I cannot see any difference in the behavior.
Read the code of Converters.Import .
luowy
 
Posts: 195
Joined: Mon Oct 20, 2014 12:52 pm

Re: issue-#201 Importer and Exporter for UTF-8 texts

Postby Josef Templ » Fri Aug 09, 2019 8:03 pm

OK, in the file open dialog there is no difference because this flag is not used outside Converters.

Inside Converters it is only used for getting a converter for the rare case that there is no
converter specified and no converter registered for the file extension.
Then the very first converter that has this option is chosen.

As such, does it give sense to have multiple converters marked as importAll?
Only the very first in the list is ever taken.

I am asking this because I want a clear picture of possible side effects from the proposed change in the module Config.

- Josef
User avatar
Josef Templ
 
Posts: 1968
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#201 Importer and Exporter for UTF-8 texts

Postby luowy » Sat Aug 10, 2019 3:45 am

the first rigister one with {importAll} is valid, so the patch for Config should put in top of the "Setup" to take effect.


please check this code in the Config file:
Code: Select all
      Converters.Register("DevBrowser.ImportSymFile", "", "TextViews.View", "osf", {});
      Converters.Register("DevBrowser.ImportCodeFile", "", "TextViews.View", "ocf", {});

The "TextViews.View" should changed to "Documents.Document", I think;
luowy
 
Posts: 195
Joined: Mon Oct 20, 2014 12:52 pm

Re: issue-#201 Importer and Exporter for UTF-8 texts

Postby Josef Templ » Sat Aug 10, 2019 8:46 pm

> The "TextViews.View" should changed to "Documents.Document", I think;

Documents.Document is, ironically, undocumented.
How can it be used if it does not exist officially?
In addition, since TextViews.View works well, why should we change that?
Furthermore, why only for ImportSymFile and ImportCodeFile?

- Josef
User avatar
Josef Templ
 
Posts: 1968
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#201 Importer and Exporter for UTF-8 texts

Postby Josef Templ » Thu Aug 15, 2019 8:46 am

A first draft version is in the branch.
For diffs see https://redmine.blackboxframework.org/projects/blackbox/repository/diff?utf8=%E2%9C%93&rev=2d2cb269c073b96429fd823701d764dcf5bb0d83&rev_to=cd3c020e77648cc9b37ad4cea7677cec172f4dc6

It is largely based on Helmut's proposal with some refinements in ImportUtf8:

- if the optional byte order mark (BOM) is found at the beginning of the file, it is skipped.
This is because the BlackBox text editor gets confused when the BOM is imported.

- characters larger than 16 bit are reported as '?'.
This is expected to be very rare but has been added for the sake of completeness.

- in case of finding an illegal encoding it falls back to importing a windows text.
This is an experimental feature that would allow us to use the Utf8 importer also for xml or html texts, I think.
html and xml is often encoded in utf8 but sometimes it is not directly known if it is or not.
Please think about this and give feedback if this is a good idea or not.

- Josef
User avatar
Josef Templ
 
Posts: 1968
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#201 Importer and Exporter for UTF-8 texts

Postby Zinn » Fri Aug 16, 2019 6:16 am

Thank you Josef. Your improvement is a perfect solution. On error falling back to text format is a great idea. So I have not to open the same file once again when it is not UTF-8.
- Helmut
Zinn
 
Posts: 466
Joined: Tue Mar 25, 2014 5:56 pm
Location: Frankfurt am Main

Re: issue-#201 Importer and Exporter for UTF-8 texts

Postby luowy » Fri Aug 16, 2019 2:59 pm

ExportUtf8, how about skip views?
Code: Select all
  IF (ch # view) & (ch # para) THEN
    IF ch = CR THEN WriteChar(wr, LF) ELSE WriteChar(wr, ch) END;
  END;
luowy
 
Posts: 195
Joined: Mon Oct 20, 2014 12:52 pm

Re: issue-#201 Importer and Exporter for UTF-8 texts

Postby Josef Templ » Fri Aug 16, 2019 6:51 pm

Good point. Should be included. Have to look how this is handled in other text importers/exporters.

- Josef
User avatar
Josef Templ
 
Posts: 1968
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#201 Importer and Exporter for UTF-8 texts

Postby Josef Templ » Sat Aug 17, 2019 7:15 pm

ExportText also treats TextModels.viewcode and TextModels.para specially, i.e. they are skipped.
But why is TextModels.para skipped? Shouldn't it better be treated as a newline character, or even two?

- Josef
User avatar
Josef Templ
 
Posts: 1968
Joined: Tue Sep 17, 2013 6:50 am

Next

Return to Features

Who is online

Users browsing this forum: No registered users and 1 guest