issue-#201 Importer and Exporter for UTF-8 texts
- Josef Templ
- Posts: 2047
- Joined: Tue Sep 17, 2013 6:50 am
issue-#201 Importer and Exporter for UTF-8 texts
I have created an issue for the proposal of Helmut, see https://redmine.blackboxframework.org/issues/201.
One detail question is:
What is the effect of specifying {Converters.importAll}? I cannot see any difference in the behavior.
- Josef
One detail question is:
What is the effect of specifying {Converters.importAll}? I cannot see any difference in the behavior.
- Josef
Re: issue-#201 Importer and Exporter for UTF-8 texts
Read the code of Converters.Import .Josef Templ wrote:What is the effect of specifying {Converters.importAll}? I cannot see any difference in the behavior.
- Josef Templ
- Posts: 2047
- Joined: Tue Sep 17, 2013 6:50 am
Re: issue-#201 Importer and Exporter for UTF-8 texts
OK, in the file open dialog there is no difference because this flag is not used outside Converters.
Inside Converters it is only used for getting a converter for the rare case that there is no
converter specified and no converter registered for the file extension.
Then the very first converter that has this option is chosen.
As such, does it give sense to have multiple converters marked as importAll?
Only the very first in the list is ever taken.
I am asking this because I want a clear picture of possible side effects from the proposed change in the module Config.
- Josef
Inside Converters it is only used for getting a converter for the rare case that there is no
converter specified and no converter registered for the file extension.
Then the very first converter that has this option is chosen.
As such, does it give sense to have multiple converters marked as importAll?
Only the very first in the list is ever taken.
I am asking this because I want a clear picture of possible side effects from the proposed change in the module Config.
- Josef
Re: issue-#201 Importer and Exporter for UTF-8 texts
the first rigister one with {importAll} is valid, so the patch for Config should put in top of the "Setup" to take effect.
please check this code in the Config file:
The "TextViews.View" should changed to "Documents.Document", I think;
please check this code in the Config file:
Code: Select all
Converters.Register("DevBrowser.ImportSymFile", "", "TextViews.View", "osf", {});
Converters.Register("DevBrowser.ImportCodeFile", "", "TextViews.View", "ocf", {});
The "TextViews.View" should changed to "Documents.Document", I think;
- Josef Templ
- Posts: 2047
- Joined: Tue Sep 17, 2013 6:50 am
Re: issue-#201 Importer and Exporter for UTF-8 texts
> The "TextViews.View" should changed to "Documents.Document", I think;
Documents.Document is, ironically, undocumented.
How can it be used if it does not exist officially?
In addition, since TextViews.View works well, why should we change that?
Furthermore, why only for ImportSymFile and ImportCodeFile?
- Josef
Documents.Document is, ironically, undocumented.
How can it be used if it does not exist officially?
In addition, since TextViews.View works well, why should we change that?
Furthermore, why only for ImportSymFile and ImportCodeFile?
- Josef
- Josef Templ
- Posts: 2047
- Joined: Tue Sep 17, 2013 6:50 am
Re: issue-#201 Importer and Exporter for UTF-8 texts
A first draft version is in the branch.
For diffs see https://redmine.blackboxframework.org/p ... ec172f4dc6
It is largely based on Helmut's proposal with some refinements in ImportUtf8:
- if the optional byte order mark (BOM) is found at the beginning of the file, it is skipped.
This is because the BlackBox text editor gets confused when the BOM is imported.
- characters larger than 16 bit are reported as '?'.
This is expected to be very rare but has been added for the sake of completeness.
- in case of finding an illegal encoding it falls back to importing a windows text.
This is an experimental feature that would allow us to use the Utf8 importer also for xml or html texts, I think.
html and xml is often encoded in utf8 but sometimes it is not directly known if it is or not.
Please think about this and give feedback if this is a good idea or not.
- Josef
For diffs see https://redmine.blackboxframework.org/p ... ec172f4dc6
It is largely based on Helmut's proposal with some refinements in ImportUtf8:
- if the optional byte order mark (BOM) is found at the beginning of the file, it is skipped.
This is because the BlackBox text editor gets confused when the BOM is imported.
- characters larger than 16 bit are reported as '?'.
This is expected to be very rare but has been added for the sake of completeness.
- in case of finding an illegal encoding it falls back to importing a windows text.
This is an experimental feature that would allow us to use the Utf8 importer also for xml or html texts, I think.
html and xml is often encoded in utf8 but sometimes it is not directly known if it is or not.
Please think about this and give feedback if this is a good idea or not.
- Josef
Re: issue-#201 Importer and Exporter for UTF-8 texts
Thank you Josef. Your improvement is a perfect solution. On error falling back to text format is a great idea. So I have not to open the same file once again when it is not UTF-8.
- Helmut
- Helmut
Re: issue-#201 Importer and Exporter for UTF-8 texts
ExportUtf8, how about skip views?
Code: Select all
IF (ch # view) & (ch # para) THEN
IF ch = CR THEN WriteChar(wr, LF) ELSE WriteChar(wr, ch) END;
END;
- Josef Templ
- Posts: 2047
- Joined: Tue Sep 17, 2013 6:50 am
Re: issue-#201 Importer and Exporter for UTF-8 texts
Good point. Should be included. Have to look how this is handled in other text importers/exporters.
- Josef
- Josef
- Josef Templ
- Posts: 2047
- Joined: Tue Sep 17, 2013 6:50 am
Re: issue-#201 Importer and Exporter for UTF-8 texts
ExportText also treats TextModels.viewcode and TextModels.para specially, i.e. they are skipped.
But why is TextModels.para skipped? Shouldn't it better be treated as a newline character, or even two?
- Josef
But why is TextModels.para skipped? Shouldn't it better be treated as a newline character, or even two?
- Josef