issue-#201 Importer and Exporter for UTF-8 texts

Merged to the master branch
luowy
Posts: 234
Joined: Mon Oct 20, 2014 12:52 pm

Re: issue-#201 Importer and Exporter for UTF-8 texts

Post by luowy »

for views, no need; considering rtf2odc , a lot of rulers will be added, e.g. bolding a word text will add begin ruler and end ruler, if they convert them to lines, ....

for para,I rarely use it, not familiar with it.
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#201 Importer and Exporter for UTF-8 texts

Post by Josef Templ »

added as in other exporters:

Code: Select all

IF (ch # TextModels.viewcode) & (ch # TextModels.para) THEN
Improving the treatment of para or rulers is a cross cutting change that applies also to other exporters.
It is not a special problem of the utf-8 text exporter, so it is not optimized in any way.
This could be another issue, but so far nobody has ever found this to be a relevant topic.

For the diffs see https://redmine.blackboxframework.org/p ... dcf5bb0d83.

- Josef
Zinn
Posts: 476
Joined: Tue Mar 25, 2014 5:56 pm
Location: Frankfurt am Main
Contact:

Re: issue-#201 Importer and Exporter for UTF-8 texts

Post by Zinn »

Josef Templ wrote:added as in other exporters:

Code: Select all

IF (ch # TextModels.viewcode) & (ch # TextModels.para) THEN
Josef,
yes it is an other topic. To get the same behaviour as the text exporter we need this line.

But what happens without this line?
The function of para is kept. It is not lost. Do we need a translation of para to an empty line?
The viewcode is still lost. You see that their was a view (02X). It may be a problem to have the control character in the result.

Normally this question is not relevant, because you can edit the text before you save it as you would like to have.
- Helmut
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#201 Importer and Exporter for UTF-8 texts

Post by Josef Templ »

Zinn wrote: But what happens without this line?
The function of para is kept. It is not lost. Do we need a translation of para to an empty line?
The viewcode is still lost. You see that their was a view (02X). It may be a problem to have the control character in the result.

Normally this question is not relevant, because you can edit the text before you save it as you would like to have.
- Helmut
I assume this question refers to that other issue, i.e. changing the export of viewcode and para in ALL text exporters, right?

A simple strategy would be to treat para as newline and to treat rulers also as newline.
Other views would probably have to be ignored as it is done now.
Inserting viewcode into the output text is not OK, I think, because it is a very special control character
that only causes problems with text editors.

Regarding the usage of utf8 for xml and html importers is definitely worth looking at, I think.
I just tried to export a text with extended ASCII characters to xhtml and it ends up as utf8.
When opening it with the text importer, as it is the default now, the extended characters are garbage.
When opening it as utf8, everything works fine. So this would remove an asymmetry between exporting
and importing an html file.

I have changed this in the Config module.
For the changes see https://redmine.blackboxframework.org/p ... 9841f6448b.

- Josef
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#201 Importer and Exporter for UTF-8 texts

Post by Josef Templ »

If there are no more remarks I think we can vote about this issue.

Here is the link to all the changes:
https://redmine.blackboxframework.org/p ... ec172f4dc6

- Josef
Zinn
Posts: 476
Joined: Tue Mar 25, 2014 5:56 pm
Location: Frankfurt am Main
Contact:

Re: issue-#201 Importer and Exporter for UTF-8 texts

Post by Zinn »

Thank you Josef, your last change in config is a create improvement.
Josef Templ wrote: A simple strategy would be to treat para as newline and to treat rulers also as newline.
Other views would probably have to be ignored as it is done now.
Inserting viewcode into the output text is not OK, I think, because it is a very special control character
that only causes problems with text editors.
Why don't you add this changes into the UTF8 Exporter? There is no reason to obey the rules of the txt Exporter.

- Helmut
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#201 Importer and Exporter for UTF-8 texts

Post by Josef Templ »

Zinn wrote:Thank you Josef, your last change in config is a create improvement.

Why don't you add this changes into the UTF8 Exporter? There is no reason to obey the rules of the txt Exporter.

- Helmut
The para/view/ruler treatment is an orthogonal issue that applies to all text exporters not only to utf8.
It would be a mixture of unrelated changes then in #201.

But there is no point against adding that other issue before we make the release.

- Josef
Post Reply