issue-#113 XhtmlExporter bug with StdLinks.Target

User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#113 XhtmlExporter bug with StdLinks.Target

Post by Josef Templ »

Here is a list of open issues. All of them are simple to fix.

1. According to the W3C validator any id (incl. an anchor id) should start with a letter in HTML 4.
So we have to prepend something like "id" or "anchor" or "StdLinks.Target" to the
ids generated for StdLinks.Target views.

2. The document title is currently unknown and a dummy title is hard-coded instead.
The english localization in Xhtml/Strings.odc results in a dummy page title of "New Page".
Instead of hard-coding a meaningless title we should remove the title at all.
Then any browser uses the file name as the page title.
As an example see the converted change list under http://blackboxframework.org/unstable/m ... anges.html, which is shown as "New Page".

3. The changes of XhtmlStdFileWriters.String/Char need to be reverted or improved.
Currently they can produce a TRAP if a string consists of too many 3 byte Unicode characters.
While such a case is hard to find it is also hard to prove that it cannot occur.
In addition and even more important, String does not need to allocate space by means of NEW but can work on the stack.
In contrast, the current version allocates a large number of small strings on the heap.
Actually for every single character in the BB document, an auxiliary string object is allocated!
Either leave the utf-8 conversion as it was (most efficient) or use Strings.StringToUtf8 per character
with a result string as a local variable VAR utf8Str: ARRAY 4 OF CHAR (i.e. on the stack).
This is still much more efficient than the current solution. As an optimization it would of course be possible
to call Strings.StringToUtf8 only if ch > 7FX.

4. The open issues listed under the issues fold in the header of the module XhtmlExporter need to be checked.
Open issues such as the missing extension mechanism should not be removed from this list.

- Josef
Ivan Denisov
Posts: 1700
Joined: Tue Sep 17, 2013 12:21 am
Location: Russia

Re: issue-#113 XhtmlExporter bug with StdLinks.Target

Post by Ivan Denisov »

Agree with 1 and 2 without comments. About the 3rd, I think, that we can use global pointer to some big string, and allocate bigger one if required.

Code: Select all

VAR
utf8str: POINTER TO ARRAY OF SHORTCHAR;
...
IF (LEN(utf8str) < LEN(str$)) THEN NEW(utf8str, LEN(utf8str) * 3 DIV 2) END;
Please, do not return previous WriteChar procedure, we should not duplicate functions where it's possible.

Josef, fill free to add this changes to branch. I think it will be better if you finish the branch that you had started.
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#113 XhtmlExporter bug with StdLinks.Target

Post by Josef Templ »

I will try it.

Is it OK if I also clean up the branch?
I mean, there have been many small commits to this branch leading to a long history of changes.
Cleaning up would mean to create a fresh branch and to put everything into a single commit.

There is also the open issue of encoding external links.

> - may em occur outside of p (as now in table fields)?
Has this been fixed?

- Josef
Ivan Denisov
Posts: 1700
Joined: Tue Sep 17, 2013 12:21 am
Location: Russia

Re: issue-#113 XhtmlExporter bug with StdLinks.Target

Post by Ivan Denisov »

Josef Templ wrote:Is it OK if I also clean up the branch?
I mean, there have been many small commits to this branch leading to a long history of changes.
Cleaning up would mean to create a fresh branch and to put everything into a single commit.
I think yes, you can clean up the branch. It was necessary while testing. Now this steps does not required.
Josef Templ wrote:There is also the open issue of encoding external links.
What is this about? Do not understand.
Josef Templ wrote:> - may em occur outside of p (as now in table fields)?
Has this been fixed?
There was not problem with that. However <em> should not be used. We are using Italic <i> as used in BlackBox menu.
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#113 XhtmlExporter bug with StdLinks.Target

Post by Josef Templ »

> What is this about? Do not understand.

The question was if it is required to include some form of encoding
for the href attribute of an external link.
My experiments show that it is working well without such an encoding.

I am close to a version to commit.
It was even more work than I was fearing.
The module XhtmlExporter is a complicated beast, not something
that one can understand easily. I found another problem with a link
as the very first element in the document. Fixing this led to other
subtle problems and brought me close to a total mess.
In order to get out of that mess I had to redesign/simplify some
procedures and that really helped. The module uses too much code for
more or less trivial stuff. Everything is full with ASSERTs and it is really hard to
see the structure behind it. Ivan, this is of course not your fault but the 1.6 heritage.

I also looked deeply into the undocumented table support and did some tests
after the cleanups.

- Josef
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#113 XhtmlExporter bug with StdLinks.Target

Post by Josef Templ »

I have recreated the branch issue-#113 and committed all the changes in a single step.

This was an awful lot of work but I think the code of XhtmlExporter is
a bit better readable and maintainable now.
It is also a bit more tolerant against pairing/nesting errors in the input text
but it still can produce a TRAP in some cases.

I have of course a backup of the previous branch locally.

- Josef
Ivan Denisov
Posts: 1700
Joined: Tue Sep 17, 2013 12:21 am
Location: Russia

Re: issue-#113 XhtmlExporter bug with StdLinks.Target

Post by Ivan Denisov »

http://validator.w3.org/ saying that <title> is obligatory part of <head> ... maybe there is the way to get Window title from the TextView ? It should be!
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#113 XhtmlExporter bug with StdLinks.Target

Post by Josef Templ »

Right, we need a title indeed (only in HTML 5 it would be allowed).
But where to get it from?

In the change list generator, for example, there is not even an open text view.
Also the exporter's f:Files.File parameter does not have a name yet.

Here are some weird ideas first:

1. pass a name as a global variable (or setter procedure) of XhtmlExporter which
is set to #Xhtml:New Page by default. This would allow to pass arbitrary
titles independent from the file name, but it would only work for cases
where the exporter is called directly from a program and not from the File->Save As menu item.

2. Another option would be to provide access to the name parameter of Converters.Export
by means of an exported global variable or a getter function.
This would mean to use the file name as the title, but it would work for all cases.

3. Use the first line of text in the document as the title.
If there is no text use #Xhtml:New Page. This may work for some cases.


A less weird and more natural approach would be to use the window title as Ivan suggested.
Therefore I need to open a text view in the change list generator and
thereby set a title which can be used in the exporter.
I have tried it out and it works well.
Note: since a title may consist of arbitrary characters it needs to be html encoded.
Otherwise a part of the title could be treated as html markup.
This also holds for attributes such as ids of anchors (StdLinks.Target).
I think that wr.Attr should be changed such that it performs an html encoding
for everything inside the double quotes.
The html encoding is exactly what happens at the level of normal text characters
by means of the entity mapping (XhtmlEntitySets.MapCharToEntity).
The html encoding of attributes is not yet committed. Setting the title from the window
is committed.

- Josef
Last edited by Josef Templ on Tue May 31, 2016 2:29 pm, edited 1 time in total.
Ivan Denisov
Posts: 1700
Joined: Tue Sep 17, 2013 12:21 am
Location: Russia

Re: issue-#113 XhtmlExporter bug with StdLinks.Target

Post by Ivan Denisov »

Josef Templ wrote:A less weird and probably more realistic approach would be to open a text view in
the change list generator and thereby set a title which can be used in the exporter.
I have tried it out and it works well.
This is best way. How did you get window title from TextView?
Ivan Denisov
Posts: 1700
Joined: Tue Sep 17, 2013 12:21 am
Location: Russia

Re: issue-#113 XhtmlExporter bug with StdLinks.Target

Post by Ivan Denisov »

I checked this version:
http://blackboxframework.org/unstable/i ... b1.547.zip
Now it works fine and passes validation tests.
Post Reply