issue-#133 Encoding/Decoding unicode filenames & paths

Ivan Denisov
Posts: 1700
Joined: Tue Sep 17, 2013 12:21 am
Location: Russia

Re: issue-#133 Encoding/Decoding unicode filenames & paths

Post by Ivan Denisov »

Josef Templ wrote:Ivan, did you try to decode my test file under BB1.6 with Russian Windows?

What are the file names that you get?

This test would show if full Latin-1 support was available in 1.6 and 1.7.
If we want to retain Latin-1 support as it was before, we would have to do a little
more than what Helmut proposed, I think.
Just tried 1.6 with your example. All looks fine.

xxxxxÄÖÜyyyy
xxxxxäöüyyyy2
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#133 Encoding/Decoding unicode filenames & paths

Post by Josef Templ »

Thanks Ivan, this means that there was full Latin-1 (8-bit) support in BB1.6 but not 16-bit Unicode support.

With the current approach if we encode a filename which uses 8-bit characters it will no
longer be possible to decode it with BB1.6 because it will be converted to Utf-8.
Utf-8 modifies all characters above 7 bit ASCII.
With 7 bit ASCII characters it is fully compatible with BB1.6.

I don't know if this is a problem but we should be aware of this fact and make an informed decision.

The latest commit includes the changes proposed by Helmut with minor adaptations:
- initialization of name in ReadHeader
- conversion refactored into separate procedure Utf8ToString
- Utf8ToString conversion as late as possible.

See the diffs at http://redmine.blackboxframework.org/pr ... 8&type=sbs.

- Josef
User avatar
Robert
Posts: 1024
Joined: Sat Sep 28, 2013 11:04 am
Location: Edinburgh, Scotland

Re: issue-#133 Encoding/Decoding unicode filenames & paths

Post by Robert »

Josef Templ wrote:With the current approach if we encode a filename which uses 8-bit characters it will no
longer be possible to decode it with BB1.6 ...

I don't know if this is a problem but we should be aware of this fact and make an informed decision.
I think this is the best approach - Always insisting on both forward & backward compatibility limits progress.
Zinn
Posts: 476
Joined: Tue Mar 25, 2014 5:56 pm
Location: Frankfurt am Main
Contact:

Re: issue-#133 Encoding/Decoding unicode filenames & paths

Post by Zinn »

Josef’s solution is genial. He always impoves it in the right way.
- Helmut
User avatar
Josef Templ
Posts: 2047
Joined: Tue Sep 17, 2013 6:50 am

Re: issue-#133 Encoding/Decoding unicode filenames & paths

Post by Josef Templ »

For me the current solution is compatible enough.
Anything more compatible would require a new version tag, I think, and is
more complicated.

So I am ready to vote for that issue.

- Josef
User avatar
DGDanforth
Posts: 1061
Joined: Tue Sep 17, 2013 1:16 am
Location: Palo Alto, California, USA
Contact:

Re: issue-#133 Encoding/Decoding unicode filenames & paths

Post by DGDanforth »

Vote has been created.
Post Reply