[wx-dev] UTF-8 development plans

Julian Smart julian at anthemion.co.uk
Mon Mar 5 09:52:34 PST 2007


Hi Vadim,

Vadim Zeitlin wrote:
>  Hello,
>
>  I've updated
>
> http://www.wxwidgets.org/wiki/index.php/Development:_UTF-8_Support
>   
Many thanks for the notes, and best of luck with the implementation!
> with the latest thoughts about what we intend to do and how and I hope that
> this is really the final version of the plan. So it's time to start
> implementing it. We can start with some minor changes but relatively
> quickly big changes (e.g. changing the return type of c_str()) would need
> to be done and from then there probably won't be any way to make another
> release until this work is fully finished. To be honest, I don't think we
> are going to make such release any time soon anyhow but I still propose to
> create a WX_2_10_BRANCH right now just in case.
>   
Right.
>  I also think that the changes in the wx API will be so extensive that it
> will make sense to call the next wx version 3.0 and not 2.10 or 2.12. What
> do you think?
>   
Sounds fine, we've been on version 2.x for so long anyway...

A few questions about the Plan:

(1) Will we be able to compile under Unix using old-style Unicode, as a 
fallback for legacy apps? Or
would that be too hard to maintain? Similarly, could we compile in UTF-8 
mode on Windows so
we can replicate and debug UTF-8-related bugs on that platform?

(2) What about code that uses a wxString to store arbitrary binary data? 
Presumably the UTF-8 processing
would get very confused, especially when using the [] operator and 
iterators? (I'm not sure I have any code
like this but it's not impossible.)

(3) Some string-manipulation code may look ahead or behind a few 
characters, e.g. s[i+1].
I'm not sure if your optimizations will cope with that, in which case, 
how about having a
simple wxString-like class that allows you to quickly adapt existing 
code, e.g. change:

wxString s(somestring);
for (size_t i = 0; i < s.Length(); i++)
{
    wxChar ch = s[i];
    ...
}

to:

wxIndexedString s(somestring);
for (size_t i = 0; i < s.Length(); i++)
{
    wxChar ch = s[i];
    ...
}

wxIndexedString would store Unicode as an array of wxChar and might have 
a few commonly-used methods for
character access, but not much else.

Best regards,

Julian




More information about the wx-dev mailing list