[wx-dev] Re: #9672: wx 2.9 Ansi/Unicode Combi-wxString breaks compatibility with std::[w]string and introduces subtle errors

wxTrac noreply at wxsite.net
Tue Jul 1 07:34:28 PDT 2008


Ticket URL: <http://trac.wxwidgets.org/ticket/9672#comment:3>

#9672: wx 2.9 Ansi/Unicode Combi-wxString breaks compatibility with std::[w]string
and introduces subtle errors
----------------------------+-----------------------------------------------
  Reporter:  hajokirchhoff  |       Owner:                                               
      Type:  defect         |      Status:  new                                          
  Priority:  normal         |   Milestone:                                               
 Component:  GUI-all        |     Version:  2.9-svn                                      
Resolution:                 |    Keywords:  wxString Ansi Unicode stl boost compatibility
 Blockedby:                 |       Patch:  0                                            
  Blocking:                 |  
----------------------------+-----------------------------------------------
Changes (by hajokirchhoff):

  * status:  infoneeded_new => new


Comment:

 Replying to [comment:2 vadz]:
 > Replying to [ticket:9672 hajokirchhoff]:
 >
 > I see no compelling reason to do it but I do (continue to) see a
 '''lot''' of posts to wx mailing lists, forums &c from people who don't
 understand why doesn't
 > {{{
 > wxString foo("foo");
 > }}}
 > What justification can be there for using wxT() around 7 bit ASCII
 characters? How can this be useful? IMO the answer clearly is that it
 can't.

 The justification for this is the C++ standard. What better justification
 can there be? String literals for unicode must begin with L, so wxString
 foo(L"foo") will compile. wxT() is just a more complicated way of saying
 L"".
 If you use --enable-unicode, you are telling wxWidgets that you want to
 use wchar_t instead of char. Consequently you must say L"" instead of "".
 If you don't like this, then use --disable-unicode. Where is the problem?

 It has been this way for a long time. Why change that? Just because people
 have not yet understood the difference between a wide string literal and a
 string literal?

 If this really is the only reason for this change, then I am all the more
 against it. I think it is a valiant attempt, but will cause problems
 without end in the long run.

 Also consider this: all other libraries have the same problem. Yet noone
 tried to solve it the way wxWidgets did. char and wchar_t are two
 different data types.

 >
 > > The recurring problem is that your new wxString implementation
 confuses template mechanics.

 > >
 > > wxString left(L"ä");
 > > wstring right(L"ä");
 > > assert(boost::algorithm::istarts_with(left, right)==true);
 > >
 > > This assert fails even when the locale is set to german, because
 istarts_with uses to_upper which uses ctype<_Elem>::to_upper which has a
 template specialization for wchar_t but not for wxUniChar.
 >
 > How does it compile then?

 It compiles fine. boost::algorithm::istarts_with is a template function.
 Here is a simplified version:

 template<typename Range1T, typename Range2T>
   inline bool istarts_with(
        const Range1T& Input,
        const Range2T& Test,
        const std::locale& Loc=std::locale())
 {
    Range1T::const_iterator i=Input.begin();
    Range2T::const_iterator j=Test.begin();
    while (i!=Input.end() && j!=Test.end()) {
       if (!is_iequal(Loc)(*i, *j))
          return false;
    }
    return j==Test.end();
 }

 is_iequal is also a template function. Again, simplified version:

 template <typename T1, typename T2 >
   bool operator()(const T1& Arg1, const T2& Arg2)
 {
    return std::toupper<T1>(Arg1,m_Loc)==std::toupper<T2>(Arg2,m_Loc);
 }

 As you can see, the template arguments are deducted automatically.
 wxString has a const_iterator, so it compiles fine. The
 wxString::const_iterator returns a wxUniChar. The wstring::const_iterator
 returns a wchar_t.

 So is_iequal compares
    std::toupper<wxUniChar>(Arg1, m_Loc) == std::toupper<wchar_t>(Arg2,
 m_Loc)

 This fails.

 > Anyhow, before spending time on debugging this myself, I'd really
 appreciate if you could explain what the problem really is and how does
 toupper<wxUniChar> ends up being called. Thanks!

 I hope I could explain it this time.


 I know you are trying to make wxWidgets easier to use for people that get
 confused why wxString("foo") fails to compile. But I think your solution
 is going into the wrong direction. I think it would be better to educate
 people to use wxString(L"foo"), because what you are trying here will be
 causing a lot more problems in the long run.

 a) I think there is a reason why other libraries do not attempt to do
 this.
 b) I also think, interoperability with other libraries is more important
 than not having to explain why wxString(L"abcd") is the proper C++
 standard way of coding.

 L"abcd" is the new unicode way of doing things. It's C++ standard.

 You are right, wxString does no longer inherit from std::[w]string, it
 only used to. Now it is trying to be compatible with std::[w]string but
 fails to do so in subtle and not so subtle ways.

 You are helping C++ beginners that do not (yet) understand the difference
 between a char string literal and a wchar_t string literal. In the process
 you are breaking compatibility with the C++ standard (!) library and with
 one of the most advanced C++ libraries around, boost.

 I think this is the wrong decision. Seeing that you have put a lot of work
 into the new wxString class already, I can imagine it will not be easy for
 you to reconsider this decision.

 I could live with a version where wxString and string are entirely
 different entities, although I'd really prefer wxString == string. But the
 current implementation is just plain wrong, IMHO.

 If you cannot achieve 100% compatibility with std::[w]string, you should
 have 0% compatibility. Otherwise wxString objects look as if they could be
 used interchangeably with string when they actually cannot. This results
 in bugs that are extremely difficult to find. In my case, a program that
 was working fine, suddenly stopped working for german umlauts, even if it
 compiled without a problem.


--
Ticket URL: <http://trac.wxwidgets.org/ticket/9672#comment:3>


More information about the wx-dev mailing list