[wx-dev] Re: #9672: wx 2.9 Ansi/Unicode Combi-wxString breaks
compatibility with std::[w]string and introduces subtle errors
wxTrac
noreply at wxsite.net
Tue Jul 1 07:34:28 PDT 2008
Ticket URL: <http://trac.wxwidgets.org/ticket/9672#comment:3>
#9672: wx 2.9 Ansi/Unicode Combi-wxString breaks compatibility with std::[w]string
and introduces subtle errors
----------------------------+-----------------------------------------------
Reporter: hajokirchhoff | Owner:
Type: defect | Status: new
Priority: normal | Milestone:
Component: GUI-all | Version: 2.9-svn
Resolution: | Keywords: wxString Ansi Unicode stl boost compatibility
Blockedby: | Patch: 0
Blocking: |
----------------------------+-----------------------------------------------
Changes (by hajokirchhoff):
* status: infoneeded_new => new
Comment:
Replying to [comment:2 vadz]:
> Replying to [ticket:9672 hajokirchhoff]:
>
> I see no compelling reason to do it but I do (continue to) see a
'''lot''' of posts to wx mailing lists, forums &c from people who don't
understand why doesn't
> {{{
> wxString foo("foo");
> }}}
> What justification can be there for using wxT() around 7 bit ASCII
characters? How can this be useful? IMO the answer clearly is that it
can't.
The justification for this is the C++ standard. What better justification
can there be? String literals for unicode must begin with L, so wxString
foo(L"foo") will compile. wxT() is just a more complicated way of saying
L"".
If you use --enable-unicode, you are telling wxWidgets that you want to
use wchar_t instead of char. Consequently you must say L"" instead of "".
If you don't like this, then use --disable-unicode. Where is the problem?
It has been this way for a long time. Why change that? Just because people
have not yet understood the difference between a wide string literal and a
string literal?
If this really is the only reason for this change, then I am all the more
against it. I think it is a valiant attempt, but will cause problems
without end in the long run.
Also consider this: all other libraries have the same problem. Yet noone
tried to solve it the way wxWidgets did. char and wchar_t are two
different data types.
>
> > The recurring problem is that your new wxString implementation
confuses template mechanics.
> >
> > wxString left(L"ä");
> > wstring right(L"ä");
> > assert(boost::algorithm::istarts_with(left, right)==true);
> >
> > This assert fails even when the locale is set to german, because
istarts_with uses to_upper which uses ctype<_Elem>::to_upper which has a
template specialization for wchar_t but not for wxUniChar.
>
> How does it compile then?
It compiles fine. boost::algorithm::istarts_with is a template function.
Here is a simplified version:
template<typename Range1T, typename Range2T>
inline bool istarts_with(
const Range1T& Input,
const Range2T& Test,
const std::locale& Loc=std::locale())
{
Range1T::const_iterator i=Input.begin();
Range2T::const_iterator j=Test.begin();
while (i!=Input.end() && j!=Test.end()) {
if (!is_iequal(Loc)(*i, *j))
return false;
}
return j==Test.end();
}
is_iequal is also a template function. Again, simplified version:
template <typename T1, typename T2 >
bool operator()(const T1& Arg1, const T2& Arg2)
{
return std::toupper<T1>(Arg1,m_Loc)==std::toupper<T2>(Arg2,m_Loc);
}
As you can see, the template arguments are deducted automatically.
wxString has a const_iterator, so it compiles fine. The
wxString::const_iterator returns a wxUniChar. The wstring::const_iterator
returns a wchar_t.
So is_iequal compares
std::toupper<wxUniChar>(Arg1, m_Loc) == std::toupper<wchar_t>(Arg2,
m_Loc)
This fails.
> Anyhow, before spending time on debugging this myself, I'd really
appreciate if you could explain what the problem really is and how does
toupper<wxUniChar> ends up being called. Thanks!
I hope I could explain it this time.
I know you are trying to make wxWidgets easier to use for people that get
confused why wxString("foo") fails to compile. But I think your solution
is going into the wrong direction. I think it would be better to educate
people to use wxString(L"foo"), because what you are trying here will be
causing a lot more problems in the long run.
a) I think there is a reason why other libraries do not attempt to do
this.
b) I also think, interoperability with other libraries is more important
than not having to explain why wxString(L"abcd") is the proper C++
standard way of coding.
L"abcd" is the new unicode way of doing things. It's C++ standard.
You are right, wxString does no longer inherit from std::[w]string, it
only used to. Now it is trying to be compatible with std::[w]string but
fails to do so in subtle and not so subtle ways.
You are helping C++ beginners that do not (yet) understand the difference
between a char string literal and a wchar_t string literal. In the process
you are breaking compatibility with the C++ standard (!) library and with
one of the most advanced C++ libraries around, boost.
I think this is the wrong decision. Seeing that you have put a lot of work
into the new wxString class already, I can imagine it will not be easy for
you to reconsider this decision.
I could live with a version where wxString and string are entirely
different entities, although I'd really prefer wxString == string. But the
current implementation is just plain wrong, IMHO.
If you cannot achieve 100% compatibility with std::[w]string, you should
have 0% compatibility. Otherwise wxString objects look as if they could be
used interchangeably with string when they actually cannot. This results
in bugs that are extremely difficult to find. In my case, a program that
was working fine, suddenly stopped working for german umlauts, even if it
compiled without a problem.
--
Ticket URL: <http://trac.wxwidgets.org/ticket/9672#comment:3>
More information about the wx-dev
mailing list