[wxPython-users] ANN: GUI2Exe for wxPython :-D
Robin Dunn
robin at alldunn.com
Thu Apr 5 17:52:06 PDT 2007
Andrea Gavana wrote:
>> For example, if you have a Unicode build of wxPython, and pass a string
>> object to textctrl.SetValue, then it will use the
>> wx.GetDefaultPyEncoding() encoding to convert it to a Unicode object to
>> pass to the C++ SetValue. The opposite is also true, if you have an
>> ansi build of wxPython and pass a Unicode object then that encoding will
>> be used to convert it to a string first.
>
> Well, let's assume I have an unicode build of wxPython. Then Roee
> sends me his database of GUI2Exe projects, and he has put in there
> project names with hebrew letters (or cyrillic or whatever). My
> database pre-processing will not blink, but on my machines
> wx.GetDefaultPyEncoding() returns either 'cp1251' (here at work) or
> 'ascii' at home. Will it harm if I receive Roee's database?
If you are storing unicode values in the database then no, it won't be a
problem. If you are storing encoded ansi strings in the database then
you should either also store which encoding is being used, or
standardize on something like utf-8 for all database values, so when you
read them from the database later you can use that same encoding to
convert them back to unicode objects for display and manipulation.
>
>> For other automatic ansi/unicode conversions that Python does then it
>> uses the return value of sys.getdefaultencoding() for the encoding.
>> Python also provides the sys.getfilesystemencoding() value which
>> specifies what should be used for encoding unicode values into ansi
>> strings to be used for path and file names on the current system.
>
> sys.getdefaultencoding() seems not enough to work with bsddb, also
> Roee reported the same problem. And I found exactly the same when
> assigning project names with accented letters (french, german and
> similar). I saved a project in the database with this strange name,
> closed GUI2Exe, reopened it and the project name was completely
> screwed up :-(
Use sys.getfilesystemencoding for the filename of the database. Follow
the above guidelines for the content you store, (either store unicode
objects, or remember which encoding is used for the string objects.)
Here is some playing in PyShell to show it all in action. I used a
DBShelve object so it automatically pickles the objects stored, allowing
me to store a unicode object (pickled to a string) with no extra effort
on my part:
>>>
>>> import bsddb
>>> import bsddb.dbshelve
>>> name = u"/tmp/\u20ac.db"
>>> print name
/tmp/€.db
>>> import sys
>>>
>>> db = bsddb.dbshelve.open(name)
Traceback (most recent call last):
File "<input>", line 1, in ?
File "bsddb/dbshelve.py", line 73, in open
UnicodeEncodeError: 'ascii' codec can't encode character u'\u20ac' in
position 5: ordinal not in range(128)
It doesn't like the unicode for the filename, as expected.
>>> db = bsddb.dbshelve.open(name.encode(sys.getfilesystemencoding()))
So converting it to a string using the filesystemencoding makes it
happy, and if I do a listing of that dir, I can see that the OS knows
what the name is too:
$ ll /tmp/*.db
-rw-r----- 1 robind robind 49152 Apr 5 16:48 /tmp/€.db
Now I'll store and retrieve a Unicode object. You can also store any
picklable Python object just as easily:
>>> db['filename'] = name
>>> db.close()
>>> del db
>>>
>>>
>>> db = bsddb.dbshelve.open(name.encode(sys.getfilesystemencoding()))
>>> print db['filename']
/tmp/€.db
>>>
--
Robin Dunn
Software Craftsman
http://wxPython.org Java give you jitters? Relax with wxPython!
More information about the wxpython-users
mailing list