[wxPython-users] Mystery: wx.grid, a filter function, and Unicode
Bob Klahn
bobstones at comcast.net
Thu Jan 3 17:18:24 PST 2008
Thanks, Jean-Michel, cp850 was just what I =
needed. Recently it's become clearer than ever =
that trying to use anything but Unicode with =
wxPython would continue to give me untold =
grief. So now, after quite a number of coding =
changes this evening, my application, which =
wasn't using explicit Unicode anywhere, is using Unicode throughout.
BTW, one of the coding changes was to replace my =
makefilter function with this CharFilter class:
class CharFilter(object):
"""
Given a string of Unicode characters to (a) keep or (b) delete,
build a filtering function that, applied to any string s,
returns a copy of s containing
(a) only the characters to be kept, or
(b) all but the characters to be deleted.
"""
def __init__(self, chars, delete=3DTrue):
self.chars =3D set(map(ord,chars))
self.delete =3D delete
def __getitem__(self, n):
if self.delete:
if n in self.chars: return None
else:
if n not in self.chars: return None
return unichr(n)
def __call__(self, s):
return unicode(s).translate(self)
Thanks again for your help! I wasn't aware of cp850.
Bob
At 05:46 PM 1/3/2008, Jean-Michel wrote:
>Bob Klahn wrote:
>
>How should I handle phrases such as "D=E9j=E0
>vu"? Externally, the =E9 and =E0 are recorded as hex
>82 and hex 85 respectively; internally, they
>should presumably be hex E9 and hex E0
>respectively. How do I get "D=E9j=E0 vu" back from
>Unicode to extended ASCII?
>
>---
>
> From what I read, =E9: hex82, =E0:hex85, it *probably* comes from the
>table cp850, used in the DOS world. The vertical bar, hexB3 is
>not a char, but is a "drawing character" used in the DOS world.
>See http://fr.wikipedia.org/wiki/Page_de_code_850
>
>However, I should say cp850 does not fit exactly with your proposed
>"extended ASCII" table.
>
>In a DOS box on my win platform setup for Western European
>Languages (cp850), Python yields this (one recognizes 82 and 85):
>
> >>> s =3D 'D=E9j=E0 vu'
> >>> s
>'D\x82j\x85 vu'
> >>>
>
>Now on Windows using cp1252 as table, I can mimick the
>'D=E9j=E0 vu' string and convert it to an unicode.
>
> >>> s =3D 'D=E9j=E0 vu'
> >>> s
>'D\xe9j\xe0 vu'
> >>> isinstance(s, str)
>True
> >>> u =3D s.decode('cp1252')
> >>> isinstance(u, unicode)
>True
>
>Once the unicode is created, it should be possible to
>convert it into something else.
>
> >>> s =3D u.encode('cp850')
> >>> s
>'D\x82j\x85 vu'
> >>> isinstance(s, str)
>True
>
>82 and 85 again!
>
>or
>
> >>> u.encode('cp1252')
>'D\xe9j\xe0 vu'
> >>> u.encode('iso-8859-1')
>'D\xe9j\xe0 vu'
> >>> u.encode('utf-8')
>'D\xc3\xa9j\xc3\xa0 vu'
> >>> u.encode('utf-16')
>'\xff\xfeD\x00\xe9\x00j\x00\xe0\x00 \x00v\x00u\x00'
> >>> u.encode('raw_unicode_escape')
>'D\xe9j\xe0 vu'
>
>A side note, this encoding/decoding job is done on the Python
>level and has nothing to do with the wxPython builds ANSI/unicode.
>It is up to you if you prefer to work with the ANSI or unicode
>build.
>
>Hope that helps.
>
>Jean-Michel Fauth, Switzerland
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: wxPython-users-unsubscribe at lists.wxwidgets.org
>For additional commands, e-mail: wxPython-users-help at lists.wxwidgets.org
>
>
>
>--
>No virus found in this incoming message.
>Checked by AVG Free Edition. Version: 7.5.516 / =
>Virus Database: 269.17.13/1207 - Release Date: 1/2/2008 11:29 AM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.wxwidgets.org/pipermail/wxpython-users/attachments/200801=
03/58c4a3bd/attachment.htm
More information about the wxpython-users
mailing list