[wxPython-users] unicode handling

Tim Roberts timr at probo.com
Thu Aug 3 09:35:53 PDT 2006


On Thu, 3 Aug 2006 17:04:08 +1200, "Thomas Thomas" <thomas at mindz-i.co.nz> wr

>
> >If you have control over writing data, I would
> >suggest writing to utf-8
>  
> Na the file is something i receive extenally and dropped into a
> specific location from where the application reads..


Then you need to figure out what encoding the file is using.

>  
> it will be as starightforward as copying and pasting the content below
> onto notepad
> ------------------------
> string MetaDataPrompt = "Discovery No";
> string MetaDataFieldName = "Discovery No";
> string MetaDataType = "string";
> string MetaDataValue = "£500";
> string MetaDataPrompt = "comments";
> string MetaDataFieldName = "Comments";
> string MetaDataType = "string";
> string MetaDataValue = "Energy Scope £500";
> -----------------------------------------------------
>  
> and try reading it from that file..


If you do that, you will have an 8-bit file encoded with whatever your
system's default encoding is.  It is NOT a Unicode file.  The issue is
that the £ sign is not in the same place in every encoding.  If you
wrote that file, put it on a floppy, moved it to a Chinese system and
tried to read it, it would show something very different.

Because of that, Python, by default, does not assume an encoding.  When
it encounters a byte outside of the standard ASCII range (0-127), it pukes.

It is quite likely that your file is iso-8859-1.  Try:
   inifile = codec.open(filename, 'r', encoding='iso-8859-1')

-- 
Tim Roberts, timr at probo.com
Providenza & Boekelheide, Inc.





More information about the wxpython-users mailing list