python - Getting the unicode characters of a string -
i'm getting string qt widget, , i'm trying convert non ascii characters (eg. €) hex unicode characters (eg. x20ac)
currently i'm doing see unicode character this:
currenttext = self.rich_text_edit.toplaintext() # string € symbol print("unicode char is: {0}".format(unicode_text))
this provides me error:
unicodeencodeerror: 'ascii' codec can't encode character u'\u20ac' in position 0: ordinal not in range(128)
that's want, right there, 20ac.
how @ that?
if this:
unicode_text = str(unicode_text).encode('string_escape') print unicode_text #returns \xe2\x82\xac
it returns 3 characters, of them wrong, i'm going round in circles :)
i know it's basic question, i've never had worry unicode before.
many in advance, ian
\xe2\x82\xac
utf-8 encoding of unicode \x20ac
.
think of follows, unicode 1 1 mapping between integer number , character similar ascii, except unicode goes much much higher in number of integer character mappings.
your €
symbol has integer value of 8364
(or \x20ac
in hex), far big fit 8-bit value of 256 - , \x20ac
broken downwards 3 individual bytes of \xe2\x82\xac
. high level overview, i'd recommend take @ first-class explanation scott hanselman:
why #askobama tweet garbled on screen.
as question, can
>>> print "unicode code point is: {0}".format(hex(ord(unicode_text))) unicode code point is: 0x20ac
python unicode pyside
No comments:
Post a Comment