Sunday, 15 March 2015

python - Getting the unicode characters of a string -



python - Getting the unicode characters of a string -

i'm getting string qt widget, , i'm trying convert non ascii characters (eg. €) hex unicode characters (eg. x20ac)

currently i'm doing see unicode character this:

currenttext = self.rich_text_edit.toplaintext() # string € symbol print("unicode char is: {0}".format(unicode_text))

this provides me error:

unicodeencodeerror: 'ascii' codec can't encode character u'\u20ac' in position 0: ordinal not in range(128)

that's want, right there, 20ac.

how @ that?

if this:

unicode_text = str(unicode_text).encode('string_escape') print unicode_text #returns \xe2\x82\xac

it returns 3 characters, of them wrong, i'm going round in circles :)

i know it's basic question, i've never had worry unicode before.

many in advance, ian

\xe2\x82\xac utf-8 encoding of unicode \x20ac.

think of follows, unicode 1 1 mapping between integer number , character similar ascii, except unicode goes much much higher in number of integer character mappings.

your symbol has integer value of 8364 (or \x20ac in hex), far big fit 8-bit value of 256 - , \x20ac broken downwards 3 individual bytes of \xe2\x82\xac. high level overview, i'd recommend take @ first-class explanation scott hanselman:

why #askobama tweet garbled on screen.

as question, can

>>> print "unicode code point is: {0}".format(hex(ord(unicode_text))) unicode code point is: 0x20ac

python unicode pyside

No comments:

Post a Comment