最新消息:20210816 当前crifan.com域名已被污染,为防止失联,请关注(页面右下角的)公众号

【已解决】Python中,将一个字符串eval或ast.literal_eval变成字典后,unicode的字符变成了x格式

StringEncoding crifan 5648浏览 0评论

【已解决】Python中,将一个字符串eval或ast.literal_eval变成字典后,unicode的字符变成了x格式

【背景】

在将一个字典形式的字符串,通过eval或ast.literal_eval,变成了所期望的字典类型的变量后,但是原先其中某个content键的值是utf-8的unicode字符,结果被转换为xXX类型的x十六进制的值了。

比如:

 {“err_no”:0,”err_msg”:”success”,”total_count”:”4″,”response_count”:4,”err_desc”:”success”,”body”:{“total_count”:4,”real_ret_count”:4,”data”:[{“reply_count”:0,”score”:0,”favor”:0,”is_top”:0,”like_count”:0,”dislike_count”:0,”create_time”:”1323940382″,”user_id”:”39390080″,”user_name”:”againinput6″,”user_ip”:”58.240.236.19″,”area”:””,”title”:””,”content”:”此评论只是测试而已。。。”,”reserved1″:0,”reserved2″:0,”mdatetime”:”1323940382″,”cdatetime”:1323940382,”un”:”againinput6″,”reply_id_enc”:”8db1cb13d6590233dd540179″,”thread_id_enc”:”b48f8c541640ec1a3a2935d7″,”parent_id_enc”:”ab64034f78f0f736afc3ab64″,”portrait”:”800b616761696e696e707574365902″,”sexy_time”:”15小时前”},{“reply_count”:0,”score”:0,”favor”:0,”is_top”:0,”like_count”:0,”dislike_count”:0,”create_time”:”1323941023″,”user_id”:”39390080″,”user_name”:”againinput6″,”user_ip”:”58.240.236.19″,”area”:””,”title”:””,”content”:”测试添加一个双引号,cn=“中文双引号”,en=“英文双引号“,测试完毕。”,”reserved1″:0,”reserved2″:0,”mdatetime”:”1323941023″,”cdatetime”:1323941023,”un”:”againinput6″,”reply_id_enc”:”78f0f73658cd72210a55a97b”,”thread_id_enc”:”b48f8c541640ec1a3a2935d7″,”parent_id_enc”:”ab64034f78f0f736afc3ab64″,”portrait”:”800b616761696e696e707574365902″,”sexy_time”:”15小时前”},{“reply_count”:0,”score”:0,”favor”:0,”is_top”:0,”like_count”:0,”dislike_count”:0,”create_time”:”1323944383″,”user_id”:”39390080″,”user_name”:”againinput6″,”user_ip”:”58.240.236.19″,”area”:””,”title”:””,”content”:”再次测试一下英文的单引号,enSQ=’English Single Quote'”,”reserved1″:0,”reserved2″:0,”mdatetime”:”1323944383″,”cdatetime”:1323944383,”un”:”againinput6″,”reply_id_enc”:”9b504fc250748e25e5dd3b42″,”thread_id_enc”:”b48f8c541640ec1a3a2935d7″,”parent_id_enc”:”ab64034f78f0f736afc3ab64″,”portrait”:”800b616761696e696e707574365902″,”sexy_time”:”14小时前”},{“reply_count”:0,”score”:0,”favor”:0,”is_top”:0,”like_count”:0,”dislike_count”:0,”create_time”:”1323944826″,”user_id”:”39390080″,”user_name”:”againinput6″,”user_ip”:”58.240.236.19″,”area”:””,”title”:””,”content”:”一早进入你的空间就听到了你的背景歌曲,的确不错,我也是喜欢《空位》更多一些,也许跟之前喜欢许巍的歌曲有关系吧,呵呵,一种可以透过肌肤,沁入心灵的旋律和声音,很不错。谢谢你推荐好歌给大家,原来你写了这么多的东西,经营着这么多的园地,辛苦了。“我一直在寻找寻找那个空位”,很不错,一如自己的状态,寻找那个空位,寻找属于自己的空位,一直一直在寻找……22222222222222222222222222″,”reserved1″:0,”reserved2″:0,”mdatetime”:”1323944826″,”cdatetime”:1323944826,”un”:”againinput6″,”reply_id_enc”:”2edda3cc04ce860001e92843″,”thread_id_enc”:”b48f8c541640ec1a3a2935d7″,”parent_id_enc”:”ab64034f78f0f736afc3ab64″,”portrait”:”800b616761696e696e707574365902″,”sexy_time”:”14小时前”}]}}

   变成了:

 {‘body’: {‘total_count’: 4, ‘real_ret_count’: 4, ‘data’: [{‘thread_id_enc’: ‘b48f8c541640ec1a3a2935d7’, ‘reply_count’: 0, ‘reply_id_enc’: ‘8db1cb13d6590233dd540179’, ‘sexy_time’: ’15xe5xb0x8fxe6x97xb6xe5x89x8d’, ‘like_count’: 0, ‘mdatetime’: ‘1323940382’, ‘portrait’: ‘800b616761696e696e707574365902’, ‘reserved1’: 0, ‘reserved2’: 0, ‘dislike_count’: 0, ‘user_id’: ‘39390080’, ‘area’: ”, ‘user_ip’: ‘58.240.236.19’, ‘favor’: 0, ‘score’: 0, ‘create_time’: ‘1323940382’, ‘content’: ‘xe6xadxa4xe8xafx84xe8xaexbaxe5x8fxaaxe6x98xafxe6xb5x8bxe8xafx95xe8x80x8cxe5xb7xb2xe3x80x82xe3x80x82xe3x80x82’, ‘user_name’: ‘againinput6’, ‘is_top’: 0, ‘parent_id_enc’: ‘ab64034f78f0f736afc3ab64’, ‘cdatetime’: 1323940382, ‘title’: ”, ‘un’: ‘againinput6’}, {‘thread_id_enc’: ‘b48f8c541640ec1a3a2935d7’, ‘reply_count’: 0, ‘reply_id_enc’: ’78f0f73658cd72210a55a97b’, ‘sexy_time’: ’15xe5xb0x8fxe6x97xb6xe5x89x8d’, ‘like_count’: 0, ‘mdatetime’: ‘1323941023’, ‘portrait’: ‘800b616761696e696e707574365902’, ‘reserved1’: 0, ‘reserved2’: 0, ‘dislike_count’: 0, ‘user_id’: ‘39390080’, ‘area’: ”, ‘user_ip’: ‘58.240.236.19’, ‘favor’: 0, ‘score’: 0, ‘create_time’: ‘1323941023’, ‘content’: ‘xe6xb5x8bxe8xafx95xe6xb7xbbxe5x8axa0xe4xb8x80xe4xb8xaaxe5x8fx8cxe5xbcx95xe5x8fxb7xefxbcx8ccn=xe2x80x9cxe4xb8xadxe6x96x87xe5x8fx8cxe5xbcx95xe5x8fxb7xe2x80x9dxefxbcx8cen=xe2x80x9cxe8x8bxb1xe6x96x87xe5x8fx8cxe5xbcx95xe5x8fxb7xe2x80x9cxefxbcx8cxe6xb5x8bxe8xafx95xe5xaex8cxe6xafx95xe3x80x82’, ‘user_name’: ‘againinput6’, ‘is_top’: 0, ‘parent_id_enc’: ‘ab64034f78f0f736afc3ab64’, ‘cdatetime’: 1323941023, ‘title’: ”, ‘un’: ‘againinput6’}, {‘thread_id_enc’: ‘b48f8c541640ec1a3a2935d7’, ‘reply_count’: 0, ‘reply_id_enc’: ‘9b504fc250748e25e5dd3b42’, ‘sexy_time’: ’14xe5xb0x8fxe6x97xb6xe5x89x8d’, ‘like_count’: 0, ‘mdatetime’: ‘1323944383’, ‘portrait’: ‘800b616761696e696e707574365902’, ‘reserved1’: 0, ‘reserved2’: 0, ‘dislike_count’: 0, ‘user_id’: ‘39390080’, ‘area’: ”, ‘user_ip’: ‘58.240.236.19’, ‘favor’: 0, ‘score’: 0, ‘create_time’: ‘1323944383’, ‘content’: “xe5x86x8dxe6xacxa1xe6xb5x8bxe8xafx95xe4xb8x80xe4xb8x8bxe8x8bxb1xe6x96x87xe7x9ax84xe5x8dx95xe5xbcx95xe5x8fxb7xefxbcx8cenSQ=’English Single Quote'”, ‘user_name’: ‘againinput6’, ‘is_top’: 0, ‘parent_id_enc’: ‘ab64034f78f0f736afc3ab64’, ‘cdatetime’: 1323944383, ‘title’: ”, ‘un’: ‘againinput6’}, {‘thread_id_enc’: ‘b48f8c541640ec1a3a2935d7’, ‘reply_count’: 0, ‘reply_id_enc’: ‘2edda3cc04ce860001e92843’, ‘sexy_time’: ’14xe5xb0x8fxe6x97xb6xe5x89x8d’, ‘like_count’: 0, ‘mdatetime’: ‘1323944826’, ‘portrait’: ‘800b616761696e696e707574365902’, ‘reserved1’: 0, ‘reserved2’: 0, ‘dislike_count’: 0, ‘user_id’: ‘39390080’, ‘area’: ”, ‘user_ip’: ‘58.240.236.19’, ‘favor’: 0, ‘score’: 0, ‘create_time’: ‘1323944826’, ‘content’: ‘xe4xb8x80xe6x97xa9xe8xbfx9bxe5x85xa5xe4xbdxa0xe7x9ax84xe7xa9xbaxe9x97xb4xe5xb0xb1xe5x90xacxe5x88xb0xe4xbax86xe4xbdxa0xe7x9ax84xe8x83x8cxe6x99xafxe6xadx8cxe6x9bxb2xefxbcx8cxe7x9ax84xe7xa1xaexe4xb8x8dxe9x94x99xefxbcx8cxe6x88x91xe4xb9x9fxe6x98xafxe5x96x9cxe6xacxa2xe3x80x8axe7xa9xbaxe4xbdx8dxe3x80x8bxe6x9bxb4xe5xa4x9axe4xb8x80xe4xbax9bxefxbcx8cxe4xb9x9fxe8xaexb8xe8xb7x9fxe4xb9x8bxe5x89x8dxe5x96x9cxe6xacxa2xe8xaexb8xe5xb7x8dxe7x9ax84xe6xadx8cxe6x9bxb2xe6x9cx89xe5x85xb3xe7xb3xbbxe5x90xa7xefxbcx8cxe5x91xb5xe5x91xb5xefxbcx8cxe4xb8x80xe7xa7x8dxe5x8fxafxe4xbbxa5xe9x80x8fxe8xbfx87xe8x82x8cxe8x82xa4xefxbcx8cxe6xb2x81xe5x85xa5xe5xbfx83xe7x81xb5xe7x9ax84xe6x97x8bxe5xbex8bxe5x92x8cxe5xa3xb0xe9x9fxb3xefxbcx8cxe5xbex88xe4xb8x8dxe9x94x99xe3x80x82xe8xb0xa2xe8xb0xa2xe4xbdxa0xe6x8exa8xe8x8dx90xe5xa5xbdxe6xadx8cxe7xbbx99xe5xa4xa7xe5xaexb6xefxbcx8cxe5x8ex9fxe6x9dxa5xe4xbdxa0xe5x86x99xe4xbax86xe8xbfx99xe4xb9x88xe5xa4x9axe7x9ax84xe4xb8x9cxe8xa5xbfxefxbcx8cxe7xbbx8fxe8x90xa5xe7x9dx80xe8xbfx99xe4xb9x88xe5xa4x9axe7x9ax84xe5x9bxadxe5x9cxb0xefxbcx8cxe8xbex9bxe8x8bxa6xe4xbax86xe3x80x82xe2x80x9cxe6x88x91xe4xb8x80xe7x9bxb4xe5x9cxa8xe5xafxbbxe6x89xbexe5xafxbbxe6x89xbexe9x82xa3xe4xb8xaaxe7xa9xbaxe4xbdx8dxe2x80x9dxefxbcx8cxe5xbex88xe4xb8x8dxe9x94x99xefxbcx8cxe4xb8x80xe5xa6x82xe8x87xaaxe5xb7xb1xe7x9ax84xe7x8axb6xe6x80x81xefxbcx8cxe5xafxbbxe6x89xbexe9x82xa3xe4xb8xaaxe7xa9xbaxe4xbdx8dxefxbcx8cxe5xafxbbxe6x89xbexe5xb1x9exe4xbax8exe8x87xaaxe5xb7xb1xe7x9ax84xe7xa9xbaxe4xbdx8dxefxbcx8cxe4xb8x80xe7x9bxb4xe4xb8x80xe7x9bxb4xe5x9cxa8xe5xafxbbxe6x89xbexe2x80xa6xe2x80xa622222222222222222222222222’, ‘user_name’: ‘againinput6’, ‘is_top’: 0, ‘parent_id_enc’: ‘ab64034f78f0f736afc3ab64’, ‘cdatetime’: 1323944826, ‘title’: ”, ‘un’: ‘againinput6’}]}, ‘err_desc’: ‘success’, ‘total_count’: ‘4’, ‘response_count’: 4, ‘err_msg’: ‘success’, ‘err_no’: 0}

   

需要说明的一点是,此时用的是Python是2.7.2版本的。

而且由于eval和ast.literal_eval不支持键值为null,所以定义了一个全局变量:

null = None

然后在使用到eval或ast.literal_eval之前,用global null,使得eval或者ast.literal_eval得以正常解析。

【解决过程】

经过一番调试,发现上述的解析,就是这样的,没错误。

而我之前想要实现的,对于上述例子中的content域,希望是保留uXXXX格式的unicode编码的,是可以通过先把为编码的content内容,比如:

 {“err_no”:0,”err_msg”:”success”,”total_count”:”4″,”response_count”:4,”err_desc”:”success”,”body”:{“total_count”:4,”real_ret_count”:4,”data”:[{“reply_count”:0,”score”:0,”favor”:0,”is_top”:0,”like_count”:0,”dislike_count”:0,”create_time”:”1323940382″,”user_id”:”39390080″,”user_name”:”againinput6″,”user_ip”:”58.240.236.19″,”area”:””,”title”:””,”content”:”此评论只是测试而已。。。”,”reserved1″:0,”reserved2″:0,”mdatetime”:”1323940382″,”cdatetime”:1323940382,”un”:”againinput6″,”reply_id_enc”:”8db1cb13d6590233dd540179″,”thread_id_enc”:”b48f8c541640ec1a3a2935d7″,”parent_id_enc”:”ab64034f78f0f736afc3ab64″,”portrait”:”800b616761696e696e707574365902″,”sexy_time”:”16小时前”},{“reply_count”:0,”score”:0,”favor”:0,”is_top”:0,”like_count”:0,”dislike_count”:0,”create_time”:”1323941023″,”user_id”:”39390080″,”user_name”:”againinput6″,”user_ip”:”58.240.236.19″,”area”:””,”title”:””,”content”:”测试添加一个双引号?n=“中文双引号”?n=”英文双引号”,测试完毕。”,”reserved1″:0,”reserved2″:0,”mdatetime”:”1323941023″,”cdatetime”:1323941023,”un”:”againinput6″,”reply_id_enc”:”78f0f73658cd72210a55a97b”,”thread_id_enc”:”b48f8c541640ec1a3a2935d7″,”parent_id_enc”:”ab64034f78f0f736afc3ab64″,”portrait”:”800b616761696e696e707574365902″,”sexy_time”:”16小时前”},{“reply_count”:0,”score”:0,”favor”:0,”is_top”:0,”like_count”:0,”dislike_count”:0,”create_time”:”1323944383″,”user_id”:”39390080″,”user_name”:”againinput6″,”user_ip”:”58.240.236.19″,”area”:””,”title”:””,”content”:”再次测试一下英文的单引号?nSQ=’English Single Quote'”,”reserved1″:0,”reserved2″:0,”mdatetime”:”1323944383″,”cdatetime”:1323944383,”un”:”againinput6″,”reply_id_enc”:”9b504fc250748e25e5dd3b42″,”thread_id_enc”:”b48f8c541640ec1a3a2935d7″,”parent_id_enc”:”ab64034f78f0f736afc3ab64″,”portrait”:”800b616761696e696e707574365902″,”sexy_time”:”15小时前”},{“reply_count”:0,”score”:0,”favor”:0,”is_top”:0,”like_count”:0,”dislike_count”:0,”create_time”:”1323944826″,”user_id”:”39390080″,”user_name”:”againinput6″,”user_ip”:”58.240.236.19″,”area”:””,”title”:””,”content”:”一早进入你的空间就听到了你的背景歌曲,的确不错,我也是喜欢《空位》更多一些,也许跟之前喜欢许巍的歌曲有关系吧,呵呵,一种可以透过肌肤,沁入心灵的旋律和声音,很不错。谢谢你推荐好歌给大家,原来你写了这么多的东西,经营着这么多的园地,辛苦了。<br>“我一直在寻找寻找那个空位”,很不错,一如自己的状态,寻找那个空位,寻找属于自己的空位,一直一直在寻找……n<br>22222222222222222222222222″,”reserved1″:0,”reserved2″:0,”mdatetime”:”1323944826″,”cdatetime”:1323944826,”un”:”againinput6″,”reply_id_enc”:”2edda3cc04ce860001e92843″,”thread_id_enc”:”b48f8c541640ec1a3a2935d7″,”parent_id_enc”:”ab64034f78f0f736afc3ab64″,”portrait”:”800b616761696e696e707574365902″,”sexy_time”:”15小时前”}]}}

   送去eval或ast.literal_eval变成字典变量,然后被解析成为:

 {‘body’: {‘total_count’: 4, ‘real_ret_count’: 4, ‘data’: [{‘thread_id_enc’: ‘b48f8c541640ec1a3a2935d7’, ‘reply_count’: 0, ‘reply_id_enc’: ‘8db1cb13d6590233dd540179’, ‘sexy_time’: ’16\小\时\前’, ‘like_count’: 0, ‘mdatetime’: ‘1323940382’, ‘portrait’: ‘800b616761696e696e707574365902’, ‘reserved1’: 0, ‘reserved2’: 0, ‘dislike_count’: 0, ‘user_id’: ‘39390080’, ‘area’: ”, ‘user_ip’: ‘58.240.236.19’, ‘favor’: 0, ‘score’: 0, ‘create_time’: ‘1323940382’, ‘content’: ‘\此\评\论\只\是\测\试\而\已\。\。\。’, ‘user_name’: ‘againinput6’, ‘is_top’: 0, ‘parent_id_enc’: ‘ab64034f78f0f736afc3ab64’, ‘cdatetime’: 1323940382, ‘title’: ”, ‘un’: ‘againinput6’}, {‘thread_id_enc’: ‘b48f8c541640ec1a3a2935d7’, ‘reply_count’: 0, ‘reply_id_enc’: ’78f0f73658cd72210a55a97b’, ‘sexy_time’: ’16\小\时\前’, ‘like_count’: 0, ‘mdatetime’: ‘1323941023’, ‘portrait’: ‘800b616761696e696e707574365902’, ‘reserved1’: 0, ‘reserved2’: 0, ‘dislike_count’: 0, ‘user_id’: ‘39390080’, ‘area’: ”, ‘user_ip’: ‘58.240.236.19’, ‘favor’: 0, ‘score’: 0, ‘create_time’: ‘1323941023’, ‘content’: ‘\测\试\添\加\一\个\双\引\号\?n=\“\中\文\双\引\号\”\?n=”\英\文\双\引\号”\,\测\试\完\毕\。’, ‘user_name’: ‘againinput6’, ‘is_top’: 0, ‘parent_id_enc’: ‘ab64034f78f0f736afc3ab64’, ‘cdatetime’: 1323941023, ‘title’: ”, ‘un’: ‘againinput6’}, {‘thread_id_enc’: ‘b48f8c541640ec1a3a2935d7’, ‘reply_count’: 0, ‘reply_id_enc’: ‘9b504fc250748e25e5dd3b42’, ‘sexy_time’: ’15\小\时\前’, ‘like_count’: 0, ‘mdatetime’: ‘1323944383’, ‘portrait’: ‘800b616761696e696e707574365902’, ‘reserved1’: 0, ‘reserved2’: 0, ‘dislike_count’: 0, ‘user_id’: ‘39390080’, ‘area’: ”, ‘user_ip’: ‘58.240.236.19’, ‘favor’: 0, ‘score’: 0, ‘create_time’: ‘1323944383’, ‘content’: “\再\次\测\试\一\下\英\文\的\单\引\号\?nSQ=’English Single Quote'”, ‘user_name’: ‘againinput6’, ‘is_top’: 0, ‘parent_id_enc’: ‘ab64034f78f0f736afc3ab64’, ‘cdatetime’: 1323944383, ‘title’: ”, ‘un’: ‘againinput6’}, {‘thread_id_enc’: ‘b48f8c541640ec1a3a2935d7’, ‘reply_count’: 0, ‘reply_id_enc’: ‘2edda3cc04ce860001e92843’, ‘sexy_time’: ’15\小\时\前’, ‘like_count’: 0, ‘mdatetime’: ‘1323944826’, ‘portrait’: ‘800b616761696e696e707574365902’, ‘reserved1’: 0, ‘reserved2’: 0, ‘dislike_count’: 0, ‘user_id’: ‘39390080’, ‘area’: ”, ‘user_ip’: ‘58.240.236.19’, ‘favor’: 0, ‘score’: 0, ‘create_time’: ‘1323944826’, ‘content’: ‘\一\早\进\入\你\的\空\间\就\听\到\了\你\的\背\景\歌\曲\,\的\确\不\错\,\我\也\是\喜\欢\《\空\位\》\更\多\一\些\,\也\许\跟\之\前\喜\欢\许\巍\的\歌\曲\有\关\系\吧\,\呵\呵\,\一\种\可\以\透\过\肌\肤\,\沁\入\心\灵\的\旋\律\和\声\音\,\很\不\错\。\谢\谢\你\推\荐\好\歌\给\大\家\,\原\来\你\写\了\这\么\多\的\东\西\,\经\营\着\这\么\多\的\园\地\,\辛\苦\了\。<br>\“\我\一\直\在\寻\找\寻\找\那\个\空\位\”\,\很\不\错\,\一\如\自\己\的\状\态\,\寻\找\那\个\空\位\,\寻\找\属\于\自\己\的\空\位\,\一\直\一\直\在\寻\找\…\…n<br>22222222222222222222222222’, ‘user_name’: ‘againinput6’, ‘is_top’: 0, ‘parent_id_enc’: ‘ab64034f78f0f736afc3ab64’, ‘cdatetime’: 1323944826, ‘title’: ”, ‘un’: ‘againinput6’}]}, ‘err_desc’: ‘success’, ‘total_count’: ‘4’, ‘response_count’: 4, ‘err_msg’: ‘success’, ‘err_no’: 0}

然后再去拿出来对应的content域,将其解码:

 cmt_content = cmt_resp[‘body’][‘data’][0][‘content’]

cmt_decoded_char = cmt_content.decode(‘unicode-escape’)

   

就得到了对应的解码后的字符了,对应的解析出来的comment分别为:

 LINE 245   DEBUG    Total comments for this blog item: 4

LINE 251   DEBUG    comment
—————————-
此评论只是测试而已。。。
LINE 251   DEBUG    comment
—————————-
测试添加一个双引号,cn=“中文双引号”,en=”英文双引号”,测试完毕。
LINE 251   DEBUG    comment
—————————-
再次测试一下英文的单引号,enSQ=’English Single Quote’
LINE 251   DEBUG    comment
—————————-
一早进入你的空间就听到了你的背景歌曲,的确不错,我也是喜欢《空位》更多一些,也许跟之前喜欢许巍的歌曲有关系吧,呵呵,一种可以透过肌肤,沁入心灵的旋律和声音,很不错。谢谢你推荐好歌给大家,原来你写了这么多的东西,经营着这么多的园地,辛苦了。<br>“我一直在寻找寻找那个空位”,很不错,一如自己的状态,寻找那个空位,寻找属于自己的空位,一直一直在寻找……
<br>22222222222222222222222222

   

【总结】

对于一个形如字典类型变量的字符串,如果其中有域值原先是unicode(utf-8)的字符的话,

那么该字符串在被eval或ast.literal_eval解析成为字典变量后,对应域值,就会变为xXX之类的值了。

转载请注明:在路上 » 【已解决】Python中,将一个字符串eval或ast.literal_eval变成字典后,unicode的字符变成了x格式

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
83 queries in 0.166 seconds, using 22.13MB memory