2.2.2. 将实体定义替换为字符:repUniNumEntToChar


#------------------------------------------------------------------------------
# replace the &#N; (N is digit number, N > 1) to unicode char
# eg: replace "'" with "'" in "Creepin' up on you"
def repUniNumEntToChar(text):
    unicodeP = re.compile('&#[0-9]+;');
    def transToUniChr(match): # translate the matched string to unicode char
        numStr = match.group(0)[2:-1]; # remove '&#' and ';'
        num = int(numStr);
        unicodeChar = unichr(num);
        return unicodeChar;
    return unicodeP.sub(transToUniChr, text);

        

例 2.6. repUniNumEntToChar的使用范例

infoDict['title'] = repUniNumEntToChar(infoDict['title']);