video: Is there any way to force ipython to interpret utf-8 symbols?

dimanche 1 mars 2015

Is there any way to force ipython to interpret utf-8 symbols?

I'm using ipython notebook.

What I want to do is search a literal string for any spanish accented letters (ñ,á,é,í,ó,ú,Ñ,Á,É,Í,Ó,Ú) and change them to their closest representation in the english alphabet.

I decided to write down a simple function and give it a go:


def remove_accent(n):
    listn = list(n)
    for i in  range(len(listn)):
        if listn[i] == 'ó':
            listn[i] =o
        return listn

Seemed simple right simply compare if the accented character is there and change it to its closest representation so i went ahead and tested it getting the following output:


in []: remove_accent('whatever !@# ó')
out[]: ['w',
        'h',
        'a',
        't',
        'e',
        'v',
        'e',
        'r',
        ' ',
        '!',
        '@',
        '#',
        ' ',
        '\xc3',
        '\xb3']

I've tried to change the default encoding from ASCII (I presume since i'm getting two positions for te accented character instead of one '\xc3','\xb3') to UTF-8 but this didnt work. what i would like to get is:


in []: remove_accent('whatever !@# ó')
out[]: ['w',
        'h',
        'a',
        't',
        'e',
        'v',
        'e',
        'r',
        ' ',
        '!',
        '@',
        '#',
        ' ',
        'o']

PD: this wouldn't be so bad if the accented character yielded just one position instead of two I would just require to change the if condition but I haven't find a way to do that either.

video

dimanche 1 mars 2015

Is there any way to force ipython to interpret utf-8 symbols?

Aucun commentaire:

Enregistrer un commentaire