Python dataframes -
i have dataframe (df) , trying append info specific row
index fruit rank 0 banana 1 1 apple 2 2 mango 3 3 melon 4
the goal compare fruit @ rank 1 each rank , append value. i'm using difflib.sequencematcher create comparison. right i'm able append df end appending same value each row. i'm struggling loop , append. pointers much appreciated.
here of code:
new_entry = df[(df.rank ==1)] new_fruit = new_entry['fruit'] prev_entry = df[(df.rank ==2)] prev_fruit = prev_entry['fruit'] similarity_score = difflib.sequencematcher(none, str(new_fruit).lower(), str(prev_fruit).lower()).ratio() df['similarity_score'] = similarity_score
the result this:
index fruit rank similarity_score 0 banana 1 0.3 1 apple 2 0.3 2 mango 3 0.3 3 melon 4 0.3
the desired result is:
index fruit rank similarity_score 0 banana 1 n/a 1 apple 2 0.4 2 mango 3 0.5 3 melon 4 0.6
thanks.
this doesn't give similarity score order want, calculates sequencematcher
ratio rank 1 value ('banana') , each row , adds column.
import pandas pd import difflib df = pd.dataframe({'fruit': ['banana', 'apple', 'mango', 'melon'], 'rank': [1, 2, 3, 4]}) top = df['fruit'][df.rank == 1][0] df['similarity_score'] = df['fruit'].apply(lambda x: difflib.sequencematcher( none, top, x).ratio())
python python-2.7 pandas difflib
No comments:
Post a Comment