Monday, 15 March 2010

Basic data formatting in python -



Basic data formatting in python -

i want format text file 7000 entries , utilize next code sort things. lastly couple of weeks, stuck problem: (input info following)

user_protein_id = p25358

smart_protein_id = uniprot|p25358|elo2_yeast

number_of_features_found=8

domain=pfam:elo

start=63

end=307

evalue=2.4e-64

type=pfam

code.py

file=open('r.txt').readlines() line in file: line= line.rstrip() if re.search('user|domain|status=visible|ok', line): line= re.sub(r'user_protein_id = |domain=pfam:|\s','', line) print(''.join(line))

what getting is:

p53242 vac_importdeg status=visible|ok p40850 domain=xpgn status=visible|ok xpg_n domain=xpgi status=visible|ok xpg_i mkt1_n status=visible|ok mkt1_c status=visible|ok

but wanted print results started line entry (eg p53242) in sep=\t in next shape:

p53242 vac_importdeg status=visible|ok p40850 domain=xpgn status=visible|ok xpg_n domain=xpgi status=visible|ok xpg_i mkt1_n

without content file.

use:

print re.sub(r'(p\d+)',r'\n\1 ',re.sub(r'\n','',line))

instead of:

print(''.join(line))

python

No comments:

Post a Comment