Thursday, 15 May 2014

python - How can I efficiently merge these two datasets? -



python - How can I efficiently merge these two datasets? -

so have 2 lists of data, looking (shortened):

[[1.0, 1403603100], [0.0, 1403603400], [2.0, 1403603700], [0.0, 1403604000], [none, 1403604300]] [1.0, 1403603100], [0.0, 1403603400], [1.0, 1403603700], [none, 1403604000], [5.0, 1403604300]]

what i'm wanting merge them, summing first elements of each dataset, or making 0.0 if either counter value none. above illustration become this:

[[2.0, 1403603100], [0.0, 1403603400], [3.0, 1403603700], [0.0, 1403604000], [0.0, 1403604300]]

this i've come far, apologies if it's bit cludgy.

def emit_datum(datapoints): datum in datapoints: yield datum def merge_data(data_set1, data_set2): assert len(data_set1) == len(data_set2) data_length = len(data_set1) data_gen1 = emit_datum(data_set1) data_gen2 = emit_datum(data_set2) merged_data = [] _ in range(data_length): datum1 = data_gen1.next() datum2 = data_gen2.next() if datum1[0] none or datum2[0] none: merged_data.append([0.0, datum1[1]]) go on count = datum1[0] + datum2[0] merged_data.append([count, datum1[1]]) homecoming merged_data

i can hope/assume there's cunning can itertools or collections?

if making both values equal 0.0 if either none need simple loop.

l1 = [1.0, 1403603100], [0.0, 1403603400], [2.0, 1403603700], [0.0, 1403604000], [none, 1403604300]] l2 = [[1.0, 1403603100], [0.0, 1403603400], [1.0, 1403603700], [none, 1403604000], [5.0, 1403604300]] final = [] assert len(l1)== len(l2) x, y in zip(l1, l2): if x[0] none or y[0] none: y[0] = 0.0 final.append(y) else: final.append([x[0] + y[0], x[-1]]) print final [[2.0, 1403603100], [0.0, 1403603400], [3.0, 1403603700], [0.0, 1403604000], [0.0, 1403604300]] in [51]: %timeit merge_data(l1,l2) 100000 loops, best of 3: 5.76 µs per loop in [52]: %%timeit ....: final = [] ....: assert len(l1)==len(l2) ....: x, y in zip(l1, l2): ....: if x[0] none or y[0] none: ....: y[0] = 0.0 ....: final.append(y) ....: else: ....: final.append([x[0] + y[0], x[-1]]) ....: 100000 loops, best of 3: 2.64 µs per loop

python arrays merge

No comments:

Post a Comment