Wednesday, 15 February 2012

pandas - failing simple groupby example from "Python for Data Analysis" text -



pandas - failing simple groupby example from "Python for Data Analysis" text -

i started learning python (mostly open source replacement matlab using "ipython --pylab" ), going through examples "python info analysis" text. on page 253, simple illustration shown using 'groupby' (passing list of arrays). repeat in text, error: "typeerror: 'series' objects mutable, cannot hashed"

import pandas pd pandas import dataframe df = dataframe({'key1' : ['a','a','b','b','a'],'key2' : ['one','two','one','two\ ','one'],'data1' : np.random.randn(5),'data2' : np.random.randn(5)}) grouped = df['data1'].groupby(df['key1']) means = df['data1'].groupby(df['key1'],df['key2']).mean()

-----details of typeerror-------

typeerror traceback (most recent phone call last) <ipython-input-7-0412f2897849> in <module>() ----> 1 means = df['data1'].groupby(df['key1'],df['key2']).mean() /home/joeblow/enthought/canopy_64bit/user/lib/python2.7/site-packages/pandas/core/generic.pyc in groupby(self, by, axis, level, as_index, sort, group_keys, squeeze) 2725 2726 pandas.core.groupby import groupby -> 2727 axis = self._get_axis_number(axis) 2728 homecoming groupby(self, by, axis=axis, level=level, as_index=as_index, 2729 sort=sort, group_keys=group_keys, squeeze=squeeze) /home/joeblow/enthought/canopy_64bit/user/lib/python2.7/site-packages/pandas/core/generic.pyc in _get_axis_number(self, axis) 283 284 def _get_axis_number(self, axis): --> 285 axis = self._axis_aliases.get(axis, axis) 286 if com.is_integer(axis): 287 if axis in self._axis_names: /home/joeblow/enthought/canopy_64bit/user/lib/python2.7/site-packages/pandas/core/generic.pyc in __hash__(self) 639 def __hash__(self): 640 raise typeerror('{0!r} objects mutable, cannot be' --> 641 ' hashed'.format(self.__class__.__name__)) 642 643 def __iter__(self): typeerror: 'series' objects mutable, cannot hashed

what simple thing missing here?

you didn't in text. :^)

>>> means = df['data1'].groupby([df['key1'],df['key2']]).mean() >>> means key1 key2 1 1.127536 2 1.220386 b 1 0.402765 2 -0.058255 dtype: float64

if you're grouping 2 arrays, need pass list of arrays. instead passed 2 arguments: (df['key1'],df['key2']), beingness interpreted by , axis.

python pandas

No comments:

Post a Comment