Python: Pandas generating downward filling variables in DataFrame -

- July 15, 2014

i have following dataframe df:

                s    2011-01-26      1 2011-01-27      0 2011-01-28      0 2011-01-29      0 2011-01-30      0 2011-01-31      0 2011-02-01      0 2011-02-02      0 2011-02-03      0 2011-02-04      0 2011-02-05      0 2011-02-06      0 2011-02-07      0 2011-02-08      0 2011-02-09      0

i trying generate following dataframe df:

                s  s1 s2 s3    2011-01-26      1  0  0  0 2011-01-27      0  1  0  0 2011-01-28      0  1  0  0 2011-01-29      0  0  1  0 2011-01-30      0  0  1  0 2011-01-31      0  0  1  0 2011-02-01      0  0  1  0 2011-02-02      0  0  0  1 2011-02-03      0  0  0  1 2011-02-04      0  0  0  1 2011-02-05      0  0  0  1 2011-02-06      0  0  0  1 2011-02-07      0  0  0  1 2011-02-08      0  0  0  1 2011-02-09      0  0  0  1

you can see number of 1 in each columns increases downward multiple of 2. there in pandas function, fillna can specify fill downwards x rows?

update in fact, have more complicated task.

if df:

                s    2011-01-26      1 2011-01-27      0 2011-01-28      0 2011-01-29      0 2011-01-30      0 2011-01-31      0 2011-02-01      0 2011-02-02      0 2011-02-03      0 2011-02-04      0 2011-02-05      0 2011-02-06      0 2011-02-07      0 2011-02-08      0 2011-02-09      0 ...         (all zeros)                     s    2011-04-26      1 2011-04-27      0 2011-04-28      0 2011-04-29      0 2011-04-30      0 2011-04-31      0 2011-05-01      0 2011-05-02      0 2011-05-03      0 2011-05-04      0 2011-05-05      0 2011-05-06      0 2011-05-07      0 2011-05-08      0 2011-05-09      0

and need this:

                s  s1 s2 s3    2011-01-26      1  0  0  0 2011-01-27      0  1  0  0 2011-01-28      0  1  0  0 2011-01-29      0  0  1  0 2011-01-30      0  0  1  0 2011-01-31      0  0  1  0 2011-02-01      0  0  1  0 2011-02-02      0  0  0  1 2011-02-03      0  0  0  1 2011-02-04      0  0  0  1 2011-02-05      0  0  0  1 2011-02-06      0  0  0  1 2011-02-07      0  0  0  1 2011-02-08      0  0  0  1 2011-02-09      0  0  0  1 zeros every                     s  s1 s2 s3    2011-04-26      1  0  0  0 2011-04-27      0  1  0  0 2011-04-28      0  1  0  0 2011-04-29      0  0  1  0 2011-04-30      0  0  1  0 2011-04-31      0  0  1  0 2011-05-01      0  0  1  0 2011-05-02      0  0  0  1 2011-05-03      0  0  0  1 2011-05-04      0  0  0  1 2011-05-05      0  0  0  1 2011-05-06      0  0  0  1 2011-05-07      0  0  0  1 2011-05-08      0  0  0  1 2011-05-09      0  0  0  1

to best knowledge, there no ready-available function this. can use following trick similar.

import pandas pd import numpy np  # data # ======================================== df = pd.dataframe(0, index=pd.date_range('2015-01-01', periods=100, freq='d'), columns=['col']) df.iloc[[0, 71], 0] = 1  grouped = df.groupby(df.col.cumsum())  grouped.get_group(1)  out[275]:              col 2015-01-01    1 2015-01-02    0 2015-01-03    0 2015-01-04    0 2015-01-05    0 2015-01-06    0 2015-01-07    0 2015-01-08    0 ...         ... 2015-03-05    0 2015-03-06    0 2015-03-07    0 2015-03-08    0 2015-03-09    0 2015-03-10    0 2015-03-11    0 2015-03-12    0  [71 rows x 1 columns]  grouped.get_group(2)  out[276]:              col 2015-03-13    1 2015-03-14    0 2015-03-15    0 2015-03-16    0 2015-03-17    0 2015-03-18    0 2015-03-19    0 2015-03-20    0 ...         ... 2015-04-03    0 2015-04-04    0 2015-04-05    0 2015-04-06    0 2015-04-07    0 2015-04-08    0 2015-04-09    0 2015-04-10    0  [29 rows x 1 columns]  # processing # ==================================  def func(group):     group['temp'] = 0     group.temp.iloc[2 ** np.arange(int(np.log2(len(group))) + 1) - 1] = 1     group['new_col'] = group.temp.cumsum()     return pd.get_dummies(group.new_col)   grouped.apply(func)  out[281]:              1  2  3  4  5   6   7 2015-01-01  1  0  0  0  0   0   0 2015-01-02  0  1  0  0  0   0   0 2015-01-03  0  1  0  0  0   0   0 2015-01-04  0  0  1  0  0   0   0 2015-01-05  0  0  1  0  0   0   0 2015-01-06  0  0  1  0  0   0   0 2015-01-07  0  0  1  0  0   0   0 2015-01-08  0  0  0  1  0   0   0 ...        .. .. .. .. ..  ..  .. 2015-04-03  0  0  0  0  1 nan nan 2015-04-04  0  0  0  0  1 nan nan 2015-04-05  0  0  0  0  1 nan nan 2015-04-06  0  0  0  0  1 nan nan 2015-04-07  0  0  0  0  1 nan nan 2015-04-08  0  0  0  0  1 nan nan 2015-04-09  0  0  0  0  1 nan nan 2015-04-10  0  0  0  0  1 nan nan

Search This Blog

Running

Python: Pandas generating downward filling variables in DataFrame -

Comments

Post a Comment

Popular posts from this blog

python - No exponential form of the z-axis in matplotlib-3D-plots -

c# - "Newtonsoft.Json.JsonSerializationException unable to find constructor to use for types" error when deserializing class -

Why does a .NET 4.0 program produce a system.unauthorizedAccess error on a Windows Server 2012 machine with .NET 4.5 installed? -