Efficiently Creating A Pandas DataFrame From A Numpy 3d array -


suppose start with

import numpy np = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) 

how can efficiently made pandas dataframe equivalent to

import pandas pd >>> pd.dataframe({'a': [0, 0, 1, 1], 'b': [1, 3, 5, 7], 'c': [2, 4, 6, 8]})      b  c 0  0  1  2 1  0  3  4 2  1  5  6 3  1  7  8 

the idea have a column have index in first dimension in original array, , rest of columns vertical concatenation of 2d arrays in latter 2 dimensions in original array.

(this easy loops; question how without them.)


longer example

using @divakar's excellent suggestion:

>>> np.random.randint(0,9,(4,3,2)) array([[[0, 6],     [6, 4],     [3, 4]],     [[5, 1],     [1, 3],     [6, 4]],     [[8, 0],     [2, 3],     [3, 1]],     [[2, 2],     [0, 0],     [6, 3]]]) 

should made like:

>>> pd.dataframe({     'a': [0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3],      'b': [0, 6, 3, 5, 1, 6, 8, 2, 3, 2, 0, 6],      'c': [6, 4, 4, 1, 3, 4, 0, 3, 1, 2, 0, 3]})      b  c 0   0  0  6 1   0  6  4 2   0  3  4 3   1  5  1 4   1  1  3 5   1  6  4 6   2  8  0 7   2  2  3 8   2  3  1 9   3  2  2 10  3  0  0 11  3  6  3 

using panel:

a = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) b=pd.panel(rollaxis(a,2)).to_frame() c=b.set_index(b.index.labels[0]).reset_index() c.columns=list('abc') 

then a :

[[[1 2]   [3 4]]   [[5 6]   [7 8]]] 

b :

             0  1 major minor       0     0      1  2       1      3  4 1     0      5  6       1      7  8 

and c :

    b  c 0  0  1  2 1  0  3  4 2  1  5  6 3  1  7  8 

Comments