python - Calcuate mean for selected rows for selected columns in pandas data frame -


i have pandas df say, 100 rows, 10 columns, (actual data huge). have row_index list contains, rows considered take mean. want calculate mean on columns 2,5,6,7 , 8. can function dataframe object?

what know loop, value of row each element in row_index , keep doing mean. have direct function can pass row_list, , column_list , axis, ex df.meanadvance(row_list,column_list,axis=0) ?

i have seen dataframe.mean() didn't guess.

  b c d q  0 1 2 3 0 5 1 1 2 3 4 5 2 1 1 1 6 1 3 1 0 0 0 0 

i want mean of 0, 2, 3 rows each a, b, d columns

  b d 0 1 1 2 

to select rows of dataframe can use iloc, can select columns want using square brackets.

for example:

 df = pd.dataframe(data=[[1,2,3]]*5, index=range(3, 8), columns = ['a','b','c']) 

gives following dataframe:

    b  c 3  1  2  3 4  1  2  3 5  1  2  3 6  1  2  3 7  1  2  3 

to select 3d , fifth row can do:

df.iloc[[2,4]] 

which returns:

    b  c 5  1  2  3 7  1  2  3 

if want select columns b , c use following command:

df[['b', 'c']].iloc[[2,4]] 

which yields:

   b  c 5  2  3 7  2  3 

to mean of subset of dataframe can use df.mean function. if want means of columns can specify axis=0, if want means of rows can specify axis=1

thus:

df[['b', 'c']].iloc[[2,4]].mean(axis=0) 

returns:

b    2 c    3 

as should expect input dataframe.

for code can do:

 df[column_list].iloc[row_index_list].mean(axis=0) 

edit after comment: new question in comment: have store these means in df/matrix. have l1, l2, l3, l4...lx lists tells me index mean need columns c[1, 2, 3]. ex: l1 = [0, 2, 3] , means need mean of rows 0,2,3 , store in 1st row of new df/matrix. l2 = [1,4] again calculate mean , store in 2nd row of new df/matrix. till lx, want new df have x rows , len(c) columns. columns l1..lx remain same. me this?

answer:

if understand correctly, following code should trick (same df above, columns took 'a' , 'b':

first loop on lists of rows, collection means pd.series, concatenate resulting list of series on axis=1, followed taking transpose in right format.

dfs = list() l in l:     dfs.append(df[['a', 'b']].iloc[l].mean(axis=0))  mean_matrix = pd.concat(dfs, axis=1).t 

Comments