i have situation curious know how r can handle efficiently. let's have data set has 2 columns-v1 , v2. now, want way evaluate column v1 , check in 3 rows @ time (i.e rows 1 3, 4 6 , on) following 2 conditions:- a) either of 3 rows of v1 contain 0 b) either of 3 rows of v1 contains 3 digit number
if conditions meet, swap 3 values in v1 column values in v2.
i struggling find way in r. has done on 500,000 rows , 5 columns, efficiency important.
thanks!
not yet proficient r, here's way;
# sample data df = data.frame(col1=c(1,0,3,4,5,6,107,8,9), col2=c(9,8,7,6,5,4,3,2,1)) # col1 col2 # 1 1 9 # 2 0 8 # 3 3 7 # 4 4 6 # 5 5 5 # 6 6 4 # 7 107 3 # 8 8 2 # 9 9 1 # function evaluate condition ==0 or 3 digits cond <- function(x) { any(x==0 | x>=100 & x<=999); } # add column telling whether swap running cond function on groups # of three. expand groups again repeating each value 3 times match # rows. df$swap = rep(lapply(split(df$col1, ceiling(seq_along(df$col1)/3)), fun=cond), each=3) # swap indicated rows df[df$swap==true,][c('col1', 'col2')] = df[df$swap==true,][c('col2', 'col1')] # remove swap column df <- within(df, rm(swap)) # col1 col2 # 1 9 1 # 2 8 0 # 3 7 3 # 4 4 6 # 5 5 5 # 6 6 4 # 7 3 107 # 8 2 8 # 9 1 9
i'm not fond of adding swap column frame, i'm not quite sure how avoid it.
Comments
Post a Comment