python - How to parallelize the numpy operations in cython -


i trying parallelize following code includes numerous numpy array operations

    #fft_fit.pyx     import cython     import numpy np     cimport numpy np     cython.parallel cimport prange     libc.stdlib cimport malloc, free      dat1 = np.genfromtxt('/home/bagchilab/sumanta_files/fourier_ecology_sample_data_set.csv',delimiter=',')     dat = np.delete(dat1, 0, 0)     yr = np.unique(dat[:,0])     fit_dat = np.empty([1,2])       def fft_fit_yr(np.ndarray[double, ndim=1] yr, np.ndarray[double, ndim=2] dat, int yr_idx, int pix_idx):         cdef np.ndarray[double, ndim=2] yr_dat1          cdef np.ndarray[double, ndim=2] yr_dat         cdef np.ndarray[double, ndim=2] fft_dat         cdef np.ndarray[double, ndim=2] fft_imp_dat         cdef int len_yr = len(yr)         in prange(len_yr ,nogil=true):             gil:                  yr_dat1 = dat[dat[:,yr_idx]==yr[i]]                 yr_dat = yr_dat1[~np.isnan(yr_dat1).any(axis=1)]                 print "index" ,i                 y_fft = np.fft.fft(yr_dat[:,pix_idx])                 y_fft_abs = np.abs(y_fft)                 y_fft_freq = np.fft.fftfreq(len(y_fft), 1)                 x_fft = range(len(y_fft))                 fft_dat = np.column_stack((y_fft, y_fft_abs))                 cut_off_freq = np.percentile(y_fft_abs, 25)                 imp_freq =  np.array(y_fft_abs[y_fft_abs > cut_off_freq])                 fft_imp_dat = np.empty((1,2))         j in range(len(imp_freq)):                     freq_dat = fft_dat[fft_dat[:, 1]==imp_freq[j]]                     fft_imp_dat  = np.vstack((fft_imp_dat , freq_dat[0,:]))                        fft_imp_dat = np.delete(fft_imp_dat, 0, 0)                 fit_dat1 = np.fft.ifft(fft_imp_dat[:,0])                 fit_dat2 = np.column_stack((fit_dat1.real, [yr[i]] * len(fit_dat1)))                 fit_dat = np.concatenate((fit_dat, fit_dat2), axis = 0)  

i have used following code setup.py

    ####setup.py     distutils.core import setup     distutils.extension import extension     cython.distutils import build_ext      setup( cmdclass = {'build_ext': build_ext}, ext_modules = [extension("fft_fit_yr", ["fft_fit.pyx"])]     extra_compile_args=['-fopenmp'],     extra_link_args=['-fopenmp'])]     ) 

but getting following error when compile fft_fit.pyx in cython:

    in prange(len_yr ,nogil=true):     target may not python object don't have gil 

please let me know going wrong while using prange function. thanks.

you can't (at least not using cython).

numpy functions operate on python objects , therefore require gil, prevents multiple native threads executing in parallel. if compile code using cython -a, annotated html file shows python c-api calls being made (and therefore gil can't released).

cython useful have specific bottleneck in code cannot speeded using vectorization. if code spending of time in numpy function calls calling exact same functions cython not going result in significant performance improvement. in order see noticeable difference need write or of array operations explicit for loops. looks me though there simpler optimizations made code.

i suggest following:

  1. profile original python code (e.g. using line_profiler) see bottlenecks are.
  2. focus attention on speeding these bottlenecks in single-threaded version. should ask separate question on if want this.
  3. if optimized single-threaded version still slow needs, parallelize using joblib or multiprocessing. parallelization last tool reach once you've tried else can think of.

Comments