OpenCL multiple in-order command queues vs single out-of-order one -


i have number of jobs execute. each job consists of buffer write, kernel execution , buffer read , operations must of course executed in order. various jobs indipendent , can therefore executed concurrently.

is there performance difference between using multiple in-order command queues (like 1 cuda streams) , single out-of-order one, equivalent synchronization? better?

some implementations don't support out-of-order command queues.

based on description i'd use multiple out-of-order queues. using single out-of-order queue required events synchronize within virtual queue, work you.


Comments