multithreading - OpenCL program running on CPU -
i want compare performance of single-core cpu , multi-core cpu. wrote program , let iterate 1000 times on single-core cpu see running time. in multi-core case, used opencl launch kernel code same inside iteration in first case.
considered multi-core run 8 concurrent threads, theoretically, running time of multi-core case should above t(single-core)/8. results t(multi-core) 1/20 of t(single-core).
i wonder why happen? did opencl compiler optimization multi-core cpu ?
if single core code scalar, chances opencl runtime used sse or avx , multiplier.
Comments
Post a Comment