I tried using modin to read from a csv(size~5gb) file. The below is the code I used
import modin.pandas as pd for run in range(0,1): df = pd_modin.read_csv("DM_ALUNO.CSV")
I am able to get good speed up compared to pandas on a 80 core intel skylake CPU. But something I found confusing was when I profiled my code with intel Vtune profiler (One of the most standard tools to profile CPU usage) the cpu usage histogram was almost the same for both pandas as well as modin. The attached image is of cpu histogram collected for Modin( we can see avg core utilisation is way too low in 80 core cpu).
Could you throw me some light on why the core usage has not increased with modin. And how does modin provide the speed up if not by using more cores.