Marrying Haskell and Hyper-Threading (Post Scriptum)
After writing of the previous blog post, I got some interesting feedback from the working chat and r/haskell. Some of the input I want to hightlight explicitly. Feedback order is arbitrary.
First thing is a discussion of the explicit pinning capabilities to the cores. It’s possible using +RTS -qa flag, as it was mentioned by the nh2 on Reddit. As I mentioned in the previous blog post, my approach will not work with this option correctly (for some reason I have used -xm
instead of -qa
in that post, I’m sorry) and I’ll need to redefine more functions. But in general pinning capabilities to cores may work on all possible CPU layouts. I have not looked deep inside that issue as in most of our cases -qa
flag gave me worse performance, so your program should have some special properties to make benefit from the hard pinning. I think it’s possible to use /proc/cpuinfo
to make the most efforts when pinning capabilities.
The entire thread is very entertaining and if you are interested in the topic then I recommend to check out ther comments as well.
Secondly, there was a question if my reasoning was incorrect and it’s enough to leave one thread off and still have better performance. We used this approach in some projects, however for one particular case the results with N-1
threads were very depressing:
Cumulative quantiles per tag (N7)
99% 98% 95% 90% 85% 80% 75% 50%
Overall 4600ms 4380ms 3980ms 3540ms 3400ms 3280ms 3210ms 1105ms
get 4600ms 4390ms 3980ms 3550ms 3410ms 3290ms 3210ms 1145ms
put 4600ms 4380ms 3980ms 3540ms 3400ms 3280ms 3210ms 1100ms
Cumulative quantiles per tag (N4)
99% 98% 95% 90% 85% 80% 75% 50%
Overall 139ms 105ms 37ms 17ms 12ms 8ms 6ms 2ms
get 139ms 104ms 37ms 18ms 12ms 9ms 7ms 2ms
put 139ms 105ms 37ms 17ms 12ms 8ms 6ms 2ms
There is 1 to 3 orders of magnitude differences in response times, without going deeper I have decided to stick with -N4
for now.
The third, @TerrorJack adviced me to improve teardown procedure in the wrapper.c
, as it should check ifRTS
was stopped and report its status. So I have rechecked the sources and introduced few updates that allow to report status of running haskell command (the same way as RTS does), and which do not require using FFI extension in the Haskell code.
comments powered by Disqus