Random high CPU load using transmission 1.50
Posted: Tue Feb 24, 2009 12:22 pm
This is mostly for the developers of transmission, probably I found what triggers it.
(You are doing a great job, thanks for this fine client!)
I am running transmission 1.50 (7887) on Arch linux, downloading one ~30G torrent with 4M pieces (means around 7000 pieces) and 16k blocks. There are 3-6 seeds and 10-40 leechers. Normal CPU load is 6-8%. Upload is limted. Download is not, because it has its own problems. (I will write another post later about it.)
At random times the CPU load climbs to 50%, 60%, 75% or 99% and stays there near constantly for minutes, sometimes for half an hour or hours. The upload drops to half, the download is hard to measure but I think it is dropping too.
Oprofile pointed to refillPulse and blockIteratorNext procedures in peer-mgr.c. Reviewing the code I found some fairly likely worst case scenario can occur. Let us suppose we have a downloading leecher who has small number of completed pieces and nothing important to us. In refillPulse, most probably this will be the last peer who remains in the peer list. Now we have to find something for it to download. What he has can be the "number 3000 piece" in our list. For every peace blockIteratorNext will build a list, cycling 256 times and refillPulse cycling 256 times again, to find that that piece is not good for this peer. Can be millions of wasted cycles.
As a proof of concept I inserted a line at the end of the refillPulse while cycle.
#
if( !handled ) blockIterator->blockIndex = blockIterator->blockCount;
#
I know that this is not a proper solution, but this skips the refillPulse cycles.
Yesturday the high CPU was triggered almost every time with around 50% load. So I tried the modified refillPulse and the CPU load dropped to 16%.
I hope this helps.
(You are doing a great job, thanks for this fine client!)
I am running transmission 1.50 (7887) on Arch linux, downloading one ~30G torrent with 4M pieces (means around 7000 pieces) and 16k blocks. There are 3-6 seeds and 10-40 leechers. Normal CPU load is 6-8%. Upload is limted. Download is not, because it has its own problems. (I will write another post later about it.)
At random times the CPU load climbs to 50%, 60%, 75% or 99% and stays there near constantly for minutes, sometimes for half an hour or hours. The upload drops to half, the download is hard to measure but I think it is dropping too.
Oprofile pointed to refillPulse and blockIteratorNext procedures in peer-mgr.c. Reviewing the code I found some fairly likely worst case scenario can occur. Let us suppose we have a downloading leecher who has small number of completed pieces and nothing important to us. In refillPulse, most probably this will be the last peer who remains in the peer list. Now we have to find something for it to download. What he has can be the "number 3000 piece" in our list. For every peace blockIteratorNext will build a list, cycling 256 times and refillPulse cycling 256 times again, to find that that piece is not good for this peer. Can be millions of wasted cycles.
As a proof of concept I inserted a line at the end of the refillPulse while cycle.
#
if( !handled ) blockIterator->blockIndex = blockIterator->blockCount;
#
I know that this is not a proper solution, but this skips the refillPulse cycles.
Yesturday the high CPU was triggered almost every time with around 50% load. So I tried the modified refillPulse and the CPU load dropped to 16%.
I hope this helps.