System: Coe2Duo E4500, 4Gb memory, 1GiB Intel NIC, 64-bit system, dedicated router to ISP connection (so, this system doesn't perform any NAT or firewalling). Theoretically, 25Mbit/s internet connection, but it is more at practice. No other ``heavy'' processes on system -- only essential system processes like cron and transmission-daemon.
Torrents: 357 torrents, all 100% ready (Seed), 692.1GiB according to ``transmission-remote --list''. Upload speed is between 2000KiB/s and 4500KiB/s with average about 3000KiB/s. No limits are set via config file.
transmission-daemon, compiled with standard options (no profiling, optimizations), consumes 12-14% of CPU (one core, so Idle is ~185%). It is very responsive if upload speed is about 2000KiB/s.
When speed is higher than 2000KiB/s, it becomes unresponsive: simple ``transmission-remote --list'' could take 30 seconds, or even ended with timeout or ``no answer'', Tr. Remote GUI losts connection and could not reconnect, etc. Please note, that here is plenty of CPU at this moment: one core is completely available, and other is loaded only at 15%.
transmission-daemon contains 4 threads, but only 1 of them is really works. Two others don't consume CPU at all.
Profiling with gprof (gcc -gp -g) shows, that most of time (about 50%) is consumed by bandwidthPulse(), most of THAT time is consumed by reconnectPulse(), and, down by the tree, tr_bandwidthAllocate(). NB: here is NO any bandwidth limits in configuration file! makeNewPeerConnections() consumes a lot of CPU too.
Functions comparePeerCandidates() and addValToKey() spend a lot of time by itself (without children calls or syscalls): 9% and 4.2%.
Ok. Now to syscall analyzyss with kdump. I've runned kdump for 20 seconds and get this statistics by number of calls (top-20):
Code: Select all
gettimeofday 87054
writev 31118
clock_gettime 16124
kevent 8378
sendto 7150
readv 4577
ioctl 4577
fstat 3888
pread 3883
recvfrom 3831
poll 2288
getpid 1384
madvise 951
fcntl 945
setsockopt 908
close 503
socket 461
connect 461
bind 436
stat 190
Ok. Enough data. Now I have some questions to authors :)
(1) Why is any RPC so sluggish when here is alot of CPU available?
(2) Why here is 4000+ calls per second to gettimeofday(), expensive syscall? Standard resolution of system timer on FreeBSD is less than that, I think, that Linux doesn't use so high-resolution timer for this call too. And, please note, it is when there is no need in bandwidth allocation at all.
(2a) Even with bandwidth allocation -- do we need to call gettimeofday() SO often? Which precision and guarantees do we want to achieve with this?
(3) Here is another 800 call/second time-related syscall clock_gettime(). WHY?
(4) What do 3 other threads of transmission-daemon do? It seems, that one ins DHT, and other is "web" (RPC?), but why RPC is so sluggish then?
(5) Why here is poll() calls when libevent uses kevent() (it should not affect performance, to have 100 polls/s, but for completeness :))
Really, I'm satisfied by transmission-daemon performance, as it looks like I never will have Internet connection, which will be faster, than my server could serve. But this GUI sluggish-ness is annoying.