porter wrote:Jordan wrote:Under the approach you're describing, the steps would be:
* Transmission prefetches 15 blocks for torrent A, peer A1 because it knows it will be sending them soon.
* Transmission prefetches 15 blocks for torrent A, peer A2 because it knows it will be sending them soon.
* Transmission prefetches 15 blocks for torrent A, peer A3 because it knows it will be sending them soon.
* Repeat previous steps for remaining torrents
* User hits "verify local data" for torrent X
* Verify loads all of torrent X into the cache at the same priority as the prefetched blocks. If X is large, the prefetched blocks get swapped out.
* The OS the has to hit the disk again to reload the previously-prefetched blocks we're uploading in every torrent except X
* In exchange, some or all of torrent X is kept in the cache even if there are 0 downloading peers
The problem with this approach isn't the kernel, it's that we've told the kernel to give equal weight to blocks loaded by verify and blocks prefetched due to peer requests. The latter are ones we'll definitely need soon, and the former are ones we may or may not need.
Huh, it looks like we're talking about 2 different things?! And it looks like you're trying to implement another I/O caching layer, on top of the existing kernel one, with different rules.
When I talk about Transmission prefetching blocks, I'm referring to the use of POSIX_FADV_WILLNEED and F_RDADVISE to tell the OS about upcoming reads. We know in advance that we'll need these blocks because we can peek the request lists that the peers have sent us:
Code: Select all
int
tr_prefetch (int fd UNUSED, off_t offset UNUSED, size_t count UNUSED)
{
#ifdef HAVE_POSIX_FADVISE
return posix_fadvise (fd, offset, count, POSIX_FADV_WILLNEED);
#elif defined (SYS_DARWIN)
struct radvisory radv;
radv.ra_offset = offset;
radv.ra_count = count;
return fcntl (fd, F_RDADVISE, &radv);
#else
return 0;
#endif
}
Porter wrote:I'm afraid it just can't work well. How much memory are you controlling with "your" cache, do you take care of all 6-7GB that I have available? Of course not. At least not in a efficient way, you're actually killing gigabytes of cached content to protect a few blocks in "your" cache, do you really think it's fair?
None of this is relevant to my question, which is about OS-level prefetching.
Interesting though that you chose to "quote" me "twice" on something that I wasn't talking about.
Porter wrote:First of all, you say "transmission prefetches", what do you mean by that? In all modern OS-es it's the kernel that fetches blocks from disk, and if application needs them, it will ask kernel via system call to provide the block. The kernel than takes care of allocating memory, reading the block, doing readahead (prefetching the following blocks, anticipating they will be needed soon), caching the block(s) and eventually freeing them. The kernel also takes care to keep cached (in all available memory!) what's needed often, and throw out what is not needed.
Yes.
Porter wrote:And it has the best view overall, because transmission is not the only app running on the OS, typically, right?
This is true to a point, but the app has a role. If it has specific knowledge about upcoming IO, it can give hints to the OS to optimize for it. This is why things like posix_fadvise() exist.
This paragraph isn't relevant to the topic of prefetching, but as an aside, there's also a place for an app-level write cache in BitTorrent clients. The client has unique knowledge about what disk writes are upcoming because it knows (1) which blocks it's requested (2) from which peers and (3) how fast each peer is. A smart torrent client can lower the total number of disk writes with even a small in-memory write cache.
Porter wrote:Lastly, quote "it's that we've told the kernel to give equal weight to blocks loaded by verify and blocks prefetched due to peer requests" just doesn't make sense, because there ain't such system call, at least not in Linux. Actually, the kernel always starts giving equal weights to every page of memory, but soon it starts to prefer those pages that are accessed often, by using multiple LRU lists, hardware page table bits (was the page accessed or not?) and various complicated clock algorithms.
This finally gets to the question I asked. Let's say ${OS} is using LRU. Transmission prefetches the blocks it knows it's going to need soon. Then "verify local data" loads a torrent larger than the cache, causing the prefetched blocks to fall off the end of the LRU list. The end result is we lose blocks we know we're going to need, to make room for a torrent that may or may not have any peers.
That's the question I'm asking, anyway. Ideally I'd like to see some testing about how this plays out in practice on different OSes and different cache sizes.
Porter wrote:That is stuff that you just CAN'T implement in user space. So it's better not to even start, because you can just make things worse, and actually - you are.
How can I help you understand that the extra I/O layer can only slow things down? Help me to help you.
I hope this response helps you to understand the question I'm asking.