You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Cyberduck currently attempts to assemble all .cyberducksegment files for multiple downloads concurrently. This basically results in excessive disk thrashing (and doubled writes) as it is also currently hardcoded to store the segments in a folder adjacent to the requested download location #11841
This is made worse by the fact that no status updates are shown within Cyberduck itself other than "Disconnecting" #13610
To Reproduce
Attempt to download several folders with sufficiently large files within them, see below
Watch as it gradually becomes an order of magnitude slower than attempting a non-segmented download
Transfer3 on its own would've been more than sufficient to result in thrashing. Any scenario involving segmented downloads does, but a single file being concatenated might go unnoticed provided it isn't larger than a few gigabytes in size.
There was seemingly no rhyme or reason to which order the files were "done" done in, File3-3.mp4 was done and re-assembled before Transfer1 & Transfer2 were even finished downloading.
Add'l Notes on Reproduction (Server, Settings, etc.)
Bandwidth: Unlimited
Connections: 20 Connections
Preferences -> Transfers
General
Transfers - Transfer Files: Open multiple connections
Downloads - Segmented downloads with multiple connections per file enabled
Checksum - "Verify checksum" disabled for both downloads and uploads
I was connected to an FTP-SSL server
1 Gbps Internet connection
360 Mbps in Cyberduck downloading 2 files without segmented downloads from the same server to the same disk
240 Mbps in aria2c (5 connections per file, 1 file)
466 Mbps in wget (?, 1 file)
also from the same FTP server and saving files to the same disk
stanky old 7200 RPM hard drive, see Screenshots
Expected behavior
I am not entirely sure what the ideal expected behavior is. There should be some kind of queueing a bit more complex than "first-come, first-served, everybody at the same time" as a bare minimum fix; Cyberduck is currently attempting to concatenate multiple files simultaneously.
This makes it take an order of magnitude longer on any single mechanical hard drive, likely on any SSD "worse" than an MLC SSD with sufficient DRAM cache, and probably on most RAID arrays with parity. Obviously something like SSDs in RAID, or even hard disks in RAID10 will probably be fine, but you shouldn't need storage tuned for maximum SQL database to handle a segmented download. I'm not sure how a CoW filesystem would fare as I do not currently have a ReFS partition or ZFS to test with (where I have permission to install Cyberduck and download large files willy nilly) nor do I have the slightest idea how macOS handles low-level file operations.
A DRAM-less QLC SSD will fare just as bad as my hard drive did before the concatenating even starts, I've tested that.
I can really only see the current implementation being faster on something like an NVMe drive, or on something obscene like an Optane drive (RIP) or a RAM disk, and only when the underlying file system isn't garbage. Something Really Crappy™ like an exFAT formatted 2.5" SMR hard disk would probably just keel over.
How necessary are the segments in the first place?
In aria2c for example, I can set falloc as the file allocation method, which appears to work and instantly creates a "sparse file" on NTFS drives in Windows, and then download a 30 GiB file with 5 simultaneous connections and not have to worry about aria2c "putting it back together" later. The same applies for wget, which I believe opens multiple connections by default. Standalone torrent clients can similarly download files in "pieces" using sparse files (on NTFS, and their equivalents elsewhere?).
You can make either aria2c or e.g. qBitTorrent take significantly longer if you enable or force "pre-allocation" and have it write zeros out for the entire file, but at least that is completed at or close to the drive's sequential write speed, "bad" SSDs or SMR hard drives or something notwithstanding.
Screenshots
don't mind the censor blocks, nobody needs to know about these ISO(BMFF) files
Note the Date Created/Modified and remaining sizes in the segment folders. There doesn't really appear to be any method to the madness, and some of the later started downloads were seemingly done concatenating their files before the very first file downloaded. As one would expect when throwing what essentially becomes random I/O at a hard drive.
performance improving as the queue shrank from 3 files remaining to 1 file remaining, I regret not getting a screenshot when it was trying to put six back together at the same time
If, for example, only a single file could be concatenated at a time, performance would still improve over the current setup, even if other downloads were still transferring. The current implementation seems to be the slowest way multiple segmented downloads can possibly be handled.
default CrystalDiskMark results for stinky old hard drive
ignore the sequential read speed, but that random I/O performance seems to be close to how Cyberduck behaves at the moment
It's an older 512-byte drive with the default 4KiB NTFS allocation unit size. I'm not sure how Cyberduck handles I/O directly, but a simple fix might also be making larger reads/writes when concatenating? Giving the user the option to set a (RAM) cache to handle reads of .cyberducksegments? Or increasing the size if it currently exists. I've never handled disk I/O like this before at any kind of low-level.
Specs
Windows 10 Version 22H2, 19045.4046
Cyberduck 9.0.3 (42112)
1.5 TB 7200 RPM hard drive, SATA2, not SMR, old and dusty but otherwise fine
Log Files
Oops. Sorry, but I am not repeating a download that was done transferring nearly an hour before the files were actually "done," and I have already disabled segmented downloads for the time being.
Additional context
If one downloads a single "mediumly" large file, e.g. 1-2 GiB, this might be hardly noticeable as a hard drive would only spend a minute or so on concatenating the file, and even the crappiest of cheap SSDs doesn't start to choke until a bit later than 1-2 GiB.
A 10 GiB folder with five 2 GiB files may not be immediately noticeable either, since it seems that Cyberduck begins concatenating each file as soon as it is done downloading, and depending on connection limits in Cyberduck and on the FTP server itself, it might only be downloading 1-2 files at a time. I'm not sure if it prioritizes remaining downloads over concatenation jobs or vice versa, if at all.
This might be even less noticeable if the last file(s) in the download order are smaller, there's less "catching up" I/O to do.
The problems begin when a single large (>5 GiB) file has to be concatenated, and it gets exponentially worse as the number of (simultaneously) downloaded files and/or the number of "Transfer" jobs increases.
The text was updated successfully, but these errors were encountered:
I don't think this qualifies as a duplicate of #10961 - I'm sure the improvements made by #13000 were significant, but there are still performance issues.
I do also understand that segmented downloads are probably best suited to scenarios where individual connections might be limited in speed (S3, SFTP?) and are (at least currently) really only usable on SSDs. I am aware the segmented download itself only resulted in marginally faster transfer speeds when using FTPs.
Throwing this in here, regarding the sparse file usage: The segment files currently allows for validation of whether that segment has already and completely finished downloading - regardless of whether the backend storage allows segmented downloads.
In case of FTP a file is sequentially downloaded into segments, and every completely downloaded segment isn't retried later, should a download interrupt occur.
In case of S3, multiple segments are written concurrently to fill as much bandwidth as possible.
As transfers in Cyberduck don't know anything about progress when at-rest, there is no recoverability, and complete restarts are the only other option.
I don't know whether FileChannels can pre-allocate their storage.
Describe the bug
Cyberduck currently attempts to assemble all
.cyberducksegment
files for multiple downloads concurrently. This basically results in excessive disk thrashing (and doubled writes) as it is also currently hardcoded to store the segments in a folder adjacent to the requested download location #11841This is made worse by the fact that no status updates are shown within Cyberduck itself other than "Disconnecting" #13610
To Reproduce
Attempted Transfers (Actual Example)
Transfer3
on its own would've been more than sufficient to result in thrashing. Any scenario involving segmented downloads does, but a single file being concatenated might go unnoticed provided it isn't larger than a few gigabytes in size.There was seemingly no rhyme or reason to which order the files were "done" done in,
File3-3.mp4
was done and re-assembled before Transfer1 & Transfer2 were even finished downloading.Add'l Notes on Reproduction (Server, Settings, etc.)
Expected behavior
I am not entirely sure what the ideal expected behavior is. There should be some kind of queueing a bit more complex than "first-come, first-served, everybody at the same time" as a bare minimum fix; Cyberduck is currently attempting to concatenate multiple files simultaneously.
This makes it take an order of magnitude longer on any single mechanical hard drive, likely on any SSD "worse" than an MLC SSD with sufficient DRAM cache, and probably on most RAID arrays with parity. Obviously something like SSDs in RAID, or even hard disks in RAID10 will probably be fine, but you shouldn't need storage tuned for maximum SQL database to handle a segmented download. I'm not sure how a CoW filesystem would fare as I do not currently have a ReFS partition or ZFS to test with (where I have permission to install Cyberduck and download large files willy nilly) nor do I have the slightest idea how macOS handles low-level file operations.
A DRAM-less QLC SSD will fare just as bad as my hard drive did before the concatenating even starts, I've tested that.
I can really only see the current implementation being faster on something like an NVMe drive, or on something obscene like an Optane drive (RIP) or a RAM disk, and only when the underlying file system isn't garbage. Something Really Crappy™ like an exFAT formatted 2.5" SMR hard disk would probably just keel over.
How necessary are the segments in the first place?
In aria2c for example, I can set
falloc
as the file allocation method, which appears to work and instantly creates a "sparse file" on NTFS drives in Windows, and then download a 30 GiB file with 5 simultaneous connections and not have to worry about aria2c "putting it back together" later. The same applies for wget, which I believe opens multiple connections by default. Standalone torrent clients can similarly download files in "pieces" using sparse files (on NTFS, and their equivalents elsewhere?).You can make either aria2c or e.g. qBitTorrent take significantly longer if you enable or force "pre-allocation" and have it write zeros out for the entire file, but at least that is completed at or close to the drive's sequential write speed, "bad" SSDs or SMR hard drives or something notwithstanding.
Screenshots
don't mind the censor blocks, nobody needs to know about these ISO(BMFF) files
Note the Date Created/Modified and remaining sizes in the segment folders. There doesn't really appear to be any method to the madness, and some of the later started downloads were seemingly done concatenating their files before the very first file downloaded. As one would expect when throwing what essentially becomes random I/O at a hard drive.
performance improving as the queue shrank from 3 files remaining to 1 file remaining, I regret not getting a screenshot when it was trying to put six back together at the same time
If, for example, only a single file could be concatenated at a time, performance would still improve over the current setup, even if other downloads were still transferring. The current implementation seems to be the slowest way multiple segmented downloads can possibly be handled.
default CrystalDiskMark results for stinky old hard drive
ignore the sequential read speed, but that random I/O performance seems to be close to how Cyberduck behaves at the moment
It's an older 512-byte drive with the default 4KiB NTFS allocation unit size. I'm not sure how Cyberduck handles I/O directly, but a simple fix might also be making larger reads/writes when concatenating? Giving the user the option to set a (RAM) cache to handle reads of
.cyberducksegment
s? Or increasing the size if it currently exists. I've never handled disk I/O like this before at any kind of low-level.Specs
Log Files
Oops. Sorry, but I am not repeating a download that was done transferring nearly an hour before the files were actually "done," and I have already disabled segmented downloads for the time being.
Additional context
If one downloads a single "mediumly" large file, e.g. 1-2 GiB, this might be hardly noticeable as a hard drive would only spend a minute or so on concatenating the file, and even the crappiest of cheap SSDs doesn't start to choke until a bit later than 1-2 GiB.
A 10 GiB folder with five 2 GiB files may not be immediately noticeable either, since it seems that Cyberduck begins concatenating each file as soon as it is done downloading, and depending on connection limits in Cyberduck and on the FTP server itself, it might only be downloading 1-2 files at a time. I'm not sure if it prioritizes remaining downloads over concatenation jobs or vice versa, if at all.
This might be even less noticeable if the last file(s) in the download order are smaller, there's less "catching up" I/O to do.
TL;DR
The problems begin when a single large (>5 GiB) file has to be concatenated, and it gets exponentially worse as the number of (simultaneously) downloaded files and/or the number of "Transfer" jobs increases.
The text was updated successfully, but these errors were encountered: