-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Creation With Upload extension #88
Conversation
The creation extension is getting quite large here, and since it says:
Should we not rather place this paragraph under its own |
@kvz Since the Creation With Chunk only applies when the Creation extension is supported, I dislike the idea of splitting these two. |
How about solving that with something along the lines of:
|
Apart from the extensions' size is there some feedback on it's content? |
As for the content, I think it could use a
Where it says
Otherwise it makes sense to me, but for some reason I do have to try hard to grasp it. But I can't think of a better way to rephrase. Maybe we can let AJ take a look though? |
What about @cjhenck and @MMasterson? |
I think it might make sense to make it a separate extension that requires Creation because we are adding a second advertisement to the extensions list. I have some suggested wording changes that I will add to the diff, and personally think it would make sense to call it "Creation with Data" since it is more general. |
@@ -351,6 +351,20 @@ If the length of the upload exceeds the maximum, which MAY be specified using | |||
the `Tus-Max-Size` header, the Server MUST respond with the | |||
`413 Request Entity Too Large` status. | |||
|
|||
The Client MAY include the entire or a chunk of the data, which is meant to be | |||
uploaded, in the body of the `POST` request. In this case, similar rules as for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this wording?
The Client MAY include the entirety or a chunk of the data to be uploaded in the body of the POST
request.
@cjhenck Thank you for the feedback! I mostly agree with your findings and will address them correctly. Since you also suggested to move this into a separate extension, I will follow this path. However, instead of naming it Creation With Data, I now tend more towards Creation With Upload. While "data" is a better fit than "chunk", it's still a bit to general. Data could be anything, maybe even metadata or additional information which is totally unrelated to this specific upload. In my opinion, the name Creation With Upload suites the best since it not only indicates that we are creating a new upload resource but also directly provide the entire upload data or just a part of it. |
See tus/tus-resumable-upload-protocol#88 for the current proposal
Looks great to me! Thanks! |
When do we see a possible merge and shipment of the next 1.1 release |
@cjhenck Great! |
### Creation With Upload | ||
|
||
The Client MAY want to include parts of the upload in the initial Creation | ||
request. This MAY be achieved using the Creation With Upload extension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Client MAY want to
The Client MAY
This MAY be achieved
This can be achieved
Reason is: doing this is optional. But if opt-in, we can expect to use this things, and no other thing. So I feel the second MAY is superfluous
offer the Creation extension, it MUST NOT offer the Creation With Upload | ||
extension either. | ||
|
||
The Client MAY include the entire upload data or a chunk of it in the body of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MAY include either the entirety or a chunk of the upload data in the body
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Damn, that sounds eloquent 👍
Thank you, @kvz and @AJvanLoon, for your suggestions. ❤️ |
@Acconut @kvz @bhstahl @MMasterson Apologies for chiming in late, I'm just reading through this thread now, figured I would comment here as opposed to on #82. If my comments are irrelevant at this point that's ok, I'm thinking better to add a few thoughts even if they're old news. Also, apologies if I'm writing stuff you already know. I implemented video upload in Vimeo's iOS application and many of the challenges described in #82 are familiar. On iOS, and mobile in general, video upload in the foreground doesn't make a ton of sense from a user perspective. Videos tend to be large and are getting larger as device cameras improve. Users expect to be able to kick off an upload and put the device back in their pocket, it's not really acceptable to expect them to keep the app in the foreground until the upload completes. Unless we're talking about a really short / small video that can be uploaded in a few seconds (e.g. video upload from within the Instagram app). Uploading in the background requires using the NSURLSession APIs. The NSURLSession APIs for background upload are very limited. They only allow you to pass in a local file URL. No option to pass in a stream or even an offset. This means that if the upload fails midway you have to start from 0 when you retry. You cannot resume from an offset. As the video files become larger this limitation becomes more and more cumbersome. This means if you want to be able to resume uploads, you have to manually split the file into chunks (smaller individual files on disk), and pass the URLs to those files to the NSURLSession. But this is quite problematic from a UX perspective also. To my knowledge the only way to manually chunk a file is to use an AVAssetExportSession to do so. And this API requires that the chunking be performed when the app is in the foreground. It's also a very time consuming process. Chunking a 2GB file will take a while, getting back to the problem of the user having to keep the app in the foreground while the operation completes. Not to mention complexities of chunking precisely, and handling chunking failures etc. It also means you're literally making a copy of the file and thereby blowing up the user's disk space (at least until the upload completes). You could create chunks as needed (i.e. only create a chunk once the previous chunk has completed), but this is not a solution I am confident in because: if chunk A finishes uploading in the background and you want to create the next chunk (chunk B) and initiate the background upload of that chunk, the time window you have between the moment upload A completes and when you must have initiated upload B is small and unpredictable (undocumented). In other words, I'm not confident that there is enough time to create a chunk in that time (also not sure AVAssetExportSession is usable at this moment in the app lifecycle since the app is technically still in the background). Anyway, just wanted to throw some ideas / experiences out there. Hopefully something in here is valuable. Our upload library is open source here: https://github.com/vimeo/VimeoUpload It's not yet consumable as a Cocoapod (coming soon) but the code is there and is operational in our iOS app. Thanks 👍 |
Thank you a lot for this feedback, @alfiehanssen. It is absolutely invaluable for us. The actual reason for why we did consider to add this extension, was to provide a basis on which resumable uploads could also be achieved on the iOS platform. @MMasterson told us that the situation is pretty unfortunate, what is also confirmed by your comment. However, it seems as if this extension does still not solve the problem entirely.
From what you are describing, does it even make sense to attempt this chunking approach? Your comment draws a relatively depressive image of the upload capabilities for iOS on my mind. Does the VimeoUpload library use the chunking method (AVAssetExportSession) and then uploads in the background (NSURLSession)? If not, are the any alternative approaches? |
@Acconut, phew! I'm glad this info was helpful. The VimeoUpload library does not use the chunking method. It just retries the file and in doing so starts from 0 again. I opted for this approach for v1.0 of the library and figured that, depending on usage and product goals, we'd decide later whether to prototype the chunking approach. Without having prototyped the chunking approach I can't really say whether it will work or not. My exploration though does make me skeptical. I think it will hinge on: (1) Whether a chunk can be created when the app is in the background (or do the iOS APIs involved in creating the chunk only work in the foreground), and (2) whether a chunk can be created within the window of time available, i.e. when the app is awakened in the background in response to the previous chunk upload having completed. That's a mouthful, but there's basically a callback that the background upload session invokes when a network task completes. I.e. the NSURLSession process (separate from your app's process) wakes your app up briefly. You can use this time to "respond to completion of the network task" (e.g. by updating persisted state etc.). Can you create a chunk and start uploading it before the app is killed again? As far as I know the duration of this time period is undocumented. And the time to create a chunk probably varies by device / OS profile. Let me know if you have other questions but that's basically where I ended up when considering chunking. |
Hi all, sorry to chime in late. @alfiehanssen thanks for adding your expertise to the discussion! If I recall, upload-with-create was a bit of a compromise: when I had considered using chunking, the problem I ran into was that there is no way to tell the server that it must either accept or reject a whole chunk. When a chunk fails, iOS will automatically retry it from the beginning, but the offsets will no longer match and all future requests will fail, and the upload is essentially stuck in an endless retry loop (hence both #82 and #83). #82 was meant to be a way to allow the application to upload individual chunks using the concatenation extension, while #83 would have allowed for sequential chunking. I believe that this extension will actually allow us to use chunking with concatenation, but with the side-effect of having unused data on the server until it expires. |
Thank you for clarifying this.
While reading about the functionality behind the NSUrlSession, I stumbled upon some numbers on following site: https://krumelur.me/2015/11/25/ios-background-transfer-what-about-uploads/
From my personal experience, these number seem to be enough to divide a medium sized file into smaller chunks and prepare them for being uploaded in the background. However, I may be absolutely wrong with my assumptions and only trying could bring light in this area.
Thank you, again, a lot for you advice.
Exactly, that's what this extension tries to solve. Were you able to try to use this approach in reality? The tusd server actually support this extension which may be relevant for testing. |
@alfiehanssen I was wondering if you had any input on this topic from a macOS programming perspective. I am currently making a few changes to make TUSKit work on the macOS environment. Obviosuly factors for the iOS side don't all apply 100% for macOS, but curious if you ran into any of these issues or different ones on macOS. |
@Acconut most of our files are small enough that using the background tasks in TUSKit and restarting any cancelled uploads during significant location changes has been sufficient for our needs. It seems like this is what DropBox, etc. do as well. I would consider trying the chunking approach again, but I'm constrained by wanting to reduce my users' data use. The current technique is clearly the most data-efficient, whereas chunks uploading in the background will retry indefinitely. The latter also presents an interesting optimization problem because you want to find a sweet spot where you maximize chunk success rates and also minimize the HTTP overhead by having as few chunks as possible. |
That's good to know, thanks.
Yes, but only if the upload works as intended and is not interrupted, I assume.
Absolutely correct. After thinking and reading more about the situation on iOS, I still do not see a viable alternative than using this extension for making uploading work on Apple's mobile platform. The question, whether chunking should be used to support uploading bigger files in a reliable fashion, does not matter. We need this extension since we have to supply a single request, which combines upload creation and file data, to NSUrlSession and this extension is currently the only way for going forward in this matter. Is this correct or am I still missing something crucial? |
I don't think that's true - in terms of data use, the current implementation doesn't use any retrying tasks, so as soon as an upload is interrupted, it makes a new HEAD request and restarts from the existing offset. The downside is that the upload only restarts if your app is somehow reawakened or never left the foreground.
Well... I think we need either this extension or one of the "PATCH with reset" options (#82 or #83). If we use this extension with background URLs then we have to wait until every chunk has been successfully uploaded before we can submit the concatenation request, and we'll (likely) have lots of dead uploads. I believe that once all of the chunks are successfully uploaded that the application would be reawakened with a "success" callback. If we used #82 or #83 then we could create each chunk's upload, get the URLs, submit a concatenation request, and then create background tasks for the upload that would run entirely in the background without any additional intervention. The latter approach is cleaner to me but presents opportunities for potential data corruption (in the case of #82). The former approach has the advantage of minimizing overhead for TUS in all use cases by eliminating one request-reply cycle. |
Shouldn't |
I don't think so. The chunk will be put at the offset 0 which is implicitly defined as the upload is newly created. Therefore the content type
I generally enjoy the idea of seeing an upload as an append-only file. It provides you with the guarantee that chunks which have already been received will not change. This makes it possible and easy to process a file will it's being uploaded.
Correct. If there is nothing else to be said about this extension, I think we can finally move to merge this. |
@Acconut Any word on when this will be merged in? |
Sorry about the inactivity, I lost this from my radar. I am going to try to publish a new version of the protocol soon. |
See #82 for the underlying discussion.