Skip to content

Commit

Permalink
fixup: rephrase external dependecy section
Browse files Browse the repository at this point in the history
  • Loading branch information
pedro-psb committed Dec 4, 2024
1 parent 8ddbd7f commit 1fe8ef4
Showing 1 changed file with 17 additions and 18 deletions.
35 changes: 17 additions & 18 deletions docs/user/learn/on-demand-downloading.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# On-Demand Download/Sync
# On-Demand Download and Sync

## Overview

Expand Down Expand Up @@ -65,30 +65,29 @@ made available in multiple places.
!!! warning "Deleting a Remote"
Learn about the dangers of [deleting a Remote](#remote-deletion-and-content-sharing) in the context of on-demand content.

## On-Demand/Streamed limitations
## On-Demand and Streamed limitations

On-demand/streamed content can be very useful, but it comes with some caveats.
On-demand and streamed content can be useful, but they come with some problems.

### External dependency and error handling

The content might become unavailable or corrupted on the remote server.
This makes it hard for Pulp to provide an accurate error message.
There are two different types of errors that can occur with on-demand streaming:

Here are some scenarios involving remote failure:
1. Pre-response: For some reason, Pulp can't get any data from the server (e.g, connectivity errors). A response is never started.
2. Post-response: Pulp can get data from the remote and start streaming the response, but in the end the data doesn't match the expected digest.

* Unreachable
* Given all remote sources for the content are unavailable/corrupted
* When the user requests that content through a distribution
* Then it fails to deliver the content and it is effectively unreachable
* Reachable after failure(s)
* Given there is more than one remote for the content and at least one of them is good.
* When the user requests that content through a distribution
* Then some requests for the content might fail with close connection errors* and future requests will try the next ones, eventually reaching the good remote.
Even though the content might be reachable, the failures can be confusing.
In the first case, Pulp will try all the available remote sources for the requested content and will return a 404 if all of them fail *with this same type of error*.

!!! note "* Why do we close the connection?"
The connection close happens because Pulp streams content directly from the remote.
If the content is bad (and we can only know that after streaming everything) we prefer to close the connection over finalizing a bad response.
In the second case, Pulp already sent the corrupted data to the client and can't recover from it, so it will close the connection to prevent the client from consolidation the file.
When this happens, the content-app will ignore that remote source for a certain amount of time, which will enable future requests to select a different remote source.
A 404 is returned if no more remote source is available (e.g., they're all ignored).
Pulp doesnt't permanently invalidate the remote because it can't know if the error is transient or not.

The second case is complex and can be confusing to the user.
The core reason for this complexity lies on the very nature of on-demand serving, which imposes that Pulp must fetch and stream the content on request time, and has no way to know anything about the remote before that.
This constraint great limits the range of actions Pulp can do to properly satisfy the request.

If this behavior is prohibitive, consider using the immediate sync policy.

Context: <https://github.com/pulp/pulpcore/issues/5012>.

Expand Down

0 comments on commit 1fe8ef4

Please sign in to comment.