Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add gateway to http over libp2p #10108

Merged
merged 16 commits into from
Sep 20, 2023
Merged

feat: add gateway to http over libp2p #10108

merged 16 commits into from
Sep 20, 2023

Conversation

Jorropo
Copy link
Contributor

@Jorropo Jorropo commented Aug 30, 2023

Closes #10049

@Jorropo Jorropo force-pushed the gateway-http-over-libp2p branch from d9380f9 to f7aa120 Compare August 30, 2023 16:51
@aschmahmann aschmahmann force-pushed the gateway-http-over-libp2p branch from 5725ad8 to d2ced05 Compare August 31, 2023 08:38
@aschmahmann aschmahmann force-pushed the gateway-http-over-libp2p branch 2 times, most recently from 679643a to c6e28ed Compare August 31, 2023 09:19
Comment on lines +283 to +289
// os.Interrupt does not support interrupts on Windows https://github.com/golang/go/issues/46345
if runtime.GOOS == "windows" {
if n.signalAndWait(watch, syscall.SIGKILL, 5*time.Second) {
return n
}
log.Panicf("timed out stopping node %d with peer ID %s", n.ID, n.PeerID())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can split these harness changes out into another PR and rebase if needed. Just something I was running into running harness tests locally.

Ideally, we'd start running these tests in CI too now that they work. For another time though 😄

Comment on lines 635 to 644
Notes:
- This feature currently is only about serving the gateway requests over libp2p, not about fetching data this way using
[Trustless Gateway Specification](https://specs.ipfs.tech/http-gateways/trustless-gateway/).
- While kubo currently mounts the gateway API at the root (i.e. `/`) of the libp2p `/http/1.1` protocol that is subject to
change. The way to reliably discover where a given HTTP protocol is mounted on a libp2p endpoint is via the `.well-known/libp2p`
resource specified in the [http+libp2p specification](https://github.com/libp2p/specs/pull/508)
- Kubo currently hard codes the gateway-over-libp2p behavior to:
- Only operate on `/ipfs` resources
- Only satisfy the Trustless Gateway API
- Only serve data that is already local to the node (i.e. similar to a `NoFetch` gateway)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidel @Jorropo I'd like review here to make sure we're happy and aligned with this being what we're serving. Also, this might be obvious but there's no way to use a subdomain gateway here, only paths unless we want to define (for us or as a recommendation for the http-over-libp2p specs) some top level domain to use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm, make some clarifications in 8d28507 (made it clear we only support block and car content types)

Modify your ipfs config:

```
ipfs config --json Experimental.GatewayOverLibp2p true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name changes welcome, but figured this was supposed to be an experiment. A concern with a boolean here is that we will almost certainly break this in the not so distant future whether by enabling things like configurable gateways (e.g. turn off NoFetch if you want, or serve deserialized responses) or adding GatewayOverLibp2p as a retrieval protocol.

I don't necessarily have a big problem breaking the experimental flags for this going forward, but wondering what others think.

Copy link
Member

@lidel lidel Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good for now.

Long term, we will need to figure out UX around "HTTP data transfer" story.
Maybe have explicit DataTransfer section with both Bitswap and HTTP, and ability to enable/disable/prioritize.

💭 Ecosystem-wide, do we want the same "NoFetch trustless gateway" over both libp2p and regular TCP?
Do we want peers to be able to discover and leverage both? I imagine for libp2p we have identify + /http/1.1 protocol +.well-known thing. For pure TCP we could announce additional multiaddr with /http.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe have explicit DataTransfer section with both Bitswap and HTTP, and ability to enable/disable/prioritize.

We don't even have client support, I don't think it's important here.


### Road to being a real feature

- [ ] Needs more people to use and report on how well it works
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything else we want to explicitly call out here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some follow-up tasks in 8d28507

t.Run("WillNotServeDeserializedResponses", func(t *testing.T) {
resp, err := http.Get(fmt.Sprintf("http://%s/ipfs/%s", p2pProxyNodeHTTPListenAddr, cidDataOnGatewayNode))
require.NoError(t, err)
require.Equal(t, http.StatusNotAcceptable, resp.StatusCode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 406 code is what's currently returned from kubo. This isn't a spec thing, but given these are kubo tests this seems fine.

t.Run("WillNotServeRemoteContent", func(t *testing.T) {
resp, err := http.Get(fmt.Sprintf("http://%s/ipfs/%s?format=raw", p2pProxyNodeHTTPListenAddr, cidDataNotOnGatewayNode))
require.NoError(t, err)
require.Equal(t, 500, resp.StatusCode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A 500 is what's currently returned from kubo. It seems like it could very well be something else, if this changes in the future I think we're fine changing the test to accommodate that.

Comment on lines +942 to +944
tmpProtocol := protocol.ID("/kubo/delete-me")
h.SetHTTPHandler(tmpProtocol, http.NotFoundHandler())
h.WellKnownHandler.RemoveProtocolMeta(tmpProtocol)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unfortunate hack as a result of a go-libp2p bug which will be linked shortly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@lidel lidel Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aschmahmann @Jorropo do we still need this after libp2p/go-libp2p#2546 being merged?
if not, maybe add TODO to remove it once libp2p is upgraded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes because libp2p/go-libp2p@9f14eb7 is not released.

Comment on lines +946 to +948
h.WellKnownHandler.AddProtocolMeta(gatewayProtocolID, p2phttp.ProtocolMeta{Path: "/"})
h.ServeMux = http.NewServeMux()
h.ServeMux.Handle("/", handler)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT mounting at / makes testing easier for us (since we can just use the p2p forward command) and lowers the complexity of things like redirects. This means we can have a proxy that listens on localhost:1234 and forwards to the http-over-libp2p endpoint without needing to worry about path rewrites that could be annoying.

I think we should feel free to move this going forward though since the point of .well-known/libp2p is to make the mountpoint flexible by selecting on the protocol ID.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: we have to do the work here a little more explicitly than if we weren't mounting at / due to libp2p/go-libp2p#2545

config/experiments.go Outdated Show resolved Hide resolved
@aschmahmann aschmahmann force-pushed the gateway-http-over-libp2p branch 10 times, most recently from 6adf6ee to 0728205 Compare September 1, 2023 20:18
@aschmahmann aschmahmann force-pushed the gateway-http-over-libp2p branch from 0728205 to df66dac Compare September 1, 2023 20:53
Comment on lines 92 to 112
endpoint="http://127.0.0.1:5001/api/v0/version"
max_retries=5
retry_interval=3

check_endpoint() {
curl -X POST --silent --fail "$endpoint" > /dev/null
return $?
}

retries=0
while ! check_endpoint; do
retries=$((retries+1))

if [ $retries -ge $max_retries ]; then
echo "daemon took too long to start"
exit 1
fi

sleep $retry_interval
done
echo "daemon started and ready to receive API calls"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this here and below. Open to alternatives if people have them, but it seems a little dicey to run the daemon in the background and then start using it later without some kind of a "wait" function.

Maybe we can remove this one if the gateway-conformance binary will do the waiting for us, if so feel free to remove it but we may still need the one for the p2p-proxy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aschmahmann aschmahmann marked this pull request as ready for review September 1, 2023 21:03
@aschmahmann aschmahmann requested review from lidel and a team as code owners September 1, 2023 21:03
@BigLep BigLep mentioned this pull request Sep 5, 2023
14 tasks
@lidel lidel force-pushed the gateway-http-over-libp2p branch 2 times, most recently from 5497023 to 4b87e4b Compare September 5, 2023 23:48
this ensures the libp2p experiment runs independently
and its failure does not impact the result of job
that tests stable features on http port
@lidel lidel force-pushed the gateway-http-over-libp2p branch from 4b87e4b to 1efd9d4 Compare September 5, 2023 23:58
seems we were testing regular gateway instead of proxied one
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sgtm, should be good to go as an experiment.

Main benefit today is that this enables

  • reuse of trustless gateway as a HTTP data transfer protocol over webtransport
  • experimentation with CAR transfers of dag-scope=entity instead of bitswap

ps. I've pushed small changes around CI tests (separate job for this feature, actually testing the right port) and docs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ I've separated action for testing HTTP Gateway from experimental subset over libp2p (in 1efd9d4)

This way experiment does not impact existing CI and branch protection rule (which requires full conformance on the HTTP port to pass).

Comment on lines +942 to +944
tmpProtocol := protocol.ID("/kubo/delete-me")
h.SetHTTPHandler(tmpProtocol, http.NotFoundHandler())
h.WellKnownHandler.RemoveProtocolMeta(tmpProtocol)
Copy link
Member

@lidel lidel Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aschmahmann @Jorropo do we still need this after libp2p/go-libp2p#2546 being merged?
if not, maybe add TODO to remove it once libp2p is upgraded?

Comment on lines 92 to 112
endpoint="http://127.0.0.1:5001/api/v0/version"
max_retries=5
retry_interval=3

check_endpoint() {
curl -X POST --silent --fail "$endpoint" > /dev/null
return $?
}

retries=0
while ! check_endpoint; do
retries=$((retries+1))

if [ $retries -ge $max_retries ]; then
echo "daemon took too long to start"
exit 1
fi

sleep $retry_interval
done
echo "daemon started and ready to receive API calls"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


### Road to being a real feature

- [ ] Needs more people to use and report on how well it works
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some follow-up tasks in 8d28507

Comment on lines 635 to 644
Notes:
- This feature currently is only about serving the gateway requests over libp2p, not about fetching data this way using
[Trustless Gateway Specification](https://specs.ipfs.tech/http-gateways/trustless-gateway/).
- While kubo currently mounts the gateway API at the root (i.e. `/`) of the libp2p `/http/1.1` protocol that is subject to
change. The way to reliably discover where a given HTTP protocol is mounted on a libp2p endpoint is via the `.well-known/libp2p`
resource specified in the [http+libp2p specification](https://github.com/libp2p/specs/pull/508)
- Kubo currently hard codes the gateway-over-libp2p behavior to:
- Only operate on `/ipfs` resources
- Only satisfy the Trustless Gateway API
- Only serve data that is already local to the node (i.e. similar to a `NoFetch` gateway)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm, make some clarifications in 8d28507 (made it clear we only support block and car content types)

Comment on lines 200 to 207
./ipfs --api=/ip4/127.0.0.1/tcp/5002 p2p forward --allow-custom-protocol /http/1.1 /ip4/127.0.0.1/tcp/8082 /p2p/$gatewayNodeId
working-directory: kubo-gateway/cmd/ipfs

# 9. Run the gateway-conformance tests over libp2p
- name: Run gateway-conformance tests over libp2p
uses: ipfs/gateway-conformance/.github/actions/[email protected]
with:
gateway-url: http://127.0.0.1:8081
Copy link
Member

@lidel lidel Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a bug: iiuc we should be testing port opened via p2p forward and not plain http gateway from 8081
Fixed in 3fa7ef8

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤣 thanks, that's why we always ask for reviews 🙏. Tested locally with the right ports, but can't run workflows locally.

@Jorropo
Copy link
Contributor Author

Jorropo commented Sep 6, 2023

do we still need this after libp2p/go-libp2p#2546 being merged?
if not, maybe add TODO to remove it once libp2p is upgraded?

Yes, this hasn't been released yet. https://filecoinproject.slack.com/archives/C03FW2RKK5Y/p1693838383249159?thread_ts=1693838185.597839&cid=C03FW2RKK5Y

@BigLep
Copy link
Contributor

BigLep commented Sep 7, 2023

2023-09-07 conversation: @aschmahmann will create the followup issue for taking this out of experimental. We can link to that from the experimental docs.

@Jorropo will give a quick pass and merge.

- [ ] Needs more people to use and report on how well it works
- [ ] Needs UX work for exposing non-recursive "HTTP transport" (NoFetch) over both libp2p and plain TCP (and sharing the configuration)
- [ ] Needs a mechanism for HTTP handler to signal supported features ([IPIP-425](https://github.com/ipfs/specs/pull/425))
- [ ] Needs an option for Kubo to detect peers that have it enabled and prefer HTTP transport before falling back to bitswap (and use CAR if peer supports dag-scope=entity from [IPIP-402](https://github.com/ipfs/specs/pull/402))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is needed to be a real feature for the server side, I agree we need client support but that doesn't seem strictly needed here.

Copy link
Contributor Author

@Jorropo Jorropo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@@ -17,7 +17,24 @@ defaults:
run:
shell: bash

env:
# hostnames expected by https://github.com/ipfs/gateway-conformance
GATEWAY_PUBLIC_GATEWAYS: |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this had to be moved to the global scope ? This should do nothing for the libp2p gateway.

.github/workflows/gateway-conformance.yml Show resolved Hide resolved
@Jorropo Jorropo enabled auto-merge September 20, 2023 08:48
@Jorropo Jorropo merged commit 62a14d9 into master Sep 20, 2023
17 checks passed
@Jorropo Jorropo deleted the gateway-http-over-libp2p branch September 20, 2023 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Serve Trustless IPFS HTTP Gateway API via libp2p (experimental)
4 participants