Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: parameterised headers rest_api_source #2084

Open
wants to merge 11 commits into
base: devel
Choose a base branch
from

Conversation

ArneDePeuter
Copy link

@ArneDePeuter ArneDePeuter commented Nov 21, 2024

Description

This pull request addresses the feature request discussed in issue #2071. It introduces dynamic parameter resolution for headers, allowing both keys and values to be resolved dynamically.

Key Features

  • Dynamic Header Resolution: Supports resolving both header keys and values at runtime using specified parameters.
  • Header params can also be bound to provided params.
  • You can still define headers in the client, these will always be used. Endpoint specific parameters extend this dict.

Usage

The feature is demonstrated with the following example:

{
    "name": "chicken",
    "endpoint": {
        "path": "chicken",
        "headers": {"{token}": "{token}", "num": "2"},
        "params": {
            "token": {
                "type": "resolve",
                "field": "token",
                "resource": "authenticate",
            },
        },
    },
}

Tests

Comprehensive tests have been added to validate the implementation and ensure its correctness.

Copy link

netlify bot commented Nov 21, 2024

Deploy Preview for dlt-hub-docs canceled.

Name Link
🔨 Latest commit 5396496
🔍 Latest deploy log https://app.netlify.com/sites/dlt-hub-docs/deploys/67478a34a66b61000868821c

@ArneDePeuter
Copy link
Author

Excuse me for the failed checks, I'll have a look at it!

@ArneDePeuter
Copy link
Author

All checks should pass now, also added an extra test!

@burnash burnash self-assigned this Nov 24, 2024
@burnash
Copy link
Collaborator

burnash commented Nov 24, 2024

Hey @ArneDePeuter thanks for the PR and for adding tests! I'll review it soon.

"headers": {
"{token}": "{token}",
"num": "2",
"nested": {"nested": "{token}", "{token}": "other"},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArneDePeuter could you elaborate on the use case the nested headers?

Copy link
Author

@ArneDePeuter ArneDePeuter Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this for the sake of supporting every use case. After some research tough... headers are mostly flat key value pairs. So I also now question wether this is needed. 😬 I'll remove this functionality!

@ArneDePeuter
Copy link
Author

@burnash I removed the recursive parameter resolution for headers, as this was not useful.

@ArneDePeuter
Copy link
Author

ArneDePeuter commented Nov 25, 2024

I now have following question. In: dlt/sources/rest_api/__init__.py:366 would it be useful that resolved params gets extended with the incremental params? This way incrementals can be in the url as in the headers.

Implemented as this for example:

# dlt/sources/rest_api/__init__.py:366 
if incremental_object:
  params = _set_incremental_params(
      params,
      incremental_object,
      incremental_param,
      incremental_cursor_transform,
  )
  resolved_params.extend(params)

for item in items:
  formatted_path, formatted_headers, parent_record = process_parent_data_item(
      path, item, resolved_params, include_from_parent, headers
  )

Secondly, I find it confusing that parameter definitions are intertwined with query params. Is this something that has been brought up before?

@burnash
Copy link
Collaborator

burnash commented Nov 27, 2024

@burnash I removed the recursive parameter resolution for headers, as this was not useful.

Thanks for the update, @ArneDePeuter

I now have following question. In: dlt/sources/rest_api/init.py:366 would it be useful that resolved params gets extended with the incremental params? This way incrementals can be in the url as in the headers.

I don't see a need to implement incremental in headers. I haven't seen a use case for incremental parametes in headers yet. If such a case exists, I would suggest implementing a dlt source using RESTClient rather than the rest_api source.

Secondly, I find it confusing that parameter definitions are intertwined with query params. Is this something that has been brought up before?

Good point. I have already received this feedback from someone. The way the rest_api source defines parameters is loosely based on how parameters are defined in the OpenAPI specification. I don't find the current implementation too confusing, as the end user has control over how path parameters are named and resolved. They are also excluded from the query parameters. However, I'm open to suggestions on how to improve this.

If we extend params to include header params (as in this PR), it might become confusing. I was thinking about different ways to address this. I was considering different ways to address this. One option is to take inspiration from the suggested configuration for query string parameters in #1978 and define the location of the parameter in a location key. For example, from your scenario:

"endpoint": {
    "path": "issues/comments",
    "headers": {"token": "{token}", "static_param":  region}
    "params": {
        "token": {
            "type": "resolve",
            "resource": "authenticate",
            "field": "token",
            "location": "header"
        }
    },
},

What do you think? Would you be willing to update this PR to handle the location key?

@ArneDePeuter
Copy link
Author

ArneDePeuter commented Nov 27, 2024

Thanks for your response!

Good suggestion for the location field, ill have a look at it in the upcoming days.

@ArneDePeuter
Copy link
Author

ArneDePeuter commented Nov 27, 2024

I added the location field. The current implementation isn't very clean and could use further refinement.

That said, I believe this feature does not enhance clarity and might be better reverted. I preferred the previous approach, where the resolve parameter was not tied to a specific location and applied universally to matching names.

If we decide to proceed with this change, I suggest making location a list instead of a singular value. This would help avoid code duplication if someone needs to reuse a resolvedParam across multiple locations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants