Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] Replace raw.py with flyteidl_rust.RawSynchronousFlyteClient #2561

Merged
merged 83 commits into from
Oct 4, 2024

Conversation

austin362667
Copy link
Collaborator

@austin362667 austin362667 commented Jul 4, 2024

Tracking issue

flyteorg/flyte#5344

Why are the changes needed?

The end game is replacing raw.py with fully functioned, auth-included, Pyhton binding exposed from Rust implementation of RawSynchronousFlyteClient through PyO3.

With the helps of works in flyteorg/flyte#5525, this PR eliminates the protobuf objects serialization overhead while calling gRPC services. Unlike the original approach in flyrs, this method optimizes performance and reduces the maintenance cost.

With flyteidl-rust(test release) Python bindings, Flytekit can get rid of not only grpc but also protobuf Python dependencies.

What changes were proposed in this pull request?

Retain model layer overhead from this PR #2536, and make current SynchronousFlyteClient directly inherit class RawSynchronousFlyteClient from binding https://test.pypi.org/project/flyteidl-rust/.

  1. Phase 0
  • [model] Remove minimal flyteidl from model files
  • [integration tests] Tests on flyte cluster with public flytekit image
  • [grpc] Support secured channel in Rust
  • [grpc] Catch gRPC error in Python
  1. Phase 1
  • [model] Remove all flyteidl from model files
  • [unit tests] Add Python and Rust tests for ParseFromString, SerializeToString and ParseStruct ParseValue
  • [docker] Add flytekit base image with flyteidl-rust
  • [unit tests] Add @pytest.mark.remote_task to skip tests
  • [auth] Oauth in Rust
    • pkce
    • client credentials
    • device flow
  • [integration tests] Tests on flyte cluster with flyteidl-rust image
  1. Phase 2
  • [CI] Publish flyteidl-rust PyPI package
  • [union] Test flyteidl-rust in union sdk
  • [project.toml] Remove grpcio and grpcio-status
  • [unit tests] Update that use flyteidl
  1. Phase 3 (optional)
  • [grpc] Add retry middleware
    Interceptors are a bit more constrained in what they can do but are implemented as middleware and requests are nit clonable in the retry loop.
  • [perf] Upgrade deprecated gil-refs

How was this patch tested?

Two ways to install flyteidl-rust:

  1. Local Generate Protobuf & Build Python Wheel
    1. git clone -b flyrs https://github.com/austin362667/flyte.git
    2. cd flyte/flyteidl/
    3. rm -rf gen/pb_rust/* && cargo run --bin gen_flyteidl
    4. maturin dev --release -m Cargo.toml
  2. Install flyteidl-rust from pypi test registry
    1. pip install -i https://test.pypi.org/simple/ flyteidl-rust

Use flytekit with flyteidl-rust:

  1. Checkout to dev branch
    1. git clone -b austin362667/flyrs-v2 https://github.com/austin362667/flytekit.git
  2. Build flytekit image with flyteidl-rust
    1. docker build -f Dockerfile.rust --build-arg PYTHON_VERSION=3.12 -t localhost:30000/flytekit:rust .
    2. docker push localhost:30000/flytekit:rust
  3. Tests
    1. export FLYTEKIT_IMAGE=localhost:30000/flytekit:rust
    2. make integration_test

Setup process

  1. Checkout to branch austin362667:austin362667/flyrs-v2
  2. Remove gRPC dependencies
    pip uninstall grpcio grpcio-status
  3. Install flyteidl-rust
    pip install -i https://test.pypi.org/simple/ flyteidl-rust
  4. Test unauthenticated remote client with gRPC and flyteidl from Rust bindings.
    pytest ./tests/flytekit/integration/remote/test_remote.py

Screenshots

  1. PyPi (test): Screenshot 2024-07-24 at 2 01 13 AM

  2. Integration Test: Screenshot 2024-07-24 at 2 01 35 AM

Follow-up PRs

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

#2536

Docs link

add_content_md5_metadata=add_content_md5_metadata,
org="",
Copy link
Collaborator Author

@austin362667 austin362667 Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initialize org as empty string, can be assigned later by union serverless sdk etc.

@@ -326,12 +326,12 @@ def literal_type_to_click_type(lt: LiteralType, python_type: typing.Type) -> cli
Converts a Flyte LiteralType given a python_type to a click.ParamType
"""
if lt.simple:
if lt.simple == SimpleType.STRUCT:
if int(str(lt.simple)) == SimpleType.STRUCT:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can int(str(lt.simple_type)) because we've add __repr__ magic function for the Rust structure.

        #[pymethods]
        impl literal_type::Type {
            fn __repr__(&self) -> String {
                match self {
                    Self::Simple(value) => format!("{}", value),
                    _ => todo!(),
                }
            }
        }

Why not just int(lt.simple_type)? Cause, unlike most enum in Rust, literal_type.Type is complex without int variants.

/// Nested message and enum types in `LiteralType`.
pub mod literal_type {
    #[::pyo3_macro::with_pyclass]
    #[derive(::pyo3_macro::WithNew)]
    #[allow(clippy::derive_partial_eq_without_eq)]
    #[derive(Clone, PartialEq, ::prost::Oneof)]
    pub enum Type {
        /// A simple type that can be compared one-to-one with another.
        #[prost(enumeration = "super::SimpleType", tag = "1")]
        Simple(i32),
        /// A complex type that requires matching of inner fields.
        #[prost(message, tag = "2")]
        Schema(super::SchemaType),
        /// Defines the type of the value of a collection. Only homogeneous collections are allowed.
        #[prost(message, tag = "3")]
        CollectionType(::prost::alloc::boxed::Box<super::LiteralType>),
        /// Defines the type of the value of a map type. The type of the key is always a string.
        #[prost(message, tag = "4")]
        MapValueType(::prost::alloc::boxed::Box<super::LiteralType>),
        /// A blob might have specialized implementation details depending on associated metadata.
        #[prost(message, tag = "5")]
        Blob(super::BlobType),
        /// Defines an enum with pre-defined string values.
        #[prost(message, tag = "7")]
        EnumType(super::EnumType),
        /// Generalized schema support
        #[prost(message, tag = "8")]
        StructuredDatasetType(super::StructuredDatasetType),
        /// Defines an union type with pre-defined LiteralTypes.
        #[prost(message, tag = "10")]
        UnionType(super::UnionType),
    }
}

flytekit/models/task.py Outdated Show resolved Hide resolved
flytekit/models/types.py Outdated Show resolved Hide resolved
flytekit/models/types.py Outdated Show resolved Hide resolved
@austin362667 austin362667 force-pushed the austin362667/flyrs-v2 branch 7 times, most recently from 0019800 to 8b884e5 Compare August 3, 2024 18:49
@austin362667 austin362667 force-pushed the austin362667/flyrs-v2 branch 2 times, most recently from 1d0bef5 to 076beb8 Compare August 11, 2024 19:26
Copy link

codecov bot commented Aug 11, 2024

Codecov Report

Attention: Patch coverage is 58.17175% with 302 lines in your changes missing coverage. Please review.

Please upload report for BASE (flyrs-v2@6ababc9). Learn more about missing BASE report.

Files with missing lines Patch % Lines
flytekit/models/literals.py 46.57% 31 Missing and 8 partials ⚠️
flytekit/core/artifact.py 30.76% 33 Missing and 3 partials ⚠️
flytekit/models/core/workflow.py 68.65% 17 Missing and 4 partials ⚠️
flytekit/models/types.py 70.96% 16 Missing and 2 partials ⚠️
flytekit/models/common.py 34.61% 17 Missing ⚠️
flytekit/models/execution.py 27.27% 13 Missing and 3 partials ⚠️
flytekit/models/task.py 70.58% 14 Missing and 1 partial ⚠️
flytekit/models/utils.py 47.61% 11 Missing ⚠️
flytekit/models/schedule.py 28.57% 10 Missing ⚠️
flytekit/models/core/identifier.py 75.67% 6 Missing and 3 partials ⚠️
... and 39 more
Additional details and impacted files
@@             Coverage Diff             @@
##             flyrs-v2    #2561   +/-   ##
===========================================
  Coverage            ?   49.03%           
===========================================
  Files               ?      183           
  Lines               ?    19009           
  Branches            ?     4036           
===========================================
  Hits                ?     9322           
  Misses              ?     9171           
  Partials            ?      516           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>

rebase

Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
@pingsutw pingsutw merged commit 5c8f1e7 into flyteorg:flyrs-v2 Oct 4, 2024
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants