-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an 'official' caller for batch UDFs #632
Open
spencerseale
wants to merge
7
commits into
main
Choose a base branch
from
spencerseale/sc-52502/registered_as_batch
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 4 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
466fb6d
basics on exec_batch_udf
spencerseale 3dec287
cleaned and added args
spencerseale 12d77d4
unittests for dag.exec_batch_udf
spencerseale a55e364
clean a doc str
spencerseale 8a1f567
Merge branch 'main' into spencerseale/sc-52502/registered_as_batch
spencerseale 66fde9a
clean
spencerseale d17517c
clean2
spencerseale File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
"""Pytest-based tests for tiledb.cloud.dag.dag""" | ||
|
||
from unittest.mock import MagicMock | ||
from unittest.mock import patch | ||
from webbrowser import Error | ||
|
||
from attrs import define | ||
|
||
import tiledb.cloud | ||
from tiledb.cloud.dag.dag import exec_batch_udf | ||
from tiledb.cloud.dag.mode import Mode | ||
|
||
_TASK_NAME = "unittest-test-dag-exec-batch-udf" | ||
_NAMESPACE = tiledb.cloud.client.default_user().username | ||
|
||
|
||
@define | ||
class ExecutableLoader: | ||
arg: str = "arg1" | ||
registered_udf: str = "TileDB-Inc/ls_uri" | ||
|
||
def in_memory(self, arg: str): | ||
return arg | ||
|
||
@property | ||
def registered(self): | ||
return self.registered_udf | ||
|
||
def all_exec(self): | ||
return ( | ||
self.in_memory, | ||
self.registered, | ||
) | ||
|
||
|
||
@patch("tiledb.cloud.dag.dag.webbrowser.open_new_tab") | ||
@patch("tiledb.cloud.dag.dag.DAG") | ||
def test_exec_batch_udf_mock(mock_dag: MagicMock, mock_open_new_tab: MagicMock) -> None: | ||
"""Test procedure of exec_batch_udf. | ||
|
||
This test is concerned only if proper logic is engaged based on args and exceptions. | ||
|
||
Additionally by passing both an in-memory callable and a str referencing | ||
a registered UDF, checks that the name of the submitted node is set properly | ||
and no AttributeError thrown. | ||
""" | ||
|
||
mock_dag_inst = mock_dag.return_value | ||
|
||
loader = ExecutableLoader() | ||
|
||
expected_submit_call_count = 0 | ||
for callable_to_test in loader.all_exec(): | ||
expected_submit_call_count += 1 | ||
|
||
graph = exec_batch_udf( | ||
callable_to_test, | ||
loader.arg, | ||
compute=False, | ||
) | ||
|
||
assert mock_dag_inst.submit.call_count == expected_submit_call_count | ||
assert mock_dag_inst.compute.call_count == 0 | ||
assert isinstance(graph, MagicMock) | ||
|
||
# checking logic associated with open_browser == True | ||
# ensure 'except' block hits when trying to open webbrowser | ||
mock_open_new_tab.side_effect = Error() | ||
|
||
graph = exec_batch_udf( | ||
loader.in_memory, | ||
loader.arg, | ||
compute=True, | ||
open_browser=True, | ||
) | ||
|
||
assert mock_dag_inst.submit.call_count == expected_submit_call_count + 1 | ||
assert mock_dag_inst.compute.call_count == 1 | ||
assert mock_open_new_tab.called # test webbrowser attempted to open | ||
|
||
|
||
def test_exec_batch_udf() -> None: | ||
"""Test actual loading of DAG. | ||
|
||
Concerned primarily that DAG is instantiated appropriately as | ||
specified by exec_batch_udf. | ||
|
||
Previous unit test for exec_batch_udf already tested compute method is called | ||
when the 'compute' arg is True. So not actually executing batch UDF, as | ||
DAG.compute is tested elsewhere, outside of scope of these tests. | ||
|
||
Does not test whether the registered UDF actually exists, just that | ||
a str passed is acceptable. It is on the user to ensure registered | ||
UDF exists. | ||
""" | ||
|
||
loader = ExecutableLoader() | ||
|
||
# test multiple retry limit settings | ||
for retry_count, callable_to_test in enumerate(loader.all_exec()): | ||
graph = exec_batch_udf( | ||
callable_to_test, | ||
loader.arg, | ||
name=_TASK_NAME, | ||
namespace=_NAMESPACE, | ||
retry_limit=retry_count, | ||
compute=False, | ||
) | ||
|
||
assert graph.name == f"batch->{_TASK_NAME}" | ||
assert graph.namespace == _NAMESPACE | ||
assert graph.mode == Mode.BATCH | ||
assert graph.retry_strategy.retry_policy.lower() == "always" | ||
assert graph.retry_strategy.limit == retry_count | ||
assert len(graph.nodes) == 1 | ||
assert graph.status.name.lower() == "not_started" |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For uniformity reasons, I prefer that we use
tiledb.cloud.utilities.get_logger_wrapper
for logging everywhere.The method also allows for a verbosity level (
verbose=True/False
) that should be set from the UDF's arguments (as is the case with theas_batch
method) so thelogger
can be declared inside theexec_batch_udf
and theverbose
should be "grabbed" from thekwargs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JohnMoutafis I agree, but we have a circular import problem because
run_dag
is in the utilities module.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spencerseale @JohnMoutafis are we still stuck here? Do I understand correctly that we don't have a circular import yet, but will when
run_dag()
calls this new function?If we refactored and moved
get_logger_wrapper()
to, for example,tiledb.cloud.logging
, that would eliminate the potential circular import, yes? I'm willing to do that work.