Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from ray-project:master #2312

Merged
merged 8 commits into from
Aug 16, 2023
Merged

Conversation

pull[bot]
Copy link

@pull pull bot commented Aug 16, 2023

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

shrekris-anyscale and others added 8 commits August 15, 2023 17:19
`test_http_state` has been failing on Windows due to timeout recently:

<img width="1335" alt="Screen Shot 2023-08-14 at 3 58 38 PM" src="https://github.com/ray-project/ray/assets/92341594/493323ce-b535-4dd2-b5bc-a85563912912">

Sample traceback ([link](https://buildkite.com/ray-project/oss-ci-build-branch/builds/5432#0189e839-c78b-4561-aab6-8de1253f65e1/6-12179)):

```console
//python/ray/serve:test_http_state                                      TIMEOUT in 3 out of 3 in 60.1s
--
  | Stats over 3 runs: max = 60.1s, min = 60.0s, avg = 60.1s, dev = 0.0s
  | C:/tmp/4lhdprva/execroot/com_github_ray_project_ray/bazel-out/x64_windows-opt/testlogs/python/ray/serve/test_http_state/test.log
  | C:/tmp/4lhdprva/execroot/com_github_ray_project_ray/bazel-out/x64_windows-opt/testlogs/python/ray/serve/test_http_state/test_attempts/attempt_1.log
  | C:/tmp/4lhdprva/execroot/com_github_ray_project_ray/bazel-out/x64_windows-opt/testlogs/python/ray/serve/test_http_state/test_attempts/attempt_2.log
```

This change updates `test_http_state`'s duration to medium, so it doesn't time out.
Use new public api `serve.status()`, `serve.get_app_handle()`, `serve.get_deployment_handle()` in tests when appropriate.
…ts (#38475)

Change the replicas to stop logic from selecting node with fewest replicas of the same deployment to node with fewest replicas of all deployments.

This doesn't completely avoid the fragmentation issue which is unavoidable and requires compaction to fix but should behavior better in many cases with multiple deployments: e.g. two deployments A, B, one node has A1, B1 and the other node has A2, B2. If we scale down A, B by 1, we should be able to free a node with the new policy.

Signed-off-by: Jiajun Yao <[email protected]>
Signed-off-by: woshiyyya <[email protected]>
Signed-off-by: Yunxuan Xiao <[email protected]>
Co-authored-by: matthewdeng <[email protected]>
This PR is to postpone `reader.get_read_tasks()` (i.e. generate the `List[ReadTask]`) until Dataset is executed. Also introduce a hook to allow post processing input files inside `reader.get_read_tasks()`, so we can have custom logic to do post processing of input files, before returning the `List[ReadTask]`.

Signed-off-by: Cheng Su <[email protected]>
…t parallel pg submission instead of serial. (#38437)

The unit test test_placement_group_parallel_submission has become flaky on MacOS. The test schedules a task, which creates a placement group and schedules another task within the placement group, then waits for the task and deletes the placement group. The unit test is designed to run with 20 tasks in parallel, but seems it is misconfigured as the each of the 20 run in serial.

This PR fixes the unit test by making the task which creates placement groups require num_cpus=0. This makes the test run in parallel instead of serial, and tests the parallel scheduling of placement groups as desired.

Signed-off-by: Cade Daniel <[email protected]>
@pull pull bot added the ⤵️ pull label Aug 16, 2023
@pull pull bot merged commit 2fbd5ff into miqdigital:master Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants