Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable GitHub CI/CD runners #30

Merged
merged 28 commits into from
Nov 13, 2023
Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
aaadad7
Testing CI/CD
Nov 1, 2023
2f68956
Testing CI
Nov 1, 2023
878d59b
Update self-push-amd.yml
AdrianAbeyta Nov 6, 2023
39c3676
Remove CI Tests
Nov 6, 2023
df6d2be
Added runner check
Nov 6, 2023
06deb15
Remove dependency on status
Nov 6, 2023
2622d0a
Update self-push-amd.yml
AdrianAbeyta Nov 6, 2023
865fcab
Update self-push-amd.yml
AdrianAbeyta Nov 6, 2023
528320a
Update self-push-amd.yml
AdrianAbeyta Nov 6, 2023
9dbb299
Update self-push-amd.yml
AdrianAbeyta Nov 6, 2023
af40277
Update self-push-amd-mi250-caller.yml
okakarpa Nov 7, 2023
67499a0
Update self-push-amd.yml
AdrianAbeyta Nov 7, 2023
705a709
Update self-push-amd-mi250-caller.yml
AdrianAbeyta Nov 7, 2023
c1e49d2
Update self-push-amd.yml
AdrianAbeyta Nov 7, 2023
5a22b23
Update self-push-amd.yml
AdrianAbeyta Nov 7, 2023
f4fef57
Update self-push-amd.yml
AdrianAbeyta Nov 7, 2023
228ec6f
Update self-push-amd.yml
AdrianAbeyta Nov 7, 2023
af55eb3
Update self-push-amd.yml
AdrianAbeyta Nov 7, 2023
257f0a6
Modified CI to use internal transformers
AdrianAbeyta Nov 7, 2023
cb34248
Update self-push-amd.yml
AdrianAbeyta Nov 8, 2023
c12093a
Change syntax to fit internal transformers runs
Nov 8, 2023
6a1a7d5
Modify file to test ci
Nov 8, 2023
4702912
Modify test file
Nov 8, 2023
cbfa403
Clean up test cases
Nov 9, 2023
3b73ca4
Update MI250 caller to run jobs on PR.
AdrianAbeyta Nov 9, 2023
6439468
Update mi210 caller to test on PR.
AdrianAbeyta Nov 9, 2023
d2a2a60
Revert to upstream syntax
Nov 9, 2023
ca05884
Merge branch 'run_amd_push_ci_caller' of https://github.com/ROCmSoftw…
Nov 9, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 23 additions & 50 deletions .github/workflows/self-push-amd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,25 +17,12 @@ env:
RUN_PT_TF_CROSS_TESTS: 1

jobs:
check_runner_status:
name: Check Runner Status
runs-on: ubuntu-latest
steps:
- name: Checkout transformers
uses: actions/checkout@v3
with:
fetch-depth: 2

- name: Check Runner Status
run: python utils/check_self_hosted_runner.py --target_runners amd-mi210-single-gpu-ci-runner-docker --token ${{ secrets.ACCESS_REPO_INFO_TOKEN }}

check_runners:
name: Check Runners
needs: check_runner_status
strategy:
matrix:
machine_type: [single-gpu, multi-gpu]
runs-on: [self-hosted, docker-gpu, amd-gpu, '${{ matrix.machine_type }}', '${{ inputs.gpu_flavor }}']
runs-on: rocm
container:
image: huggingface/transformers-pytorch-amd-gpu-push-ci # <--- We test only for PyTorch for now
options: --device /dev/kfd --device /dev/dri --env HIP_VISIBLE_DEVICES --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
Expand All @@ -54,14 +41,21 @@ jobs:
strategy:
matrix:
machine_type: [single-gpu, multi-gpu]
runs-on: [self-hosted, docker-gpu, amd-gpu, '${{ matrix.machine_type }}', '${{ inputs.gpu_flavor }}']
runs-on: rocm
container:
image: huggingface/transformers-pytorch-amd-gpu-push-ci # <--- We test only for PyTorch for now
options: --device /dev/kfd --device /dev/dri --env HIP_VISIBLE_DEVICES --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
test_map: ${{ steps.set-matrix.outputs.test_map }}
steps:
- name: Remove transformers repository (installed during docker image build)
working-directory: /
shell: bash
run: |
rm -r transformers
git clone https://github.com/ROCmSoftwarePlatform/transformers.git

# Necessary to get the correct branch name and commit SHA for `workflow_run` event
# We also take into account the `push` event (we might want to test some changes in a branch)
- name: Prepare custom environment variables
Expand Down Expand Up @@ -152,11 +146,23 @@ jobs:
matrix:
folders: ${{ fromJson(needs.setup_gpu.outputs.matrix) }}
machine_type: [single-gpu, multi-gpu]
runs-on: [self-hosted, docker-gpu, amd-gpu, '${{ matrix.machine_type }}', '${{ inputs.gpu_flavor }}']
runs-on: rocm
container:
image: huggingface/transformers-pytorch-amd-gpu-push-ci # <--- We test only for PyTorch for now
options: --device /dev/kfd --device /dev/dri --env HIP_VISIBLE_DEVICES --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:

- name: Remove transformers repository (installed during docker image build)
working-directory: /
shell: bash
run: |
rm -r transformers
git clone https://github.com/ROCmSoftwarePlatform/transformers.git

- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .

# Necessary to get the correct branch name and commit SHA for `workflow_run` event
# We also take into account the `push` event (we might want to test some changes in a branch)
- name: Prepare custom environment variables
Expand Down Expand Up @@ -189,10 +195,6 @@ jobs:
git checkout ${{ env.CI_SHA }}
echo "log = $(git log -n 1)"

- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .

- name: Echo folder ${{ matrix.folders }}
shell: bash
# For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to
Expand Down Expand Up @@ -244,19 +246,15 @@ jobs:
runs-on: ubuntu-latest
if: always()
needs: [
check_runner_status,
check_runners,
setup_gpu,
run_tests_amdgpu,
# run_tests_torch_cuda_extensions_single_gpu,
# run_tests_torch_cuda_extensions_multi_gpu
run_tests_amdgpu
]
steps:
- name: Preliminary job status
shell: bash
# For the meaning of these environment variables, see the job `Setup`
run: |
echo "Runner availability: ${{ needs.check_runner_status.result }}"
echo "Setup status: ${{ needs.setup_gpu.result }}"
echo "Runner status: ${{ needs.check_runners.result }}"

Expand Down Expand Up @@ -297,28 +295,3 @@ jobs:
echo "updated branch = $(git branch --show-current)"
git checkout ${{ env.CI_SHA }}
echo "log = $(git log -n 1)"

- uses: actions/download-artifact@v3
- name: Send message to Slack
env:
CI_SLACK_BOT_TOKEN: ${{ secrets.CI_SLACK_BOT_TOKEN }}
CI_SLACK_CHANNEL_ID: ${{ secrets.CI_SLACK_CHANNEL_ID }}
CI_SLACK_CHANNEL_ID_DAILY: ${{ secrets.CI_SLACK_CHANNEL_ID_DAILY }}
CI_SLACK_CHANNEL_ID_AMD: ${{ secrets.CI_SLACK_CHANNEL_ID_AMD }}
CI_SLACK_CHANNEL_DUMMY_TESTS: ${{ secrets.CI_SLACK_CHANNEL_DUMMY_TESTS }}
CI_SLACK_REPORT_CHANNEL_ID: ${{ secrets.CI_SLACK_CHANNEL_ID_AMD }}
ACCESS_REPO_INFO_TOKEN: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}
CI_EVENT: Push CI (AMD) - ${{ inputs.gpu_flavor }}
CI_TITLE_PUSH: ${{ github.event.head_commit.message }}
CI_TITLE_WORKFLOW_RUN: ${{ github.event.workflow_run.head_commit.message }}
CI_SHA: ${{ env.CI_SHA }}
RUNNER_STATUS: ${{ needs.check_runner_status.result }}
RUNNER_ENV_STATUS: ${{ needs.check_runners.result }}
SETUP_STATUS: ${{ needs.setup_gpu.result }}

# We pass `needs.setup_gpu.outputs.matrix` as the argument. A processing in `notification_service.py` to change
# `models/bert` to `models_bert` is required, as the artifact names use `_` instead of `/`.
run: |
pip install slack_sdk
pip show slack_sdk
python utils/notification_service.py "${{ needs.setup_gpu.outputs.matrix }}"
36 changes: 18 additions & 18 deletions utils/tests_fetcher.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
This util is designed to fetch tests to run on a PR so that only the tests impacted by the modifications are run, and
when too many models are being impacted, only run the tests of a subset of core models. It works like this.

Stage 1: Identify the modified files. For jobs that run on the main branch, it's just the diff with the last commit.
Stage 1: Identify the modified files. For jobs that run on the master branch, it's just the diff with the last commit.
On a PR, this takes all the files from the branching point to the current commit (so all modifications in a PR, not
just the last commit) but excludes modifications that are on docstrings or comments only.

Expand All @@ -42,7 +42,7 @@
python utils/tests_fetcher.py
```

Base use to fetch the tests on a the main branch (with diff from the last commit):
Base use to fetch the tests on a the master branch (with diff from the last commit):

```bash
python utils/tests_fetcher.py --diff_with_last_commit
Expand Down Expand Up @@ -300,7 +300,7 @@ def get_modified_python_files(diff_with_last_commit: bool = False) -> List[str]:
"""
Return a list of python files that have been modified between:

- the current head and the main branch if `diff_with_last_commit=False` (default)
- the current head and the master branch if `diff_with_last_commit=False` (default)
AdrianAbeyta marked this conversation as resolved.
Show resolved Hide resolved
- the current head and its parent commit otherwise.

Returns:
Expand All @@ -311,15 +311,15 @@ def get_modified_python_files(diff_with_last_commit: bool = False) -> List[str]:
repo = Repo(PATH_TO_REPO)

if not diff_with_last_commit:
print(f"main is at {repo.refs.main.commit}")
print(f"master is at {repo.refs.master.commit}")
print(f"Current head is at {repo.head.commit}")

branching_commits = repo.merge_base(repo.refs.main, repo.head)
branching_commits = repo.merge_base(repo.refs.master, repo.head)
for commit in branching_commits:
print(f"Branching commit: {commit}")
return get_diff(repo, repo.head.commit, branching_commits)
else:
print(f"main is at {repo.head.commit}")
print(f"master is at {repo.head.commit}")
parent_commits = repo.head.commit.parents
for commit in parent_commits:
print(f"Parent commit: {commit}")
Expand Down Expand Up @@ -424,7 +424,7 @@ def get_doctest_files(diff_with_last_commit: bool = False) -> List[str]:
"""
Return a list of python and Markdown files where doc example have been modified between:

- the current head and the main branch if `diff_with_last_commit=False` (default)
- the current head and the master branch if `diff_with_last_commit=False` (default)
- the current head and its parent commit otherwise.

Returns:
Expand All @@ -435,15 +435,15 @@ def get_doctest_files(diff_with_last_commit: bool = False) -> List[str]:

test_files_to_run = [] # noqa
if not diff_with_last_commit:
print(f"main is at {repo.refs.main.commit}")
print(f"master is at {repo.refs.master.commit}")
print(f"Current head is at {repo.head.commit}")

branching_commits = repo.merge_base(repo.refs.main, repo.head)
branching_commits = repo.merge_base(repo.refs.master, repo.head)
for commit in branching_commits:
print(f"Branching commit: {commit}")
test_files_to_run = get_diff_for_doctesting(repo, repo.head.commit, branching_commits)
else:
print(f"main is at {repo.head.commit}")
print(f"master is at {repo.head.commit}")
parent_commits = repo.head.commit.parents
for commit in parent_commits:
print(f"Parent commit: {commit}")
Expand All @@ -452,7 +452,7 @@ def get_doctest_files(diff_with_last_commit: bool = False) -> List[str]:
all_test_files_to_run = get_all_doctest_files()

# Add to the test files to run any removed entry from "utils/not_doctested.txt".
new_test_files = get_new_doctest_files(repo, repo.head.commit, repo.refs.main.commit)
new_test_files = get_new_doctest_files(repo, repo.head.commit, repo.refs.master.commit)
test_files_to_run = list(set(test_files_to_run + new_test_files))

# Do not run slow doctest tests on CircleCI
Expand Down Expand Up @@ -766,8 +766,8 @@ def create_reverse_dependency_map() -> Dict[str, List[str]]:
something_changed = False
for m in all_modules:
for d in direct_deps[m]:
# We stop recursing at an init (cause we always end up in the main init and we don't want to add all
# files which the main init imports)
# We stop recursing at an init (cause we always end up in the master init and we don't want to add all
# files which the master init imports)
if d.endswith("__init__.py"):
continue
if d not in direct_deps:
Expand Down Expand Up @@ -910,7 +910,7 @@ def infer_tests_to_run(
json_output_file: Optional[str] = None,
):
"""
The main function called by the test fetcher. Determines the tests to run from the diff.
The master function called by the test fetcher. Determines the tests to run from the diff.

Args:
output_file (`str`):
Expand All @@ -922,8 +922,8 @@ def infer_tests_to_run(
- doctest_list.txt: The list of doctests to run.

diff_with_last_commit (`bool`, *optional*, defaults to `False`):
Whether to analyze the diff with the last commit (for use on the main branch after a PR is merged) or with
the branching point from main (for use on each PR).
Whether to analyze the diff with the last commit (for use on the master branch after a PR is merged) or with
the branching point from master (for use on each PR).
filter_models (`bool`, *optional*, defaults to `True`):
Whether or not to filter the tests to core models only, when a file modified results in a lot of model
tests.
Expand Down Expand Up @@ -1112,8 +1112,8 @@ def parse_commit_message(commit_message: str) -> Dict[str, bool]:
print("Force-launching all tests")

diff_with_last_commit = args.diff_with_last_commit
if not diff_with_last_commit and not repo.head.is_detached and repo.head.ref == repo.refs.main:
print("main branch detected, fetching tests against last commit.")
if not diff_with_last_commit and not repo.head.is_detached and repo.head.ref == repo.refs.master:
print("master branch detected, fetching tests against last commit.")
diff_with_last_commit = True

if not commit_flags["test_all"]:
Expand Down
Loading