Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add non-blocking wait #766

Merged
merged 7 commits into from
Sep 13, 2023
Merged

Add non-blocking wait #766

merged 7 commits into from
Sep 13, 2023

Conversation

luraess
Copy link
Contributor

@luraess luraess commented Sep 6, 2023

A first implementation of non-blocking wait as discussed in #762

@luraess luraess requested a review from simonbyrne September 6, 2023 20:27
@vchuravy
Copy link
Member

vchuravy commented Sep 6, 2023

Should we also add a MPI.wait(t::Task)?

function wait(task::Task)
    while !Base.istaskdone(task)
        MPI.Iprobe(MPI.MPI_ANY_SOURCE, MPI.MPI_ANY_TAG, MPI.COMM_WORLD)
        yield()
    end
    wait(task)
end

@luraess
Copy link
Contributor Author

luraess commented Sep 7, 2023

Should we also add a MPI.wait(t::Task)?

If so, should one add it in similar location as the Base.wait?

@simonbyrne
Copy link
Member

I think leave that for a separate PR.

@simonbyrne
Copy link
Member

Can you add it to the docs, and add a test?

@luraess
Copy link
Contributor Author

luraess commented Sep 8, 2023

Can you add it to the docs, and add a test?

Done. Feel free to update the docstring @simonbyrne

src/nonblocking.jl Outdated Show resolved Hide resolved
Comment on lines 18 to 30
Threads.@spawn begin
# send to self
recv_req = MPI.Irecv!(recv_arr[i], MPI.COMM_WORLD; source=myrank, tag=i)
send_req = MPI.Isend(send_arr[i], MPI.COMM_WORLD; dest=myrank, tag=i)

wait(recv_req)
@test MPI.isnull(recv_req)
recv_check[i] += 1

wait(send_req)
@test MPI.isnull(send_req)
send_check[i] += 1
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that this is correct, in that it seems to assume that you can wait on the receive before the send is complete (this is typically true for short messages, but not guaranteed by the spec).

How about something like:

Suggested change
Threads.@spawn begin
# send to self
recv_req = MPI.Irecv!(recv_arr[i], MPI.COMM_WORLD; source=myrank, tag=i)
send_req = MPI.Isend(send_arr[i], MPI.COMM_WORLD; dest=myrank, tag=i)
wait(recv_req)
@test MPI.isnull(recv_req)
recv_check[i] += 1
wait(send_req)
@test MPI.isnull(send_req)
send_check[i] += 1
end
Threads.@spawn begin
recv_req = MPI.Irecv!(recv_arr[i], MPI.COMM_WORLD; source=myrank, tag=i)
wait(recv_req)
@test MPI.isnull(recv_req)
recv_check[i] += 1
end
Threads.@spawn begin
send_req = MPI.Isend(send_arr[i], MPI.COMM_WORLD; dest=myrank, tag=i)
wait(send_req)
@test MPI.isnull(send_req)
send_check[i] += 1
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I follow your point. Also, how would having 2 threads guarantee the following to be true.

you can wait on the receive before the send is complete

Why would the semantic change from usual non-blocking MPI point to point comm which has the initially proposed workflow (as in here). Initially suggested, we have the non-blocking workflow done per MPI process in a sequential way, and have one thread per process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your original proposal, the Isend and Irecv! are in one Julia task, and you wait on the receive first: this means that the receive has to complete before it can wait on the send. This is typically fine, but there is no guarantee that it actually will complete. From https://stackoverflow.com/a/67099566/392585

MPI_Isend does not necessarily progress in the background but only when the MPI implementation is given the chance to progress it. MPI_Wait progresses the operation and guarantees its completion. Some MPI implementations can progress operations in the background using progression threads. Some cannot. It is implementation-dependent and one should never rely on one or another specific behaviour.

My proposal splits the send and receive into two Julia tasks: if the task is waiting on the receive, it can then switch to the send task and complete that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Thanks for the details.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonbyrne I don't quite understand the need for threads. Page 75 of the MPI 4.0 standard (section 3.7 Nonblocking Communication) states:

Progress A call to MPI_WAIT that completes a receive will eventually terminate and return
if a matching send has been started , unless the send is satisfied by another receive. In
particular, if the matching send is nonblocking, then the receive should complete even if no
call is executed by the sender to complete the send. Similarly, a call to MPI_WAIT that
completes a send will eventually return if a matching receive has been started, unless the
receive is satisfied by another send, and even if no call is executed to complete the receive.

Is there something that I am missing? The above indicates to me that the wait on the receive should return even if the send has not been waited on.

p.s. This statement is also on page 45 of the MPI 1.3 standard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, TIL! Thanks @lcw.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your welcome. I was worried I had been using an invalid assumption for 20 years. So, I had to look it up.

Copy link
Contributor Author

@luraess luraess Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand the need for threads

The usage of threads here emulates a use case in which one would use a TLS approach to perform some async work on GPUs which requires to launch kernels on different streams from different Julia tasks/threads. This thus requires the "cooperative" wait. But indeed, I was also confused by the fact that non-blocking comm would stall in the single thread spawn implementation.

@luraess
Copy link
Contributor Author

luraess commented Sep 13, 2023

Could this be reviewed and finalised 🙏 ?

@simonbyrne simonbyrne closed this Sep 13, 2023
@simonbyrne simonbyrne reopened this Sep 13, 2023
@simonbyrne simonbyrne merged commit 846c79d into JuliaParallel:master Sep 13, 2023
39 of 47 checks passed
@luraess
Copy link
Contributor Author

luraess commented Sep 15, 2023

Shall one thus revert the test and spawn a single thread to do the non-blocking send and receive within as originally proposed, given #766 (comment)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants