Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reads and writes crashing with EAGAIN #165

Open
sdm900 opened this issue Sep 1, 2017 · 3 comments
Open

reads and writes crashing with EAGAIN #165

sdm900 opened this issue Sep 1, 2017 · 3 comments

Comments

@sdm900
Copy link

sdm900 commented Sep 1, 2017

I have an issue where we get errors when sending large files over long fast links.

I instrumented the code (so I could see what the errno was) and got

E0901 04:06:28.172408 38457 SenderThread.cpp:389] wdt>       Thread[5, port: 38185]  Write error -1 (262144). fd = 8. file = randfile. port = 38185  error 11  Resource temporarily unavailable

which is telling me that the write should be "try again"... but WDT just crashes out.

Lines like

    written = socket_->write(buffer, size, /* retry writes */ true);

are not protected against EAGAIN...

Why? Is their something in the design that is meant to handle this?

Why aren't the error messages spitting out the errno and string version? eg.

<< ". port = " << socket_->getPort() << "  error " << errno << "  " << std::strerror(errno) ;

Thanks.

@sdm900 sdm900 changed the title short reads and writes reads and writes crashing with EAGAIN Sep 1, 2017
@sdm900
Copy link
Author

sdm900 commented Sep 1, 2017

This patch appears to fix it...

diff --git a/util/WdtSocket.cpp b/util/WdtSocket.cpp
index f8043d24d4..354ec1c765 100644
--- a/util/WdtSocket.cpp
+++ b/util/WdtSocket.cpp
@@ -149,6 +149,7 @@ int WdtSocket::writeInternal(const char *buf, int nbyte, int timeoutMs,
     int w = writeWithAbortCheck(buf + written, nbyte - written, timeoutMs,
                                 /* always try to write everything */ true);
     if (w <= 0) {
+      if (errno == EAGAIN || errno == EINTR) continue;
       break;
     }
     written += w;
@@ -158,7 +159,7 @@ int WdtSocket::writeInternal(const char *buf, int nbyte, int timeoutMs,
     }
   }
   if (written != nbyte) {
-    WLOG(ERROR) << "Socket write failure " << written << " " << nbyte;
+    WLOG(ERROR) << "Socket write failure " << written << " " << nbyte << " error " << errno << " " << std::strerror(errno) ;
     writeErrorCode_ = SOCKET_WRITE_ERROR;
     return -1;
   }

@sdm900
Copy link
Author

sdm900 commented Sep 1, 2017

What I don't understand is why you have a loop in WdtSocket::writeInternal at all. It is calling writeWithAbortCheck which to the best of my understanding is doing the same loop and handling the same errors...

@sdm900
Copy link
Author

sdm900 commented Sep 1, 2017

I mean I understand that I'm basically setting an infinite timeout on the socket (by ignoring the check in WdtSocket::ioWithAbortCheck - but I'm setting the timeout to read and write timeouts to 20s... so they shouldn't really ever be hit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant