Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service stops if shelly is not reachable #44

Open
jgracio opened this issue Apr 16, 2023 · 15 comments
Open

Service stops if shelly is not reachable #44

jgracio opened this issue Apr 16, 2023 · 15 comments

Comments

@jgracio
Copy link

jgracio commented Apr 16, 2023

Although there is not a persistent problem with the network nor with shelly 3em, the service stops after some minutes with the following message on current.log "CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0".
After I restart the service manually and it starts working again for more some minutes/hour.

There is a way to make this access more resilient not being necessary o keep restarting it?

Thanks

@fabian-lauer
Copy link
Owner

Hi @jgracio ,

thanks for your feedback - I think there are more possibilities to gain more resilience, but can you give more details about the situation?

Is the device not reachable via network (WIFI lost, etc).
I am asking as I would like to reproduce the situation and work on solutions.

Because based on code the error should result in one miss measurement and a retry after ~500ms.
Can you confirm or report anything different?

@jgracio
Copy link
Author

jgracio commented Apr 18, 2023

Thanks @fabian-lauer for your prompt reply

I do not have any further information about connection failures, but I can say I'm using a wifi connection on Shelly and also on Cerbo GX. As I wrote before my Unify system is not detecting any grid failure.

Where is set the retry after ~500ms?
I notice on line 134 that the access to Shelly has "meter_r = requests.get(url = URL, timeout=5)". Shall I increase the timeout here?

Thanks again

@fabian-lauer
Copy link
Owner

The 500ms is the time where the main "loop" for getting data is running.
The "timeout=5" in the command is just if the http response is not received within that time.....I mean you can extend this, but if no response is received within the timeframe, you will just wait for the next "loop" run.

Hmmm... difficult

@jgracio
Copy link
Author

jgracio commented Apr 20, 2023

Thanks, @fabian-lauer

I did increase the timeout of the HTTP response from 5 to 10. It seems the failure are decreasing but still failing, but I notice a pattern with daily failures @ 00:01:
_

"2023-04-17 00:01:03,946 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-19 00:01:09,829 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-20 00:01:10,558 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0"

_

Do you have a view of why the failure is happening with this regularity?

@fabian-lauer
Copy link
Owner

Hi @jgracio - not sure what the reason is really, but looks like a problem with network - maybe regular router restart configures or something else?

But will check what happens when Shelly Connection goes away like while rebooting

@fabian-lauer
Copy link
Owner

Hmmm...just realizing that my log file looks the same:

2023-04-16 00:01:15,830 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-16 00:01:20,865 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-16 00:01:25,905 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 00:01:05,303 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 00:01:10,357 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 00:01:15,419 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 00:01:20,455 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 00:01:25,494 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 01:00:45,609 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 01:00:50,670 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 01:00:55,706 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 01:01:00,740 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 01:01:05,775 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 01:01:10,818 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 01:01:15,855 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 01:01:15,885 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 01:01:16,386 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 01:01:16,883 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 01:01:20,515 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 01:01:23,639 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 01:01:26,758 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 01:01:29,878 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 01:01:32,997 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-17 01:01:36,116 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-18 00:01:05,392 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-18 00:01:10,437 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-18 00:01:15,472 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-18 00:01:20,507 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-18 00:01:25,552 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-18 11:36:41,599 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-19 00:01:06,318 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-19 00:01:11,386 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-19 00:01:16,424 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-19 00:01:21,463 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-19 00:01:26,499 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-20 00:01:04,982 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-20 00:01:10,40 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-20 00:01:15,72 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-20 00:01:20,106 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0
2023-04-20 00:01:25,164 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0

But I have no problem with the script - never do a restart either of Cebro nor script

@jgracio
Copy link
Author

jgracio commented Apr 21, 2023

Hello @fabian-lauer

After the exception, I include the following line to have more information about it:

logging.critical("ValueError: %s, ConnectionError: %s , Timeout: %s, ConnectionError2: %s" % (ValueError, requests.exceptions.ConnectionError, requests.exceptions.Timeout, ConnectionError))

My knowledge of Python is null, so the result was not worth as I got

2023-04-21 00:01:09,810 root CRITICAL ValueError: <class 'ValueError'>, ConnectionError: <class 'requests.exceptions.ConnectionError'> , Timeout: <class 'requests.exceptions.Timeout'>, ConnectionError2: <class 'ConnectionError'>
2023-04-21 00:01:09,811 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0

Can you please help me better understand why this error (or errors) happens?

@fabian-lauer
Copy link
Owner

Hi @jgracio ,

that style of logging is not correct - please change it like this:

except (ValueError, requests.exceptions.ConnectionError, requests.exceptions.Timeout, ConnectionError) as e:
logging.critical('Error getting data from Shelly - check network or Shelly status. Setting power values to 0. Details: %s', e, exc_info=e)
self._dbusservice['/Ac/L1/Power'] = 0
self._dbusservice['/Ac/L2/Power'] = 0
self._dbusservice['/Ac/L3/Power'] = 0
self._dbusservice['/Ac/Power'] = 0
self._dbusservice['/UpdateIndex'] = (self._dbusservice['/UpdateIndex'] + 1 ) % 256

After that and after making Shelly unavailable - error msg look like this:

2023-04-21 07:17:25,239 urllib3.connectionpool DEBUG Starting new HTTP connection (1): 192.168.2.13:80
2023-04-21 07:17:25,314 urllib3.connectionpool DEBUG http://192.168.2.13:80 "GET /status HTTP/1.1" 401 None
2023-04-21 07:17:25,341 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0. Details: No response from Shelly 3EM - http://flauer:[email protected]/status
Traceback (most recent call last):
File "/data/dbus-shelly-3em-smartmeter/dbus-shelly-3em-smartmeter.py", line 160, in _update
meter_data = self._getShellyData()
File "/data/dbus-shelly-3em-smartmeter/dbus-shelly-3em-smartmeter.py", line 138, in _getShellyData
raise ConnectionError("No response from Shelly 3EM - %s" % (URL))
ConnectionError: No response from Shelly 3EM - http://flauer:[email protected]/status

But when checking the details - you will see that the error is thrown by own code based on "if not meter_r:".
So the script is unable to get response result....based on what ever.

@jgracio
Copy link
Author

jgracio commented Apr 21, 2023

@fabian-lauer , I used your script and the error log stopped with the following

2023-04-21 22:07:26,974 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0. Details: HTTPConnectionPool(host='192.168.1.128', port=80): Max retries exceeded with url: /status (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0xb5a517d8>, 'Connection to 192.168.1.128 timed out. (connect timeout=10)'))
Traceback (most recent call last):
File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 159, in _new_conn
conn = connection.create_connection(
File "/usr/lib/python3.8/site-packages/urllib3/util/connection.py", line 84, in create_connection
raise err
File "/usr/lib/python3.8/site-packages/urllib3/util/connection.py", line 74, in create_connection
sock.connect(sa)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 665, in urlopen
httplib_response = self._make_request(
File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 387, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.8/http/client.py", line 1256, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.8/http/client.py", line 1302, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.8/http/client.py", line 1251, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.8/http/client.py", line 1011, in _send_output
self.send(msg)
File "/usr/lib/python3.8/http/client.py", line 951, in send
self.connect()
File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 187, in connect
conn = self._new_conn()
File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 164, in _new_conn
raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPConnection object at 0xb5a517d8>, 'Connection to 192.168.1.128 timed out. (connect timeout=10)')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
resp = conn.urlopen(
File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 719, in urlopen
retries = retries.increment(
File "/usr/lib/python3.8/site-packages/urllib3/util/retry.py", line 436, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='192.168.1.128', port=80): Max retries exceeded with url: /status (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0xb5a517d8>, 'Connection to 192.168.1.128 timed out. (connect timeout=10)'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/data/dbus-shelly-3em-smartmeter/dbus-shelly-3em-smartmeter.py", line 150, in _update
meter_data = self._getShellyData()
File "/data/dbus-shelly-3em-smartmeter/dbus-shelly-3em-smartmeter.py", line 124, in _getShellyData
meter_r = requests.get(url = URL, timeout=10)
File "/usr/lib/python3.8/site-packages/requests/api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "/usr/lib/python3.8/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python3.8/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3.8/site-packages/requests/sessions.py", line 643, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3.8/site-packages/requests/adapters.py", line 504, in send
raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='192.168.1.128', port=80): Max retries exceeded with url: /status (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0xb5a517d8>, 'Connection to 192.168.1.128 timed out. (connect timeout=10)'))

Thats a long message and very different from the one you got.

Any clue what's wrong?

@fabian-lauer
Copy link
Owner

Puhh...still hard to rate for me - looks like a "network" issue - but can also be a problem with the endpoint on Shelly device.

Let's focus more on the fact that the device is not reachable for a few seconds (why - no idea at the moment) and focus on the fact that it should recover automatically - this is still not happening right?

@jgracio
Copy link
Author

jgracio commented Apr 24, 2023

Agreed.
The service keeps failing and it restarts for more some hours after being restarted.
It would be great if the service would not stop.

@fabian-lauer
Copy link
Owner

Hi @jgracio,

no idea so far why the script is crashing - will try to reproduce in a few minutes by changing the IP address of my Shelly during normal script run.....

@jgracio
Copy link
Author

jgracio commented Apr 25, 2023

Thanks @fabian-lauer

@fabian-lauer
Copy link
Owner

Hi @jgracio,

this doing fell into oblivion, but I finaly did the test.
BUT without any new outcome - for me it is fine - when chaning the IP the script gives in logging mode "DEBUG" the following error msg:

2023-05-26 11:43:10,423 root CRITICAL Error getting data from Shelly - check network or Shelly status. Setting power values to 0. Details: HTTPConnectionPool(host='192.168.2.13', port=80): Max retries exceeded with url: /status (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0xb5abcfd0>: Failed to establish a new connection: [Errno 113] No route to host'))
Traceback (most recent call last):
File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 159, in _new_conn
conn = connection.create_connection(
File "/usr/lib/python3.8/site-packages/urllib3/util/connection.py", line 84, in create_connection
raise err
File "/usr/lib/python3.8/site-packages/urllib3/util/connection.py", line 74, in create_connection
sock.connect(sa)
OSError: [Errno 113] No route to host

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 665, in urlopen
httplib_response = self._make_request(
File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 387, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.8/http/client.py", line 1256, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.8/http/client.py", line 1302, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.8/http/client.py", line 1251, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.8/http/client.py", line 1011, in _send_output
self.send(msg)
File "/usr/lib/python3.8/http/client.py", line 951, in send
self.connect()
File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 187, in connect
conn = self._new_conn()
File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 171, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0xb5abcfd0>: Failed to establish a new connection: [Errno 113] No route to host

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
resp = conn.urlopen(
File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 719, in urlopen
retries = retries.increment(
File "/usr/lib/python3.8/site-packages/urllib3/util/retry.py", line 436, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='192.168.2.13', port=80): Max retries exceeded with url: /status (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0xb5abcfd0>: Failed to establish a new connection: [Errno 113] No route to host'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/data/dbus-shelly-3em-smartmeter/dbus-shelly-3em-smartmeter.py", line 160, in _update
meter_data = self._getShellyData()
File "/data/dbus-shelly-3em-smartmeter/dbus-shelly-3em-smartmeter.py", line 134, in _getShellyData
meter_r = requests.get(url = URL, timeout=5)
File "/usr/lib/python3.8/site-packages/requests/api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "/usr/lib/python3.8/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python3.8/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3.8/site-packages/requests/sessions.py", line 643, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3.8/site-packages/requests/adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='192.168.2.13', port=80): Max retries exceeded with url: /status (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0xb5abcfd0>: Failed to establish a new connection: [Errno 113] No route to host'))
2023-05-26 11:43:10,458 urllib3.connectionpool DEBUG Starting new HTTP connection (1): 192.168.2.13:80

And when changing the IP back to the normal value it starts working again:

2023-05-26 11:43:13,582 urllib3.connectionpool DEBUG Starting new HTTP connection (1): 192.168.2.13:80
2023-05-26 11:43:16,17 urllib3.connectionpool DEBUG http://192.168.2.13:80 "GET /status HTTP/1.1" 200 1183
2023-05-26 11:43:16,85 root DEBUG House Consumption (/Ac/Power): -6464.66
2023-05-26 11:43:16,100 root DEBUG House Forward (/Ac/Energy/Forward): 0
2023-05-26 11:43:16,105 root DEBUG House Reverse (/Ac/Energy/Revers): 0.1522681833333333
2023-05-26 11:43:16,111 root DEBUG ---

So I have no good idea at the moment what may cause your problem.
Could only imagine that the Pyhton libary handels an unreachable host differntly compared to a timeout.
But I don't know how to reproduce a timeout :-(

@svenbluege
Copy link

svenbluege commented Oct 7, 2023

I encounter the same issue: after a socket.timeout: timed out, the server can't fix itself.

To reproduce this, one could add a server with a simple "I sleep and answer after 60s" page to get the timeout. Maybe this is caused by a hick up in the EM3 or the Wifi. Only an idea... I did not do that. This only produces a Read timed out which looks recoverable (i switched between two hosts in config.ini for this test)

Since nobody has a clue on how to reproduce this until now, what do you think about simply exiting the program and let the service do a restart? We at least know that a manual restart of the server helps. That could be automated. I know, turning it off and on is not a desirable fix ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants