Bug #913
Modular input keeps failing
100%
Description
The modular input keeps failing for some reason. arg.
Associated revisions
Adding support for keep-alives for the TCP PLM connection. References
#913.
Adding select.select to detect connection failures. Reference #913.
History
#1 Updated by Luke Murphey about 10 years ago
No relevant logs. The input just seems to stop. The connection to the hub appears down too.
#2 Updated by Luke Murphey about 10 years ago
Theories:
1. The socket times out due to no activity
I might be able to handle with a TCP keep alive. I would expect that an exception would have been generated though.
2. An exception is being generated but not handled
I would think that the exception would be outputted.
3. The input is in a infinite loop
CPU usage seems fine though.
#3 Updated by Luke Murphey about 10 years ago
- Status changed from New to In Progress
#4 Updated by Luke Murphey about 10 years ago
- Priority changed from Normal to Urgent
#5 Updated by Luke Murphey about 10 years ago
I suspect the issue is in the TCP interface read function that swallows socket exceptions.
#6 Updated by Luke Murphey about 10 years ago
- % Done changed from 0 to 90
#7 Updated by Luke Murphey about 10 years ago
- Status changed from In Progress to Closed
- % Done changed from 90 to 100
#8 Updated by Luke Murphey almost 10 years ago
- Status changed from Closed to In Progress
#9 Updated by Luke Murphey almost 10 years ago
Still happens.
#10 Updated by Luke Murphey almost 10 years ago
Trying this method to detect down sockets: http://stackoverflow.com/questions/17386487/python-detect-when-a-socket-disconnects-for-any-reason
#12 Updated by Luke Murphey almost 10 years ago
2014-12-27 17:12:03,006 ERROR Execution failed: Traceback (most recent call last): File "/Library/Splunk/splunk_sp/etc/apps/insteon/bin/insteon_app/modular_input.py", line 1127, in execute self.do_run(in_stream, log_exception_and_continue=True) File "/Library/Splunk/splunk_sp/etc/apps/insteon/bin/insteon_app/modular_input.py", line 1027, in do_run input_config) File "/Library/Splunk/splunk_sp/etc/apps/insteon/bin/insteon_plm.py", line 189, in run ready_to_read, ready_to_write, in_error = select.select([self.interface.__s,], [self.interface.__s,], [], 5) AttributeError: 'TCP' object has no attribute '_InsteonPLMInput__s'
#13 Updated by Luke Murphey almost 10 years ago
Monitoring the connection with this:
netstat -n | grep 9761
#14 Updated by Luke Murphey almost 10 years ago
To test, I disconnected from the network for several minutes. I see that a connection is still established but no packets are being sent back and forth. It seems like the connection is broken from the hub-side and isn't being detected by Python.
Re-establishing a connection from another hosts succeeds which indicates that the connection really was dead. I'm going to leave the input running to see if it eventually detects the connection failure using the select.select method.
#15 Updated by Luke Murphey almost 10 years ago
It's been 4 hours and my host still shows a connection to the PLM. I'm going to look into using a keep alive to help detect connection failures.
#18 Updated by Luke Murphey almost 10 years ago
- % Done changed from 100 to 80
#19 Updated by Luke Murphey almost 10 years ago
Implemented a method that bounces the connection if no activity is observed within a given time frame.
#20 Updated by Luke Murphey almost 10 years ago
- % Done changed from 80 to 100
#21 Updated by Luke Murphey almost 10 years ago
- Status changed from In Progress to Closed