Project

General

Profile

Bug #913

Modular input keeps failing

Added by Luke Murphey about 10 years ago. Updated almost 10 years ago.

Status:
Closed
Priority:
Urgent
Assignee:
Target version:
Start date:
11/24/2014
Due date:
% Done:

100%


Description

The modular input keeps failing for some reason. arg.

Associated revisions

Revision 109 (diff)
Added by luke.murphey almost 10 years ago

Adding support for keep-alives for the TCP PLM connection. References
#913.

Revision 110 (diff)
Added by luke.murphey almost 10 years ago

Adding select.select to detect connection failures. Reference #913.

History

#1 Updated by Luke Murphey about 10 years ago

No relevant logs. The input just seems to stop. The connection to the hub appears down too.

#2 Updated by Luke Murphey about 10 years ago

Theories:

1. The socket times out due to no activity
I might be able to handle with a TCP keep alive. I would expect that an exception would have been generated though.

2. An exception is being generated but not handled
I would think that the exception would be outputted.

3. The input is in a infinite loop
CPU usage seems fine though.

#3 Updated by Luke Murphey about 10 years ago

  • Status changed from New to In Progress

#4 Updated by Luke Murphey about 10 years ago

  • Priority changed from Normal to Urgent

#5 Updated by Luke Murphey about 10 years ago

I suspect the issue is in the TCP interface read function that swallows socket exceptions.

#6 Updated by Luke Murphey about 10 years ago

  • % Done changed from 0 to 90

#7 Updated by Luke Murphey about 10 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 90 to 100

#8 Updated by Luke Murphey almost 10 years ago

  • Status changed from Closed to In Progress

#9 Updated by Luke Murphey almost 10 years ago

Still happens.

#12 Updated by Luke Murphey almost 10 years ago

2014-12-27 17:12:03,006 ERROR Execution failed: Traceback (most recent call last):
  File "/Library/Splunk/splunk_sp/etc/apps/insteon/bin/insteon_app/modular_input.py", line 1127, in execute
    self.do_run(in_stream, log_exception_and_continue=True)
  File "/Library/Splunk/splunk_sp/etc/apps/insteon/bin/insteon_app/modular_input.py", line 1027, in do_run
    input_config)
  File "/Library/Splunk/splunk_sp/etc/apps/insteon/bin/insteon_plm.py", line 189, in run
    ready_to_read, ready_to_write, in_error = select.select([self.interface.__s,], [self.interface.__s,], [], 5)
AttributeError: 'TCP' object has no attribute '_InsteonPLMInput__s'

#13 Updated by Luke Murphey almost 10 years ago

Monitoring the connection with this:

netstat -n | grep 9761

#14 Updated by Luke Murphey almost 10 years ago

To test, I disconnected from the network for several minutes. I see that a connection is still established but no packets are being sent back and forth. It seems like the connection is broken from the hub-side and isn't being detected by Python.

Re-establishing a connection from another hosts succeeds which indicates that the connection really was dead. I'm going to leave the input running to see if it eventually detects the connection failure using the select.select method.

#15 Updated by Luke Murphey almost 10 years ago

It's been 4 hours and my host still shows a connection to the PLM. I'm going to look into using a keep alive to help detect connection failures.

#18 Updated by Luke Murphey almost 10 years ago

  • % Done changed from 100 to 80

#19 Updated by Luke Murphey almost 10 years ago

Implemented a method that bounces the connection if no activity is observed within a given time frame.

#20 Updated by Luke Murphey almost 10 years ago

  • % Done changed from 80 to 100

#21 Updated by Luke Murphey almost 10 years ago

  • Status changed from In Progress to Closed

Also available in: Atom PDF