Using NX-1200 for a "ping" server - lockups
cmatkin
Posts: 86
Hi Guys,
I have an NX-1200 performing routine pings on a network using the following:
Sequence:
1. NX-1200 boots up waits 5 minutes and starts a 5 minute timer.
2a. When timer is activated (5 min delay), the NX then telnets into itself and pings 85 ip addresses. Each ping request waits 5 seconds after the previous response.
2b Once all 85 have been completed, the telnet session is closed and a 5 minute timer is activated.
3a. When timer is activated (5 min delay), the NX then telnets into itself and pings i-addresses that reported offline from the previous test. Each ping request waits 5 seconds after the previous response.
3b Once all offline pings have been completed, the telnet session is closed and a 5 minute timer is activated.
4a. When timer is activated (5 min delay), the NX then telnets into itself and pings i-addresses that reported offline from the previous test. Each ping request waits 5 seconds after the previous response.
4b Once all offline pings have been completed, the telnet session is closed and a 5 minute timer is activated.
This sequence loops between 2-4 essentially with a minute delay between pinging ip addresses.
The issue i am experiencing is that the NX processor locks up after a while, the NX will not accept any ip connections (telnet, http, icslan, netlinx...) and requires a hard reboot.
I have a couple of variables which were logged as follows:
numberOfConnections-26
numberOfPings-935
numberOfErrors-0
numberOfConnection is the number of telnet sessions that connected.
numberOfPings is the total amount of pings that were requested
numberOfErrors is the ip connection error count
Has anyone experienced issues with NX processors performing pings? Are there any alternative ways to perform a ping instead of telnetting into itself?
We have been running the current firmware on the processors.
Kind regards
Craig
I have an NX-1200 performing routine pings on a network using the following:
Sequence:
1. NX-1200 boots up waits 5 minutes and starts a 5 minute timer.
2a. When timer is activated (5 min delay), the NX then telnets into itself and pings 85 ip addresses. Each ping request waits 5 seconds after the previous response.
2b Once all 85 have been completed, the telnet session is closed and a 5 minute timer is activated.
3a. When timer is activated (5 min delay), the NX then telnets into itself and pings i-addresses that reported offline from the previous test. Each ping request waits 5 seconds after the previous response.
3b Once all offline pings have been completed, the telnet session is closed and a 5 minute timer is activated.
4a. When timer is activated (5 min delay), the NX then telnets into itself and pings i-addresses that reported offline from the previous test. Each ping request waits 5 seconds after the previous response.
4b Once all offline pings have been completed, the telnet session is closed and a 5 minute timer is activated.
This sequence loops between 2-4 essentially with a minute delay between pinging ip addresses.
The issue i am experiencing is that the NX processor locks up after a while, the NX will not accept any ip connections (telnet, http, icslan, netlinx...) and requires a hard reboot.
I have a couple of variables which were logged as follows:
numberOfConnections-26
numberOfPings-935
numberOfErrors-0
numberOfConnection is the number of telnet sessions that connected.
numberOfPings is the total amount of pings that were requested
numberOfErrors is the ip connection error count
Has anyone experienced issues with NX processors performing pings? Are there any alternative ways to perform a ping instead of telnetting into itself?
We have been running the current firmware on the processors.
Kind regards
Craig
0
Comments
I have rebooted the NX processor multiple times and it can only ping 930 (+/- 10) devices until it locks up.
Ran the exact same code on an NI3100 and NI700 and both are currently around the 2600 pings without a single error or lockup.
Paul
I've tried a few variations, one much like yours, one keeping an open socket and a few others and mine seem to lockup after about 7 hours.... with a 15 second gap in between each poll. I'm also using NX masters, DVXs, NX1200s....
This was a last resort for me as i have a largish project with about 80 odd AMX Acendo Room Booking Panels, which I monitor via SSH. However the SSH server locks up after a while on all, could be an hour, could be a week. So my program reverts from ssh to a ping after the SSH locks up just to see if they are on the network for RMS. The the bloody NX locks up...
Did you get anywhere with this?
I was just about to start looking into something else to do the pinging for me. Raspberry Pi is a good idea.
Thanks for the post though Craig, I was at a dead end.
Should be locked up by the morning.
excited to hear the results.
Anything starting with a > string getting sent to the telnet server.
The polling loop with 15 second gap:
cpu usage, show mem, show buffers, mem, msg stats, 9 pings.
Had a quick look through the diagnostics, only thing i could see is the "Total free memory is xxxxxxxxx" drops steadily, but not by much.
In most cases of memory leak it involves either the underlying programming (firmware) or your program declaring and allocating memory for a new variable or stored value. While you may not necessarily run out of memory, you might hit the maximum number of variables allowed; thus causing a lock up. It's a wild shot in the dark, but perhaps there's something wrong with the ping routine that never gets hit because the vast majority of folks using a telnet session might ping something just a few times and then be done. Perhaps you might try disconnecting the telnet session after each attempt and see if it is then able to keep going.
I was talking to our AMX distributors here in Australia about something else and mentioned this issue. The jumped on it and were able to replicate it quite easily and have passed it back to AMX HQ. See what happens.
i know this is digging up an old thread, but I was curious if anyone had confirmed a fix for this. I was advised that this is a know bug in the kernel which can not be fixed.
It doesn?t matter how you perform the pings, these all get performed from the kernel.
Only solution for me was to use an old NI700.
Used the same code and tested with all variants of debug code from testing with the NX.
NI has been pinging continuously for months without fail.
kind regards
Craig
Might have some good news for you (and me) on this one. Our AMX vendor has told us AMX HQ have replicated the issue and have fixed it in firmware. I assume it is the 1_5_87 hotfix release AMX_Chris is talking about but I haven't had time to try it. Have had a baby and several new projects in between this one and now... so have had -ve time to muck around on this one.
I might try and get the Hotfix and check it out. Will let you know how i go...