Home AMX User Forum AMX General Discussion
Options

Using NX-1200 for a "ping" server - lockups

Hi Guys,

I have an NX-1200 performing routine pings on a network using the following:

Sequence:
1. NX-1200 boots up waits 5 minutes and starts a 5 minute timer.

2a. When timer is activated (5 min delay), the NX then telnets into itself and pings 85 ip addresses. Each ping request waits 5 seconds after the previous response.
2b Once all 85 have been completed, the telnet session is closed and a 5 minute timer is activated.

3a. When timer is activated (5 min delay), the NX then telnets into itself and pings i-addresses that reported offline from the previous test. Each ping request waits 5 seconds after the previous response.
3b Once all offline pings have been completed, the telnet session is closed and a 5 minute timer is activated.

4a. When timer is activated (5 min delay), the NX then telnets into itself and pings i-addresses that reported offline from the previous test. Each ping request waits 5 seconds after the previous response.
4b Once all offline pings have been completed, the telnet session is closed and a 5 minute timer is activated.

This sequence loops between 2-4 essentially with a minute delay between pinging ip addresses.

The issue i am experiencing is that the NX processor locks up after a while, the NX will not accept any ip connections (telnet, http, icslan, netlinx...) and requires a hard reboot.

I have a couple of variables which were logged as follows:
numberOfConnections-26
numberOfPings-935
numberOfErrors-0

numberOfConnection is the number of telnet sessions that connected.
numberOfPings is the total amount of pings that were requested
numberOfErrors is the ip connection error count

Has anyone experienced issues with NX processors performing pings? Are there any alternative ways to perform a ping instead of telnetting into itself?
We have been running the current firmware on the processors.

Kind regards
Craig

Comments

  • Options
    cmatkincmatkin Posts: 86
    Just an update,
    I have rebooted the NX processor multiple times and it can only ping 930 (+/- 10) devices until it locks up.
    Ran the exact same code on an NI3100 and NI700 and both are currently around the 2600 pings without a single error or lockup.
  • Options
    a_riot42a_riot42 Posts: 1,624
    Odd. Personally, this is what I use a raspberry pi for. My guess is its a memory management issue if it locks up, so you might want to do a "show mem" between pings or something. Post your code though, since its impossible to tshoot something like that on a forum with no code, no FW version, etc. I have an NX1200 I can try it on.
    Paul
  • Options
    Wow, that's crazy. This is exactly what I have done and exactly what I came here to see if anyone else had found. I've been ripping my code apart trying to figure out where I snuck a program loop or something in....

    I've tried a few variations, one much like yours, one keeping an open socket and a few others and mine seem to lockup after about 7 hours.... with a 15 second gap in between each poll. I'm also using NX masters, DVXs, NX1200s....

    This was a last resort for me as i have a largish project with about 80 odd AMX Acendo Room Booking Panels, which I monitor via SSH. However the SSH server locks up after a while on all, could be an hour, could be a week. So my program reverts from ssh to a ping after the SSH locks up just to see if they are on the network for RMS. The the bloody NX locks up...

    Did you get anywhere with this?

    I was just about to start looking into something else to do the pinging for me. Raspberry Pi is a good idea.

    Thanks for the post though Craig, I was at a dead end.
  • Options
    ericmedleyericmedley Posts: 4,177
    There was a known issue with the NX series early on that had to do with a variable used in firmware needing to be 64-bit vs. 32-bit. The result was the processor would lock up after a period of 51 days (I think) As always - double check to see that your firmware version is up-to-date and all that. but, it does sound like a memory leak or something. If you telnet into the master and do a msg on all you might see some runtime errors. If you see a continual stream of Memory xxxxxxxx messages then it is likely a memory leak.
  • Options
    Thanks Eric, will do. I just put together a little logger program, will let it go over night.

    Should be locked up by the morning.
  • Options
    ericmedleyericmedley Posts: 4,177
    Garthvader wrote: »
    Thanks Eric, will do. I just put together a little logger program, will let it go over night.

    Should be locked up by the morning.

    excited to hear the results.
  • Options
    So my magic number of pings is about 940... Pretty much matched Craigs results. See attached diagnostics.

    Anything starting with a > string getting sent to the telnet server.

    The polling loop with 15 second gap:
    cpu usage, show mem, show buffers, mem, msg stats, 9 pings.

    Had a quick look through the diagnostics, only thing i could see is the "Total free memory is xxxxxxxxx" drops steadily, but not by much.

  • Options
    ericmedleyericmedley Posts: 4,177
    Garthvader wrote: »
    So my magic number of pings is about 940... Pretty much matched Craigs results. See attached diagnostics.

    Anything starting with a > string getting sent to the telnet server.

    The polling loop with 15 second gap:
    cpu usage, show mem, show buffers, mem, msg stats, 9 pings.

    Had a quick look through the diagnostics, only thing i could see is the "Total free memory is xxxxxxxxx" drops steadily, but not by much.

    In most cases of memory leak it involves either the underlying programming (firmware) or your program declaring and allocating memory for a new variable or stored value. While you may not necessarily run out of memory, you might hit the maximum number of variables allowed; thus causing a lock up. It's a wild shot in the dark, but perhaps there's something wrong with the ping routine that never gets hit because the vast majority of folks using a telnet session might ping something just a few times and then be done. Perhaps you might try disconnecting the telnet session after each attempt and see if it is then able to keep going.
  • Options
    Thanks for the added detail. A member of VIP raised this issue with support in July and NX Master Hotfix 1.5.87 resolves this memory leak issue pertaining to IP communications.
  • Options
    When i get 2 seconds i'll have a play with logging off periodically or after each ping and see what happens.

    I was talking to our AMX distributors here in Australia about something else and mentioned this issue. The jumped on it and were able to replicate it quite easily and have passed it back to AMX HQ. See what happens.
  • Options
    Hi guys,
    i know this is digging up an old thread, but I was curious if anyone had confirmed a fix for this. I was advised that this is a know bug in the kernel which can not be fixed.
    It doesn?t matter how you perform the pings, these all get performed from the kernel.

    Only solution for me was to use an old NI700.
    Used the same code and tested with all variants of debug code from testing with the NX.
    NI has been pinging continuously for months without fail.

    kind regards
    Craig
  • Options
    G'day Craig,

    Might have some good news for you (and me) on this one. Our AMX vendor has told us AMX HQ have replicated the issue and have fixed it in firmware. I assume it is the 1_5_87 hotfix release AMX_Chris is talking about but I haven't had time to try it. Have had a baby and several new projects in between this one and now... so have had -ve time to muck around on this one.

    I might try and get the Hotfix and check it out. Will let you know how i go...




  • Options
    MLaletasMLaletas Posts: 226
    Congrats on the baby, good luck on getting some sleep :(
Sign In or Register to comment.