Home AMX User Forum AMXForums Archive Threads AMX Hardware

Strange lockup of NI-3100

It tooked me some time to trace the problem down, but now I think I've found the trigger. I've still some NI-x100 controllers running at our customers and sometimes the controllers lock up. All the controllers are running with the latest firmware. On all this controllers I've set the time sync to "network time". This is working as long as the controller can connect the time server. At the moment the internet connection is down, or the time server is not reachable it takes between 1 or 2 ours - depends on re-sync period - until the controller locks up. Then I have to pull power to restart the controller. This error is reproduceable.
As long as the controller is set to stand alone time sync, it does not crash and is independently from the internet. But then the system time will be wrong after some time.

Is this a known error and is there something I can do to avoid the crashes (beside setting time sync to stand alone)?

A.T.

Comments

  • ericmedleyericmedley Posts: 4,177
    A couple things...
    First, yes there is a known issue right now with the time servers. What happened was most Network Time Servers have changed to deal with hacking issues. While most major manufacturers quietly modified their software to deal with it, it kind of slipped under our radar. I first noticed it at Daylight Savings Time change. All my clocks were wrong. the short answer is there is a hot fix for newer masters. For older ones you'll need to find one of several NTS servers still using the old protocol.

    I'm currently using: time-a.nist.gov IP: 129.6.15.28

    It still seems to be working.

    Now the second issue: If fixing the time server does fix your problem then there is no second issue. But, if the lockups continue even afterward, then it sounds like you might have a memory leak. You can test this by running a telnet session on the master and run the MSG ON command. This will show all the internal diagnostic messages as they happen.

    If after the program is up and running for a while you occasionally see several 'Memory" notices coming over time this means something is wrong with code, modules, under the hood stuff. It could be anywhere.

    I'm guessing that since you only seem to notice it lately it's probably just the NTS call not working and the errors are piling up.
  • Thanks for you answer.
    ericmedley wrote: »
    If after the program is up and running for a while you occasionally see several 'Memory" notices coming over time this means something is wrong with code, modules, under the hood stuff. It could be anywhere.

    I'm guessing that since you only seem to notice it lately it's probably just the NTS call not working and the errors are piling up.

    The AMX controller was running nearly a year flawlessly and is using a local linux machine as the time server. Recently a hard disk of that linux machine died and I had to take it down for repair. After about 2 hours the AMX controller locked up. I had to reboot it, but after a few hours it locked up again. Setting the time to local time fixed the issue.
    In the mean time the linux machine is running again and I set the controller to network time again. Since that it works. However: I checked the command line as you mentioned, but found no memory notices at all. Because of that and the fact that the controller was running nearly a year without problems, there must be an error with the time server in the firmware. Beside the problem with the protocol there must be a basic problem with network connection to a time server. At the moment the time server is not reachable for whatever reason, the controller locks up.

    A.T.
  • I saw something similar when I found a NI3100 which shipped without the timekeeper battery (its a yellow battery fixed on top of one of the IC's on the board. It would drop offline in RMS every 4 hours (same time period I had set in the NTP dialog).
Sign In or Register to comment.