NXR-ZGW Requiring Reboots

TurnipTruck · March 2010

Greetings,

I just updated an R4 and ZGW from version 1.xx to the latest V3. I followed the update procedures as directed and all went well. The R4 and gateway are communicating on the latest firmware.

However, now the R4 will lose communication with the system occasionally. A reboot of the gateway will bring it back until next time.

I am quite sure that the problem is with the gateway as I cannot get to its webpage once the fault happens.

Any ideas? Thanks.

Gateway Firmware is 3.01.11

Gateway Zigbee Firmware is 3.01.06

Remote Firmware Version is 3.01.05

Remote Zigbee Version is 3.01.05

[Deleted User] · March 2010

The only thing that I know of is a DHCP failure. If the gateway fails to DHCP, it will drop into link-local mode and get a 169.254.X.X address. The gateway will try for 30 seconds before it gives up on DHCP. This can also happen when the lease is renewed.

ericmedley · March 2010

rhargrave wrote: »

The only thing that I know of is a DHCP failure. If the gateway fails to DHCP, it will drop into link-local mode and get a 169.254.X.X address. The gateway will try for 30 seconds before it gives up on DHCP. This can also happen when the lease is renewed.

Does the unit hook up again if the DHCP Server comes back online? There are many enterprise level switches that take longer than 30 seconds to come to life.

TurnipTruck · March 2010

Gateway is set for static IP. Once it drops offline, it never comes back without a power cycle.

[Deleted User] · March 2010

ericmedley wrote: »

Does the unit hook up again if the DHCP Server comes back online? There are many enterprise level switches that take longer than 30 seconds to come to life.

Unfortunately, no it doesn't. It is a technical problem with the OS and link-local. Once it goes into link-local, there is no way to get it back out besides reboot. The workaround is to use static IPs.

Either way, that is not the problem in this case, since the gateways are set to static.

ericmedley · March 2010

rhargrave wrote: »

Unfortunately, no it doesn't. It is a technical problem with the OS and link-local. Once it goes into link-local, there is no way to get it back out besides reboot. The workaround is to use static IPs.

Either way, that is not the problem in this case, since the gateways are set to static.

I do run all mine Static. so, I guess it's not an issue. But, I was just curious. Thanx!
e

TurnipTruck · March 2010

Bad gateway?

[Deleted User] · March 2010

TurnipTruck wrote: »

Bad gateway?

I don't know. Looking into it.

TurnipTruck · March 2010

rhargrave wrote: »

I don't know. Looking into it.

Thanks. I have another gateway that I could swap to test. The site is quite a distance, so I won't be able to do it immediately.

Rich Abel · August 2010

Frequent re-boot requirements for ZGW

I've got a situation that sounds similar - a NXR-ZGW (3.01.07) with a static IP that requires reboots 1 - 2 times per week.

TurnipTruck: Were you able to resolve the reboot problem you had?

Thanks
Rich Abel

DHawthorne · August 2010

I've had this happen too, and I always use static IPs for all AMX equipment (too many troubles with DHCP issues on networks I have no control over). It's not common for me, but it happens. In case it matters, I have only seen it on jobs that have multiple gateways and multiple R4's per gateway. Any job with just one of each has been perfectly stable.

Mkn · January 2011

Hi Guys,
No any solution on this problem? I have exactly the same problem with the last firmware as well. Gateway has static IP, according all recomendation it is on channel 26 and power supply (not PoE) through UPS.

It happens randomly. The gateway was replaced once and worked good for a couple of the months and recently it happend again. When it locks, I could not even login through webinterface and it is not in device tree.
Any ideas, please.

ericmedley · January 2011

Mkn wrote: »

Hi Guys,
No any solution on this problem? I have exactly the same problem with the last firmware as well. Gateway has static IP, according all recomendation it is on channel 26 and power supply (not PoE) through UPS.
It happens randomly. The gateway was replaced once and worked good for a couple of the months and recently it happend again. When it locks, I could not even login through webinterface and it is not in device tree.
Any ideas, please.

Put one of these on it.
http://www.dataprobe.com/iboot-g2.php

PM if you need one.

the8thst · January 2011

ericmedley wrote: »

Put one of these on it.
http://www.dataprobe.com/iboot-g2.php

PM if you need one.

It would be cheaper to run the power to the gateway through a relay on the AMX master. Then let the master document the offline event and reboot the gateway automagically.

PS. Is the gateway an NXR-ZGW or NXR-ZGW-PRO? My problem was with horrible signal range from an NXR-ZGW and AMX had me swap it for the newer NXR-ZGW-PRO and things have been rock solid ever since. The Pro version has ALOT better signal range and it seemed to respond faster in the web browser too (maybe more memory or better hardware?).

a_riot42 · January 2011

the8thst wrote: »

It would be cheaper to run the power to the gateway through a relay on the AMX master. Then let the master document the offline event and reboot the gateway automagically.

PS. Is the gateway an NXR-ZGW or NXR-ZGW-PRO? My problem was with horrible signal range from an NXR-ZGW and AMX had me swap it for the newer NXR-ZGW-PRO and things have been rock solid ever since. The Pro version has ALOT better signal range and it seemed to respond faster in the web browser too (maybe more memory or better hardware?).

I wonder why there is so little debugging available with these devices. In the settings there is something called debug mode but I don't know if it actually does anything. When problems start to happen there isn't much to go on unless its obvious like buffer overflows and the like. It would be nice to be able to log what's its doing so that anything that happened right before it went offline could be logged and examined. This would likely have been helpful when it was discovered that all the gateways go offline when a time server action happens.
Paul

vining · June 2011

I've been having a similar issue on a job which has 2 gateways and one of them periodically requires a reboot. Both gateways handles 2 R4's and I figure they'll almost never be used at the same time so the gateweay burden should be minimal. The gateway that locks up requiring the reboot is on channel 15 and the other that has never had an issue is on 25. For now I've set is up to cycle power if it falls offline for over 2 minutes but right now they're powered form the master's power supply and that will need to be changed, probably to individual power supplies and a seperate reboot controlle outlets to power it. So if I throw more money at it I can make the problem go away or at least make it somewhat invisible to my clients.

We still have the R4's / Gateways falling offline as a result of the Clock Manager setting time which I would have hoped to have been fixed by now. I reduced my update cylce to every 4 hours to reduce the frequency that these fall offline but it still needs fixin'. Another odd thing I've notice is that this only occurs on 1 gateway, the one on channel 25 so is this a coincidence that the one that doesn't lock up is the one the clock manager causes to drop offline while the one that does lock doesn't fall online due to the clock manager. Very scewry.

Here's what's printed in my log every 4 hours when the clock manager does it's thing. Gateway is dev 20002/IP 192.168.9.129 and the associated R4's DEVs are 10023 & 10024.

(0057704381) ClockMgr: Setting system time to - SAT JUN 04 04:33:49 2011

(0057712329) Closing connection to 192.168.9.129 due to duplicate (0:1:1 already exists)
(0057712329) ICSPTCPRx1::CloseSocket: Closing Socket
(0057712330) CICSPTCP Rx connection to 192.168.9.129 has been closed locally or by peer
(0057712413) CIpEvent::OffLine 10023:1:1
(0057712415) CIpEvent::OffLine 10023:2:1
(0057712415) CIpEvent::OffLine 10023:3:1
(0057712416) CIpEvent::OffLine 10023:4:1
(0057712416) CIpEvent::OffLine 10023:5:1
(0057712417) CIpEvent::OffLine 10023:6:1
(0057712418) CIpEvent::OffLine 10023:7:1
(0057712419) CIpEvent::OffLine 10023:8:1
(0057712420) CIpEvent::OffLine 10023:9:1
(0057712421) CIpEvent::OffLine 10023:10:1
(0057712422) CIpEvent::OffLine 10023:11:1
(0057712423) CIpEvent::OffLine 10023:12:1
(0057712424) CIpEvent::OffLine 10023:13:1
(0057712425) CIpEvent::OffLine 10023:14:1
(0057712425) CIpEvent::OffLine 10023:15:1
(0057712430) CIpEvent::OffLine 10023:16:1
(0057712434) CIpEvent::OffLine 10023:17:1
(0057712436) CIpEvent::OffLine 10023:18:1
(0057712441) CIpEvent::OffLine 10023:19:1
(0057712445) CIpEvent::OffLine 10023:20:1
(0057712448) CIpEvent::OffLine 10023:21:1
(0057712452) CIpEvent::OffLine 10023:22:1
(0057712456) CIpEvent::OffLine 10023:23:1
(0057712460) CIpEvent::OffLine 10023:24:1
(0057712465) CIpEvent::OffLine 10023:25:1
(0057712467) CIpEvent::OffLine 10023:26:1
(0057712472) CIpEvent::OffLine 10023:27:1
(0057712512) CIpEvent::OffLine 10023:28:1
(0057712512) CIpEvent::OffLine 10023:29:1
(0057712513) CIpEvent::OffLine 10023:30:1
(0057712514) CIpEvent::OffLine 10023:31:1
(0057712514) CIpEvent::OffLine 20002:1:1
(0057712515) CIpEvent::OffLine 10024:1:1
(0057712517) CIpEvent::OffLine 10024:2:1
(0057712518) CIpEvent::OffLine 10024:3:1
(0057712518) CIpEvent::OffLine 10024:4:1
(0057712519) CIpEvent::OffLine 10024:5:1
(0057712520) CIpEvent::OffLine 10024:6:1
(0057712520) CIpEvent::OffLine 10024:7:1
(0057712522) CIpEvent::OffLine 10024:8:1
(0057712526) CIpEvent::OffLine 10024:9:1
(0057712530) CIpEvent::OffLine 10024:10:1
(0057712533) CIpEvent::OffLine 10024:11:1
(0057712536) CIpEvent::OffLine 10024:12:1
(0057712541) CIpEvent::OffLine 10024:13:1
(0057712546) CIpEvent::OffLine 10024:14:1
(0057712550) CIpEvent::OffLine 10024:15:1
(0057712552) CIpEvent::OffLine 10024:16:1
(0057712557) CIpEvent::OffLine 10024:17:1
(0057712560) CIpEvent::OffLine 10024:18:1
(0057712565) CIpEvent::OffLine 10024:19:1
(0057712569) CIpEvent::OffLine 10024:20:1
(0057712610) CIpEvent::OffLine 10024:21:1
(0057712611) CIpEvent::OffLine 10024:22:1
(0057712612) CIpEvent::OffLine 10024:23:1
(0057712612) CIpEvent::OffLine 10024:24:1
(0057712613) CIpEvent::OffLine 10024:25:1
(0057712614) CIpEvent::OffLine 10024:26:1
(0057712614) CIpEvent::OffLine 10024:27:1
(0057712615) CIpEvent::OffLine 10024:28:1
(0057712615) CIpEvent::OffLine 10024:29:1
(0057712616) CIpEvent::OffLine 10024:30:1
(0057712617) CIpEvent::OffLine 10024:31:1
(0057713556) CIpEvent::OnLine 10023:1:1
(0057713559) CIpEvent::OnLine 10024:1:1
(0057713562) CIpEvent::OnLine 20002:1:1
(0057714362) Invalid message received @ CDMDeviceManager (00A8)
(0057714364) Invalid message received @ CDMDeviceManager (00A8)
(0057714364) Invalid message received @ CDMDeviceManager (00A8)
(0057714364) CIpEvent::OnLine 10023:2:1
(0057714365) CIpEvent::OnLine 10023:3:1
(0057714366) CIpEvent::OnLine 10023:4:1
(0057714367) CIpEvent::OnLine 10023:5:1
(0057714368) CIpEvent::OnLine 10023:6:1
(0057714368) CIpEvent::OnLine 10023:7:1
(0057714370) CIpEvent::OnLine 10023:8:1
(0057714371) CIpEvent::OnLine 10023:9:1
(0057714376) CIpEvent::OnLine 10023:10:1
(0057714377) CIpEvent::OnLine 10023:11:1
(0057714378) CIpEvent::OnLine 10023:12:1
(0057714379) CIpEvent::OnLine 10023:13:1
(0057714380) CIpEvent::OnLine 10023:14:1
(0057714381) CIpEvent::OnLine 10023:15:1
(0057714381) CIpEvent::OnLine 10023:16:1
(0057714382) CIpEvent::OnLine 10023:17:1
(0057714383) CIpEvent::OnLine 10023:18:1
(0057714388) CIpEvent::OnLine 10023:19:1
(0057714390) CIpEvent::OnLine 10023:20:1
(0057714395) CIpEvent::OnLine 10023:21:1
(0057714399) CIpEvent::OnLine 10023:22:1
(0057714440) CIpEvent::OnLine 10023:23:1
(0057714441) CIpEvent::OnLine 10023:24:1
(0057714442) CIpEvent::OnLine 10023:25:1
(0057714442) CIpEvent::OnLine 10023:26:1
(0057714443) CIpEvent::OnLine 10023:27:1
(0057714443) CIpEvent::OnLine 10023:28:1
(0057714444) CIpEvent::OnLine 10023:29:1
(0057714445) CIpEvent::OnLine 10023:30:1
(0057714445) CIpEvent::OnLine 10023:31:1
(0057714446) CIpEvent::OnLine 10024:2:1
(0057714447) CIpEvent::OnLine 10024:3:1
(0057714448) CIpEvent::OnLine 10024:4:1
(0057714449) CIpEvent::OnLine 10024:5:1
(0057714451) CIpEvent::OnLine 10024:6:1
(0057714457) CIpEvent::OnLine 10024:7:1
(0057714460) CIpEvent::OnLine 10024:8:1
(0057714462) CIpEvent::OnLine 10024:9:1
(0057714467) CIpEvent::OnLine 10024:10:1
(0057714472) CIpEvent::OnLine 10024:11:1
(0057714474) CIpEvent::OnLine 10024:12:1
(0057714478) CIpEvent::OnLine 10024:13:1
(0057714483) CIpEvent::OnLine 10024:14:1
(0057714485) CIpEvent::OnLine 10024:15:1
(0057714489) CIpEvent::OnLine 10024:16:1
(0057714494) CIpEvent::OnLine 10024:17:1
(0057714531) CIpEvent::OnLine 10024:18:1
(0057714531) CIpEvent::OnLine 10024:19:1
(0057714534) CIpEvent::OnLine 10024:20:1
(0057714535) CIpEvent::OnLine 10024:21:1
(0057714535) CIpEvent::OnLine 10024:22:1
(0057714537) CIpEvent::OnLine 10024:23:1
(0057714537) CIpEvent::OnLine 10024:24:1
(0057714538) CIpEvent::OnLine 10024:25:1
(0057714539) CIpEvent::OnLine 10024:26:1
(0057714540) CIpEvent::OnLine 10024:27:1
(0057714540) CIpEvent::OnLine 10024:28:1
(0057714541) CIpEvent::OnLine 10024:29:1
(0057714543) CIpEvent::OnLine 10024:30:1
(0057714548) CIpEvent::OnLine 10024:31:1

John Nagy · June 2011

Have you tried trading the two gateways, physically and addresses? See if the problem follows the hardware, the location, the address, or if it changes based on the R4's it talks to. Lots of permutations, but you can pin it down to the device or location best by substitution.

vining · June 2011

John Nagy wrote: »

Have you tried trading the two gateways, physically and addresses? See if the problem follows the hardware, the location, the address, or if it changes based on the R4's it talks to. Lots of permutations, but you can pin it down to the device or location best by substitution.

No I haven't had a chance to do anything yet since I just realized it was happening my last 2 visits, Thursday & Friday. Fortunately I caught it and not the client but I informed them of the issue and that I'll be evaluating the situation. Since I have them set to auto reboot if they go offline I watch the daily logs that this system sends me to see how often it actaully occurs. The log will list the reason for the reboot. Most of the time the system sits idle since no one is normally there so I don't believe it's traffic related. Even when they are used they only get minimal traffic for the page they're on if they're online.

John Nagy · June 2011

Traffic hasn't been responsible for -any- gateway crashes I've witnessed yet, they can take down the link to the R4, but not the gate. I just checked, and my gate log goes back to January... many an R4 disconnect for over-traffic in that time, as I have torture tested our projects as we go along, but never made the gate reboot, or the log would be gone. Which of course destroys the evidence that might have told you what was happening.

NXR-ZGW Requiring Reboots

Comments