R4 Related Error Msg
vining
Posts: 4,368
I went to a job Friday to finally swap out the 5200i's due to the battery issue so while I was there I also updated the ZigBee gateway and R4 firmware. Needless to say the process was as aggravating as usually but it got done, everything showed up online with their correct updated firmware and the remotes worked.
For some reason this morning I decided to open up the daily sitrep log that the system emails me and much to my suprise I got this every hour:
Why duplicate DPS? What's routing loop detected?
Now despite the 3 hours round trip to replace the faulty TP's, plus the couple of hours loading TP files and reconfiguring my routers static binding table for the new MACs and then the R4 frimware upgrade which I guess isconsidered an essential upgrade, a bit of testing, etc, basically I wasted an entire day.
Now I guess on Monday I could call Tech Support, oh but wait, that will cost me $1,200.00 in additon to the $1,000.00 they already cost me on Friday. Hmmm, I guess them wasting my precious time is alright but for me to take up 15 minutes or so of their time isn't. Oh and 1 of the 5 replacement TPs didn't work so I get to make the trip all over again and if this error message is related to the firmware I'll have to make the trip for that to.
According to the log this started just after the ZigBee upgrades and I'm basically getting the same log entries every hour. Has any one seen this before or have a clue as to what it's caused by and I do I resolve it with out calling tech support.
Both gateways have different ext. pan & pan ids and one is on channel 26 while the other is on ch 15, although they probable are so far apart and they wouldn't see and/or step on each other.
For some reason this morning I decided to open up the daily sitrep log that the system emails me and much to my suprise I got this every hour:
<<< 00:00 >>> how mem Volatile Free : 9145280 (largest free block in bytes) NonVolatile Free: 934890 (bytes free) (0021610256) ICSPTCPRx25: Ignoring msg from foreign device (same DPS) (IP=C0A8088C) Cmd!3 DSys=DDev=TCP Socket% IP2.168.8.140 Index (0021610257) Routing Loop detected. Killing route to system 1. (0021614159) ICSPTCPRx22: Ignoring msg from foreign device (same DPS) (IP=C0A8088D) Cmd!3 DSys=DDev=TCP Socket" IP2.168.8.141 Index (0021614160) Routing Loop detected. Killing route to system 1. (0021614186) ICSPTCPRx22: Ignoring msg from foreign device (same DPS) (IP=C0A8088D) Cmd!3 DSys=DDev=TCP Socket" IP2.168.8.141 Index (0021614187) Routing Loop detected. Killing route to system 1. (0021614389) ICSPTCPRx22: Ignoring msg from foreign device (same DPS) (IP=C0A8088D) Cmd!3 DSys=DDev=TCP Socket" IP2.168.8.141 Index (0021614389) Routing Loop detected. Killing route to system 1. (0021730098) CIpEvent::OnLine 0:2:1 (0021730187) Exiting UDP SNMP Read thread - closing this socket for local port 2 (0021730206) CIpEvent::OffLine 0:2:1 (0021730451) ICSPTCPRx22: Ignoring msg from foreign device (same DPS) (IP=C0A8088D) Cmd!3 DSys=DDev=TCP Socket" IP2.168.8.141 Index (0021730452) Routing Loop detected. Killing route to system 1. (0021730648) ICSPTCPRx25: Ignoring msg from foreign device (same DPS) (IP=C0A8088C) Cmd!3 DSys=DDev=TCP Socket% IP2.168.8.140 Index (0021730648) Routing Loop detected. Killing route to system 1. (0021774558) Connected Successfully (0021774619) CIpEvent::OnLine 0:3:1 (0021775152) Exiting TCP Read thread - closing this socket for local port 3 (0021775189) CIpEvent::OffLine 0:3:1 <<< 01:00 >>>Now the IP address 192.168.8.140 & .141 are my gateways. The system has 2 gateways & 2 R4. Now the the gateways are set up with device number 20001 & 20002 respectively and the R4's are 10022 & 10023 respctively. They show in my device tree in NS3 just fine so what the F is this crap?
Why duplicate DPS? What's routing loop detected?
Now despite the 3 hours round trip to replace the faulty TP's, plus the couple of hours loading TP files and reconfiguring my routers static binding table for the new MACs and then the R4 frimware upgrade which I guess isconsidered an essential upgrade, a bit of testing, etc, basically I wasted an entire day.
Now I guess on Monday I could call Tech Support, oh but wait, that will cost me $1,200.00 in additon to the $1,000.00 they already cost me on Friday. Hmmm, I guess them wasting my precious time is alright but for me to take up 15 minutes or so of their time isn't. Oh and 1 of the 5 replacement TPs didn't work so I get to make the trip all over again and if this error message is related to the firmware I'll have to make the trip for that to.
According to the log this started just after the ZigBee upgrades and I'm basically getting the same log entries every hour. Has any one seen this before or have a clue as to what it's caused by and I do I resolve it with out calling tech support.
Both gateways have different ext. pan & pan ids and one is on channel 26 while the other is on ch 15, although they probable are so far apart and they wouldn't see and/or step on each other.
0
Comments
Paul
I have no redundant loops or spanning trees enabled on this switch or any where else in this network so what settings in the switch could be cause these odd errors that came about as a result of the firmware upgrade? Should I just kick myself in the a$$ for breaking my rule and upgrading firmware on a working system?
Up until now I've called TS probably 4 or 5 times in the last 6 years and I damn sure ain't goint to call them now with this new BS.... I mean Bulls Eye program in affect.
It doesn't seem to cause any problems, but I like to at least know why random error messages are popping up.
The message above is normal. It is a Parametric Info EOT message that is used by the gateway but meaningless (and ignored) by the master. It is sourced by each ZigBee device when coming online to help the gateway determine when to report the device as online with the master.
Do you have network time updates enabled on your master?
I didn't realize you were ever un-blacklisted.
rhargrave wrote:
If you mean in the web interface>clock manager the radio button for "Network Time" is checked with a 1 hour re-sync. I just changed the resync period to every 2 hours to see if my errors follow.
So is this another one of those basically harmless error messages? I'd like it to go away and stop filling up my logs though so is there a remedy for this?
TIA,
Dan
Sorry, I only answered the question that I knew how to answer. Had to wait for Rhargrave to get here to help answer yours.
No, it is not. There is a known issue with network time updates and the ZigBee Gateway. Everything the master updates its time, the gateway and every device connected to it have a chance to go off line. The system recovers in under 10 seconds, but it is still a problem.
It was happening every hour, and 1 hour is one of the network time update options, and the log messages look similar to some of the other logs I have seen from tech support on this issue.
If so for the time being should I max my re-sync time to every 4 hours so the Gateways are only offline around 10 seconds every 4 hours as opposed to offline around 10 seconds every hour?
Now is this an issue for everyone who updated to the current firmware or is it random? Is it strictly an AMX hardware issue or an issue with AMX gear and how it plays with other netwrok gear as eluded to in a previous post by another member?
Inquiring minds would like to know and thanks for responses.
sjohnson wrote: Understood but I'm totally happy with an I don't know, I'll get back to ya. At least that way I feel the love and recently its been seemingly in short supply. Although not necessarily from you guys in the functional department (engineering).
Thx
Guess it's time to pull out the time manager mod and start using the astro clock function so I can reduce the amount of system time syncs, at least until the gateway firmware gets fixed. It will help clean up my code too.
Since you guys aren't sure what the cuase is yet I guess we won't get an estimated time for a fix , huh. At least I now know the reason for the errors and the ramification so I'm cool with that.
Thx
Yes, the "routing loop detected" messages are not harmless error messages. At best, they show that you have some problem with how your master to master is configured(not the case here). At worst, they show that you have a device that is confusing the master by acting like another master(This is the case).
If I remember correctly, it is not a good idea to have the time manager module, and the clock manager both running. There is some overlap in their functionality that can cause issues.
As for the device going offline because of the time updates, it is a firmware issue.
Does putting the built in time manager into standalone mode turn it off? Which is better to use the built in time manager or iTime Manager?
Paul
I think the built-in time manager still has to be modified if the government decides to change the date of the change over. So, if you have 50-100 systems out there, that's going to be quite a chore.
I don't know about iTime Manager.
I run the built-in time manager on my main mothership master and all my systems check in with it for time. That way I only have to have one Netlinx Master to fiddle with.
BTW I do have 2 masters on this job using M2M but it's set up correctly with master 2 in 1's URL list and nothing in 2's. Besides these log entries never appeard until the gateway firmware was upgraded.
a_riot42 wrote: I think the master will still do their sync of all the AMX gear which is where the problem seems to be and it just uses internal time tracking instead of retrieving network time. So the problem should remain in either method.
I think the built in method would be alot cleaner for our code and provide more flexability but functionally no different. Of course I've never used it so what the F do I know.
ericmedley wrote: Wasn't there a patch for iTimeManager a few years back when they changed the DST dates so I'm sure if dates change again we'll be updating all systems again. Hopefully you and I will be retired by then.
i!-TimeManager and i!-Schedule use the same clock code, and they have always had a single line you can edit to change when Daylight Savings adjusts. You can set it to whatever you like. It requires a re-compile and reload to activate, but that's not the worst thing in the world.
I don't know what's involved with the "resync" but isn't the default value of 60 minutes a little too frequent? Wouldn't most systems, especially large ones be better off set to the max of 240 minutes? Wouldn't once or twice a day be sufficient? Wouldn't less == more better?
Once a day, and on a reboot. Anything more than that is of no value, unless the clock chip is really off.
NTP is done in a way that both keeps the clock synced up nicely and helps the computer figure out the fastness or slowness of it's internal clock.
The NTP conversation is a series of call and responses to determine the average time a packet travels from host to client and back. The amount of time this happens is adjustable by the software. Most OSs don't give this control to the user. From what I've read most home computers do 5-10 bounces to get a good network latency reading.
Once this value is estaablished, the actual time is passed and adjusted for the determined average latency. The computer will then determine the slowness or fastness of its clock. Some chips will tweak to compensate for this.
The time chip may be checking NTP every 20 mintues or so. In most cases, it's out of the user's hands.
I wonder how AMX's works?
Also I think the re-sync is mainly a command that is sent to all amx gear to sync those devices to the master's clock. It also 1st checks the network time if that option is selected which is usually is if there's a gateway to the world available.
The options for the built in network time manager are 5 min, 1 hour, 2 hours, and 4 hours. Most crystals are 20 PPM, so in 4 hours you are looking at a maximum drift of around 288 ms. Setting the network time manager to standalone mode will turn the updates off entirely.
If you are seeing R4/Gateway offlines from having the network time turned on, there is a Hotfix available.