R4 Related Error Msg

vining Posts: 4,368

February 2010 in AMX Control Products

I went to a job Friday to finally swap out the 5200i's due to the battery issue so while I was there I also updated the ZigBee gateway and R4 firmware. Needless to say the process was as aggravating as usually but it got done, everything showed up online with their correct updated firmware and the remotes worked.

For some reason this morning I decided to open up the daily sitrep log that the system emails me and much to my suprise I got this every hour:

<<< 00:00 >>>

how mem
Volatile Free   :  9145280 (largest free block in bytes)
NonVolatile Free:   934890 (bytes free)
(0021610256) ICSPTCPRx25: Ignoring msg from foreign device (same DPS) (IP=C0A8088C) Cmd!3 DSys=DDev=TCP Socket&#37; IP2.168.8.140 Index
(0021610257) Routing Loop detected. Killing route to system 1.
(0021614159) ICSPTCPRx22: Ignoring msg from foreign device (same DPS) (IP=C0A8088D) Cmd!3 DSys=DDev=TCP Socket" IP2.168.8.141 Index
(0021614160) Routing Loop detected. Killing route to system 1.
(0021614186) ICSPTCPRx22: Ignoring msg from foreign device (same DPS) (IP=C0A8088D) Cmd!3 DSys=DDev=TCP Socket" IP2.168.8.141 Index
(0021614187) Routing Loop detected. Killing route to system 1.
(0021614389) ICSPTCPRx22: Ignoring msg from foreign device (same DPS) (IP=C0A8088D) Cmd!3 DSys=DDev=TCP Socket" IP2.168.8.141 Index
(0021614389) Routing Loop detected. Killing route to system 1.
(0021730098) CIpEvent::OnLine 0:2:1
(0021730187) Exiting UDP SNMP Read thread - closing this socket for local port 2
(0021730206) CIpEvent::OffLine 0:2:1
(0021730451) ICSPTCPRx22: Ignoring msg from foreign device (same DPS) (IP=C0A8088D) Cmd!3 DSys=DDev=TCP Socket" IP2.168.8.141 Index
(0021730452) Routing Loop detected. Killing route to system 1.
(0021730648) ICSPTCPRx25: Ignoring msg from foreign device (same DPS) (IP=C0A8088C) Cmd!3 DSys=DDev=TCP Socket% IP2.168.8.140 Index
(0021730648) Routing Loop detected. Killing route to system 1.
(0021774558) Connected Successfully
(0021774619) CIpEvent::OnLine 0:3:1
(0021775152) Exiting TCP Read thread - closing this socket for local port 3
(0021775189) CIpEvent::OffLine 0:3:1

<<< 01:00 >>>

Now the IP address 192.168.8.140 & .141 are my gateways. The system has 2 gateways & 2 R4. Now the the gateways are set up with device number 20001 & 20002 respectively and the R4's are 10022 & 10023 respctively. They show in my device tree in NS3 just fine so what the F is this crap?

Why duplicate DPS? What's routing loop detected?

Now despite the 3 hours round trip to replace the faulty TP's, plus the couple of hours loading TP files and reconfiguring my routers static binding table for the new MACs and then the R4 frimware upgrade which I guess isconsidered an essential upgrade, a bit of testing, etc, basically I wasted an entire day.

Now I guess on Monday I could call Tech Support, oh but wait, that will cost me $1,200.00 in additon to the $1,000.00 they already cost me on Friday. Hmmm, I guess them wasting my precious time is alright but for me to take up 15 minutes or so of their time isn't. Oh and 1 of the 5 replacement TPs didn't work so I get to make the trip all over again and if this error message is related to the firmware I'll have to make the trip for that to.

According to the log this started just after the ZigBee upgrades and I'm basically getting the same log entries every hour. Has any one seen this before or have a clue as to what it's caused by and I do I resolve it with out calling tech support.

Both gateways have different ext. pan & pan ids and one is on channel 26 while the other is on ch 15, although they probable are so far apart and they wouldn't see and/or step on each other.

Online_Tree.jpg 106.6K

Comments

a_riot42 Posts: 1,624

February 2010

I have encountered it before and I think I know the solution, but you should call tech support and let them know what is going on so they fix this and don't keep assuming it isn't their fault. Are you using Cisco switches on this job?
Paul

0
vining Posts: 4,368

February 2010

The rack switch is a new Linksys managed switch so it probably has some of the lower end components and firmware that the Cisco Catalyst Express 500 series switches have.

I have no redundant loops or spanning trees enabled on this switch or any where else in this network so what settings in the switch could be cause these odd errors that came about as a result of the firmware upgrade? Should I just kick myself in the a$$ for breaking my rule and upgrading firmware on a working system?

Up until now I've called TS probably 4 or 5 times in the last 6 years and I damn sure ain't goint to call them now with this new BS.... I mean Bulls Eye program in affect.

0
vining Posts: 4,368

February 2010

Any of you fellas from engineering got any ideas?

0
the8thst Posts: 470

February 2010
I don't have an answer for you, but I did start to receive another odd message in the telnet extended messages console also appears to be related to the R4's coming on/off line.
Invalid message received @ CDMDeviceManager (00A8)

It doesn't seem to cause any problems, but I like to at least know why random error messages are popping up.
0
[Deleted User] Posts: 0

February 2010
the8thst wrote: »

I don't have an answer for you, but I did start to receive another odd message in the telnet extended messages console also appears to be related to the R4's coming on/off line.
Invalid message received @ CDMDeviceManager (00A8)

It doesn't seem to cause any problems, but I like to at least know why random error messages are popping up.

The message above is normal. It is a Parametric Info EOT message that is used by the gateway but meaningless (and ignored) by the master. It is sourced by each ZigBee device when coming online to help the gateway determine when to report the device as online with the master.
0
vining Posts: 4,368

February 2010

sjohnson wrote:

The message above is normal. It is a Parametric Info EOT message that is used by the gateway but meaningless (and ignored) by the master. It is sourced by each ZigBee device when coming online to help the gateway determine when to report the device as online with the master.

Hey, what about my problem which initated this thread? What, am I black listed again?

0
[Deleted User] Posts: 0

February 2010

vining wrote: »

sjohnson wrote:

Hey, what about my problem which initated this thread? What, am I black listed again?

Do you have network time updates enabled on your master?

0
the8thst Posts: 470

February 2010

vining wrote: »

sjohnson wrote:

Hey, what about my problem which initated this thread? What, am I black listed again?

I didn't realize you were ever un-blacklisted.

0
vining Posts: 4,368

February 2010

the8thst wrote:

I didn't realize you were ever un-blacklisted.

Oh, that hurts!

rhargrave wrote:

Do you have network time updates enabled on your master?

If you mean in the web interface>clock manager the radio button for "Network Time" is checked with a 1 hour re-sync. I just changed the resync period to every 2 hours to see if my errors follow.

So is this another one of those basically harmless error messages? I'd like it to go away and stop filling up my logs though so is there a remedy for this?

TIA,

Dan

0
[Deleted User] Posts: 0

February 2010

vining wrote: »

sjohnson wrote:

Hey, what about my problem which initated this thread? What, am I black listed again?

Sorry, I only answered the question that I knew how to answer. Had to wait for Rhargrave to get here to help answer yours.

0
[Deleted User] Posts: 0

February 2010

vining wrote: »

the8thst wrote:

Oh, that hurts!

rhargrave wrote:

If you mean in the web interface>clock manager the radio button for "Network Time" is checked with a 1 hour re-sync. I just changed the resync period to every 2 hours to see if my errors follow.

So is this another one of those basically harmless error messages? I'd like it to go away and stop filling up my logs though so is there a remedy for this?

TIA,

Dan

No, it is not. There is a known issue with network time updates and the ZigBee Gateway. Everything the master updates its time, the gateway and every device connected to it have a chance to go off line. The system recovers in under 10 seconds, but it is still a problem.

It was happening every hour, and 1 hour is one of the network time update options, and the log messages look similar to some of the other logs I have seen from tech support on this issue.

0
vining Posts: 4,368

February 2010

rhargrave wrote:

No, it is not.

Is that in response to "So is this another one of those basically harmless error messages?" ? Meaning it isn't harmless.

If so for the time being should I max my re-sync time to every 4 hours so the Gateways are only offline around 10 seconds every 4 hours as opposed to offline around 10 seconds every hour?

Now is this an issue for everyone who updated to the current firmware or is it random? Is it strictly an AMX hardware issue or an issue with AMX gear and how it plays with other netwrok gear as eluded to in a previous post by another member?

Inquiring minds would like to know and thanks for responses.

sjohnson wrote:

Sorry, I only answered the question that I knew how to answer. Had to wait for Rhargrave to get here to help answer yours.

Understood but I'm totally happy with an I don't know, I'll get back to ya. At least that way I feel the love and recently its been seemingly in short supply. Although not necessarily from you guys in the functional department (engineering).

Thx

0
vining Posts: 4,368

February 2010

I just realized I'm running time manager too which is why (I assume) these errors are doubled up ever 2 hours and why when I changed the re-sync period to every 4 hours on the master I was still getting the errors every 2 hours and every 4th hour they would be doubled up. According to my log the time manager mod connects every 2 hours and I don't know if I can change that and I really don't want to read up and find out?

Guess it's time to pull out the time manager mod and start using the astro clock function so I can reduce the amount of system time syncs, at least until the gateway firmware gets fixed. It will help clean up my code too.

Since you guys aren't sure what the cuase is yet I guess we won't get an estimated time for a fix , huh. At least I now know the reason for the errors and the ramification so I'm cool with that.

Thx

0
[Deleted User] Posts: 0

February 2010

vining wrote: »

I just realized I'm running time manager too which is why (I assume) these errors are doubled up ever 2 hours and why when I changed the re-sync period to every 4 hours on the master I was still getting the errors every 2 hours and every 4th hour they would be doubled up. According to my log the time manager mod connects every 2 hours and I don't know if I can change that and I really don't want to read up and find out?

Guess it's time to pull out the time manager mod and start using the astro clock function so I can reduce the amount of system time syncs, at least until the gateway firmware gets fixed. It will help clean up my code too.

Since you guys aren't sure what the cuase is yet I guess we won't get an estimated time for a fix , huh. At least I now know the reason for the errors and the ramification so I'm cool with that.

Thx

Yes, the "routing loop detected" messages are not harmless error messages. At best, they show that you have some problem with how your master to master is configured(not the case here). At worst, they show that you have a device that is confusing the master by acting like another master(This is the case).

If I remember correctly, it is not a good idea to have the time manager module, and the clock manager both running. There is some overlap in their functionality that can cause issues.

As for the device going offline because of the time updates, it is a firmware issue.

0
a_riot42 Posts: 1,624

February 2010

rhargrave wrote: »

Yes, the "routing loop detected" messages are not harmless error messages. At best, they show that you have some problem with how your master to master is configured(not the case here). At worst, they show that you have a device that is confusing the master by acting like another master(This is the case).

If I remember correctly, it is not a good idea to have the time manager module, and the clock manager both running. There is some overlap in their functionality that can cause issues.

As for the device going offline because of the time updates, it is a firmware issue.

Does putting the built in time manager into standalone mode turn it off? Which is better to use the built in time manager or iTime Manager?
Paul

0
ericmedley Posts: 4,177

February 2010

a_riot42 wrote: »

Does putting the built in time manager into standalone mode turn it off? Which is better to use the built in time manager or iTime Manager?
Paul

I think the built-in time manager still has to be modified if the government decides to change the date of the change over. So, if you have 50-100 systems out there, that's going to be quite a chore.

I don't know about iTime Manager.

I run the built-in time manager on my main mothership master and all my systems check in with it for time. That way I only have to have one Netlinx Master to fiddle with.

0
vining Posts: 4,368

February 2010

rhargrave wrote:

it is not a good idea to have the time manager module, and the clock manager both running. There is some overlap in their functionality that can cause issues.

Untill yesterday I didn't realize I was duplicating processes. I knew the new masters provided time manager functionality and there was the ASTROCLOCK function but I figured I'd leave things the way I've always done them and incorrectly thought I would have to do something to invoke the time manager functionality and obviously that isn't the case.

BTW I do have 2 masters on this job using M2M but it's set up correctly with master 2 in 1's URL list and nothing in 2's. Besides these log entries never appeard until the gateway firmware was upgraded.

a_riot42 wrote:

Does putting the built in time manager into standalone mode turn it off? Which is better to use the built in time manager or iTime Manager?

I think the master will still do their sync of all the AMX gear which is where the problem seems to be and it just uses internal time tracking instead of retrieving network time. So the problem should remain in either method.

I think the built in method would be alot cleaner for our code and provide more flexability but functionally no different. Of course I've never used it so what the F do I know.

ericmedley wrote:

I don't know about iTime Manager.

Wasn't there a patch for iTimeManager a few years back when they changed the DST dates so I'm sure if dates change again we'll be updating all systems again. Hopefully you and I will be retired by then.

0
DHawthorne Posts: 4,584

February 2010

I'll be watching out for this one. I have two projects coming up that use a fair number of R4's. I have never used the built in time clock in the masters ... I have always used i!-TimeManager or i!-Schedule, and have never seen a conflict before. Good to know where to look when it crops up ...

i!-TimeManager and i!-Schedule use the same clock code, and they have always had a single line you can edit to change when Daylight Savings adjusts. You can set it to whatever you like. It requires a re-compile and reload to activate, but that's not the worst thing in the world.

0
vining Posts: 4,368

February 2010

I think the next firmware release for the masters should include an option to disable the internal time manger function so existing systems running !iTimeManager don't have to be modified to avoid duplicating processes and what ever errors or problems that may result.

I don't know what's involved with the "resync" but isn't the default value of 60 minutes a little too frequent? Wouldn't most systems, especially large ones be better off set to the max of 240 minutes? Wouldn't once or twice a day be sufficient? Wouldn't less == more better?

0
DHawthorne Posts: 4,584

March 2010

vining wrote: »

I think the next firmware release for the masters should include an option to disable the internal time manger function so existing systems running !iTimeManager don't have to be modified to avoid duplicating processes and what ever errors or problems that may result.

I don't know what's involved with the "resync" but isn't the default value of 60 minutes a little too frequent? Wouldn't most systems, especially large ones be better off set to the max of 240 minutes? Wouldn't once or twice a day be sufficient? Wouldn't less == more better?

Once a day, and on a reboot. Anything more than that is of no value, unless the clock chip is really off.

0
ericmedley Posts: 4,177

March 2010

DHawthorne wrote: »

Once a day, and on a reboot. Anything more than that is of no value, unless the clock chip is really off.

NTP is done in a way that both keeps the clock synced up nicely and helps the computer figure out the fastness or slowness of it's internal clock.

The NTP conversation is a series of call and responses to determine the average time a packet travels from host to client and back. The amount of time this happens is adjustable by the software. Most OSs don't give this control to the user. From what I've read most home computers do 5-10 bounces to get a good network latency reading.

Once this value is estaablished, the actual time is passed and adjusted for the determined average latency. The computer will then determine the slowness or fastness of its clock. Some chips will tweak to compensate for this.

The time chip may be checking NTP every 20 mintues or so. In most cases, it's out of the user's hands.

I wonder how AMX's works?

0
vining Posts: 4,368

March 2010

DHawthorne wrote:

Once a day, and on a reboot. Anything more than that is of no value, unless the clock chip is really off.

That's what I would think but the longest period between re-sync that you can set on the master is 480 minutes (8 hours) and it can't be disabled at least to my knowledge. I think through the web interface the mx you can set is 240 minutes but according to the help file for "CLKMGR_SET_RESYNC_PERIOD (CONSTANT INTEGER PERIOD)" you can set it to 480 minutes.

Also I think the re-sync is mainly a command that is sent to all amx gear to sync those devices to the master's clock. It also 1st checks the network time if that option is selected which is usually is if there's a gateway to the world available.

0
[Deleted User] Posts: 0

March 2010

vining wrote: »

DHawthorne wrote:

That's what I would think but the longest period between re-sync that you can set on the master is 480 minutes (8 hours) and it can't be disabled at least to my knowledge. I think through the web interface the mx you can set is 240 minutes but according to the help file for "CLKMGR_SET_RESYNC_PERIOD (CONSTANT INTEGER PERIOD)" you can set it to 480 minutes.

Also I think the re-sync is mainly a command that is sent to all amx gear to sync those devices to the master's clock. It also 1st checks the network time if that option is selected which is usually is if there's a gateway to the world available.

The options for the built in network time manager are 5 min, 1 hour, 2 hours, and 4 hours. Most crystals are 20 PPM, so in 4 hours you are looking at a maximum drift of around 288 ms. Setting the network time manager to standalone mode will turn the updates off entirely.

If you are seeing R4/Gateway offlines from having the network time turned on, there is a Hotfix available.

0

or Register to comment.