System log message interpretation request

Joe Hebert · September 2006

Does anyone know what the following system log messages mean? I get 100s and 100s of them within a few seconds. For all I know it might be 1000s at a time because I never see both the start and end of the message burst (the system log holds 1000 max.) I?ve never seen the message burst in real-time, I only noticed it when I telneted in and paged through the system log. I?ve seen this a few times now on the same job and I don?t recall seeing these on other jobs.

194: 09-08-2006 FRI 19:52:03 Unknown
Optimizing tDeviceMgr messages with 999 messages in queue.
195: 09-08-2006 FRI 19:52:03 Unknown
Optimization tDeviceMgr complete with 1000 messages now in queue. 0 messages avoided.
196: 09-08-2006 FRI 19:52:03 Unknown
Optimizing tDeviceMgr messages with 1000 messages in queue.
197: 09-08-2006 FRI 19:52:03 Unknown
Optimization tDeviceMgr complete with 1000 messages now in queue. 0 messages avoided

It says 0 messages avoided so I?m hoping it?s benign but I do want to track down the meaning and what?s causing it. I keep my own log file that monitors most user interaction and best I can tell there was no major activity at that time. (I have nothing in my log that matches the time frame but I don?t log every button push so I can?t be 100% positive)

The code is running in an NI-4000 with the latest firmware. The compiled code is just over 1 MB.

Thanks in advance to anyone who can shed some light on this for me.

vining · September 2006

I think this just released tech note might have something to do with it.

http://www.amx.com/techsupport/technote.asp?id=809

Although it list the NI-4100 not NI-4000. But it sounds like it might do the trick.

Joe Hebert · September 2006

Thanks for the response but I don?t think TN809 is the answer since I don?t exhibit any of those symptoms. The program already has the Queue_and_Threshold_Sizes.axi included.

I just telneted into the job again and this time the log file contains a tDeviceMgr messages burst from line 11 thru line 1000.

11: 09-09-2006 SAT 15:04:03 Unknown
Optimization tDeviceMgr complete with 999 messages now in queue. 0 messages avoided.

?

1000: 09-09-2006 SAT 15:03:55 Unknown
Optimizing tDeviceMgr messages with 1000 messages in queue.

Lines 1-10 look normal.

DHawthorne · September 2006

I have no idea what the Unknown line could possibly be; in the world of unhelpful error messages, it ranks pretty high I'd say

.

The queue message I am familiar with. It essentially means there are ### (in your case 999) messages stacked up that the master has not yet processed because it is busy (more than likely, for the "Unknown" reason). The real question here is if the queue backup is causing the "Unknown" or vica versa. You need to find out what was happening just before the errors occurred. My guess is there is a device problem that is causing the message backlog, and the Unknown is a memory fault; it's just a guess though. You are only going to find this if you can get that log from just before it happened ... and if that log segment has any useful clues.

A message queue overrun is not by any means harmless. Eventually, the master will start throwing messages away (to the end user, it will appear that button presses don't work, or are extremely sluggish). It can lock up the master as well.

alexanbo · September 2006

I would guess that there's a problem with one of the devices connected to the system, perhaps a wire that's been shorted or some such.

Joe Hebert · September 2006

Thanks for the feedback guys. I haven?t seen a rash of messages in 48 hours now. I have no idea what caused the breakout to begin with and I don?t know why they stopped. I feel like I?m sitting on a ticking time bomb.

One thing that is different than most systems I?ve dealt with is that I?m borrowing a port on one TP that is connected to another system. This is a multiple master-master setup (1-NI-4000, 2-NI-3000, 1-NI-700) and there is also an Axcent3 in slave mode that is attached to one of the NI-3000s.

All the code runs in an NI-4000 except for some code that runs in a NI-700 to control a small media room. All the TPs in the system point to the NI-4000. There is one TP (MVP-8400) that has pages with buttons on port 3 (for the media room) so I have something like this:

Main NI-4000 system 100

dvTP1 = 10001:1:0
dvTP2 = 10002:1:0
...
dvTP8 = 10008:1:0

Media Room NI-700 system 101

dvTP1 = 10008:3:100

Everything seems to work fine but I did notice the feedback for the port 3 media system is dreadfully slow. Buttons that should light up within an instant take about 5 seconds or so to kick on.

I assume what I?m doing is legal since it?s working but I?m wondering if it?s bad practice to do so. And I wonder if this might be causing my intermittent problems.

Any thoughts appreciated.

DHawthorne · September 2006

It's a perfectly legit way to control two systems from the same panel. I've done it many times and with some very large systems without a hitch. There is nothing inherent to it that would cause a response slowdown; something else is going on.

There may be an issue with the masters connecting to each other, and the problem might be the network itself. I had a funky router cause all sorts of slowdown and disconnect problems on a NetLinx system. Is this a wireless panel? I would run a packet sniffer and take a look at what kind of traffic you have.

alexanbo · September 2006

Now that I think about it I think I may have seen this message before. What I had done was setup some feedback that was using the second port of the virtual device. I had a timeline that checked the status of something every 1/2 second or so and then would update a channel based on the status. Now usually the device that it updates is a real device so using the second port of it isn't a problem, but with the virtual device, when I used the second port it would update it every 1/2 second instead of only updating when the channel needed to be updated. The system then would bog down and generate log messages, now I can't recall if it was exactly the same as your message or not. Once I changed the device I was updating to the first port of a new virtual device all the problems and messages disappeared.

So maybe there is something in your system reporting feedback that turns the channel on the second port of a virtual device sporadically.

I've also used the method of having one port of a TP talk to a different Master then the other ports without problems. My delay wasn't that long though on feedback so perhaps your issue is related to the network.

System log message interpretation request

Comments