Touch Panel Control System Errors

Dave_UK · August 2010

Has anyone experienced high numbers of what would appear to be random touch panel errors being reported in RMS ?

By high, I mean around 1,000+ events over a couple of days and by random, I mean from different venues / different times / durations.

The problem appears to come and go - one week there will be no errors logged, then the next there will be a burst of errors, then it behaves normally again. Next week it could be a different venue, some weeks multiple venues, then nothing for month.

What I have established is that when the problem appears, the panels also fail to respond to ping requests. (I have a machine pinging all the panels and the netlinx masters and logging the results.) The panels fail to respond to a ping whenever RMS logs a control system error, so it looks as if RMS is correctly reporting the control system error.

When a panel fails to respond to a ping request, it usually will respond on the next one (15s sampling), so its not as if the panel is going off-line for long periods of time. Just seems to go on and off-line, and hence the high number of log entries in RMS.

The touch panels and netlinx master in each case are connected to the same Ethernet switch, on the same VLANs. The netlinx master always responds to ping requests, so whatever is going on is just affecting the touch panels.

Our networking team can't see any faults with the network connectivity, and our AMX contract programmer swears it's not their programming. This has been going on for almost a year now with no obvious pattern emerging.

Anyone experienced anything like this ?

Dave

annuello · August 2010

We haven't seen this on our wired panels (CV7 & 700vi, all running firmware v2.86.24). The wireless panels (MVP-8400 with firmware v2.86.50) drop offline frequently, but that is probably due to the wireless lease being limited to X hours, after which the panel renegotiates a connection.

It doesn't sound like a programming issue to me, if the panel can't be pinged. The only exception would be if there is code on the master to reboot the panel under a special condition. The NIC would come online before the reboot finishes, but the panel would still take 40 seconds to re-establish comms with the master. Check your RMS history for the panels to see how long they are offline for. If it is just 10 or 15 seconds then you know the panel isn't rebooting, which points to the network infrastructure. I'd expect that rebooting panels would also attract large complaints from the end users.

Silly question: Are your patch leads crimped "in house" or are they commercially purchased pre-moulded leads? We use pre-moulded patch leads at both the panel and switch wherever possible, since crimped cables are often crimped poorly by people who are in a rush. Slight vibrations/bumps/thermalChanges can effect such leads.

Yours,
Roger McLean
Swinburne University

Dave_UK · August 2010

Thanks Roger,

The panels in question are CV5, 500vi, & 1000vi

At times, RMS is showing errors as close as 5 seconds apart, so I had ruled out rebooting.

Most of the cabling I have seen so far has used pre-molded connectors. However, if this was the problem, then I'd expect it to affect the masters as well as the touch panels. The last spate saw six touch panels affected with 1000+ log entries and in that same period, the masters responded to every ping request. If it was cables, then I'd also expect that the networking team would be seeing a high number of link drops on the switch ports.

I've just recently repatched seven systems with new cables, and added them to RMS, so it will be interesting to see how long they take for errors to start appearing.

Dave

HARMAN_rgelling · August 2010

"The wireless panels (MVP-8400 with firmware v2.86.50) drop offline frequently, but that is probably due to the wireless lease being limited to X hours, after which the panel renegotiates a connection."

This isn't (shouldn't be) true. The panels should be renewing their lease times about half-way through the lease expiration. If it cannot renew at that time it goes to half of the remaining time, then half of the remaining, remaining time, etc until it can renew. If a wireless panel is falling offline it is most likely due to loss of signal to the AP preventing it from reaching the master.

With regard to the wired panels, I have not seen this behavior and wouldn't expect it at all. Are you running the latest firmware? It does sound like a networking problem especially with so many panels all with the same symptom. If you can do it, it would be curious to put a switch between the AMX equipment and the network. Just put a 8 port switch between the network switch/router and the master and all panels. Let that run for a few days and see what you find. I've seen some network equipment where the electrical signal from the switch is too hot/weak for the panel or vice versa (yes, even Cisco). Sometimes switches and equipment don't always play nice together.

Dave_UK · August 2010

Now wouldn't it be nice if RMS had a system report that gave me that information for all the touch panels...

(Is there documentation on how to role your own reports in RMS ?)

I will have to check them all, however as you've not seen this problem with firmware 'x', I suspect it's unlikely to be the cause.

(The latest fault occurred on a NXD-500i with old firmware 2.3.22. However the same fault has occured on another NXD-500i with the latest firmware 2.4.14)

Ironically the systems were on 4-port switches prior to joining them to the network to work with RMS. However, network policy here does not permit for switches to be attached to data points, so they were all removed and patched directly into the switches in the comms rooms.

Part of the problem here is not knowing where or when the fault is going to occur next to know which ones to experiment with. Today I've just had another 1,000+ touch panel entries for a different venue from the last time... sometimes it can go for 6 weeks between errors.

Ultimately I think having a small router in the presentation desk and the touch panel and master on the private side, with the connection to the comms room on the WAN might be the way to go. That way if the building comms goes down then the touch panel can still connect to the master, and it satisfies the no switch rule. (Ideally the touch panels would have an optional AX Link port to allow them to connect directly to the master over a control network...)

If the switch was too hot / weak for the panel, wouldn't the same apply for the master ? Also, if that was the case then I would expect to see a lot more errors and more frequently.

Dave

Dave_UK · August 2010

I think I may have just stumbled across the random element...

The systems are programmed with an automatic shutdown (powers off the projector) which kicks in if the system is left idle for several hours. I suspect that the touch panel problem occurs in relation to this automatic shutdown as oppossed to a user initiated shutdown. Hence the random factor as some users of the system will manually shut down whereas others will just walk away and the automatic shutdown will kick in.

By chance, the touch panel of a system which was currently sending touch panel alerts to RMS, was found to be in an abnormal state this afternoon - apparently displaying parts of two menu screens simultaneously. The panel however was responsive enough to be able to return to the start screen and initiate a manual system shutdown. However the alerts to RMS / ping failures continued.

A few hours later, a system reboot was initiated and the alerts being sent to RMS stopped.

I suspect that the automatic shutdown somehow manages to partially crash the system, leaving the touch panel in an abnormal state and also causing the network issues and hence why the netlinx master reports touch panel errors to RMS.

Dave

Dave_UK · October 2010

narrowing it down

Looks like it's related to the volume control. This weekend I was able to do some testing whilst the condition was present, and pulling out the power to the AXB-VOL3 controlled stopped it.

Looking at some of the networking logs there is large amounts of data flowing on the network to/from the touch panel and Netlinx controller. Around 2GB over a weekend.

I suspect that it's something to do with an endless loop updating the volume slider on the touch panel, which eventually overloads the network interface causing it to become unresponsive to ping requests.

Can't be that simple though as I can't seem to reproduce it on demand. Rebooting the touch panel / Netlinx controller isn't enought to clear it. Not all our installs have AXB-VOL3's either.

Dave

Touch Panel Control System Errors

Comments