System lockup(s)
bano
Posts: 173
in AMX Hardware
I have an ongoing issue with 2 different clients. Their systems are locking up periodically. The commonalities between the 2 systems are NI3000?s , wireless modero panels and Request Multimedia music servers for which I am using the provided modules downloaded from ARQ?s website. When I open diagnostics, I get the following:
Line 8 :: ICSPTCPRx9::ProcessICSPPacketRX checksum error MsgID=009F (rx=24, expected=D6) Cmd=0581 - 18:14:33
Line 1 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=0F5D (rx=F1, expected=A3) Cmd=0581 - 01:01:17
Line 2 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=1BE5 (rx=85, expected=37) Cmd=0581 - 11:31:14
Line 41 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=38D4 (rx=91, expected=43) Cmd=0581 - 22:22:59
Line 44 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=4F13 (rx=E7, expected=99) Cmd=0581 - 14:35:47
Line 47 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=5E3D (rx=20, expected=D2) Cmd=0581 - 20:45:00
I also receive these messages as well:
Line 1 :: Memory Available = 24477248 <41088> - 13:38:17
Line 2 :: Memory Available = 24466688 <10560> - 13:40:18
Line 3 :: Memory Available = 24456128 <10560> - 13:42:18
Line 4 :: Memory Available = 24445568 <10560> - 13:44:18
Line 5 :: Memory Available = 24435008 <10560> - 13:46:17
Line 6 :: Memory Available = 24424448 <10560> - 13:48:17
Line 7 :: Memory Available = 24413888 <10560> - 13:50:18
Line 8 :: Memory Available = 24403328 <10560> - 13:52:18
Line 9 :: Memory Available = 24392768 <10560> - 13:54:18
Line 10 :: Memory Available = 24382208 <10560> - 13:56:18
Line 11 :: Memory Available = 24371648 <10560> - 13:58:18
Line 12 :: Memory Available = 24360896 <10752> - 14:00:18
Line 13 :: Memory Available = 24309208 <51688> - 14:00:23
Line 14 :: Memory Available = 24298648 <10560> - 14:02:22
Line 15 :: Memory Available = 24288088 <10560> - 14:04:22
I have engaged amx tech support on both of these systems, both controllers were replaced, but the above messages persist. Can anyone shed any light on my problem(s) and provide suggestions for solutions?
Line 8 :: ICSPTCPRx9::ProcessICSPPacketRX checksum error MsgID=009F (rx=24, expected=D6) Cmd=0581 - 18:14:33
Line 1 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=0F5D (rx=F1, expected=A3) Cmd=0581 - 01:01:17
Line 2 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=1BE5 (rx=85, expected=37) Cmd=0581 - 11:31:14
Line 41 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=38D4 (rx=91, expected=43) Cmd=0581 - 22:22:59
Line 44 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=4F13 (rx=E7, expected=99) Cmd=0581 - 14:35:47
Line 47 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=5E3D (rx=20, expected=D2) Cmd=0581 - 20:45:00
I also receive these messages as well:
Line 1 :: Memory Available = 24477248 <41088> - 13:38:17
Line 2 :: Memory Available = 24466688 <10560> - 13:40:18
Line 3 :: Memory Available = 24456128 <10560> - 13:42:18
Line 4 :: Memory Available = 24445568 <10560> - 13:44:18
Line 5 :: Memory Available = 24435008 <10560> - 13:46:17
Line 6 :: Memory Available = 24424448 <10560> - 13:48:17
Line 7 :: Memory Available = 24413888 <10560> - 13:50:18
Line 8 :: Memory Available = 24403328 <10560> - 13:52:18
Line 9 :: Memory Available = 24392768 <10560> - 13:54:18
Line 10 :: Memory Available = 24382208 <10560> - 13:56:18
Line 11 :: Memory Available = 24371648 <10560> - 13:58:18
Line 12 :: Memory Available = 24360896 <10752> - 14:00:18
Line 13 :: Memory Available = 24309208 <51688> - 14:00:23
Line 14 :: Memory Available = 24298648 <10560> - 14:02:22
Line 15 :: Memory Available = 24288088 <10560> - 14:04:22
I have engaged amx tech support on both of these systems, both controllers were replaced, but the above messages persist. Can anyone shed any light on my problem(s) and provide suggestions for solutions?
0
Comments
Also, try commenting out the modules one at a time and see if the memory loss stops. If you organize your code in include files, you could also start commenting out that code and see if you can find something that is causing the memory loss.
Jeff
The Audio Request module 5.0? is a system hog, it wasn't designed very well because it creates many structures that cause problems in systems with many touch panels. You can get the module code from Audio Request, it is a good starting point to rebuild from. I have talked to them about the problem but I don't know if they are going to fix it. It is a shame because the unit is very good.
Also:
Line 1 :: Memory Available = 24477248 <41088> - 13:38:17
These lines are not errors, the system is always reporting Memory Available.
Yes all devices are set at 10 half along with udp bc rate at 0. I am using linksys waps & switches. I do use include files so commenting out sections of code is not a problem. The real frustration is that these symptoms take hours, sometimes days to start manifesting themselves, long after I've left the clients house. With one system I've went so far as to define all variables as persistant and program the system to reboot every 4 hours. That's just wrong, but I'm running empty on options and clients patience.
I'm wondering if using the rs-232 option instead of ip might help?
I'm using 3-MVP-8400's on one system, the other system uses 1-MVP-7500 and 1-CV7.
Jeff
P.S.
Has anyone used the AMX APs and how do they compare to other commercial grade APs?
I don't think the number of panels are the problem. I would start removing code untill the trouble goes away.
It is available as TN737.
I have used this before to solve memory issues.
I use the AMX AP's and they are great since firmware 5.4.P2.556.bin
Before that the devices were useless.
My guess is something is falling off line, and efforts are being made to reconnect it before the prior connection was properly cleaned up. I believe this to be an inherent flaw with NetLinx devices and the current firmware; connection management is not as robust as is necessary when there are any irregularities in the connections. The only current solution, if this is in fact the case, is to clean up the connection irregularities.
I have sworn off the use of any Linksys product with NetLinx. I have simply had too many issues with them. I prefer Netgear these days (and their WAP blows away the AMX offering for a fifth of the cost). Of course, you have to look at all the usual suspects in term of interference, adjacent or rogue access points, all that. I would also recommend assigning static IP's to all your NetLinx TCP connections to eliminate any potential DHCP server wonkiness.
By the time any of us have attained expert status with NetLinx, we will be fair IT persons as well.
1. the channels are non-overlapping (eq. 1, 6, 11, and 14)
2. there is no notable wi-fi interference
You may use NetStumbler & Wi-Spy for checking...
With the aid of tech support I was finally able to identify and fix the lock up problems. First the checksum errors (Line 1 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=0F5D (rx=F1, expected=A3) Cmd=0581 - 01:01:17) were generated by some anomaly that occurred between the diagnostics program and the netlinx master which had nothing to do with the lockups. I was able to discover this by having both diagnostics and telnet open simultaneously. The error would break the link between diagnostics and the master, which I could observe in the telnet session.
I also had a memory leak that was identified as the cause of the system lockups. For those that use the ?0? or wildcard value in button events, be careful. I use them primarily when dedicating a touchpanel port to a device or subsystem control as opposed to writing large integer arrays. After reviewing my source code, Tech support suggested that I replace these wild card button events with integer arrays. It worked! The memory leaks that lead to the system lockups have disappeared and all is well. I am however still troubled by the fact that I cannot replicate the problem on my NI-2000 at home. What I don?t have is a wireless modero panel at home that I can use for testing. This leads me to suspect that the problem lies between the panel and the controller, perhaps via firmware. I?ve tried different combinations of master/device firmware to no avail. I could not roll back the firmware on my client's modero's without disabling their functionality. Currently I have several systems using wireless modero panels and NI?s both with older firmware using source code with wild card button events that work fine. Any feedback suggestions would be greatly appreciated. I want my wild card back!