Home AMX User Forum AMXForums Archive Threads AMX Hardware

System lockup(s)

I have an ongoing issue with 2 different clients. Their systems are locking up periodically. The commonalities between the 2 systems are NI3000?s , wireless modero panels and Request Multimedia music servers for which I am using the provided modules downloaded from ARQ?s website. When I open diagnostics, I get the following:

Line 8 :: ICSPTCPRx9::ProcessICSPPacketRX checksum error MsgID=009F (rx=24, expected=D6) Cmd=0581 - 18:14:33
Line 1 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=0F5D (rx=F1, expected=A3) Cmd=0581 - 01:01:17
Line 2 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=1BE5 (rx=85, expected=37) Cmd=0581 - 11:31:14
Line 41 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=38D4 (rx=91, expected=43) Cmd=0581 - 22:22:59
Line 44 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=4F13 (rx=E7, expected=99) Cmd=0581 - 14:35:47
Line 47 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=5E3D (rx=20, expected=D2) Cmd=0581 - 20:45:00

I also receive these messages as well:

Line 1 :: Memory Available = 24477248 <41088> - 13:38:17
Line 2 :: Memory Available = 24466688 <10560> - 13:40:18
Line 3 :: Memory Available = 24456128 <10560> - 13:42:18
Line 4 :: Memory Available = 24445568 <10560> - 13:44:18
Line 5 :: Memory Available = 24435008 <10560> - 13:46:17
Line 6 :: Memory Available = 24424448 <10560> - 13:48:17
Line 7 :: Memory Available = 24413888 <10560> - 13:50:18
Line 8 :: Memory Available = 24403328 <10560> - 13:52:18
Line 9 :: Memory Available = 24392768 <10560> - 13:54:18
Line 10 :: Memory Available = 24382208 <10560> - 13:56:18
Line 11 :: Memory Available = 24371648 <10560> - 13:58:18
Line 12 :: Memory Available = 24360896 <10752> - 14:00:18
Line 13 :: Memory Available = 24309208 <51688> - 14:00:23
Line 14 :: Memory Available = 24298648 <10560> - 14:02:22
Line 15 :: Memory Available = 24288088 <10560> - 14:04:22

I have engaged amx tech support on both of these systems, both controllers were replaced, but the above messages persist. Can anyone shed any light on my problem(s) and provide suggestions for solutions?

Comments

  • Spire_JeffSpire_Jeff Posts: 1,917
    It looks to me like a network issue. Have you set all of the devices to 10 Half Duplex speed? Also, what brand of network switch/AP are you using?

    Also, try commenting out the modules one at a time and see if the memory loss stops. If you organize your code in include files, you could also start commenting out that code and see if you can find something that is causing the memory loss.

    Jeff
  • GSLogicGSLogic Posts: 562
    How many touch panels are connected to the system?

    The Audio Request module 5.0? is a system hog, it wasn't designed very well because it creates many structures that cause problems in systems with many touch panels. You can get the module code from Audio Request, it is a good starting point to rebuild from. I have talked to them about the problem but I don't know if they are going to fix it. It is a shame because the unit is very good.

    Also:
    Line 1 :: Memory Available = 24477248 <41088> - 13:38:17
    These lines are not errors, the system is always reporting Memory Available.
  • banobano Posts: 173
    Spire_Jeff wrote:
    It looks to me like a network issue. Have you set all of the devices to 10 Half Duplex speed? Also, what brand of network switch/AP are you using?

    Also, try commenting out the modules one at a time and see if the memory loss stops. If you organize your code in include files, you could also start commenting out that code and see if you can find something that is causing the memory loss.

    Jeff

    Yes all devices are set at 10 half along with udp bc rate at 0. I am using linksys waps & switches. I do use include files so commenting out sections of code is not a problem. The real frustration is that these symptoms take hours, sometimes days to start manifesting themselves, long after I've left the clients house. With one system I've went so far as to define all variables as persistant and program the system to reboot every 4 hours. That's just wrong, but I'm running empty on options and clients patience.
  • banobano Posts: 173
    GSLogic wrote:
    How many touch panels are connected to the system?

    The Audio Request module 5.0? is a system hog, it wasn't designed very well because it creates many structures that cause problems in systems with many touch panels. You can get the module code from Audio Request, it is a good starting point to rebuild from. I have talked to them about the problem but I don't know if they are going to fix it. It is a shame because the unit is very good.

    Also:
    Line 1 :: Memory Available = 24477248 <41088> - 13:38:17
    These lines are not errors, the system is always reporting Memory Available.

    I'm wondering if using the rs-232 option instead of ip might help?
  • banobano Posts: 173
    bano wrote:
    I'm wondering if using the rs-232 option instead of ip might help?

    I'm using 3-MVP-8400's on one system, the other system uses 1-MVP-7500 and 1-CV7.
  • Spire_JeffSpire_Jeff Posts: 1,917
    Not sure if anyone else has had any experience with Linksys, but my experience with Linksys has not been good when there are two or more seperate Linksys units on the same physical network. Running a Linksys router and a seperate AP would be almost usable. Putting 2 Linksys APs on the same network would only work for maybe a day before a reboot was necessary. My suggestion would be to try a different brand of router and AP and see if the problems clear up. I have had no problems with D-link and I know there are some people here that use Netgear. Just a suggestion you might try.

    Jeff

    P.S.
    Has anyone used the AMX APs and how do they compare to other commercial grade APs?
  • GSLogicGSLogic Posts: 562
    bano wrote:
    I'm using 3-MVP-8400's on one system, the other system uses 1-MVP-7500 and 1-CV7.

    I don't think the number of panels are the problem. I would start removing code untill the trouble goes away.
  • champchamp Posts: 261
    Try adding the Queue_and_threshold_sizes.axi to your code.
    It is available as TN737.
    I have used this before to solve memory issues.

    I use the AMX AP's and they are great since firmware 5.4.P2.556.bin
    Before that the devices were useless.
  • DHawthorneDHawthorne Posts: 4,584
    The memory available messages are showing a memory allocation taking place, and the size of the memory block allocated is in the angle brackets. Since so many are the same size, that might be a clue that something is being re-allocated too frequently, before prior memory use was released. The other messages point to the TCP stack.

    My guess is something is falling off line, and efforts are being made to reconnect it before the prior connection was properly cleaned up. I believe this to be an inherent flaw with NetLinx devices and the current firmware; connection management is not as robust as is necessary when there are any irregularities in the connections. The only current solution, if this is in fact the case, is to clean up the connection irregularities.

    I have sworn off the use of any Linksys product with NetLinx. I have simply had too many issues with them. I prefer Netgear these days (and their WAP blows away the AMX offering for a fifth of the cost). Of course, you have to look at all the usual suspects in term of interference, adjacent or rogue access points, all that. I would also recommend assigning static IP's to all your NetLinx TCP connections to eliminate any potential DHCP server wonkiness.

    By the time any of us have attained expert status with NetLinx, we will be fair IT persons as well.
  • maxifoxmaxifox Posts: 209
    For me it also seems like a network issue. Since you said you used a few wireless APs could you please assure that

    1. the channels are non-overlapping (eq. 1, 6, 11, and 14)
    2. there is no notable wi-fi interference

    You may use NetStumbler & Wi-Spy for checking...
  • banobano Posts: 173
    bano wrote:
    I have an ongoing issue with 2 different clients. Their systems are locking up periodically. The commonalities between the 2 systems are NI3000?s , wireless modero panels and Request Multimedia music servers for which I am using the provided modules downloaded from ARQ?s website. When I open diagnostics, I get the following:

    Line 8 :: ICSPTCPRx9::ProcessICSPPacketRX checksum error MsgID=009F (rx=24, expected=D6) Cmd=0581 - 18:14:33
    Line 1 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=0F5D (rx=F1, expected=A3) Cmd=0581 - 01:01:17
    Line 2 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=1BE5 (rx=85, expected=37) Cmd=0581 - 11:31:14
    Line 41 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=38D4 (rx=91, expected=43) Cmd=0581 - 22:22:59
    Line 44 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=4F13 (rx=E7, expected=99) Cmd=0581 - 14:35:47
    Line 47 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=5E3D (rx=20, expected=D2) Cmd=0581 - 20:45:00

    I also receive these messages as well:

    Line 1 :: Memory Available = 24477248 <41088> - 13:38:17
    Line 2 :: Memory Available = 24466688 <10560> - 13:40:18
    Line 3 :: Memory Available = 24456128 <10560> - 13:42:18
    Line 4 :: Memory Available = 24445568 <10560> - 13:44:18
    Line 5 :: Memory Available = 24435008 <10560> - 13:46:17
    Line 6 :: Memory Available = 24424448 <10560> - 13:48:17
    Line 7 :: Memory Available = 24413888 <10560> - 13:50:18
    Line 8 :: Memory Available = 24403328 <10560> - 13:52:18
    Line 9 :: Memory Available = 24392768 <10560> - 13:54:18
    Line 10 :: Memory Available = 24382208 <10560> - 13:56:18
    Line 11 :: Memory Available = 24371648 <10560> - 13:58:18
    Line 12 :: Memory Available = 24360896 <10752> - 14:00:18
    Line 13 :: Memory Available = 24309208 <51688> - 14:00:23
    Line 14 :: Memory Available = 24298648 <10560> - 14:02:22
    Line 15 :: Memory Available = 24288088 <10560> - 14:04:22

    I have engaged amx tech support on both of these systems, both controllers were replaced, but the above messages persist. Can anyone shed any light on my problem(s) and provide suggestions for solutions?


    With the aid of tech support I was finally able to identify and fix the lock up problems. First the checksum errors (Line 1 :: ICSPTCPRx17::ProcessICSPPacketRX checksum error MsgID=0F5D (rx=F1, expected=A3) Cmd=0581 - 01:01:17) were generated by some anomaly that occurred between the diagnostics program and the netlinx master which had nothing to do with the lockups. I was able to discover this by having both diagnostics and telnet open simultaneously. The error would break the link between diagnostics and the master, which I could observe in the telnet session.

    I also had a memory leak that was identified as the cause of the system lockups. For those that use the ?0? or wildcard value in button events, be careful. I use them primarily when dedicating a touchpanel port to a device or subsystem control as opposed to writing large integer arrays. After reviewing my source code, Tech support suggested that I replace these wild card button events with integer arrays. It worked! The memory leaks that lead to the system lockups have disappeared and all is well. I am however still troubled by the fact that I cannot replicate the problem on my NI-2000 at home. What I don?t have is a wireless modero panel at home that I can use for testing. This leads me to suspect that the problem lies between the panel and the controller, perhaps via firmware. I?ve tried different combinations of master/device firmware to no avail. I could not roll back the firmware on my client's modero's without disabling their functionality. Currently I have several systems using wireless modero panels and NI?s both with older firmware using source code with wild card button events that work fine. Any feedback suggestions would be greatly appreciated. I want my wild card back!
Sign In or Register to comment.