Home AMX User Forum AMXForums Archive Threads AMX Hardware

Event log

Hi everyone
I'm been monitoring that netlinx box(another tread I started sometime ago) that would lock up for no apparent reason and every now and then I see that everything drops offline and comes right back online 1-2 seconds later. Has anyone else noticed this in the event log before. The box is still up and running now for about 2-3 months since I added the UPS.

Comments

  • DHawthorneDHawthorne Posts: 4,584
    I've not seen everything drop off line and come back, but I have had sporadic individual items go offline then come right back - mainly touch panels. It doesn't matter if they are wired or wireless. Sometimes, there is a message that more memory has been allocated, as if the memory space used by the connection before the panel dropped offline was not released. If this happens enough, the entire system comes to a skreeking halt.

    I've had my attention diverted by another project with a close deadline, but I have been planning on chasing this down more aggressively on my project with all the difficulties. My current theory is that if a panel drops offline while in the middle of a communications session (message passing), the buffers aren't cleared properly, and sometimes even left in a wait state, even though the panel is offline. I believe it to be related to the other issue I have observed with masters not always recognizing a connectin has been dropped.

    This doesn't speak to why the panels drop offline in the first place. When I saw it mostly in MVP-8400's, I chalked it up to wireless issues. But I installed a very rudimentary logging module that simple tracked online and offline events, and I saw it was happening with CV-12's on the job as well. On the project of mine that is exhibiting this behavior, there are multiple masters, and most of the panels are declared on more than one master; I don't know if that has any bearing on it, but I haven't seen the problem on a single-master system. However, if I am doing a single-master system, it's just a one-room theater anyway, and very straightforward.

    The bottom line is I think AMX needs to examine the robustness of their IP routines. MOst of the time, and under most circumstances, there is no problem, but every now and then, when they do crop up, it's a complete nightmare.
  • Thomas HayesThomas Hayes Posts: 1,164
    Thanks Dave, let me know if you track it down. I attached a copy of the event log in question for anyone to view and perhaps answer.
  • maxifoxmaxifox Posts: 209
    We had similar problem with Axlink devices, but it was caused mainly by wiring, power and some programming issues (your situation is different - that is Netlinx controller).

    I guess, it is better to ask AMX support on that...

    In any case, I would kindly ask you to share with us what was the problem...
  • pdabrowskipdabrowski Posts: 184
    DHawthorne wrote:
    and I saw it was happening with CV-12's on the job as well. On the project of mine that is exhibiting this behavior, there are multiple masters, and most of the panels are declared on more than one master; I don't know if that has any bearing on it, but I haven't seen the problem on a single-master system.
    I have 9 single ME-260 master / CV-15 installs here (upgraded to netlinx from AXCESS) and all exhibit the problem you are describing all upgraded by the same contractor AND one install we upgraded ourselves.

    The 9 from the contractor are showing TP offlines intermittently ranging from 12 seconds to 15 minutes duration coupled with random lockups from the master. The one we upgraded ourselves has in the past 12 months only locked up once, with no offlines noticed.

    I know the law of averages may be at work here, but I have taken one of the contractors installs, removed some code that we don't include with our upgrades and haven't seen a repeat episode of the master lockup, the TP offlines are still happening but less often now. It seems to be in my opinion that different programming styles are a factor to consider as well as hardware configuration.

    The offlines are very hard to find the cause, we have done everything possible to remedy the issue (crossover cable, local switch with and without connection to the building LAN) the only thing that reduced the frequency of the offlines is setting the CV-15 network adapter to 10FullDuplex from AUTO as advised a while ago here.
  • Been on this road

    When I first started coding AMX I started seeing these kind of things and was just pulling my hair out. My mentor who had been using AMX from basically its inception told me to do the following troubleshooting. I was nevertheless reluctant if not just a little annoyed at his suggestions. But after sucessfully troubleshooting multiple installations with intermitent(sp?) connections issues, I subscribe to this way of troubleshooting almost exclusively. Now this is not a full proof method and I'm guessing from previous threads all this has been done before but looking at the bitmap you provided it just brings back memories. My suggestion is to try these steps and see what works and what breaks:

    1. PRD (program run disable) this is DIP switch 1 on the master. With this active the program in the processor will not run. This eliminates coding as a potential issue. If the system is just sitting there and STILL getting online: offline: events. Then you can be certain that it is not entirely code related.

    2. Load blank code. Let it run and see if the events still occur. Once again this can eliminate or elect coding as the potential culprit.

    3. Remove AXLINK, let the processor run without any AXLINK connections. A short or reversed wiring in AXLINK will make things go batty! If you do find that with AXLINK connected you begin to witness these online: offline: events while without it you are running a stable system then you can start to pick apart the AXLINK devices and hunt for the bad device. One of my favorite things that I've found on multiple jobs is when installers crimp down AXLINK BEHIND the phoenix pins. This provides an intermitent connection to devices and can bring down the entire AXLINK bus.

    4. If you do find that AXLINK is a potential problem: DO THIS EVERYTIME! Connect each AXLINK device one at a time. Wait a good minute or 2 between connecting another device and don't connect a whole bunch of devices at once. This may seem infuriating but believe me you'll have a huge grin on your face when you connect a single AXB-TC and watch the entire AXLINK bus go down the toliet.

    5. steps 3 and 4 can be used for ICSNET/HUB devices as well.

    I've had tremendous success stabilizing systems that in the past everyone was pointing at the programmer saying the immortal words "it's GOTTA be AMX programming". I vivdly remember one job where we were seeing some peculiar behaviour from an HAI security system. I was sent in to examine code and see what AMX was doing that was causing it to happen. The project manager was there and showing me the steps and going "see! see! now why would AMX tell the security system to do that?" to which I replied "hmmm that's really something because guess what? I have the 232 connection physically unplugged from HAI. So it must be communicating telepathicly or maybe wireless 80211.Z or something."
  • DHawthorneDHawthorne Posts: 4,584
    Sadly, not many of those options are viable if you have a system that only acts up once a week, or every few weeks.

    I'm convinced it's an IP communications issue. A connection drops, and the master doesn't recognize the drop right away. Messages pile up in the queue, locking up the system, or crippling it dramatically, causing other devices to drop off, and making the problem continuously worse. I don't think network dropoffs can be completely avoided when you are sharing the network with other, non-AMX equipment (and maybe not even if it is dedicated, though I'm sure you can improve things). Add 802.11 to the mix, and you have a lot of unknowns. It's the robustness of the connection manager that needs looking at, how well it recognizes lost connections, and how well it recovers.

    I have greatly reduced the problems I have had by cutting way back on the amount of network traffic, but I can't seem to make them go entirely away. You can only optimize feedback so much before it isn't really feedback anymore.
  • Thomas HayesThomas Hayes Posts: 1,164
    The master in question has been replaced and the old one has been running fine in my office, I've also replaced the TP, PS(6.5 amp), VOL3 and re-ran the wiring. The code in question is running in about 20 other rooms with no problems. I have also added a UPS to the system. Basically, I'm at a lost as where to look next.
  • Thomas,

    I have not read the entire thread, so forgive me if I am repeating something:

    Have you considered sources outside AMX? A bad RS-232 device may introduce voltages back on pin 5; I/O ports and relays can do the same. I have seen a control system brought down by a bad switcher before.

    Are there multiple powersupplies in the system? Keep ground loops in mind if devices are plugged into different outlets. I once measured almost 50VAC difference between the ground pins on two adjescent outlets (obviously fed from a different transformer).

    What other devices are connected to those power groups? An HVAC motor tied to the same group as the outlet can cause some real funky power on that outlet and typically only this only happens during its startup.

    If you have not already done so - try running it without the complete LAN attached, but with just the absolute minimum on your own router/switch? Use pre-made CAT-5 cables where possible.

    Just a few thoughts - after all it is a process of elimination.
Sign In or Register to comment.