On following your own advice ...
DHawthorne
Posts: 4,584
I don't know how many times I've advised others here: when you get random lockups, run a full notifications session for a while, run through the system functions, and make sure nothing is sending excessive updates.
Well, I did that once with the system in question, but a module or two were updated along the line. Then the lockups started happening. I changed WAP's, network switches; updated firmwares; generally pulled out what little remains of my hair. Still, every 2-3 weeks, I'd get a call from an irate customer that his system locked up again. I had two controllers in the house that were running practically identical programs; only one had this issue, and they shared network equipment, so I was pretty well convinced I had a bad master.
Just before calling it in for a replacement, I thought, 'hm, when was the last time I checked notifications?" I had frequently checked the diagnostic messages via a telnet log, and there were no clues there. But notifications aren't errors ... so I ran it. And, very much to my embarrassment, found the labels on my volume preset page were updating continuously. The receiver module was one of my updates since the first time I ran the test. The touch panel was getting hundreds of send commands a minute. Things were actually fine until the panel went off one (this customer tends to let it go to sleep between charges). When the panel dropped, and a few of those messages started backing up,the whole controller locked up. The other system never had the problem because the preset labels were still at their default value. The changed ones just kept getting changed, again and again and again.
So, I reiterate my advice: unknown random lockups? Check your notifications. I really wish I had followed it better myself.
Well, I did that once with the system in question, but a module or two were updated along the line. Then the lockups started happening. I changed WAP's, network switches; updated firmwares; generally pulled out what little remains of my hair. Still, every 2-3 weeks, I'd get a call from an irate customer that his system locked up again. I had two controllers in the house that were running practically identical programs; only one had this issue, and they shared network equipment, so I was pretty well convinced I had a bad master.
Just before calling it in for a replacement, I thought, 'hm, when was the last time I checked notifications?" I had frequently checked the diagnostic messages via a telnet log, and there were no clues there. But notifications aren't errors ... so I ran it. And, very much to my embarrassment, found the labels on my volume preset page were updating continuously. The receiver module was one of my updates since the first time I ran the test. The touch panel was getting hundreds of send commands a minute. Things were actually fine until the panel went off one (this customer tends to let it go to sleep between charges). When the panel dropped, and a few of those messages started backing up,the whole controller locked up. The other system never had the problem because the preset labels were still at their default value. The changed ones just kept getting changed, again and again and again.
So, I reiterate my advice: unknown random lockups? Check your notifications. I really wish I had followed it better myself.
0
Comments
Jeff
Not in this case - notifications are not errors, so the logger doesn't catch them. It's just there were too darn many of them. At the time the master choked there were no error messages either, it just stopped cold because, well, it choked .
Yes, I caught on to that much . However, I sometimes can't resist responding anyway. And just so you know, the module makes a log file on the master that will capture whatever is seen in a telnet session with the "msg on" command, so you can go back to it long after such messages would normally have expired or even after a reboot.
I finally got around to installing your module and I must say that it is very, very nice. Well done and thanks for sharing! I've modified it to send a nightly email (sitrep) of the entire log when it creates the file for the new day and it's working great.
Happy Thanksgiving ya'll.