Initializing VARs in Loops
vining
Posts: 4,368
Given these 2 options, which would you prefer?
Initialize the VARs in the beginning so that they don't have to be created on every pass of the for loop. (No scope issue and VARs are set to value prior to use anyway)
Or.. what I would normally do and keep the scope as small as possible which is really just a force of habit.
I'm thinking I should do the 1st method and I realize it's not a real big deal either way but I'm just curious to know what others would do or what they think.
Initialize the VARs in the beginning so that they don't have to be created on every pass of the for loop. (No scope issue and VARs are set to value prior to use anyway)
DEFINE_FUNCTION fnHWI_FB_DL_One(INTEGER iOutIndx) { STACK_VAR INTEGER n ; STACK_VAR INTEGER nPageStart ; STACK_VAR INTEGER nOutOffset ; for(n = 1 ; n <= HWI_NUM_UIs ; n++) { nPageStart = (sHWI_UI[n].nPage * 48) + 1 ; if(iOutIndx >= nPageStart && iOutIndx <= nPageStart + 48) { nOutOffset = iOutIndx - (sHWI_UI[n].nPage * 48) ; fnHWI_FB_SendCHNL(n,nOutOffset + 100,sHWI.sOutput[iOutIndx].nDLevel > 0) ; fnHWI_FB_SendLVL(n,(HWI_LVL_OUTPUTS - 1) + nOutOffset,sHWI.sOutput[iOutIndx].nDLevel) ; if(nOutOffset == sHWI_UI[n].nSelOut) { fnHWI_FB_SendLVL(n,HWI_LVL_MAIN_BARG,sHWI.sOutput[iOutIndx].nDLevel) ; } } } RETURN ; }
Or.. what I would normally do and keep the scope as small as possible which is really just a force of habit.
DEFINE_FUNCTION fnHWI_FB_DL_One(INTEGER iOutIndx) { STACK_VAR INTEGER n ; for(n = 1 ; n <= HWI_NUM_UIs ; n++) { STACK_VAR INTEGER nPageStart ; nPageStart = (sHWI_UI[n].nPage * 48) + 1 ; if(iOutIndx >= nPageStart && iOutIndx <= nPageStart + 48) { STACK_VAR INTEGER nOutOffset ; nOutOffset = iOutIndx - (sHWI_UI[n].nPage * 48) ; fnHWI_FB_SendCHNL(n,nOutOffset + 100,sHWI.sOutput[iOutIndx].nDLevel > 0) ; fnHWI_FB_SendLVL(n,(HWI_LVL_OUTPUTS - 1) + nOutOffset,sHWI.sOutput[iOutIndx].nDLevel) ; if(nOutOffset == sHWI_UI[n].nSelOut) { fnHWI_FB_SendLVL(n,HWI_LVL_MAIN_BARG,sHWI.sOutput[iOutIndx].nDLevel) ; } } } RETURN ; }I would think the 2nd method would be a little harder on the processor since the VAR has to be created on every pass of the loop while in the 1st method the VARs are created once and only the values change on each pass.
I'm thinking I should do the 1st method and I realize it's not a real big deal either way but I'm just curious to know what others would do or what they think.
0
Comments
This question wasn't about scope but whether it's better to initialize prior to the loop (assuming scope isn't an issue). How much harder is it for the processor to do one way versus the other?
Yep. From the NetLinx Language Reference Guide:
As stack memory is just drawn from the heap I'd imagine it would be a relatively trival pointer assigment for each iteration, wether this has any noticeable effect on performance would be interesting to find out. I'm currently putting together some benchmarking tools for the NetLinx Common Libraries project so I'll compare the two alternatives through them when it's done.
In most modern programming languages you want to limit scope as much as possible, but its for style reasons not for any processing reason. To determine which way to write your function, I would use style criteria long before worrying about what the processor is doing. For that size of function, I would have them all at the top of the function like your first example.
Paul
+1
Agreed.
In this case I think minimizing scope to an anal degree should lose out in order to achieve a miniscule performance gain as in method 1. So we agree on method 1 but I think for different reasons.
Anyway it's not really about style since most folks looking at my code would say I have none or that it's all wrong but really what the impact declaring a stack in the loop has rather than delacring before. Sort of like using <= LENGTH_ARRAY(ARRAY) in the loop instead on creating a VAR prior to hold that value so each iteration of the loop doesn't have to run that function again and again and again........ Like: So I would think regardless of "style" concepts or "scope" this: would be better than this: since we don't repeat creating the var on every interation of the loop. Again it's anal and I would think of miniscule performance consequences or gains but I would like to know how much performance is gained or lost just for the sake of knowing.
I was under the impression that NetLinx forced you to declare STACK_VAR and LOCAL_VAR before any other lines in a function/event. I just moved a LOCAL_VAR in one of my programs to the middle of a function and got the error C10567: Unrecognized node type [423] in compileStatement()
I.e. - this works:
This does not.
Actually that's not as expensive as you may think. In NetLinx arrays are stored as a header which contains the length followed by the data (at least that's what we were told in P3), so the length_array() / length_string() just lookup this. Although as you said, it's a a lot nicer to not have to call a function (regardless of how spendy) in the check condition of each loop iteration.
@mpullin
From the line beneath that last quote in the ref guide:
But no, as jj pointed out 'a block' apparently means anywhere there are curly braces, not just an event or function. So you can in fact declare your stack vars right before they are needed, as long as you have a structure with curly braces in play. You can even inject a set of curly braces with no condition to enter the block. Not that I'd be tempted to do this...
Over 1000000 loop iterations initializing the stack_var externally works out to be approximately 0.027ms per run. When the stack_var is initialized within the loop each iteration takes approximately 0.106ms. With a blazing 79?s performance boost I highly doubt it's going to be the cause of bottlenecks in anyones code.
For reference here's the test code
Paul
I got bored for 5 mintues on lunch and just tried it out of curiosity.
[edit]
Just noticed the decimal point so it's even less dramatic a change but still nearly 4x faster none the less.
[/edit]
Thxs for the test!
Speaking on my motivations only: Sometimes I explore code concepts for enjoyment. I often fish for big fish on one of the larger lakes around here. This means prepping my boat, launching the boat, driving to a location, and then spending countless hours trying to catch fish of the appropriate size using all sorts of lures. Sometimes I just like to go to some little inland lake/pond that simply requires a piece of hotdog, a hook 10 feet of line and a stick to catch countless little fish.
I enjoy both. It is fun when I finally land on of the big fish, but sometimes I just want to catch ANY fish without the time expenditure and frustration
Now back to code. Sometimes the never-ending work related code crunch gets monotonous or frustrating. Sometimes I just need to explore a problem that doesn't require resolution. If I hit a wall, I can just drop it until I feel like exploring it again... which may be never If something beneficial on the work side results from this playing, even better, but the real reward is my sanity and recharged attitude regarding coding. One other benefit is that as I am working on something like this, occasionally I will have a moment of clarity or simply blindly stumble across a method that fixes or enhances my work related code.
To summarize, I am not doing this because of the extra processor cycles gained, but I am not going to ignore the gain unless it makes the code completely obfuscated and unusable. I might also try to find a balance between performance and usability.
Jeff
There are many good reasons to optimize code and I am doing that right now in a module I am writing, but chasing 79 microseconds over a million loop iterations just doesn't seem worth the time to think about it to me. I will suffer with obfuscatory code if it achieves O(n) performance as opposed to O(n?) but to save a few microseconds in a loop with 1,000,000 iterations? It just seems crazy to me to worry about that when there are so many other pressing optimizations to achieve in Netlinx.
If you want a good optimization challenge, you might try writing a function that parses large XML files that are too big to fit in memory, or a faster way to replace characters in a string (the AMX tech note function is terribly slow), or come up with a way to make hash tables in Netlinx for fast searching and command response parsing. Netlinx has a ton of areas where optimization is needed, so that is why I was surprised you were spending brain cycles on a stack_var as opposed to global variable where any "gain' achieved is simply dwarfed by the other inefficiencies going on to the point of being infinitessimal.
I am not sure whether calling length_array every iteration of a loop is that much slower than storing it first, but it is something I just do because it can't be slower, and may be faster, but likely not appreciably so, and makes the code look better.
Paul
P.S. I've looked into a native hash table implimentation a bit but find it hard to imagine a design that will work within the limitations of the NetLinx language. If you've got any ideas though there's always the NetLinx Common Libraries project if you want some more brains on it .
P.P.S. it was 79?s per loop iteration
It's AMX programmer mentality. Worry about "performance" and "optimizing" and such without actually testing said optimizations, allthewhile prefixing all of your functions with fn_ and copy&pasting large segments of code. Just smile and nod.
This is one of the reasons I have vehemently opposed (in my own shop, not publicly) tools like Visual Architect and AMX.home. They are horribly inefficient. Sure, the code runs, but it doesn't really run well. They turn nice, crisp systems into pigs. They have a place, but frankly, not for anything I do. Optimization was the last thing on anyone's mind when they were developed.
Duet is another one. The API is bloated, and requires more than is needed to make things function. I won't use it if I have a choice. Yes, there are some Duet modules that are efficient and work well. But more often, if there is a NetLinx version, it just runs better. I'm not, per se, against Duet ... but many of the published implementations are just bad.
Bottom line: disregard for efficiency and resource conservation will eventually bite you in the nethers.
Don't forget that programmer time is a much more expensive resource that the CPU. If you spend a day trying to save a few microseconds from a loop, making it less readable, and making no real difference to your programs efficiency, that isn't conserving resources, its wasting them.
Paul
Likewise it's probably easier and quicker for a programmer to put all feedback in define_program so do we not consider processor resources and instead just consider programmers resources or do we consider both and find a happy balance between the two. I would think we'd all agree on the balanced approach and where we delcare stack vars is just one of those things to consider and if you think their placement makes it less readable, well then figure out which is more important for that situation and just keep writing but if you don't know how decisions impact the system then you should stop, ask or consider finding out for yourself. It may prove to be a big deal but often it won't but at least you will then know and you can then make informed decisions.
First, I'm not sure how a few minutes running a simple test turned into a day? Second, 90% of the testing and exploring I do is on my own time. Lastly, the couple of minutes/hours figuring this stuff out in controlled settings is way more efficient that hitting that processor wall when everything starts running too slow and you have to go back and start optimizing things after the fact.
I also find that sharing the results with the community can spur discussions that lead me to learn about different programming techniques that help me better my programming capabilities.
One other thought that just popped into my head... until the tests are run, how can you know how beneficial a change is? Sure, you can speculate that you will only pickup a couple of microseconds over a million iterations, but what if the processor doesn't execute code the way you think it does? One cannot trust that the code methods put forth by AMX are the most efficient either. I looked at their sample buffering code and it is functional, but HIGHLY inefficient in some cases. At the same time, as I recall, it was very easy to read and understand.
Jeff
"Isn't it ironic?" - No it ain't.
But I agree, it seems the discussion does take up more time than your little test. Also your little test is a fun thing to do, you don't always necessarily find something of much importance. It's a good thing kim shares it with us. In this case he didn't find something that really has an impact on things, but it might be possible that in the future one of his 'silly little tests' finds something that really does have an impact on things. If this was 2-3ms it would have an impact on larger systems.
Things like this are always good to know. Regardless of the impact they have. It would be nice if more people shared their knowledge like Kim does.