Initializing VARs in Loops

vining · May 2010

Given these 2 options, which would you prefer?

Initialize the VARs in the beginning so that they don't have to be created on every pass of the for loop. (No scope issue and VARs are set to value prior to use anyway)

DEFINE_FUNCTION fnHWI_FB_DL_One(INTEGER iOutIndx)

     {
     STACK_VAR INTEGER n ;
     STACK_VAR INTEGER nPageStart ;
     STACK_VAR INTEGER nOutOffset ;
     
     for(n = 1 ; n <= HWI_NUM_UIs ; n++)
	  {
	  nPageStart = (sHWI_UI[n].nPage * 48) + 1 ;
	  if(iOutIndx >= nPageStart && iOutIndx <= nPageStart + 48)
	       {
	       nOutOffset = iOutIndx - (sHWI_UI[n].nPage * 48) ; 
	       fnHWI_FB_SendCHNL(n,nOutOffset + 100,sHWI.sOutput[iOutIndx].nDLevel > 0) ;
	       fnHWI_FB_SendLVL(n,(HWI_LVL_OUTPUTS - 1) + nOutOffset,sHWI.sOutput[iOutIndx].nDLevel) ;
	       if(nOutOffset == sHWI_UI[n].nSelOut)
		    {
		    fnHWI_FB_SendLVL(n,HWI_LVL_MAIN_BARG,sHWI.sOutput[iOutIndx].nDLevel) ;
		    }
	       }
	  }
     
     RETURN ;
     }

Or.. what I would normally do and keep the scope as small as possible which is really just a force of habit.

DEFINE_FUNCTION fnHWI_FB_DL_One(INTEGER iOutIndx)

     {
     STACK_VAR INTEGER n ;
          
     for(n = 1 ; n <= HWI_NUM_UIs ; n++)
	  {
	  STACK_VAR INTEGER nPageStart ;
	  
	  nPageStart = (sHWI_UI[n].nPage * 48) + 1 ;
	  if(iOutIndx >= nPageStart && iOutIndx <= nPageStart + 48)
	       {
	       STACK_VAR INTEGER nOutOffset ;
	       
	       nOutOffset = iOutIndx - (sHWI_UI[n].nPage * 48) ; 
	       fnHWI_FB_SendCHNL(n,nOutOffset + 100,sHWI.sOutput[iOutIndx].nDLevel > 0) ;
	       fnHWI_FB_SendLVL(n,(HWI_LVL_OUTPUTS - 1) + nOutOffset,sHWI.sOutput[iOutIndx].nDLevel) ;
	       if(nOutOffset == sHWI_UI[n].nSelOut)
		    {
		    fnHWI_FB_SendLVL(n,HWI_LVL_MAIN_BARG,sHWI.sOutput[iOutIndx].nDLevel) ;
		    }
	       }
	  }
     
     RETURN ;
     }

I would think the 2nd method would be a little harder on the processor since the VAR has to be created on every pass of the loop while in the 1st method the VARs are created once and only the values change on each pass.

I'm thinking I should do the 1st method and I realize it's not a real big deal either way but I'm just curious to know what others would do or what they think.

PhreaK · May 2010

I may be completely wrong here.... but I thought that NetLinx variablies can only have their scope limited by functions/events, not 'lesser' code blocks (e.g. loops etc).

vining · May 2010

PhreaK wrote:

I may be completely wrong here.... but I thought that NetLinx variablies can only have their scope limited by functions/events, not 'lesser' code blocks (e.g. loops etc).

I believe the scope of STACKs & LOCALs is limited to the braces previous to where they were initialized. Of course I could be wrong too.

This question wasn't about scope but whether it's better to initialize prior to the loop (assuming scope isn't an issue). How much harder is it for the processor to do one way versus the other?

Joe Hebert · May 2010

PhreaK wrote: »

I may be completely wrong here.... but I thought that NetLinx variablies can only have their scope limited by functions/events, not 'lesser' code blocks (e.g. loops etc).

I'm not sure if there are exceptions but I believe any variable declaration within braces should work.

PhreaK · May 2010

PhreaK wrote: »

I may be completely wrong here....

Yep. From the NetLinx Language Reference Guide:

Local variables are restricted in scope to the statement block in which they are declared. A statement block is
one or more NetLinx statements enclosed in a pair of braces, like the blocks following subroutines, functions,
conditionals, loops, waits, and so on.

As stack memory is just drawn from the heap I'd imagine it would be a relatively trival pointer assigment for each iteration, wether this has any noticeable effect on performance would be interesting to find out. I'm currently putting together some benchmarking tools for the NetLinx Common Libraries project so I'll compare the two alternatives through them when it's done.

vining · May 2010

Phreak wrote:

As stack memory is just drawn from the heap I'd imagine it would be a relatively trival pointer assigment for each iteration, wether this has any noticeable effect on performance would be interesting to find out.

I agree in human time it would be trivial and in computer time it might be just a hiccup especially with only a few iterations of a loop. But a little here and a little there if might add up to a few seconds off a mainline pass, although most of my code is strictly event driven so it really wouldn't matter much at all but I'd still like to know the difference of say 100,000 iterations. I'd run the tests if I had the time or if it was more than a curiosity.

a_riot42 · May 2010

vining wrote: »

I'm thinking I should do the 1st method and I realize it's not a real big deal either way but I'm just curious to know what others would do or what they think.

In most modern programming languages you want to limit scope as much as possible, but its for style reasons not for any processing reason. To determine which way to write your function, I would use style criteria long before worrying about what the processor is doing. For that size of function, I would have them all at the top of the function like your first example.
Paul

jjames · May 2010

a_riot42 wrote: »

In most modern programming languages you want to limit scope as much as possible, but its for style reasons not for any processing reason. To determine which way to write your function, I would use style criteria long before worrying about what the processor is doing. For that size of function, I would have them all at the top of the function like your first example.
Paul

+1

Agreed.

vining · May 2010

a-riot42 wrote:

In most modern programming languages you want to limit scope as much as possible, but its for style reasons not for any processing reason.

Yes but method 2 achieves the smallest scope possible for the VARs so there has to be a balance of "style" and "function" (performance). If style doesn't affect function go with style but if function doesn't affect style, go with function with function taking the higher precedence. Again any gains here would be extremely trivial so this is really about the concept in general and not this particular piece of code.

In this case I think minimizing scope to an anal degree should lose out in order to achieve a miniscule performance gain as in method 1. So we agree on method 1 but I think for different reasons.

Anyway it's not really about style since most folks looking at my code would say I have none or that it's all wrong but really what the impact declaring a stack in the loop has rather than delacring before. Sort of like using <= LENGTH_ARRAY(ARRAY)

 for(i=1;i<= LENGTH_ARRAY(ARRAY);i++)

in the loop instead on creating a VAR prior to hold that value so each iteration of the loop doesn't have to run that function again and again and again........ Like:

STACK_VAR INTEGER nLen ;

nLen = LENGTH_ARRAY(ARRAY)  ;
for(i=1;i<=nLen;i++) 
        {
        //do something
        }

So I would think regardless of "style" concepts or "scope" this:

STACK_VAR INTEGER nLen ;
STACK_VAR INTEGER nVar ;

nLen = LENGTH_ARRAY(ARRAY)  ;
for(i=1;i<=nLen;i++) 
        {
        nVar = i ;
        //do something with nVar ;
        }

would be better than this:

STACK_VAR INTEGER nLen ;

nLen = LENGTH_ARRAY(ARRAY)  ;
for(i=1;i<=nLen;i++) 
        {
        STACK_VAR INTEGER nVar ;

        nVar = i ;
        //do something with nVar ;
        }

since we don't repeat creating the var on every interation of the loop. Again it's anal and I would think of miniscule performance consequences or gains but I would like to know how much performance is gained or lost just for the sake of knowing.

mpullin · May 2010

Does the second example even compile?

I was under the impression that NetLinx forced you to declare STACK_VAR and LOCAL_VAR before any other lines in a function/event. I just moved a LOCAL_VAR in one of my programs to the middle of a function and got the error C10567: Unrecognized node type [423] in compileStatement()

jjames · May 2010

mpullin wrote: »

I was under the impression that NetLinx forced you to declare STACK_VAR and LOCAL_VAR before any other lines in a function/event. I just moved a LOCAL_VAR in one of my programs to the middle of a function and got the error C10567: Unrecognized node type [423] in compileStatement()

Right, I believe that if you declare any variables, it needs to be the first within the braces. Can't just throw it anywhere.

I.e. - this works:

BUTTON_EVENT[dv_TP,nDISPLAY_ON_BTN] 		// TV POWER ON
{
   PUSH:
   {
		STACK_VAR INTEGER nPNL
		nPNL = GET_LAST(dv_TP)
		fnDISPLAY_POWER_ON(nPNL_AV[nPNL])
		
		{
			local_var integer test
			test = 1
		}
   }
}

This does not.

BUTTON_EVENT[dv_TP,nDISPLAY_ON_BTN] 		// TV POWER ON
{
   PUSH:
   {
		STACK_VAR INTEGER nPNL
		nPNL = GET_LAST(dv_TP)
		fnDISPLAY_POWER_ON(nPNL_AV[nPNL])
		

		local_var integer test
		test = 1
   }
}

PhreaK · May 2010

vining wrote: »
Anyway it's not really about style since most folks looking at my code would say I have none or that it's all wrong but really what the impact declaring a stack in the loop has rather than delacring before. Sort of like using <= LENGTH_ARRAY(ARRAY)
 for(i=1;i<= LENGTH_ARRAY(ARRAY);i++) 
in the loop instead on creating a VAR prior to hold that value so each iteration of the loop doesn't have to run that function again and again and again........ Like:
STACK_VAR INTEGER nLen ;

nLen = LENGTH_ARRAY(ARRAY)  ;
for(i=1;i<=nLen;i++) 
        {
        //do something
        }

Actually that's not as expensive as you may think. In NetLinx arrays are stored as a header which contains the length followed by the data (at least that's what we were told in P3), so the length_array() / length_string() just lookup this. Although as you said, it's a a lot nicer to not have to call a function (regardless of how spendy) in the check condition of each loop iteration.

@mpullin
From the line beneath that last quote in the ref guide:

Local variables must be declared immediately after the opening brace of a block but before the first executable statement.

mpullin · May 2010

PhreaK wrote: »

@mpullin
From the line beneath that last quote in the ref guide:

Yeah I was aware of that, but before this thread I was not aware that you could even put local vars anywhere other than at the very beginning. If that were true then we have been arguing about two options one of which is illegal.

But no, as jj pointed out 'a block' apparently means anywhere there are curly braces, not just an event or function. So you can in fact declare your stack vars right before they are needed, as long as you have a structure with curly braces in play. You can even inject a set of curly braces with no condition to enter the block. Not that I'd be tempted to do this...

PhreaK · May 2010

Just gave the two options a run through the benchmarking utils in the common libraries project.

Over 1000000 loop iterations initializing the stack_var externally works out to be approximately 0.027ms per run. When the stack_var is initialized within the loop each iteration takes approximately 0.106ms. With a blazing 79?s performance boost I highly doubt it's going to be the cause of bottlenecks in anyones code.

For reference here's the test code

define_function test_stack_out_loop()
{
	stack_var long i
	stack_var integer tmp

	test_start('stack_var initilized externally')

	test_timer_start()
	for (i = 1000000; i; i--) {
		tmp = 1
	}
	test_timer_stop(1000000)

	test_end()
}

define_function test_stack_in_loop()
{
	stack_var long i

	test_start('stack_var initilized internally')

	test_timer_start()
	for (i = 1000000; i; i--) {
		stack_var integer tmp
		tmp = 1
	}
	test_timer_stop(1000000)

	test_end()
}

a_riot42 · May 2010

How many times did you run the test? I would guess that the difference is so small that you will get a different result each time. There are so many bigger fish to fry, why do you worry about minutiae like this? Has it been that slow?
Paul

PhreaK · May 2010

The tests were run 1000000 each. The times are an averge speed per loop iteration.

I got bored for 5 mintues on lunch and just tried it out of curiosity.

vining · May 2010

Phreak wrote:

With a blazing 79µs performance boost I highly doubt it's going to be the cause of bottlenecks in anyones code.

Running code in define program or any code for matter that isn't event driven is obviously the bigger fish to fry but this little tid bit of knowledge is useful. Just moving the stack declaration saved you 80ms off the time the processors take to complete a single pass when this code is called to execute. Of course that's a million iterations so most of our loops will barley see a 2-3 ms boost but if more stacks were initialized in the loop x how many loops in the system and god forbid they're in define program or a fast repeating timeline it will have an increased negative affect. We'll probably never notice it but if we can avoid unecassary overhead where ever we can it can only make our code run better.

[edit]
Just noticed the decimal point so it's even less dramatic a change but still nearly 4x faster none the less.
[/edit]

Thxs for the test!

Spire_Jeff · May 2010

a_riot42 wrote: »

There are so many bigger fish to fry, why do you worry about minutiae like this? Paul

Speaking on my motivations only: Sometimes I explore code concepts for enjoyment. I often fish for big fish on one of the larger lakes around here. This means prepping my boat, launching the boat, driving to a location, and then spending countless hours trying to catch fish of the appropriate size using all sorts of lures. Sometimes I just like to go to some little inland lake/pond that simply requires a piece of hotdog, a hook 10 feet of line and a stick to catch countless little fish.

I enjoy both. It is fun when I finally land on of the big fish, but sometimes I just want to catch ANY fish without the time expenditure and frustration

Now back to code. Sometimes the never-ending work related code crunch gets monotonous or frustrating. Sometimes I just need to explore a problem that doesn't require resolution. If I hit a wall, I can just drop it until I feel like exploring it again... which may be never

If something beneficial on the work side results from this playing, even better, but the real reward is my sanity and recharged attitude regarding coding. One other benefit is that as I am working on something like this, occasionally I will have a moment of clarity or simply blindly stumble across a method that fixes or enhances my work related code.

To summarize, I am not doing this because of the extra processor cycles gained, but I am not going to ignore the gain unless it makes the code completely obfuscated and unusable. I might also try to find a balance between performance and usability.

Jeff

a_riot42 · May 2010

Spire_Jeff wrote: »

To summarize, I am not doing this because of the extra processor cycles gained, but I am not going to ignore the gain unless it makes the code completely obfuscated and unusable. I might also try to find a balance between performance and usability.
Jeff

There are many good reasons to optimize code and I am doing that right now in a module I am writing, but chasing 79 microseconds over a million loop iterations just doesn't seem worth the time to think about it to me. I will suffer with obfuscatory code if it achieves O(n) performance as opposed to O(n?) but to save a few microseconds in a loop with 1,000,000 iterations? It just seems crazy to me to worry about that when there are so many other pressing optimizations to achieve in Netlinx.

If you want a good optimization challenge, you might try writing a function that parses large XML files that are too big to fit in memory, or a faster way to replace characters in a string (the AMX tech note function is terribly slow), or come up with a way to make hash tables in Netlinx for fast searching and command response parsing. Netlinx has a ton of areas where optimization is needed, so that is why I was surprised you were spending brain cycles on a stack_var as opposed to global variable where any "gain' achieved is simply dwarfed by the other inefficiencies going on to the point of being infinitessimal.

I am not sure whether calling length_array every iteration of a loop is that much slower than storing it first, but it is something I just do because it can't be slower, and may be faster, but likely not appreciably so, and makes the code look better.
Paul

PhreaK · May 2010

Paul, I agree with what your saying, especially with optimising algorithm structure to bring things down from quadratic (or worse) complexity before even considering looking at miniscule things like this. However, as Jeff mentioned this was more of just a 'I wonder what would happen if...' type thing. The brain cycles used on it were only stolen form my Idle() process.

P.S. I've looked into a native hash table implimentation a bit but find it hard to imagine a design that will work within the limitations of the NetLinx language. If you've got any ideas though there's always the NetLinx Common Libraries project if you want some more brains on it

.

P.P.S. it was 79?s per loop iteration

true · May 2010

a_riot42 wrote: »

How many times did you run the test? I would guess that the difference is so small that you will get a different result each time. There are so many bigger fish to fry, why do you worry about minutiae like this? Has it been that slow?
Paul

It's AMX programmer mentality. Worry about "performance" and "optimizing" and such without actually testing said optimizations, allthewhile prefixing all of your functions with fn_ and copy&pasting large segments of code. Just smile and nod.

DHawthorne · May 2010

When NetLinx first came out, and especially right after the first big processor improvements (NXC-ME to ME260 then 260/64, which is what the current models are based on) we had a glut of processor capacity. After dealing with Axcent3 for so long, it was a wonderful thing. But I have always considered that sooner or later, the addition of features and the size of jobs would eat all that processor capacity up; and we are beginning to see it happen. It's the nature of the game ... increase capacity, add features to use it, run out of capacity. Because of this it is never a bad idea to always be in the habit of conserving resources when you can, and optimizing wherever possible. I'm not anal about it. I'd rather waste a few processor ticks than write code I can't figure out a month later. But it really isn't a bad thing to know the most efficient way to do some things, in case you are ever in need of more of those dwindling resources. And if it doesn't matter in terms of readability and modularity, it's a good habit to be in.

This is one of the reasons I have vehemently opposed (in my own shop, not publicly) tools like Visual Architect and AMX.home. They are horribly inefficient. Sure, the code runs, but it doesn't really run well. They turn nice, crisp systems into pigs. They have a place, but frankly, not for anything I do. Optimization was the last thing on anyone's mind when they were developed.

Duet is another one. The API is bloated, and requires more than is needed to make things function. I won't use it if I have a choice. Yes, there are some Duet modules that are efficient and work well. But more often, if there is a NetLinx version, it just runs better. I'm not, per se, against Duet ... but many of the published implementations are just bad.

Bottom line: disregard for efficiency and resource conservation will eventually bite you in the nethers.

a_riot42 · May 2010

DHawthorne wrote: »

Bottom line: disregard for efficiency and resource conservation will eventually bite you in the nethers.

Don't forget that programmer time is a much more expensive resource that the CPU. If you spend a day trying to save a few microseconds from a loop, making it less readable, and making no real difference to your programs efficiency, that isn't conserving resources, its wasting them.
Paul

mpullin · May 2010

a_riot42 wrote: »

Don't forget that programmer time is a much more expensive resource that the CPU. If you spend a day trying to save a few microseconds from a loop, making it less readable, and making no real difference to your programs efficiency, that isn't conserving resources, its wasting them.
Paul

Unless that day would otherwise be spent playing Pac-man on Google.

vining · May 2010

a_riot42 wrote:

Don't forget that programmer time is a much more expensive resource that the CPU. If you spend a day trying to save a few microseconds from a loop, making it less readable, and making no real difference to your programs efficiency, that isn't conserving resources, its wasting them.

Yes but now that we know there's a difference we can make the determination of how to structure our code as we're writing it. Wether to increase readability, performance or minimize scope are all factors that we should be considering as we write and since we now know this little tid bit we can consider it as we write and it shouldn't take any longer. It may make it less readable and that's another fork in the road to consider.

Likewise it's probably easier and quicker for a programmer to put all feedback in define_program so do we not consider processor resources and instead just consider programmers resources or do we consider both and find a happy balance between the two. I would think we'd all agree on the balanced approach and where we delcare stack vars is just one of those things to consider and if you think their placement makes it less readable, well then figure out which is more important for that situation and just keep writing but if you don't know how decisions impact the system then you should stop, ask or consider finding out for yourself. It may prove to be a big deal but often it won't but at least you will then know and you can then make informed decisions.

Spire_Jeff · May 2010

a_riot42 wrote: »

Don't forget that programmer time is a much more expensive resource that the CPU. If you spend a day trying to save a few microseconds from a loop, making it less readable, and making no real difference to your programs efficiency, that isn't conserving resources, its wasting them.
Paul

First, I'm not sure how a few minutes running a simple test turned into a day? Second, 90% of the testing and exploring I do is on my own time. Lastly, the couple of minutes/hours figuring this stuff out in controlled settings is way more efficient that hitting that processor wall when everything starts running too slow and you have to go back and start optimizing things after the fact.

I also find that sharing the results with the community can spur discussions that lead me to learn about different programming techniques that help me better my programming capabilities.

One other thought that just popped into my head... until the tests are run, how can you know how beneficial a change is? Sure, you can speculate that you will only pickup a couple of microseconds over a million iterations, but what if the processor doesn't execute code the way you think it does? One cannot trust that the code methods put forth by AMX are the most efficient either. I looked at their sample buffering code and it is functional, but HIGHLY inefficient in some cases. At the same time, as I recall, it was very easy to read and understand.

Jeff

PhreaK · May 2010

I find it ironic that the time spent debating whether this benchmarking was a waste of time _far_ outways the time it took to actually do the test.

Jorde_V · May 2010

PhreaK wrote: »

I find it ironic that the time spent debating whether this benchmarking was a waste of time _far_ outways the time it took to actually do the test.

"Isn't it ironic?" - No it ain't.

But I agree, it seems the discussion does take up more time than your little test. Also your little test is a fun thing to do, you don't always necessarily find something of much importance. It's a good thing kim shares it with us. In this case he didn't find something that really has an impact on things, but it might be possible that in the future one of his 'silly little tests' finds something that really does have an impact on things. If this was 2-3ms it would have an impact on larger systems.

Things like this are always good to know. Regardless of the impact they have. It would be nice if more people shared their knowledge like Kim does.

Initializing VARs in Loops

Comments