Which is faster vs proper
vining
Posts: 4,368
Which incarnation of the below function would be consider proper and which would be more efficient or faster? I understand changing things via functions by reference is frowned upon but in most cases it just seems to make more sense and might be faster but is it?
What I consider the preferred method:
The simpler and possibly faster method:
What I consider the preferred method:
sMyFB.cFB_Tag[nIndx] = fnJSON_TrimLeadChars('"{}[],',sMyFB.cFB_Tag[nIndx]); DEFINE_FUNCTION CHAR[MAX_15999]fnJSON_TrimLeadChars(CHAR iTrimChars[],CHAR iStr[MAX_15999]) {//pass in a string of all the chars you want trimmed from the begining of a string and pass in the string STACK_VAR INTEGER nCheck; STACK_VAR INTEGER nLenT; STACK_VAR INTEGER nLen; STACK_VAR INTEGER i; nCheck++; nLen = LENGTH_STRING(iStr); nLenT = LENGTH_STRING(iTrimChars); WHILE(nCheck && nLen) { nCheck = 0; for(i = nLenT; i; i--) { if(iTrimChars[i] == iStr[1]) { GET_BUFFER_CHAR(iStr); nCheck++; nLen--; BREAK; } } } RETURN iStr; }
The simpler and possibly faster method:
fnJSON_TrimLeadChars('"{}[],',sMyFB.cFB_Tag[nIndx]); DEFINE_FUNCTION nJSON_TrimLeadChars(CHAR iTrimChars[],CHAR iStr[MAX_15999]) {//pass in a string of all the chars you want trimmed from the begining of a string and pass in the string STACK_VAR INTEGER nCheck; STACK_VAR INTEGER nLenT; STACK_VAR INTEGER nLen; STACK_VAR INTEGER i; nCheck++; nLen = LENGTH_STRING(iStr); nLenT = LENGTH_STRING(iTrimChars); WHILE(nCheck && nLen) { nCheck = 0; for(i = nLenT; i; i--) { if(iTrimChars[i] == iStr[1]) { GET_BUFFER_CHAR(iStr); nCheck++; nLen--; BREAK; } } } }is it even worth thinking about? I'm using the 1st method but I'm thinking what's the point of returning the value of itself to itself other than some proper coding etiquette BS.
0
Comments
Think about the following scenario.
vs
Obviously, this is an oversimplified example. There will be use cases where each are applicable (e.g. if you are attempting to return a complex type, which is not allowed in NetLinx). As a rule of thumb, I'll try to make sure that a function or parameter name, as well as the documentation, make it clear it is modifying external state. One thing I religiously adhere to though is if I am dicking with external state, anything being affected is passed in as a parameter. Modifying global state is just plain evil.
In general, avoiding side effects wherever possible helps to make it easier to reason about your logic and construct bug-free code quicker.
Paul
edit: nCheck only increments in the for loop if there's a char match in which case it removes the char and repeats the loop to check the next char but looking at it this morning it might benefit from a break after the nCheck++ and nLen-- so that it stops the for and goes back to the while to check the next char to run the for loop from the beginning otherwise it looks like it will finish the for loop on the next char. Of course letting it finish the for loop could be beneficial and more effecient if it does find something but it won't be going through the entire array of chars to look for and it will repeat when it goes back to the while. I think a break would be proper.
I also have a similar function that goes like this:
What did you do for its counterpart, stripping from the end of a string? Mine looks almost identical to my TrimLeadChar function:
I think GregG's implementation is likely faster and much more readable. I think the only changes I would make is to do less copying and use less memory if possible. IE:
CInput should be cString and it might be faster since it's offloading the bulk of the work to a system function which just do things faster when my function does all the work with out calling in any other functions. They're also not really the same function since mine only trims chars at the beginning of the string where Gregs strips chars out of the entire string. I could use a while loop and use find_string and eliminate the for loop but that's just because find_string will use one instead.
My main concern/question was whether to change the var by reference or return the string to itself. Even netlinx functions do it both ways like set_length_string(var,lenght), it's not var = set_length_string(var,length) so that changes by reference too.
Yes, it should be:
set_length_string alters the string length so it has to modify the string to do so, so passing the array back to the calling function doesn't really make much sense. I don't think there is a hard and fast rule regarding whether to modify arguments or not. It depends on what the function does, and other things. For this type of function, modifying it rather than creating a new string saves 16000 bytes of memory so it can be done that way as well, leaving the option to the caller whether or not a original copy is needed.
Paul
Paul
I've decided that if you're going to return the value you should always copy the incoming string 1st so that you maintain the original otherwise returning the value is pointless and should be done like this similar to set_length_string which is fairly similar in purpose.
I prefer the last version. it's just easier to use and since it's usually accompanied by set_length_string in through out my parser it just looks better.
I had a summer job once at in a materials testing lab, and one of the test machines had a sticker on the side that I have remembered for 25+ years: "One test is worth a thousand expert opinions"
So on that track, I just had to write up a short test program, which is included as attachment here.
I checked the time on 200 loops parsing both normal and worst case strings through the last 2 sets of head and tail functions we posted.
The "normal" string for the test was '"]},{["String Data","More Data"]}'
The "worst case" string was '"]},{[""]},{[""]},{[""]},{["Data"]},{[""]},{[""]},{[""]},{["'
And the character list to trim was '}],"{['
The results (using GET_TIMER deltas) are:
**NI700**
fnhead on normal data using character compares: 173
fntail on normal data using character compares: 165
fnhead on worst case data using character compares: 178
fntail on worst case data using character compares: 172
fnhead on normal data using string function calls: 60
fntail on normal data using string function calls: 87
fnhead on worst case data using string function calls: 63
fntail on worst case data using string function calls: 93
**NX1200**
fnhead on normal data using character compares: 45
fntail on normal data using character compares: 42
fnhead on worst case data using character compares: 46
fntail on worst case data using character compares: 45
fnhead on normal data using string function calls: 55
fntail on normal data using string function calls: 60
fnhead on worst case data using string function calls: 59
fntail on worst case data using string function calls: 65
So your concept runs faster on the new masters, and mine runs faster on the old ones, hehe.
Overall, I'll probably stick with readability, since it will make future maintenance easier (also makes it easier for me to copy and modify into other functions).
Even though I was trying to avoid using find_string it is optimized as your test on the NI-700 showed. A few years ago after someone mentioned this optimization I ran a test between Netlinx's find_string and a version that I wrote that I thought would be comparable to how find_string functions internally and the netlinx find_string blew myfind_string out of the water. Granted my code probably wasn't as efficient as it could be so I kinda want to find it just to see if it could be done better or should I say if I could do it better and try the test again if I found some significant shortcuts.
I am surprised with your test comparing the find_string in the NI vs NX, not big difference and I would have thought the NX would have been much faster. The difference between the char compare and find_string on the NI was huge which makes me wonder what they do to optimize those netlinx functions in the NI's and why it doesn't seem to be a factor in the NX processor.
I think it was Spire_Jeff who first posted "for(i=MAX;i;i--)" with a lot of test data. That was back in the day when the forum was fun and there were a lot of folks wanting to learn, share and willing test their or some one else's theories. I was planning on running some tests too just cuz some times what you think might be faster, isn't. I also think my latest version might throw an error if the string only contains chars to trim so nothing remains in the string. That might happen 1 in a 1000 times so I want to verify and fix if that's the case.
Like you said it's nice to see another persons approach, I never would of thought about using find_string it what I would consider a backwards fashion putting the chars to remove in as the string to search in. In my brain that's backwards but I guess as Spire_Jeff posted several years ago, some times backwards is the better way.
First pre-populate a 255 char boolean flag array with the characters you want to kill using a prep function, then do a character compare via table lookups. Multiple prep tables can be kept around in global vars. The test program is included, but the prep and trim functions look like this:
The results using this method are:
** NI700 **
Normal tests on head total time: 63
Normal tests on tail total time: 63
Worst case tests on head total time: 64
Worst case tests on tail total time: 64
** NX1200 **
Normal tests on head total time: 25
Normal tests on tail total time: 25
Worst case tests on head total time: 25
Worst case tests on tail total time: 25
This really emphasizes the poor string function speed on the NX
Ok, I'm feeling kinda stupid right now and I'm hoping it's your code and not me but I'm not following this logic, I don't see how it's doing anything so maybe I just need a good night's sleep and I can figure it out in the morning.
Now that I just admitted being stupid I think I got it and it's pretty cool the way it works, now I just have to reason what's making it so much faster than my version, visually it appears to be pretty much the same work load except I'm doing an extra comparison each pass and an additional simple math assignment. I never would have thought about making a comparison in this manner, very cool! I just hope others look at it and don't get it at first either.
We have 2 inputs to the algorithm:
The list of strip characters which, we'll say has 'm' items
and
The actual input string which has 'n' items.
In your loop you go through each character in the strip list comparing it to the current character in the input. In a worst case scenario, where the whole input string contains nothing but the last character in the strip list, your algorithm would have to make (n*m) passes in the for loop. This is called 'polynomial time' in algorithmic analysis: https://en.wikipedia.org/wiki/Time_complexity#Polynomial_time
In the version where I pre-populated a lookup table, there is only ever one comparison made for any given input character, so at most it would need to make (n) passes through the loop. Which is 'linear time': https://en.wikipedia.org/wiki/Time_complexity#Linear_time
https://en.wikipedia.org/wiki/Lookup_table
To test and change just drop these vars into the debug window and set nRunTest to 1.
I like the replace character idea, I often replace "CR,LF"s from http results with spaces (then strip the extra spaces) so that I can reformat the text into a nicer display block.
I remember seeing that NI comparisons, I think my short term memory is failing. . Looking at the results changing to a look up table can be anywhere from 2 to 8 times faster depending what's presently being used so it's worth updating to this method in any code that has to do a lot of string manipulations.
Edit: a 255 char table works unless you want to replace a char with a decimal 0 in which case you'd need the 510 char table with the flag set to 1 and then the value of 0 and that index + 255 but I don't ever see a need to change a char to a decimal 0.
Results:
TestCode: (I just dropped this into a running timeline or define_program after a var flag that resets to zero when run and I change that var flag in NS's debug window to 1 to run it)