Sorting text alphabetically
mhermanns
Posts: 32
Has anyone a good (fast) function for sorting text (min. 100 names) alphabetically?
0
Comments
http://www.iti.fh-flensburg.de/lang/algorithmen/sortieren/shell/shellen.htm
Here is a Bubble Sort routine. I?m sure there are faster sorts but this is the only one I?ve ever learned to write and it?s been fast enough for my purposes. Perhaps it will be fast enough for you.
The sort function will alphabetize by last name and then by first name.
-> http://www.iti.fh-flensburg.de/lang...ell/shellen.htm
This is definitely a very fast algorithm, but i can't test it right now (no hardware). I will post it, when i'am sure that it works fine (next week?).
In case i forget it, please send me a mail...
Here is an implementation of Quicksort (see http://linux.wku.edu/~lamonml/algor/sort/quick.html for info). It is recursive so don't use it on Netlinx to sort 10'000 items...
The function DoSort sorts an array of integers (myArray). The function Sort is the one doing the work.
Note that if you're sorting "large" data sets, swapping elements can be CPU intensive so you're better off sorting an index (and the reason why the sort routine below operates on an integer array). That is, the data is in an array of large structs Data with elements 1..N. You create an integer array DataSorted giving the position of the element in Data (initially, the indexes are the same so DataSorted = i), and you "sort this array" (and access the data in sorted fashion using Data[DataSorted]); that is, the sorted array gives you the sequences of indexes in Data to get Data in sorted fashion (another advantage is that you can maintain multiple sort orders in parallel).
If you want to do that you need to change in Sort the two comparisons: aInt[iHi] > tempInt AND aInt[iLo] <= tempInt with equivalent comparisons on Data (f.e. Data[aInt[iHi]].last_name > Data[tempInt].last_name), or create functions to compare elements together if you want to sort by last, first name and then by area code and phone number.
Very fast. On my old NE master I need 6 ticks (GET_TIMER units) to sort 200 elements.
HTH
Fred
Looks good the only thing I would change, to speed it up, is to not check the sorted array elements again and again. See the "y=x + 1" in the second for loop.
Instead of your proposed second loop of:
FOR (y=x + 1; y<nMaxProfiles; y++) {
I think it should be:
FOR (y=1; y<=nMaxProfiles-x; y++) {
And yes that will speed things up since it doesn?t need to check the sorted elements more than once. Thanks for pointing it out.
I suppose I could have made it potentially event quicker if I added a flag to test if a swap was made in the second loop because if no swaps were made then we can break out since the sort must be complete.
I also bumped up the number of records to 100 and randomized the population of the data.
I then added a timeline to track execution time as closely as I could come up with in the last couple minutes. These are the results I got from an NI-3000 running nothing more than this code.
Run Number\ Indexed | String
1 | 1989ms | 4029ms
2 | 1978ms | 4121ms
3 | 1951ms | 3896ms
4 | 1967ms | 4038ms
I also checked a completely sorted list and got these numbers:
Indexed : 1606ms
String: 2670ms
I would appreciate somebody checking my time measurements as well as the methods I'm using for timing. As I type this up, I could have probably set the timeline times to be very large to minimize timeline events from happening during the sort, but I'm lazy, so someone else can do that if they like
Have fun,
Jeff
DO NOT USE LONG_WHILE in Netlinx code. I think they talk about it in the help file stating that it should not be used in an event. Based on this experience...DO NOT EVER USE IT
Anyway, here are the new times:
250 data elements
String Sort: 28369ms
Indexed Sort: 12275ms
Shell (Pratt): 3577ms
Shell (mhermanns): 2370ms
There wasn't a big difference between running the Papernov-Stasevic, Sedgewick, and the mhermanns column number break down. The Pratt column numbers did take a little longer.
Here are the time on a presorted list:
250 data elements
String Sort: 17322ms
Indexed Sort: 10036ms
Shell (Pratt): 1733ms
Shell (mhermanns): 591ms
I think it's clear which algorithm is the fastest
Here is the newest version of code I was using that has all three functions. They are not what I would call polished and complete, but they should give a fairly good jump to anyone that needs to use them.
You can use the debug window to change between the 4 different column number sets I have in the code if you want to play around with 4 different sets of numbers for testing purposes.
Have fun,
Jeff
I'd still like to be able to make a decent simple web browser for the touch panels. But I can't.
Anyone know if any of the other processors are supposed to be faster or slower?
Jeff
Not undertanding this. Everyday I curse Netlinx for not being object oriented. Trying to create complex systems without being able to say, pass structures to modules, or use polymorphism or abstraction is just a huge drag, and would be hugely beneficial in this kind of work where there are many devices that do basically the same thing (power on , power off, play, stop, pause etc). I would like to be able to do this:
dvCD.play()
dvDVD.play()
dvIpod.play()
and have the right thing happen regardless of what the device is.
I also miss not being able to have functions that take variable arguments like Java or C++ like this:
define_function switch(integer input, integer output) {}
define_function switch(integer input[], integer output) {}
so that I don't have to rename the functions if the argument is slightly different.
If they had at least given us pointers then we would have a fighting chance, but no ponters either, sheesh.
Oh how I pine for OO.
You can write in Java if you want. That's what DUET is all about. No?
Seriously, I started down the Duet route but realized that it would require AMX training to really understand how they set things up. Plus the Java IDE they make you use I don't like (Eclipsed). I have paid for my own Java IDE (Jetbrains) and can't use it for Duet which really burns me. I wish that they had set it up so that you could add Java code into a netlinx program by calling a lib function similar to calling native functions from Java.
Paul
From memory, AMX suggest that the NIx100 series run code about three times faster than the NIx000's. From your figures it seems like the reality is closer to 2.5 times.
So of the currently available controllers, the 2100, 3100 and 4100's will run code about 2.5x faster than the 700's and 900's.
Merry Christmas peoples *<:-)