I don't follow what the problem with data.text is. It's a buffer like any other and its up to the programmer to manipulate it to function correctly isn't it? If it overflows and data is lost then that is the programmers fault.
Someone said that create_buffer is a FIFO buffer. But data.text is as well if you read it as it comes in. The docs say use either data.text or create_buffer so I wonder if they are implemented exactly the same and create_buffer is just there for legacy reasons.
Paul
It is not totally accurate to say CREATE_BUFFER is there for legacy. It does work and act differently and was left on purpose.
And it is possible (and in the case of a web page, probable) to get a hunk of larger than 16K in DATA.TEXT before the event technically fires. In otherwords, DATA.TEXT could already have overrun by the time you get 'round to parsing the data. So, the hunk of data you're trying to retrieve from the website might already be lost before you have time to go sifting through it.
I'd ran into this myself many times. I just did my own testing to determine that CREATE_BUFFER seems to work better for Web Page scraping. It was confirmed in our Prog II class a couple weeks back. I didn't know the size limit to DATA.TEXT nor the way it acted when full and that explained the problem.
You are correct in saying that it's up to the programmer and situation. If the web page you're scraping is less than 16K you have no problems.
To me CREATE_BUFFER has alway been kind of a creepy command. You have to put it in the START section and somehow that's never made much sense to me.
To me CREATE_BUFFER has alway been kind of a creepy command. You have to put it in the START section and somehow that's never made much sense to me.
I think the reason for this is that the START section isn't processed until all the memory allocations are made, but also is guaranteed to happen before mainline starts running. Anything else would leave you in a position of not being certain everything is in place that needs to be in place when stuff starts happening.
It is not totally accurate to say CREATE_BUFFER is there for legacy. It does work and act differently and was left on purpose.
And it is possible (and in the case of a web page, probable) to get a hunk of larger than 16K in DATA.TEXT before the event technically fires. In otherwords, DATA.TEXT could already have overrun by the time you get 'round to parsing the data. So, the hunk of data you're trying to retrieve from the website might already be lost before you have time to go sifting through it.
I don't see how this can happen if you are emptying the buffer at appropriate times. Of course if your code in the string event is while (find_string(data.text, '</html>', 1)) then yes you can overrun the buffer but that is true under any circumstances with any size buffer. If you don't read any buffer before new writes you will lose data.
I'd ran into this myself many times. I just did my own testing to determine that CREATE_BUFFER seems to work better for Web Page scraping. It was confirmed in our Prog II class a couple weeks back. I didn't know the size limit to DATA.TEXT nor the way it acted when full and that explained the problem.
You are correct in saying that it's up to the programmer and situation. If the web page you're scraping is less than 16K you have no problems.
How big is the buffer that create_buffer creates? Unless its infinite there will always be a webpage out there that is bigger and will cause problems if you try and slurp entire webpages at a time. I have written FTP programs that can transmit huge files, and done it all with a 256kb buffer so I don't see how a 16kb buffer limits you in any way.
Thanks,
Paul
I wrote a quick webpage on my server that was of known length and hit it using both methods.
How big is the buffer that create_buffer creates? Unless its infinite there will always be a webpage out there that is bigger and will cause problems if you try and slurp entire webpages at a time. I have written FTP programs that can transmit huge files, and done it all with a 256kb buffer so I don't see how a 16kb buffer limits you in any way.
Thanks,
Paul
The buffer is created by you when you declare the buffer in DEFINE_VAR
So, you can make it 100K if you want to or whatever.
DATA.TEXT is not created by you. As I typed this I looked in the Netlinx.axi file and saw that in my case it was actually set to 2K, not even 16K. I suppose one could go in and modify the file and make it bigger. I don't know how this would effect operation at run time, but you could try it and see. If it didn't break it, you could set DATA.TEXT to be whatever you needed it to be as well. I'm not one of those who goes in and does a lot of mods on the Netlinx.axi file. I have my own .axi file for such things.
I also seem to remember that since DATA.TEXT is in the Netlinx.axi it can be a little more 'processor spendy' than CREATE_BUFFER which happens outside the main runtime thread.
I sound like I'm arguing. I'm not intending to. Both methods are just fine as far as I'm concerned. I guess I was just defending CREATE_BUFFER from the accusation that it is archaic and not to be used.
One of the things I've seen over the years that NetLinx has been around is that when it started there seemed to be a general disdain for more Axcess type programming methods when migrating to NetLinx. You'd hear things like, "don't do that in NetLinx. The new method is way better." Well, over time we've seen that some of those legacy keywords were not as bad as we were told they were and are even a benefit over the more modern NetLinx counter parts. The difference in how the processor handles them was in many cases better overall than the newer method for certain applications.
I just like to use them as they benefit performance. They each have their benefits and downsides. I like having both (or more in some cases) ways of getting from A to B. It offeres a lot more flexibility in programming, IMHO...
I don't follow what the problem with data.text is. It's a buffer like any other and its up to the programmer to manipulate it to function correctly isn't it?
It's not like a buffer, doesn't behave like a buffer and doesn't have the same capacity of a buffer. I was merely trying to state that since the max string size that DATA.TEXT can handle is only 2048 there's a chance that scapping a web page may return a single string in one or more string_events longer than DATA.TEXT can handle and the incoming data would be truncated.
a_riot42 wrote:
If it overflows and data is lost then that is the programmers fault.
Well duh! That was the point and if you rely solely on DATA.TEXT w/o appending it to a large VAR or buffer that's one potential problem but even if you do there is the potential of loosing data if incoming string is longer than 2048 because of this DATA.TEXT limitation.
This all may be moot since I ran tests on several different websites as follows:
I started off with Google weather, then RSSweather, then NOAA Weather, MSN.com Homepage and a few others.
Line 4 (20:41:53):: STRING_EVENT DATA.TEXT-1, STRING LENGTH = 1418
Line 5 (20:41:53):: STRING_EVENT BUFFER-1, STRING LENGTH = 1418
Line 6 (20:41:53):: STRING_EVENT DATA.TEXT-2, STRING LENGTH = 1418
Line 7 (20:41:53):: STRING_EVENT BUFFER-2, STRING LENGTH = 1418
Line 8 (20:41:53):: STRING_EVENT DATA.TEXT-3, STRING LENGTH = 1418
Line 9 (20:41:53):: STRING_EVENT BUFFER-3, STRING LENGTH = 1418
Line 10 (20:41:53):: STRING_EVENT DATA.TEXT-4, STRING LENGTH = 1418
Line 11 (20:41:53):: STRING_EVENT BUFFER-4, STRING LENGTH = 1418
Line 12 (20:41:53):: STRING_EVENT DATA.TEXT-5, STRING LENGTH = 1418
Line 13 (20:41:53):: STRING_EVENT BUFFER-5, STRING LENGTH = 1418
Line 14 (20:41:53):: STRING_EVENT DATA.TEXT-6, STRING LENGTH = 1418
Line 15 (20:41:53):: STRING_EVENT BUFFER-6, STRING LENGTH = 1418
Line 16 (20:41:53):: STRING_EVENT DATA.TEXT-7, STRING LENGTH = 1418
Line 17 (20:41:53):: STRING_EVENT BUFFER-7, STRING LENGTH = 1418
Line 18 (20:41:53):: STRING_EVENT DATA.TEXT-8, STRING LENGTH = 1418
Line 19 (20:41:53):: STRING_EVENT BUFFER-8, STRING LENGTH = 1418
Line 20 (20:41:53):: STRING_EVENT DATA.TEXT-9, STRING LENGTH = 1418
Line 21 (20:41:53):: STRING_EVENT BUFFER-9, STRING LENGTH = 1418
Line 22 (20:41:53):: STRING_EVENT DATA.TEXT-10, STRING LENGTH = 1418
Line 23 (20:41:53):: STRING_EVENT BUFFER-10, STRING LENGTH = 1418
Line 24 (20:41:53):: STRING_EVENT DATA.TEXT-11, STRING LENGTH = 1418
Line 25 (20:41:53):: STRING_EVENT BUFFER-11, STRING LENGTH = 1418
Line 26 (20:41:53):: STRING_EVENT DATA.TEXT-12, STRING LENGTH = 1418
Line 27 (20:41:53):: STRING_EVENT BUFFER-12, STRING LENGTH = 1418
Line 28 (20:41:53):: STRING_EVENT DATA.TEXT-13, STRING LENGTH = 1418
Line 29 (20:41:53):: STRING_EVENT BUFFER-13, STRING LENGTH = 1418
Line 30 (20:41:53):: STRING_EVENT DATA.TEXT-14, STRING LENGTH = 1418
Line 31 (20:41:53):: STRING_EVENT BUFFER-14, STRING LENGTH = 1418
Line 32 (20:41:53):: STRING_EVENT DATA.TEXT-15, STRING LENGTH = 1418
Line 33 (20:41:53):: STRING_EVENT BUFFER-15, STRING LENGTH = 1418
Line 34 (20:41:53):: STRING_EVENT DATA.TEXT-16, STRING LENGTH = 1418
Line 35 (20:41:53):: STRING_EVENT BUFFER-16, STRING LENGTH = 1418
Line 36 (20:41:53):: STRING_EVENT DATA.TEXT-17, STRING LENGTH = 1418
Line 37 (20:41:53):: STRING_EVENT BUFFER-17, STRING LENGTH = 1418
Line 38 (20:41:53):: STRING_EVENT DATA.TEXT-18, STRING LENGTH = 1418
Line 39 (20:41:53):: STRING_EVENT BUFFER-18, STRING LENGTH = 1418
Line 40 (20:41:53):: STRING_EVENT DATA.TEXT-19, STRING LENGTH = 1418
Line 41 (20:41:53):: STRING_EVENT BUFFER-19, STRING LENGTH = 1418
Line 42 (20:41:53):: STRING_EVENT DATA.TEXT-20, STRING LENGTH = 1418
Line 43 (20:41:53):: STRING_EVENT BUFFER-20, STRING LENGTH = 1418
Line 44 (20:41:53):: STRING_EVENT DATA.TEXT-21, STRING LENGTH = 1418
Line 45 (20:41:53):: STRING_EVENT BUFFER-21, STRING LENGTH = 1418
Line 46 (20:41:53):: STRING_EVENT DATA.TEXT-22, STRING LENGTH = 1418
Line 47 (20:41:53):: STRING_EVENT BUFFER-22, STRING LENGTH = 1418
Line 48 (20:41:53):: STRING_EVENT DATA.TEXT-23, STRING LENGTH = 1418
Line 49 (20:41:53):: STRING_EVENT BUFFER-23, STRING LENGTH = 1418
Line 50 (20:41:53):: STRING_EVENT DATA.TEXT-24, STRING LENGTH = 1418
Line 51 (20:41:53):: STRING_EVENT BUFFER-24, STRING LENGTH = 1418
Line 52 (20:41:53):: STRING_EVENT DATA.TEXT-25, STRING LENGTH = 1418
Line 53 (20:41:53):: STRING_EVENT BUFFER-25, STRING LENGTH = 1418
Line 54 (20:41:53):: STRING_EVENT DATA.TEXT-26, STRING LENGTH = 1418
Line 55 (20:41:53):: STRING_EVENT BUFFER-26, STRING LENGTH = 1418
Line 56 (20:41:53):: STRING_EVENT DATA.TEXT-27, STRING LENGTH = 1418
Line 57 (20:41:53):: STRING_EVENT BUFFER-27, STRING LENGTH = 1418
Line 58 (20:41:53):: STRING_EVENT DATA.TEXT-28, STRING LENGTH = 1418
Line 59 (20:41:53):: STRING_EVENT BUFFER-28, STRING LENGTH = 1418
Line 60 (20:41:53):: STRING_EVENT DATA.TEXT-29, STRING LENGTH = 1418
Line 61 (20:41:53):: STRING_EVENT BUFFER-29, STRING LENGTH = 1418
Line 62 (20:41:53):: STRING_EVENT DATA.TEXT-30, STRING LENGTH = 1418
Line 63 (20:41:53):: STRING_EVENT BUFFER-30, STRING LENGTH = 1418
Line 64 (20:41:53):: STRING_EVENT DATA.TEXT-31, STRING LENGTH = 1418
Line 65 (20:41:53):: STRING_EVENT BUFFER-31, STRING LENGTH = 1418
Line 66 (20:41:53):: STRING_EVENT DATA.TEXT-32, STRING LENGTH = 1418
Line 67 (20:41:53):: STRING_EVENT BUFFER-32, STRING LENGTH = 1418
Line 68 (20:41:53):: STRING_EVENT DATA.TEXT-33, STRING LENGTH = 1418
Line 69 (20:41:53):: STRING_EVENT BUFFER-33, STRING LENGTH = 1418
Line 70 (20:41:53):: STRING_EVENT DATA.TEXT-34, STRING LENGTH = 1418
Line 71 (20:41:53):: STRING_EVENT BUFFER-34, STRING LENGTH = 1418
Line 72 (20:41:53):: STRING_EVENT DATA.TEXT-35, STRING LENGTH = 1418
Line 73 (20:41:53):: STRING_EVENT BUFFER-35, STRING LENGTH = 1418
Line 74 (20:41:53):: STRING_EVENT DATA.TEXT-36, STRING LENGTH = 1418
Line 75 (20:41:53):: STRING_EVENT BUFFER-36, STRING LENGTH = 1418
Line 76 (20:41:53):: STRING_EVENT DATA.TEXT-37, STRING LENGTH = 1045
Line 77 (20:41:53):: STRING_EVENT BUFFER-37, STRING LENGTH = 1045
Line 4 (20:23:13):: STRING_EVENT DATA.TEXT-1, STRING LENGTH = 1440
Line 5 (20:23:13):: STRING_EVENT BUFFER-1, STRING LENGTH = 1440
Line 6 (20:23:13):: STRING_EVENT DATA.TEXT-2, STRING LENGTH = 1440
Line 7 (20:23:13):: STRING_EVENT BUFFER-2, STRING LENGTH = 1440
Line 8 (20:23:13):: STRING_EVENT DATA.TEXT-3, STRING LENGTH = 1440
Line 9 (20:23:13):: STRING_EVENT BUFFER-3, STRING LENGTH = 1440
Line 10 (20:23:13):: STRING_EVENT DATA.TEXT-4, STRING LENGTH = 1440
Line 11 (20:23:13):: STRING_EVENT BUFFER-4, STRING LENGTH = 1440
Line 12 (20:23:13):: STRING_EVENT DATA.TEXT-5, STRING LENGTH = 1440
Line 13 (20:23:13):: STRING_EVENT BUFFER-5, STRING LENGTH = 1440
Line 14 (20:23:13):: STRING_EVENT DATA.TEXT-6, STRING LENGTH = 1440
Line 15 (20:23:13):: STRING_EVENT BUFFER-6, STRING LENGTH = 1440
Line 16 (20:23:13):: STRING_EVENT DATA.TEXT-7, STRING LENGTH = 1440
Line 17 (20:23:13):: STRING_EVENT BUFFER-7, STRING LENGTH = 1440
Line 18 (20:23:13):: STRING_EVENT DATA.TEXT-8, STRING LENGTH = 1440
Line 19 (20:23:13):: STRING_EVENT BUFFER-8, STRING LENGTH = 1440
Line 20 (20:23:13):: STRING_EVENT DATA.TEXT-9, STRING LENGTH = 1395
Line 21 (20:23:13):: STRING_EVENT BUFFER-9, STRING LENGTH = 1395
Line 22 (20:23:20):: Exiting TCP Read thread - closing this socket for local port 4
Line 4 (20:19:25):: STRING_EVENT DATA.TEXT-1, STRING LENGTH = 1368
Line 5 (20:19:25):: STRING_EVENT BUFFER-1, STRING LENGTH = 1368
Line 6 (20:19:25):: STRING_EVENT DATA.TEXT-2, STRING LENGTH = 1368
Line 7 (20:19:25):: STRING_EVENT BUFFER-2, STRING LENGTH = 1368
Line 8 (20:19:25):: STRING_EVENT DATA.TEXT-3, STRING LENGTH = 1368
Line 9 (20:19:25):: STRING_EVENT BUFFER-3, STRING LENGTH = 1368
Line 10 (20:19:25):: STRING_EVENT DATA.TEXT-4, STRING LENGTH = 1368
Line 11 (20:19:25):: STRING_EVENT BUFFER-4, STRING LENGTH = 1368
Line 12 (20:19:25):: STRING_EVENT DATA.TEXT-5, STRING LENGTH = 1368
Line 13 (20:19:25):: STRING_EVENT BUFFER-5, STRING LENGTH = 1368
Line 14 (20:19:25):: STRING_EVENT DATA.TEXT-6, STRING LENGTH = 1368
Line 15 (20:19:25):: STRING_EVENT BUFFER-6, STRING LENGTH = 1368
Line 16 (20:19:25):: STRING_EVENT DATA.TEXT-7, STRING LENGTH = 1368
Line 17 (20:19:25):: STRING_EVENT BUFFER-7, STRING LENGTH = 1368
Line 18 (20:19:25):: STRING_EVENT DATA.TEXT-8, STRING LENGTH = 1368
Line 19 (20:19:25):: STRING_EVENT BUFFER-8, STRING LENGTH = 1368
Line 20 (20:19:26):: STRING_EVENT DATA.TEXT-9, STRING LENGTH = 1368
Line 21 (20:19:26):: STRING_EVENT BUFFER-9, STRING LENGTH = 1368
Line 22 (20:19:26):: STRING_EVENT DATA.TEXT-10, STRING LENGTH = 788
Line 23 (20:19:26):: STRING_EVENT BUFFER-10, STRING LENGTH = 788
Line 30 (20:11:32):: STRING_EVENT DATA.TEXT-1, STRING LENGTH = 1418
Line 31 (20:11:32):: STRING_EVENT BUFFER-1, STRING LENGTH = 1418
Line 32 (20:11:32):: STRING_EVENT DATA.TEXT-2, STRING LENGTH = 1418
Line 33 (20:11:32):: STRING_EVENT BUFFER-2, STRING LENGTH = 1418
Line 34 (20:11:32):: STRING_EVENT DATA.TEXT-3, STRING LENGTH = 1994
Line 35 (20:11:32):: STRING_EVENT BUFFER-3, STRING LENGTH = 1994
Line 36 (20:11:32):: STRING_EVENT DATA.TEXT-4, STRING LENGTH = 54
Line 37 (20:11:32):: STRING_EVENT BUFFER-4, STRING LENGTH = 54
Line 38 (20:11:32):: STRING_EVENT DATA.TEXT-5, STRING LENGTH = 1994
Line 39 (20:11:32):: STRING_EVENT BUFFER-5, STRING LENGTH = 1994
Line 40 (20:11:32):: STRING_EVENT DATA.TEXT-6, STRING LENGTH = 54
Line 41 (20:11:32):: STRING_EVENT BUFFER-6, STRING LENGTH = 54
Line 42 (20:11:32):: STRING_EVENT DATA.TEXT-7, STRING LENGTH = 1994
Line 43 (20:11:32):: STRING_EVENT BUFFER-7, STRING LENGTH = 1994
Line 44 (20:11:32):: STRING_EVENT DATA.TEXT-8, STRING LENGTH = 54
Line 45 (20:11:32):: STRING_EVENT BUFFER-8, STRING LENGTH = 54
Line 46 (20:11:32):: STRING_EVENT DATA.TEXT-9, STRING LENGTH = 339
Line 47 (20:11:32):: STRING_EVENT BUFFER-9, STRING LENGTH = 339
Line 48 (20:11:32):: STRING_EVENT DATA.TEXT-10, STRING LENGTH = 1418
Line 49 (20:11:32):: STRING_EVENT BUFFER-10, STRING LENGTH = 1418
Line 50 (20:11:32):: STRING_EVENT DATA.TEXT-11, STRING LENGTH = 1418
Line 51 (20:11:32):: STRING_EVENT BUFFER-11, STRING LENGTH = 1418
Line 52 (20:11:32):: STRING_EVENT DATA.TEXT-12, STRING LENGTH = 1418
Line 53 (20:11:32):: STRING_EVENT BUFFER-12, STRING LENGTH = 1418
Line 54 (20:11:32):: STRING_EVENT DATA.TEXT-13, STRING LENGTH = 1418
Line 55 (20:11:32):: STRING_EVENT BUFFER-13, STRING LENGTH = 1418
Line 56 (20:11:32):: STRING_EVENT DATA.TEXT-14, STRING LENGTH = 1418
Line 57 (20:11:32):: STRING_EVENT BUFFER-14, STRING LENGTH = 1418
Line 58 (20:11:32):: STRING_EVENT DATA.TEXT-15, STRING LENGTH = 1418
Line 59 (20:11:32):: STRING_EVENT BUFFER-15, STRING LENGTH = 1418
Line 60 (20:11:32):: STRING_EVENT DATA.TEXT-16, STRING LENGTH = 1418
Line 61 (20:11:32):: STRING_EVENT BUFFER-16, STRING LENGTH = 1418
Line 62 (20:11:32):: STRING_EVENT DATA.TEXT-17, STRING LENGTH = 1418
Line 63 (20:11:32):: STRING_EVENT BUFFER-17, STRING LENGTH = 1418
Line 64 (20:11:32):: STRING_EVENT DATA.TEXT-18, STRING LENGTH = 1418
Line 65 (20:11:32):: STRING_EVENT BUFFER-18, STRING LENGTH = 1418
Line 66 (20:11:32):: STRING_EVENT DATA.TEXT-19, STRING LENGTH = 1418
Line 67 (20:11:32):: STRING_EVENT BUFFER-19, STRING LENGTH = 1418
Line 68 (20:11:32):: STRING_EVENT DATA.TEXT-20, STRING LENGTH = 961
Line 69 (20:11:32):: STRING_EVENT BUFFER-20, STRING LENGTH = 961
As you can see at no time did an incoming string go over the length that DATA.TEXT can handle. So maybe as long as you append it to a large buffer or VAR you will never have any issues. Is it possible for something to return a single chunk of data bigger than 2048 in one triggered event, I don't know and the point was to try and find out.
It's not like a buffer, doesn't behave like a buffer and doesn't have the same capacity of a buffer. I was merely trying to state that since the max string size that DATA.TEXT can handle is only 2048 there's a chance that scapping a web page may return a single string in one or more string_events longer than DATA.TEXT can handle and the incoming data would be truncated.
Mysterious. It seemed like a buffer to me considering its declared like this CHAR TEXT[2048].
Well duh! That was the point and if you rely solely on DATA.TEXT w/o appending it to a large VAR or buffer that's one potential problem but even if you do there is the potential of loosing data if incoming string is longer than 2048 because of this DATA.TEXT limitation.
Sorry for being obvious. When I use data.text, it is only used as a buffer to transfer data from it to another device, another array, or some other data consumer. Since internet packets will be smaller than 2048 bytes, and the firmware likely doesn't send an ACK until the buffer is empty, and that over IP dropped packets are resent, I really can't see how you could lose data using data.text with all the flow control. If you can rig up an example I would love to see it.
Paul
Based on VAV's testing - and knowing that a TCP/IP packet can't be larger than 1500 bytes - I'm guessing that the OS is doing a good amount of work for us - making sure that each TCP/IP packet that comes in generates an event. Since a packet will never be larger than 1500 byes, a 2k size for DATA.TEXT is fine. DATA.TEXT will never overflow - if it did it would mean that AMX screwed up some part of their OS.
As far as buffers of any kind in Netlinx - while they are FIFO when you do things like REMOVE_STRING to them, if they fill with data and aren't cleared out, I've been told the same thing - new data coming in destined for a buffer that is already full gets chucked, unlike how it worked in Axcess.
and knowing that a TCP/IP packet can't be larger than 1500 bytes
Since I wasn't sure if Chip's statement above was just an observation of the previous test or an actual fact so I googled "max TCP packet lenght".
Result:
Definition: The MTU is the maximum size of a single data unit (e.g., a frame) of digital communications. MTU sizes are inherent properties of physical network interfaces, normally measured in bytes. The MTU for Ethernet, for instance, is 1500 bytes. Some types of networks (like Token Ring) have larger MTUs, and some types have smaller MTUs, but the values are fixed for each physical technology.
I also found this reference for TCP and other media MTU's.
Below is a list of Default MTU size for different media.
Network MTU(Bytes)
-----------------------------------
16 Mbit/Sec Token Ring 17914
4 Mbits/Sec Token Ring 4464
FDDI 4352
Ethernet 1500
IEEE 802.3/802.2 1492
X.25 576
*****Added for future reference (Cisco 2960 series switch):
Configurable maximum transmission unit (MTU) of up to 9000 bytes, with a maximum Ethernet frame size of 9018 bytes (Jumbo frames)
for bridging on Gigabit Ethernet ports, and up to 1998 bytes for bridging of Multiprotocol Label Switching (MPLS) tagged frames
on both 10/100 and 10/100/1000 ports
Now there were 3 instances in the tests where a string event recieved 1994 chars in the array.
:
Mysterious. It seemed like a buffer to me considering its declared like this CHAR TEXT[2048].
Actually an array but as an array there are certain limits and requirements to collecting data. First of all as an plain ordanary array you must concatenate data as the data comes in like:
Now if you took that same array and then created a buffer you wouldn't need to concatenate as that is auotmatically done for you and I believe it could then hold up to 65,535 bytes of data.
So an array isn't a buffer but a buffer is an array with special properties.
Since I have a new found respect of DATA.TEXT as a result of this thread I wrote some code for the DATA "STRING_EVENT" that I feel maximizes its potential. This code allows you to set a beginning search string and an ending search string and then collect only the data found between and including the two. Before using DATA.TEXT I would always create a buffer big enough to hold the entire return and then parse when the last of the data was received. I also chose to use a local var array to hold the collected data instead of a buffer just to minimize the scope and avoid the need to size the buffer big enough to contain the entire return since I usually only need a small portion of the return to do my parsing. In order to use the local var and still concantenate stings over the 15999 limit if required I modified the functon "ConcatString" posted by AMX_Jeff a while back to work in this code and it will automatically get used when the data in DATA.TEXT and the previous collected data will total more than 15999.
There's alot of debug functions commented out in the code but when initially setting this up to work on a webpage they should be uncommented and the debug var set for testing.
This will likely need additional flags depending on the webpage and the number of times identical strings are found.
This code is complete so even if you haven't played with web scrapping or IP comms before you can have this set up and running on a master in no time at all.
PROGRAM_NAME='IP_Socket'
(***********************************************************)
(* DEVICE NUMBER DEFINITIONS GO BELOW *)
(***********************************************************)
DEFINE_DEVICE
dvMaster = 0:1:0 ;
dvIPSocket = 0:4:0 ;
dvTP = 10001:1:0 ;
(***********************************************************)
(* CONSTANT DEFINITIONS GO BELOW *)
(***********************************************************)
DEFINE_CONSTANT
TCP = 1 ;
CRLF[2] = {$0D,$0A} ;
DEFINE_CONSTANT //SPECIFIC TO THIS SOCKET CONNECTION!!
MAX_RCV_DATA_LENGTH = 30000 ; //set to the max amount of data to be collected.
CHAR SOCKET_IP_ADDRESS[] = 'www.google.com' ;
CHAR BEGINNING_STRING[] = '<!doctype html>' ;
CHAR ENDING_STRING[] = '>About Google<' ;
DEFINE_VARIABLE //GENERAL VARS
VOLATILE INTEGER nIPSocket_Online = 0 ;
//VOLATILE INTEGER nStrEvtCnt = 1 ; //required only when running debug send_string 0's .
NON_VOLATILE INTEGER nIPSocket_DeBug = 0 ;//non volatile to allow debug to work!
//VOLATILE CHAR cIPSocket_Buff[65535] ; //use for testing .
DEFINE_VARIABLE //CH ARRAY
VOLATILE INTEGER nBtnArry[] =
{
1,//open socket
2,
3,
4,
5
}
DEFINE_FUNCTION fnCONNECT_IPSocket()
{
if(!nIPSocket_Online)
{
ip_client_open (dvIPSocket.Port,SOCKET_IP_ADDRESS,80,TCP) ;
fnIPSocket_DeBug("'IP SOCKET "fnCONNECT_IPSocket" - *Opening IP Socket Port* - line-<',ITOA(__LINE__),'>',crlf") ;
}
RETURN ;
}
DEFINE_FUNCTION fnCLOSE_IPSocket()
{
if(nIPSocket_Online)
{
ip_client_close (dvIPSocket.Port) ;
fnIPSocket_DeBug("'IP SOCKET "fnCLOSE_IPSocket" - *Closing IP Socket Port* - line-<',ITOA(__LINE__),'>',crlf") ;
}
RETURN ;
}
(*MODIFIED*)
DEFINE_FUNCTION CHAR[MAX_RCV_DATA_LENGTH]fnConcatString(CHAR sConcatFinal[],CHAR sConcatNew[])
//modified from original
{
STACK_VAR CHAR sPieces[3];
STACK_VAR CHAR sBuffer[MAX_RCV_DATA_LENGTH];
STACK_VAR LONG lPos, lOldPos;
lpos = 1;
VARIABLE_TO_STRING(sConcatFinal,sBuffer,lPos);
sPieces[1] = sBuffer[lPos-3];
sPieces[2] = sBuffer[lPos-2];
sPieces[3] = sBuffer[lPos-1];
lOldPos = lpos;
lpos = lpos - 3;
VARIABLE_TO_STRING(sConcatNew,sBuffer,lPos);
sBuffer[lOldPos-3] = sPieces[1];
sBuffer[lOldPos-2] = sPieces[2];
sBuffer[lOldPos-1] = sPieces[3];
GET_BUFFER_STRING(sBuffer,3)
RETURN sBuffer ;
}
DEFINE_FUNCTION fnParseRcvdData(CHAR iData[]) //FUNCTION IS EMPTY!! NEEDS CODE!!
{
RETURN ;
}
DEFINE_FUNCTION fnIPSocket_DeBug(CHAR iMsg[])
{
if(nIPSocket_DeBug)
{
SEND_STRING dvMaster,iMsg ;
}
}
DEFINE_FUNCTION fnGetWebPage()
{
SEND_STRING dvIPSocket,"'GET /search?hl=en&q=new+york%2C+ny+weather HTTP/1.1',CRLF" ;
SEND_STRING dvIPSocket,"'Host: ',SOCKET_IP_ADDRESS,CRLF" ;
SEND_STRING dvIPSocket,"CRLF" ;
RETURN ;
}
DEFINE_START
//CREATE_BUFFER dvIPSocket,cIPSocket_Buff ;
DEFINE_EVENT
DATA_EVENT [dvIPSocket]
{
ONLINE:
{
fnGetWebPage() ;
nIPSocket_Online = 1 ;
fnIPSocket_DeBug("'IP SOCKET ONLINE_EVENT - *ONLINE* - line-<',ITOA(__LINE__),'>',crlf") ;
}
STRING:
{
LOCAL_VAR CHAR cRcvdData[MAX_RCV_DATA_LENGTH] ;
LOCAL_VAR INTEGER nCollectData ;
STACK_VAR INTEGER nCombStrLen ;
//fnIPSocket_DeBug("'STRING_EVENT DATA.TEXT-',ITOA(nStrEvtCnt),', STRING LENGTH = ',ITOA(LENGTH_STRING(DATA.TEXT))") ;
SELECT
{(*may need further flags if multiple beginning strings are in the returned data!!*)
ACTIVE(FIND_STRING(DATA.TEXT,BEGINNING_STRING,1) && !nCollectData)://just in case their are duplicate strings
{
STACK_VAR INTEGER nFoundEndStr ;
//fnIPSocket_DeBug("'IP SOCKET STRING_EVENT - *BEGINNING STRING FOUND* - line-<',ITOA(__LINE__),'>',crlf") ;
nCollectData = FIND_STRING(DATA.TEXT,BEGINNING_STRING,1) ;
GET_BUFFER_STRING(DATA.TEXT,nCollectData - 1) ;//remove everything before the beginning string.
cRcvdData = DATA.TEXT ;//add remaining data to array.
(*should not execute unless it's a small section of the HTML being collected*)
if(FIND_STRING(cRcvdData,ENDING_STRING,1)) //make sure you didn't the get the ending string too!
{
//fnIPSocket_DeBug("'IP SOCKET STRING_EVENT - *ENDING STRING FOUND* - line-<',ITOA(__LINE__),'>',crlf") ;
cRcvdData = REMOVE_STRING(DATA.TEXT,ENDING_STRING,1) ;//just add up to and including the ending_string.
fnParseRcvdData(cRcvdData) ;//data collection complete, start parsing!
fnCLOSE_IPSocket() ;//got what we need so close the port! could just wait for server side shutdown!
nCollectData = 0 ; //were done so reset to prevent further collections!
cRcvdData = '' ;//were done so empty and get ready for next time!
nFoundEndStr = 1 ;
}
if(!nFoundEndStr)(*just in case we already got the ending string, don't run*)
{
(*just in case we don't find the ending string later*)
WAIT 100 'RCV DATA TIMEOUT' //timeout in case ending string isn't found.
{
nCollectData = 0 ;//didn't find ENDING_STRING so reset flag!
cRcvdData = '' ;//didn't find last string so dump collected data!
//fnIPSocket_DeBug("'IP SOCKET STRING_EVENT - *ENDING STRING NOT FOUND* - line-<',ITOA(__LINE__),'>',crlf") ;
//fnIPSocket_DeBug("'IP SOCKET STRING_EVENT - *FLAG RESET & COLLECTED DATA DUMPED* - line-<',ITOA(__LINE__),'>',crlf") ;
}
}
}
ACTIVE(FIND_STRING(DATA.TEXT,ENDING_STRING,1) && nCollectData):
{
STACK_VAR CHAR cTmpStr[2048] ;//set above max TCP MTU size (1500). Same max size as DATA.TEXT!
CANCEL_WAIT 'RCV DATA TIMEOUT' ;//ending string found so cancel timeout.
//fnIPSocket_DeBug("'IP SOCKET STRING_EVENT - *ENDING STRING FOUND* - line-<',ITOA(__LINE__),'>',crlf") ;
cTmpStr = REMOVE_STRING(DATA.TEXT,ENDING_STRING,1) ;
nCombStrLen = (length_string(cRcvdData) + length_string(cTmpStr)) ;
if(nCombStrLen < 16000)
{
cRcvdData = "cRcvdData,cTmpStr" ;//add up to and inlude the ending string to array.
}
else
{
cRcvdData = fnConcatString(cRcvdData,cTmpStr) ;
}
fnParseRcvdData(cRcvdData) ;//data collection complete, start parsing!
fnCLOSE_IPSocket() ;//got what we need so close the port! could just wait for server side shutdown!
nCollectData = 0 ; //were done so reset to prevent further collections!
cRcvdData = '' ;//were done so empty and get ready for next time!
}
ACTIVE(nCollectData)://collect data between beginning and ending found strings.
{
nCombStrLen = (length_string(cRcvdData) + length_string(DATA.TEXT)) ;
if(nCombStrLen < 16000)
{
cRcvdData = "cRcvdData,DATA.TEXT" ;//continue adding to array.
}
else
{
cRcvdData = fnConcatString(cRcvdData,DATA.TEXT) ;
}
}
}
//fnIPSocket_DeBug("'STRING_EVENT RCV''D DATA-',ITOA(nStrEvtCnt),', STRING LENGTH = ',ITOA(LENGTH_STRING(cRcvdData))") ;
//nStrEvtCnt ++ ; //only required when running fnIPSocket_DeBug() function in this code block. uncomment in define_var also.
}
OFFLINE:
{
nIPSocket_Online = 0 ;
fnIPSocket_DeBug("'IP SOCKET OFFLINE_EVENT - *OFFLINE* - line-<',ITOA(__LINE__),'>',crlf") ;
}
}
BUTTON_EVENT[dvTP,nBtnArry]
{
PUSH:
{
STACK_VAR INTEGER nBTN ;
nBTN = GET_LAST(nBtnArry) ;
SWITCH(nBTN)
{
CASE 1:
{
fnCONNECT_IPSocket() ;
}
CASE 2:
CASE 3:
CASE 4:
CASE 5:
{
}
}
}
}
(***********************************************************)
(* THE ACTUAL PROGRAM GOES BELOW *)
(***********************************************************)
DEFINE_PROGRAM
(***********************************************************)
(* END OF PROGRAM *)
(* DO NOT PUT ANY CODE BELOW THIS COMMENT *)
(***********************************************************)
I started this post and I am glad of all the responses.I go after data from a website and the size of that site is 110K. How long do you think it should take to parse out a page of this size, 110K? Lets say I was searching for a string of data "The quick brown fox jumps over the lazy dog", and this string was in the middle of this 110K page. How long should it take to process this page finding this data. I tried many different buffer sizes, from 110K, to 65K, to 2K and the speed difference is negligible. How would you approach this? Here is how I do it:
You created a buffer size of 110k and it compiled? I thought the cutoff was 65k...
That's what I thought as well. In fact I believe that's what the instructor told be at P3 (NYC) a month or so ago but then I got an email for this tech note a few days ago:
In my experience, I only use data.text and append the data to a variable if needed. I haven't used CREATE_BUFFER in years and I've not experienced loosing any data. I guess for some reason I feel better about having control on when to buffer and when to not buffer.
It's the IP_CLIENT_CLOSE in your function fnConnectToTwitter(). Calling the close function throws an error initially cuz it's already closed. Error 9 = "already closed". Having that there will definitely screw things up cuz you attempt to open by first closing which will throw the error while the connection is connecting but because it threw the error it's going to launch that function again which starts by closing the connection.
I suppose after a certain amount of time you happen to call the function which calls the close while the port is still open and then its happy and doesn't throw an error and then has a chance to complete the opening.
After another look at your code it looks like all those errors are being caused prior to you wait 20 executing to open the port. It's telling you it's closed and then you tell it to close again and again until the wait ecxecutes and opens the port. Then it's a matter of timing and hopefully it won't open right after an error.
I would get rid of the wait 20 too but add a wait (2-3 minutes) in define_start to call the function maybe set a tracking variable in the online event and reset it in the offline event and then check the variable in the connect function. if not online connect type of thing.
I've been trying to follow this post and get my first web page scraping to work for the yahoo weather, but you guys are lightyears ahead of me! Can we back up a little bit (I mean all the way to the beginning...)? I am looking for the basic functions of an IP connection/web page scrape. This is what I believe I need:
I need devices to be Identified:
dvIPWeather = 0:2:0 //Yahoo Weather API IP Connection
dvTP = 10001:1:0
I need variables:
Optional - a var to set a flag indicating the status of my IP connection
volatile integer nWeatherOnline
Required - a var (buffer) to place the page scrapings into
CONSTANT CHAR dvWeather[10000]
A function or call to start the IP connection
DEFINE_FUNCTION fcnWeather() //Weather polling function
{
IP_CLIENT_OPEN (dvIPWeather.PORT,'weather.yahooapis.com',80,1) ;//yahoo site
}
A Data Event to handle the connect, disconnect, and strings:
I'm not sure why I decided to put a constant variable under define_variable...I think I saw a previous post about doing this. That fixed the compiling errors, but is there a way for me to debug this step by step? I tried viewing the string char in the watch tab, but its length doesn't change from zero. Is there another item I can view in the watch bar for this?
So what about my notifications/debug windows? Is there something I can do to see whether or not an IP connection is actually made/disconnected? When I execute the function I created to run the IP_client_open, I see no notifications other than the button press and release. I have added the DPS (0:2:0 to start, now at 0:3:0)to device notifications.
I have started to figure out the telnet window in netlinx studio, and turned on extended diagnostics. (Man it sure makes it tough when everyone's eating dinner or enjoying themselves in the evening! ;-/ ) I see the following info when I run the IP_Client_Open:
>
(0000390084) Connected Successfully
(0000390087) CIpEvent::OnLine 0:3:1
(0000390154) STRING dvIPWeatherBuff----------
(0000390158) Exiting TCP Read thread - closing this socket for local port 3
(0000390158) STRING dvIPWeatherBuff----------
(0000390159) CIpEvent::OffLine 0:3:1
What is 0:3:1? I specified 0:3:0 in my programming.
It appears that my data event is partially running - I see the STRING portion, but not the ONLINE. If I add the send string data from my online event:
I get the following result in my telnet window when executing:
(0000100735) Connected Successfully
(0000100737) CIpEvent::OnLine 0:3:1
(0000100816) STRING dvIPWeatherBuff----------
(0000100819) Exiting TCP Read thread - closing this socket for local port 3
(0000100820) STRING dvIPWeatherBuff----------
(0000100820) CIpEvent::OffLine 0:3:1
(0000101170) SendString to socket-local port (3) invalid
(0000101171) CIpEvent::OnError 0:3:1
The OnError I added just to see the word error pop up on my touchscreen. I'm assuming that happened because the socked was already closed as I tried to send the string after the initial connection, probably from the double CRLFs.
If I am off-base here and the online is indeed running, shouldn't I be able to print out the info in my buffer? I assume I need to set my buffer equal to data.text.
Now just call "reload" at a button press or time based event... Of course, you will have to develop a parsing algorythm (I will leave that to you), and replace the city code (this is for Cape Town) with your own.
Ok, so it appears that my GET request was incorrect, where I needed to include the full URL length. Is the only way to determine this going to be just "trying" different ways to connect to the website? Can I test this in the Netlinx telnet session at all?
Now that I actually have a buffer full of data (1053 characters to be precise), I am trying to parse it. I have a select..active statement within my STRING: event:
And I want to print the resulting temperature number into my touchpanel's button. Once I have found the string I want (which I am successfully doing), I want to set another char variable (or in the case of temperature, I guess an integer would work better) to the text found. How can I do that without knowing the exact starting location in the string?
Once I have found the string I want (which I am successfully doing), I want to set another char variable (or in the case of temperature, I guess an integer would work better) to the text found. How can I do that without knowing the exact starting location in the string?
FIND_STRING will return an integer specifying the location of the first character of the sequence within the string you are searching you can then use that to do the rest of your logic.
I
n this case, I'm looking for temp=, and specifically the two or three numbers behind the equal sign. So if Find_String returns an integer (say 155) which is the start of the item temp=98, do I use Remove_String to then remove the 98? (that would put the string number starting at character 160, right?)
You could do a remove string to get rid of everything before then and the grab a couple of characters or you could use MID_STRING to extract the characters you are interested in.
I needed a quick break from what I was working on so here's a present:
(**
* Gets the value of the first instance of an XML element attribute within the
* passed string.
*
* @param sString the string to search
* @param sAttribute a string specifying the attribute to look up
* @return a string containing the value of the attribute
*)
DEFINE_FUNCTION CHAR[32] fnGetXMLAttributeValue(CHAR sString[], CHAR sAttribute[]) {
STACK_VAR INTEGER nAttrLocation
STACK_VAR INTEGER nStartIndex
STACK_VAR INTEGER nEndIndex
STACK_VAR CHAR[32] sValue
// Find the location of the attribute
nAttrLocation = FIND_STRING(sString, "' ',sAttribute,'="'", 1)
// Grab the bounding character indexes of the value
nStartIndex = nAttrLocation + LENGTH_STRING(sAttribute) + 3
nEndIndex = LENGTH_STRING(sString) - FIND_STRING(sString, "'"'", nStartIndex)
// Extract the value
sValue = MID_STRING(sString, nStartIndex, nEndIndex - nStartIndex)
RETURN sValue
}
It's completely untested but it should help you out. Just call it like so...
Here's an example of find_string that uses the starting point number that's returned by the function call find_string. In this example the returned format is always the same so I use the pack man approach and gobble up & spit out the data as I parse through it.
stack_var integer nFBS
stack_var integer n
stack_var integer i
if (find_string(iRSSBuf,"'<title>',sCity.city",1))
{// Kill the function around (time) if you prefer 24 hour time format
sRSS.lastupdate = fn24TimeTo12(time)
nFBS = find_string(iRSSBuf,"'<title>',sCity.city",1)
if (nFBS)
{
cRSSTrash = get_buffer_string(iRSSBuf, nFBS + 6)
nFBS = find_string(iRSSBuf,'</title>',1)
sRSS.title = get_buffer_string(iRSSBuf, nFBS - 1)
i ++
}
else
{
sRSS.title = 'N/A'
}
nFBS = find_string(iRSSBuf,'/zipcode/',1)
if (nFBS)
{
cRSSTrash = remove_string(iRSSBuf,'/zipcode/',1)
nFBS = find_string(iRSSBuf,'/',1)
sRSS.rcvdzip = get_buffer_string(iRSSBuf, nFBS - 1)
i ++
}
else
{
sRSS.rcvdzip = 'N/A'
}
nFBS = find_string(iRSSBuf,"'<title>',sCity.city",1)
if (nFBS)
{
cRSSTrash = get_buffer_string(iRSSBuf, nFBS + 6)
nFBS = find_string(iRSSBuf,'</title>',1)
sRSS.overview = get_buffer_string(iRSSBuf, nFBS - 1)
i ++
}
else
{
sRSS.overview = 'N/A'
}
Here's another example using find_string that I use for returns whose order of info may vary depending on the browser used. Here I leave the return intact & copy out the desired data because data didn't always appear in the same order or didn't always contain all the same info. So if the data's there I'll get it, if not I clear the previous returned values and move on.
nFBS = find_string(iHTML_Head,"' HTTP/'",1) ;
if(nFBS)
{
cFileName = GET_BUFFER_STRING(iHTML_Head,nFBS -1) ;
nFBS = find_string(iHTML_Head,"CRLF",1) ;
if(nFBS)
{
cHTTPVersion = GET_BUFFER_STRING(iHTML_Head,nFBS -1) ;
if(cHTTPVersion[1] == "' '")
{
GET_BUFFER_CHAR(cHTTPVersion) ;
}
////////////////////////////////////////////////////////////////////////////////////////////////////
/// FROM HERE ON WE LEAVE THE STRING INTACT AND JUST COPY WHAT WE'RE LOOKING FOR SINCE THE ORDER //
/// OF THE ITEMS WE'RE LOOKING FOR MAY CHANGE DEPENDING ON BROWSER CONNECTED !!!! //
////////////////////////////////////////////////////////////////////////////////////////////////////
nFBS = find_string(iHTML_Head,"'User-Agent: '",1) ;
if(nFBS)
{
nFBS_2 = find_string(iHTML_Head,"CRLF",nFBS + 12) ;
nWebAgent = fnFindUserAgent(MID_STRING(iHTML_Head,nFBS + 12,nFBS_2 - (nFBS + 12))) ;//add one
}
else//leave in in case we set above variable to "LOCAL_VAR" for testing.
{
nWebAgent = WINGENERIC ;//use as default
}
nFBS = find_string(iHTML_Head,"'Referer: '",1) ;
if(nFBS)
{
nFBS_2 = find_string(iHTML_Head,"CRLF",nFBS + 9) ;
cReferer = MID_STRING(iHTML_Head,nFBS + 9,nFBS_2 - (nFBS + 9)) ;//add one
}
else//leave in in case we set above variable to "LOCAL_VAR" for testing.
{
cReferer = '' ;
}
nFBS = find_string(iHTML_Head,"'Content-Type: '",1) ;
if(nFBS)
{
nFBS_2 = find_string(iHTML_Head,"CRLF",nFBS + 14) ;
cContentType = MID_STRING(iHTML_Head,nFBS + 14,nFBS_2 - (nFBS + 14)) ;//add one
}
else//leave in in case we set above variable to "LOCAL_VAR" for testing.
{
cContentType = '' ;
}
nFBS = find_string(iHTML_Head,"'Connection: '",1) ;
if(nFBS)
{
nFBS_2 = find_string(iHTML_Head,"CRLF",nFBS + 12) ;
cConnection = MID_STRING(iHTML_Head,nFBS + 12,nFBS_2 - (nFBS + 12)) ;//add one
}
else//leave in in case we set above variable to "LOCAL_VAR" for testing.
{
cConnection = '' ;
}
nFBS = find_string(iHTML_Head,"'Cache-Control: '",1) ;
Comments
It is not totally accurate to say CREATE_BUFFER is there for legacy. It does work and act differently and was left on purpose.
And it is possible (and in the case of a web page, probable) to get a hunk of larger than 16K in DATA.TEXT before the event technically fires. In otherwords, DATA.TEXT could already have overrun by the time you get 'round to parsing the data. So, the hunk of data you're trying to retrieve from the website might already be lost before you have time to go sifting through it.
I'd ran into this myself many times. I just did my own testing to determine that CREATE_BUFFER seems to work better for Web Page scraping. It was confirmed in our Prog II class a couple weeks back. I didn't know the size limit to DATA.TEXT nor the way it acted when full and that explained the problem.
You are correct in saying that it's up to the programmer and situation. If the web page you're scraping is less than 16K you have no problems.
To me CREATE_BUFFER has alway been kind of a creepy command. You have to put it in the START section and somehow that's never made much sense to me.
I think the reason for this is that the START section isn't processed until all the memory allocations are made, but also is guaranteed to happen before mainline starts running. Anything else would leave you in a position of not being certain everything is in place that needs to be in place when stuff starts happening.
I don't see how this can happen if you are emptying the buffer at appropriate times. Of course if your code in the string event is while (find_string(data.text, '</html>', 1)) then yes you can overrun the buffer but that is true under any circumstances with any size buffer. If you don't read any buffer before new writes you will lose data.
How did you determine this?
How big is the buffer that create_buffer creates? Unless its infinite there will always be a webpage out there that is bigger and will cause problems if you try and slurp entire webpages at a time. I have written FTP programs that can transmit huge files, and done it all with a 256kb buffer so I don't see how a 16kb buffer limits you in any way.
Thanks,
Paul
I wrote a quick webpage on my server that was of known length and hit it using both methods.
The buffer is created by you when you declare the buffer in DEFINE_VAR
So, you can make it 100K if you want to or whatever.
DATA.TEXT is not created by you. As I typed this I looked in the Netlinx.axi file and saw that in my case it was actually set to 2K, not even 16K. I suppose one could go in and modify the file and make it bigger. I don't know how this would effect operation at run time, but you could try it and see. If it didn't break it, you could set DATA.TEXT to be whatever you needed it to be as well. I'm not one of those who goes in and does a lot of mods on the Netlinx.axi file. I have my own .axi file for such things.
I also seem to remember that since DATA.TEXT is in the Netlinx.axi it can be a little more 'processor spendy' than CREATE_BUFFER which happens outside the main runtime thread.
I sound like I'm arguing. I'm not intending to. Both methods are just fine as far as I'm concerned. I guess I was just defending CREATE_BUFFER from the accusation that it is archaic and not to be used.
One of the things I've seen over the years that NetLinx has been around is that when it started there seemed to be a general disdain for more Axcess type programming methods when migrating to NetLinx. You'd hear things like, "don't do that in NetLinx. The new method is way better." Well, over time we've seen that some of those legacy keywords were not as bad as we were told they were and are even a benefit over the more modern NetLinx counter parts. The difference in how the processor handles them was in many cases better overall than the newer method for certain applications.
I just like to use them as they benefit performance. They each have their benefits and downsides. I like having both (or more in some cases) ways of getting from A to B. It offeres a lot more flexibility in programming, IMHO...
a_riot42 wrote: Well duh! That was the point and if you rely solely on DATA.TEXT w/o appending it to a large VAR or buffer that's one potential problem but even if you do there is the potential of loosing data if incoming string is longer than 2048 because of this DATA.TEXT limitation.
This all may be moot since I ran tests on several different websites as follows:
I started off with Google weather, then RSSweather, then NOAA Weather, MSN.com Homepage and a few others. As you can see at no time did an incoming string go over the length that DATA.TEXT can handle. So maybe as long as you append it to a large buffer or VAR you will never have any issues. Is it possible for something to return a single chunk of data bigger than 2048 in one triggered event, I don't know and the point was to try and find out.
Which brings up the question, has anyone ever been screwed by incoming data being truncated because of the DATA.TEXT initialized size?
Mysterious. It seemed like a buffer to me considering its declared like this CHAR TEXT[2048].
Sorry for being obvious. When I use data.text, it is only used as a buffer to transfer data from it to another device, another array, or some other data consumer. Since internet packets will be smaller than 2048 bytes, and the firmware likely doesn't send an ACK until the buffer is empty, and that over IP dropped packets are resent, I really can't see how you could lose data using data.text with all the flow control. If you can rig up an example I would love to see it.
Paul
As far as buffers of any kind in Netlinx - while they are FIFO when you do things like REMOVE_STRING to them, if they fill with data and aren't cleared out, I've been told the same thing - new data coming in destined for a buffer that is already full gets chucked, unlike how it worked in Axcess.
- Chip
Result: I also found this reference for TCP and other media MTU's. Now there were 3 instances in the tests where a string event recieved 1994 chars in the array.
http://amxforums.com/showthread.php?t=3410&highlight=15999
Now if you took that same array and then created a buffer you wouldn't need to concatenate as that is auotmatically done for you and I believe it could then hold up to 65,535 bytes of data.
So an array isn't a buffer but a buffer is an array with special properties.
There's alot of debug functions commented out in the code but when initially setting this up to work on a webpage they should be uncommented and the debug var set for testing.
This will likely need additional flags depending on the webpage and the number of times identical strings are found.
This code is complete so even if you haven't played with web scrapping or IP comms before you can have this set up and running on a master in no time at all.
volatile char strWeatherBuff[16384]
IP_CLIENT_OPEN(dvIP_Weather.PORT,'www.weather.com',80,IP_TCP);
send_string dvIP_Weather,"'GET http://www.weather.com/weather/wxclimatology/monthly/graph/48316?from=36hr_bottomnav_undeclaredHTTP/1.0',13,10,13,10";
DEFINE_FUNCTION parseWeather()
{
nFindLoc = find_string(strTemp,'<title>',1);
if(nFindLoc > 0)
{
... string found search completed
}
STRING:
{
if(length_string(strWeatherBuff) > 0)
{
parseWeather();
}
}
- Chip
That's what I thought as well. In fact I believe that's what the instructor told be at P3 (NYC) a month or so ago but then I got an email for this tech note a few days ago:
http://www.amx.com/techsupport/technote.asp?id=886
which has a line that says:
So I'm just confused again!
In my experience, I only use data.text and append the data to a variable if needed. I haven't used CREATE_BUFFER in years and I've not experienced loosing any data. I guess for some reason I feel better about having control on when to buffer and when to not buffer.
i made it so if the connection fails it will retry. when the system starts up it fails like 15 times and then connects. Does anyone know why this is?
Hear is my telnet log
I suppose after a certain amount of time you happen to call the function which calls the close while the port is still open and then its happy and doesn't throw an error and then has a chance to complete the opening.
After another look at your code it looks like all those errors are being caused prior to you wait 20 executing to open the port. It's telling you it's closed and then you tell it to close again and again until the wait ecxecutes and opens the port. Then it's a matter of timing and hopefully it won't open right after an error.
I would get rid of the wait 20 too but add a wait (2-3 minutes) in define_start to call the function maybe set a tracking variable in the online event and reset it in the offline event and then check the variable in the connect function. if not online connect type of thing.
I've been trying to follow this post and get my first web page scraping to work for the yahoo weather, but you guys are lightyears ahead of me! Can we back up a little bit (I mean all the way to the beginning...)? I am looking for the basic functions of an IP connection/web page scrape. This is what I believe I need:
I need devices to be Identified:
I need variables:
Optional - a var to set a flag indicating the status of my IP connection
Required - a var (buffer) to place the page scrapings into
A function or call to start the IP connection
A Data Event to handle the connect, disconnect, and strings:
And a button press to initiate the function:
After all this, I am getting an error in Netlinx;
Starting NetLinx Compile - Version[2.5.2.20] [09-28-2009 22:42:38]
C:\laptop backup\RobsAMXFiles\JVC\JVC LT37X898 Main.axs
ERROR: C:\laptop backup\RobsAMXFiles\JVC\JVC LT37X898 Main.axs(581): C10540: Illegal expression, node type [-1]
ERROR: C:\laptop backup\RobsAMXFiles\JVC\JVC LT37X898 Main.axs(0): C10541: Illegal operator in expression, node type [-1]
ERROR: (0): C10580: Internal Error: Major system error occurred during code generation
C:\laptop backup\RobsAMXFiles\JVC\JVC LT37X898 Main.axs - 3 error(s), 0 warning(s)
NetLinx Compile Complete [09-28-2009 22:42:39]
What am I missing? I have tried to drag my constant array into my debug window, but its length always stays at zero.
change to:
volatile char dvWeather[10000] ;
What is 0:3:1? I specified 0:3:0 in my programming.
It appears that my data event is partially running - I see the STRING portion, but not the ONLINE. If I add the send string data from my online event:
I get the following result in my telnet window when executing:
The OnError I added just to see the word error pop up on my touchscreen. I'm assuming that happened because the socked was already closed as I tried to send the string after the initial connection, probably from the double CRLFs.
If I am off-base here and the online is indeed running, shouldn't I be able to print out the info in my buffer? I assume I need to set my buffer equal to data.text.
Have a look at this code, and then play around. You will be able to build up on this logic framework, and, maybe, spot your mistakes.
Now just call "reload" at a button press or time based event... Of course, you will have to develop a parsing algorythm (I will leave that to you), and replace the city code (this is for Cape Town) with your own.
Now that I actually have a buffer full of data (1053 characters to be precise), I am trying to parse it. I have a select..active statement within my STRING: event:
And I want to print the resulting temperature number into my touchpanel's button. Once I have found the string I want (which I am successfully doing), I want to set another char variable (or in the case of temperature, I guess an integer would work better) to the text found. How can I do that without knowing the exact starting location in the string?
FIND_STRING will return an integer specifying the location of the first character of the sequence within the string you are searching you can then use that to do the rest of your logic.
I
n this case, I'm looking for temp=, and specifically the two or three numbers behind the equal sign. So if Find_String returns an integer (say 155) which is the start of the item temp=98, do I use Remove_String to then remove the 98? (that would put the string number starting at character 160, right?)
You could do a remove string to get rid of everything before then and the grab a couple of characters or you could use MID_STRING to extract the characters you are interested in.
I needed a quick break from what I was working on so here's a present: It's completely untested but it should help you out. Just call it like so...