connecting to a web site
samos
Posts: 106
How do you connected to a web site so that you can retrieve html from it. If I wanted to connect to say www.twitter.com. Do I open and ip_client_open to www.twitter.com. and then what??? Where can I find information on making these type of connections
0
Comments
Theres a lot of discussion on the forum for it. Just seach for web page scraping or html and I"m sure you'll find what you need.
The basics of it are that you have to use IP_CLIENT_OPEN an upon an online event send a spoof of a web browser to the web server. The connection closes right after the reply.
replies come back as data_Events on whichever Netlinx port you setup for comm.
Some things to know.
You probably need to use CREATE_BUFFER instead of DATA.TEXT for this since DATA.Text has a size limit that can catch you on some web pages. The built-in limit is 2K. If you use CREATE_BUFFER you can set the size up quite a bit depending upon the web site. There are ways you can use data.text. I'm sure someone with a burr under their saddle will mention how.
You cannot do some of the functions available on a web browser like flash animation, direct x. SSL, ect... Also websites with a lot of frames can goof things up. If a website changes its format often, you'll drive yourself mad trying to keep up.
Scraping data from a website is just good-ole fashioned string parsing. (Hashing can work too.)
If you're trying to get raw data fro a site (Like local time or temperature or stock quotes or whatever,) you might look into scraping RSS feeds instead. They're built a bit more for our kind of use.
Hope that helps.
Thanks for all of the information. I know how to do just about everything you talked about except how to send a spoof of a web browser to the web server. Does anyone have some example code of how to accomplish it????
In the case of twitter you will probably be interested in having a look around here: http://apiwiki.twitter.com/.
step 1 ip_client_open to www.twitter.com.
what do I send to the web server besides the url request. How do I spoof a browser??
Also http://www.amxforums.com/showthread.php?t=4406 may be of interest to you.
i made it so if the connection fails it will retry. when the system starts up it fails like 15 times and then connects. Does anyone know why this is?
Hear is my telnet log
I was not trying to leave it open. I just tried to open it, but it returned an error code on the onerror event. so I wrote code to try and open it again if it failed to open.
when I ran the code the ip_client_open function failed about 30 times(triggering the onerror event each time) before it finaly fired and connected. I just wanted to know why it failed to open the connection so many times and then finally worked and fired the online event.
You can find the error codes spread throughout the documentation in NetLinx Studio, but to help out here's a list of all of them:
-3: unable to open communication port
-2: invalid value for protocol
-1: invalid server port
2: general failure (out of memory)
4: unknown host
6: connection refused
7: connection timed out
8: unknown connection error
9: port already closed
10: binding error
11: listening error
14: local port already in use
15: UDP socket already listening
16: to many open sockets
17: local port not open
Obviously some of these will never be caused by opening a connection as a client.