Home AMX User Forum AMX General Discussion

connecting to a web site

How do you connected to a web site so that you can retrieve html from it. If I wanted to connect to say www.twitter.com. Do I open and ip_client_open to www.twitter.com. and then what??? Where can I find information on making these type of connections

Comments

  • ericmedleyericmedley Posts: 4,177
    samos wrote: »
    How do you connected to a web site so that you can retrieve html from it. If I wanted to connect to say www.twitter.com. Do I open and ip_client_open to www.twitter.com. and then what??? Where can I find information on making these type of connections

    Theres a lot of discussion on the forum for it. Just seach for web page scraping or html and I"m sure you'll find what you need.

    The basics of it are that you have to use IP_CLIENT_OPEN an upon an online event send a spoof of a web browser to the web server. The connection closes right after the reply.

    replies come back as data_Events on whichever Netlinx port you setup for comm.

    Some things to know.

    You probably need to use CREATE_BUFFER instead of DATA.TEXT for this since DATA.Text has a size limit that can catch you on some web pages. The built-in limit is 2K. If you use CREATE_BUFFER you can set the size up quite a bit depending upon the web site. There are ways you can use data.text. I'm sure someone with a burr under their saddle will mention how. :D

    You cannot do some of the functions available on a web browser like flash animation, direct x. SSL, ect... Also websites with a lot of frames can goof things up. If a website changes its format often, you'll drive yourself mad trying to keep up.

    Scraping data from a website is just good-ole fashioned string parsing. (Hashing can work too.)

    If you're trying to get raw data fro a site (Like local time or temperature or stock quotes or whatever,) you might look into scraping RSS feeds instead. They're built a bit more for our kind of use.

    Hope that helps.
  • samossamos Posts: 106
    Eric,

    Thanks for all of the information. I know how to do just about everything you talked about except how to send a spoof of a web browser to the web server. Does anyone have some example code of how to accomplish it????
  • PhreaKPhreaK Posts: 966
    Rather than scraping pages that are presented to the greater unwashed, you will find that a lot of higher profile 'community' sites (this twitter, flickr, facebook etc) have API's for nicer communication with computers. You will still have to use ip sockets within your AMX system to communicate but all the unessecary crud will already be filtered out and you will have much nicer and more efficient communication. You will also be protected against changes to the site UI as API's by nature (should) remain consistent.

    In the case of twitter you will probably be interested in having a look around here: http://apiwiki.twitter.com/.
  • samossamos Posts: 106
    I have read the twitter API and even wrote come C++ code to get data from Twitter. I just want to know how to connect to the site with AMX and send the URL requests.

    step 1 ip_client_open to www.twitter.com.

    what do I send to the web server besides the url request. How do I spoof a browser??
  • PhreaKPhreaK Posts: 966
    To 'spoof' a browser you need to set your 'User-Agent' string in your request header. Within HTTP all communication is just ASCII strings. Check out http://www.httpviewer.net to help visualize what communication actually takes place with different sites.

    Also http://www.amxforums.com/showthread.php?t=4406 may be of interest to you.
  • DHawthorneDHawthorne Posts: 4,584
    There isn't a simple answer to this question, because what you need to do varies with the site you are connecting to. The simplest of sites only requires you to connect, then send a sting with GET and two cr/lf pairs. Other sites require header information, like the aforementioned user_agent; you will likely need login credentials as well. Best bet is to look up HTML protocol and get the basics there, then run a packet sniffer (Wireshark is a decent free one) while connecting with a browser to catch what specifics your site requires. Twitter may have a published API so you can fore-go the tedious packet sniffing stage (I would be srprised if they didn't actually, but finding it may be another matter).
  • samossamos Posts: 106
    ok hear is my code for just a connection

    i made it so if the connection fails it will retry. when the system starts up it fails like 15 times and then connects. Does anyone know why this is?


    PROGRAM_NAME='temp'
    (***********************************************************)
    (*  FILE_LAST_MODIFIED_ON: 09/08/2009  AT: 13:45:56        *)
    (***********************************************************)
    
    DEFINE_DEVICE
    dvTwitter     = 0:3:0
    vdvTwitter    = 0:4:0
    
    dvTP          = 10001:1:0
    DEFINE_FUNCTION integer fnConnectToTwitter()
    {
        SEND_STRING 0, 'GET TWITTER FEED'
        ip_client_close(dvTwitter.Port)
        wait 20
        {	
    	ip_client_open(dvTwitter.Port,'www.twitter.com',80,IP_TCP)
        }
    }
    (* EXAMPLE: DEFINE_FUNCTION <RETURN_TYPE> <NAME> (<PARAMETERS>) *)
    (* EXAMPLE: DEFINE_CALL '<NAME>' (<PARAMETERS>) *)
    
    (***********************************************************)
    (*                STARTUP CODE GOES BELOW                  *)
    (***********************************************************)
    DEFINE_START
    send_string 0, 'START'
    fnConnectToTwitter()
    (***********************************************************)
    (*                THE EVENTS GO BELOW                      *)
    (***********************************************************)
    DEFINE_EVENT
    BUTTON_EVENT[dvTP,1]
    {
        push:
        {
    	fnConnectToTwitter()
        }
    }
    DATA_EVENT[dvTwitter]
    {
        onerror:
        {
    	send_string 0,"'error: client=',ITOA(Data.Number)"
    	fnConnectToTwitter()
        }
        online:
        {
    	send_string 0,"'online: client'"
        }
        offline:
        {
    	send_string 0,"'offline: client'"
        }
        string:
        {    
    	send_string 0,"'string: client=',Data.Text"
        }
    }
    (***********************************************************)
    (*            THE ACTUAL PROGRAM GOES BELOW                *)
    (***********************************************************)
    DEFINE_PROGRAM
    
    (***********************************************************)
    (*                     END OF PROGRAM                      *)
    (*        DO NOT PUT ANY CODE BELOW THIS COMMENT           *)
    (***********************************************************)
    


    Hear is my telnet log

    (0000053409) CIpEvent::OnError 0:3:1
    (0000053410) error: client=9
    (0000053410) GET TWITTER FEED
    (0000053411) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053412) CIpEvent::OnError 0:3:1
    (0000053413) error: client=9
    (0000053414) GET TWITTER FEED
    (0000053415) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053416) CIpEvent::OnError 0:3:1
    (0000053417) error: client=9
    (0000053418) GET TWITTER FEED
    (0000053419) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053422) CIpEvent::OnError 0:3:1
    (0000053423) error: client=9
    (0000053424) GET TWITTER FEED
    (0000053425) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053426) CIpEvent::OnError 0:3:1
    (0000053427) error: client=9
    (0000053427) GET TWITTER FEED
    (0000053428) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053429) CIpEvent::OnError 0:3:1
    (0000053430) error: client=9
    (0000053431) GET TWITTER FEED
    (0000053432) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053433) CIpEvent::OnError 0:3:1
    (0000053434) error: client=9
    (0000053435) GET TWITTER FEED
    (0000053436) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053437) CIpEvent::OnError 0:3:1
    (0000053438) error: client=9
    (0000053438) GET TWITTER FEED
    (0000053439) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053440) CIpEvent::OnError 0:3:1
    (0000053441) error: client=9
    (0000053442) GET TWITTER FEED
    (0000053443) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053444) CIpEvent::OnError 0:3:1
    (0000053445) error: client=9
    (0000053446) GET TWITTER FEED
    (0000053446) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053448) CIpEvent::OnError 0:3:1
    (0000053449) error: client=9
    (0000053449) GET TWITTER FEED
    (0000053450) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053451) CIpEvent::OnError 0:3:1
    (0000053452) error: client=9
    (0000053453) GET TWITTER FEED
    (0000053454) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053455) CIpEvent::OnError 0:3:1
    (0000053457) error: client=9
    (0000053458) GET TWITTER FEED
    (0000053459) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053460) CIpEvent::OnError 0:3:1
    (0000053461) error: client=9
    (0000053461) GET TWITTER FEED
    (0000053462) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053464) CIpEvent::OnError 0:3:1
    (0000053465) error: client=9
    (0000053465) GET TWITTER FEED
    (0000053466) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053467) CIpEvent::OnError 0:3:1
    (0000053468) error: client=9
    (0000053469) GET TWITTER FEED
    (0000053470) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053471) CIpEvent::OnError 0:3:1
    (0000053472) error: client=9
    (0000053472) GET TWITTER FEED
    (0000053473) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053474) CIpEvent::OnError 0:3:1
    (0000053475) error: client=9
    (0000053476) GET TWITTER FEED
    (0000053477) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053478) CIpEvent::OnError 0:3:1
    (0000053479) error: client=9
    (0000053480) GET TWITTER FEED
    (0000053481) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053482) CIpEvent::OnError 0:3:1
    (0000053483) error: client=9
    (0000053483) GET TWITTER FEED
    (0000053484) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053486) CIpEvent::OnError 0:3:1
    (0000053487) error: client=9
    (0000053488) GET TWITTER FEED
    (0000053489) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053490) CIpEvent::OnError 0:3:1
    (0000053491) error: client=9
    (0000053492) GET TWITTER FEED
    (0000053492) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053494) CIpEvent::OnError 0:3:1
    (0000053494) error: client=9
    (0000053495) GET TWITTER FEED
    (0000053496) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053497) CIpEvent::OnError 0:3:1
    (0000053498) error: client=9
    (0000053499) GET TWITTER FEED
    (0000053500) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053501) CIpEvent::OnError 0:3:1
    (0000053502) error: client=9
    (0000053503) GET TWITTER FEED
    (0000053503) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053505) CIpEvent::OnError 0:3:1
    (0000053506) error: client=9
    (0000053507) GET TWITTER FEED
    (0000053508) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053509) CIpEvent::OnError 0:3:1
    (0000053510) error: client=9
    (0000053511) GET TWITTER FEED
    (0000053512) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053513) CIpEvent::OnError 0:3:1
    (0000053514) error: client=9
    (0000053514) GET TWITTER FEED
    (0000053515) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053516) CIpEvent::OnError 0:3:1
    (0000053517) error: client=9
    (0000053518) GET TWITTER FEED
    (0000053519) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053520) CIpEvent::OnError 0:3:1
    (0000053521) error: client=9
    (0000053522) GET TWITTER FEED
    (0000053523) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053524) CIpEvent::OnError 0:3:1
    (0000053525) error: client=9
    (0000053525) GET TWITTER FEED
    (0000053526) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053527) CIpEvent::OnError 0:3:1
    (0000053528) error: client=9
    (0000053529) GET TWITTER FEED
    (0000053530) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053531) CIpEvent::OnError 0:3:1
    (0000053532) error: client=9
    (0000053533) GET TWITTER FEED
    (0000053534) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053535) CIpEvent::OnError 0:3:1
    (0000053536) error: client=9
    (0000053537) GET TWITTER FEED
    (0000053538) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053539) CIpEvent::OnError 0:3:1
    (0000053540) error: client=9
    (0000053541) GET TWITTER FEED
    (0000053542) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053543) CIpEvent::OnError 0:3:1
    (0000053544) error: client=9
    (0000053544) GET TWITTER FEED
    (0000053545) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053546) CIpEvent::OnError 0:3:1
    (0000053547) error: client=9
    (0000053548) GET TWITTER FEED
    (0000053549) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053550) CIpEvent::OnError 0:3:1
    (0000053551) error: client=9
    (0000053552) GET TWITTER FEED
    (0000053553) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053554) CIpEvent::OnError 0:3:1
    (0000053555) error: client=9
    (0000053556) GET TWITTER FEED
    (0000053556) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053558) CIpEvent::OnError 0:3:1
    (0000053559) error: client=9
    (0000053559) GET TWITTER FEED
    (0000053560) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053561) CIpEvent::OnError 0:3:1
    (0000053562) error: client=9
    (0000053563) GET TWITTER FEED
    (0000053564) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053565) CIpEvent::OnError 0:3:1
    (0000053566) error: client=9
    (0000053567) GET TWITTER FEED
    (0000053568) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053569) CIpEvent::OnError 0:3:1
    (0000053570) error: client=9
    (0000053571) GET TWITTER FEED
    (0000053572) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053573) CIpEvent::OnError 0:3:1
    (0000053574) error: client=9
    (0000053575) GET TWITTER FEED
    (0000053576) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053577) CIpEvent::OnError 0:3:1
    (0000053578) error: client=9
    (0000053579) GET TWITTER FEED
    (0000053580) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053581) CIpEvent::OnError 0:3:1
    (0000053582) error: client=9
    (0000053582) GET TWITTER FEED
    (0000053583) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053584) CIpEvent::OnError 0:3:1
    (0000053585) error: client=9
    (0000053586) GET TWITTER FEED
    (0000053587) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053588) CIpEvent::OnError 0:3:1
    (0000053589) error: client=9
    (0000053590) GET TWITTER FEED
    (0000053591) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053592) CIpEvent::OnError 0:3:1
    (0000053593) error: client=9
    (0000053594) GET TWITTER FEED
    (0000053595) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053596) CIpEvent::OnError 0:3:1
    (0000053597) error: client=9
    (0000053597) GET TWITTER FEED
    (0000053598) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053599) CIpEvent::OnError 0:3:1
    (0000053600) error: client=9
    (0000053601) GET TWITTER FEED
    (0000053602) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053604) CIpEvent::OnError 0:3:1
    (0000053605) error: client=9
    (0000053606) GET TWITTER FEED
    (0000053606) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053608) CIpEvent::OnError 0:3:1
    (0000053609) error: client=9
    (0000053609) GET TWITTER FEED
    (0000053610) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053611) CIpEvent::OnError 0:3:1
    (0000053612) error: client=9
    (0000053613) GET TWITTER FEED
    (0000053614) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053615) CIpEvent::OnError 0:3:1
    (0000053616) error: client=9
    (0000053617) GET TWITTER FEED
    (0000053618) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053619) CIpEvent::OnError 0:3:1
    (0000053620) error: client=9
    (0000053621) GET TWITTER FEED
    (0000053622) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053623) CIpEvent::OnError 0:3:1
    (0000053625) error: client=9
    (0000053625) GET TWITTER FEED
    (0000053626) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053627) CIpEvent::OnError 0:3:1
    (0000053628) error: client=9
    (0000053629) GET TWITTER FEED
    (0000053630) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053631) CIpEvent::OnError 0:3:1
    (0000053632) error: client=9
    (0000053633) GET TWITTER FEED
    (0000053634) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053635) CIpEvent::OnError 0:3:1
    (0000053636) error: client=9
    (0000053636) GET TWITTER FEED
    (0000053638) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053640) CIpEvent::OnError 0:3:1
    (0000053641) error: client=9
    (0000053641) GET TWITTER FEED
    (0000053642) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053643) CIpEvent::OnError 0:3:1
    (0000053644) error: client=9
    (0000053645) GET TWITTER FEED
    (0000053646) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053647) CIpEvent::OnError 0:3:1
    (0000053648) error: client=9
    (0000053649) GET TWITTER FEED
    (0000053650) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053651) CIpEvent::OnError 0:3:1
    (0000053652) error: client=9
    (0000053652) GET TWITTER FEED
    (0000053653) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053656) CIpEvent::OnError 0:3:1
    (0000053657) error: client=9
    (0000053658) GET TWITTER FEED
    (0000053665) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053717) CIpEvent::OnError 0:3:1
    (0000053727) error: client=9
    (0000053727) GET TWITTER FEED
    (0000053728) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053729) CIpEvent::OnError 0:3:1
    (0000053730) error: client=9
    (0000053743) GET TWITTER FEED
    (0000053744) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053745) CIpEvent::OnError 0:3:1
    (0000053746) error: client=9
    (0000053747) GET TWITTER FEED
    (0000053748) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053750) CIpEvent::OnError 0:3:1
    (0000053751) error: client=9
    (0000053752) GET TWITTER FEED
    (0000053753) CIpSocketMan::ProcessPLPacket - Socket Already Closed
    (0000053755) Connected Successfully
    (0000053757) CIpEvent::OnError 0:3:1
    (0000053758) error: client=9
    (0000053759) GET TWITTER FEED
    (0000053761) Exiting TCP Read thread - closing this socket for local port 3
    (0000053762) CIpEvent::OnLine 0:3:1
    (0000053764) online: client
    (0000053765) CIpEvent::OffLine 0:3:1
    (0000053767) offline: client
    (0000053914) IPDeviceDetector.run(): joined multicast group
    (0000054364) Memory Available = 5572148 <18812>
    (0000055764) Connected Successfully
    (0000055766) CIpEvent::OnLine 0:3:1
    (0000055767) online: client
    (0000056364) Memory Available = 5551284 <20864>
    (0000295650) Exiting TCP Read thread - closing this socket for local port 3
    (0000295651) CIpEvent::OffLine 0:3:1
    (0000295652) offline: client
    
  • DHawthorneDHawthorne Posts: 4,584
    HTTP connections are designed to open, send your data, get a response, then immediately disconnect. Remember, it was designed for browsers, where someone might open a page and let it sit for unknown periods of time before clicking a link .... you can't tie up the server for slow readers or people who left it open to answer the phone or go on vacation. You have to send all your login information on the online event, and parse what was returned in the offline event. You don't open a connection and leave it open.
  • samossamos Posts: 106
    Dave

    I was not trying to leave it open. I just tried to open it, but it returned an error code on the onerror event. so I wrote code to try and open it again if it failed to open.



    when I ran the code the ip_client_open function failed about 30 times(triggering the onerror event each time) before it finaly fired and connected. I just wanted to know why it failed to open the connection so many times and then finally worked and fired the online event.
  • PhreaKPhreaK Posts: 966
    The error you are getting (error 9) is coming form your IP_CLIENT_CLOSE statement. Error nine is 'port already closed'. By calling fnConnectToTwitter() on your onerror event it will keep calling your connect function, which in turn will create another error. To combat this you will need to do a bit of logic to make sure you are only trying to reconnect on certain errors. I've also found in the past that it can help giving the master a second or two to collapse the connection before re-opening it.

    You can find the error codes spread throughout the documentation in NetLinx Studio, but to help out here's a list of all of them:
    -3: unable to open communication port
    -2: invalid value for protocol
    -1: invalid server port
    2: general failure (out of memory)
    4: unknown host
    6: connection refused
    7: connection timed out
    8: unknown connection error
    9: port already closed
    10: binding error
    11: listening error
    14: local port already in use
    15: UDP socket already listening
    16: to many open sockets
    17: local port not open

    Obviously some of these will never be caused by opening a connection as a client.
  • viningvining Posts: 4,368
    Ah, so this is what Dave was talking about on this other thread. http://www.amxforums.com/showthread.php?t=4406
Sign In or Register to comment.