Saving a file from inet - TCP-IP

This is a discussion on Saving a file from inet - TCP-IP ; Hi all, Given a file url (internet url) , I need to save it on my hard disk using c++ code. Is there anyway I can do this through pure c++ ? The code needs to be portable on both ...

+ Reply to Thread
Results 1 to 7 of 7

Thread: Saving a file from inet

  1. Saving a file from inet

    Hi all,
    Given a file url (internet url) , I need to save it on my hard
    disk using c++ code. Is there anyway I can do this through pure c++ ?
    The code needs to be portable on both windows and linux. Also, i do
    not wish to use system calls usch as system("wget") etc. Plz guys..
    could someone help me out ? I`m not even able to get a starting point.
    Can it be done using sockets ?

    Regards,
    Venky

  2. Re: Saving a file from inet

    On Jun 17, 3:22*am, venky wrote:
    > Hi all,
    > * * * Given a file url (internet url) , I need to save it on my hard
    > disk using c++ code. Is there anyway I can do this through pure c++ ?
    > The code needs to be portable on both windows and linux. Also, i do
    > not wish to use system calls usch as system("wget") etc. Plz guys..
    > could someone help me out ? I`m not even able to get a starting point.
    > Can it be done using sockets ?


    It can be done using sockets. It cannot be done through pure C++.

    Your best bet is to find a library (like 'libcurl') that provides this
    capability. Otherwise, you'll have to deal with a large number of
    things you're just not ready to deal with. There is no reason for you
    to build a bridge when you just want to drive over one.

    http://curl.haxx.se/libcurl/

    DS

  3. Re: Saving a file from inet

    In article ,
    venky wrote:
    >Hi all,
    > Given a file url (internet url) , I need to save it on my hard
    >disk using c++ code. Is there anyway I can do this through pure c++ ?
    >The code needs to be portable on both windows and linux. Also, i do
    >not wish to use system calls usch as system("wget") etc. Plz guys..
    >could someone help me out ? I`m not even able to get a starting point.
    >Can it be done using sockets ?


    Also have a look at wget...

    http://en.wikipedia.org/wiki/Wget

    alan

  4. Re: Saving a file from inet

    On Tue, 17 Jun 2008 07:04:48 -0700 (PDT), David Schwartz wrote:
    > On Jun 17, 3:22*am, venky wrote:
    >> Hi all,
    >> * * * Given a file url (internet url) , I need to save it on my hard
    >> disk using c++ code. Is there anyway I can do this through pure c++ ?
    >> The code needs to be portable on both windows and linux. Also, i do
    >> not wish to use system calls usch as system("wget") etc. Plz guys..
    >> could someone help me out ? I`m not even able to get a starting point.
    >> Can it be done using sockets ?

    >
    > It can be done using sockets. It cannot be done through pure C++.


    Depends on your (or rather "Venky's") definition of "pure". C++ is one
    of those languages which do not aim to include interfaces to
    everything under the sun. A C++ program which uses sockets is still a
    C++ program.

    > you'll have to deal with a large number of things you're just not
    > ready to deal with.


    Oh yes. And he can read RFC 2616 (HTTP 1.1) to find out why.

    /Jorgen

    --
    // Jorgen Grahn \X/ snipabacken.se> R'lyeh wgah'nagl fhtagn!

  5. Re: Saving a file from inet

    On Jun 17, 11:19 pm, Jorgen Grahn wrote:
    > On Tue, 17 Jun 2008 07:04:48 -0700 (PDT), David Schwartz wrote:
    > > On Jun 17, 3:22 am, venky wrote:
    > >> Hi all,
    > >> Given a file url (internet url) , I need to save it on my hard
    > >> disk using c++ code. Is there anyway I can do this through pure c++ ?
    > >> The code needs to be portable on both windows and linux. Also, i do
    > >> not wish to use system calls usch as system("wget") etc. Plz guys..
    > >> could someone help me out ? I`m not even able to get a starting point.
    > >> Can it be done using sockets ?

    >
    > > It can be done using sockets. It cannot be done through pure C++.

    >
    > Depends on your (or rather "Venky's") definition of "pure". C++ is one
    > of those languages which do not aim to include interfaces to
    > everything under the sun. A C++ program which uses sockets is still a
    > C++ program.
    >
    > > you'll have to deal with a large number of things you're just not
    > > ready to deal with.

    >
    > Oh yes. And he can read RFC 2616 (HTTP 1.1) to find out why.
    >
    > /Jorgen
    >
    > --
    > // Jorgen Grahn > \X/ snipabacken.se> R'lyeh wgah'nagl fhtagn!


    Hi Venky,

    I don't think understanding whole HTTP RFC is needed for a simple
    download of an URL. ;o)
    (you are not writing a browser dude! no hassle of running javascipt
    and POST)

    If you are familiar with socket programming, you can do the following.

    - open a TCP socket
    - break the URL in to host and URI
    i.e if your URL is http://www.techpulp.com/articles/lin...pen-socket.php
    parse for host (www.techpulp.com) and URI (/articles/linux/tip-
    identify-process-with-open-socket.php)

    - connect to the remote host (in this case www.techpulp.com)

    - Frame the GET request and send it to server.
    ---------------
    GET HTTP/1.0\r\n
    \r\n
    ---------------

    in this case, just write the following string to the server. (Remember
    to have only one space character between words in the string)
    "GET /articles/linux/tip-identify-process-with-open-socket.php HTTP/
    1.0\r\n\r\n"

    - Then start reading the data sent by server until server closes the
    connection. the response will be of following format.
    -----------
    HTTP/1.1 200 OK
    Date: Sat, 12 Jul 2008 04:14:22 GMT
    Server: Apache/2.2.3 (Fedora)
    X-Powered-By: PHP/5.1.6
    Content-Length: 2078
    Connection: close
    Content-Type: text/html; charset=UTF-8

    www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">




    Techpulp Technologies - Welcome

    . . . . .

    -----------

    - In the response you received from server, first check if you have
    "200" in the first line. If you see it that means you have received
    valid response.

    - Then search for first occurrence of "\r\n\r\n" in the response you
    received from the server. All the data following first occurrence of
    "\r\n\r\n" is the content of file you have requested.

    You can save it to a local file or use it with in your program
    according to your need.

    Hope this helps.

    - Neo
    Techpulp Technologies
    http://www.techpulp.com/

  6. Re: Saving a file from inet

    On Sat, 12 Jul 2008 11:43:09 -0700 (PDT), Neo - Techpulp wrote:
    > On Jun 17, 11:19 pm, Jorgen Grahn wrote:
    >> On Tue, 17 Jun 2008 07:04:48 -0700 (PDT), David Schwartz wrote:
    >> > On Jun 17, 3:22 am, venky wrote:
    >> >> Hi all,
    >> >> Given a file url (internet url) , I need to save it on my hard
    >> >> disk using c++ code. Is there anyway I can do this through pure c++ ?
    >> >> The code needs to be portable on both windows and linux. Also, i do
    >> >> not wish to use system calls usch as system("wget") etc. Plz guys..
    >> >> could someone help me out ? I`m not even able to get a starting point.


    [...]

    >> > you'll have to deal with a large number of things you're just not
    >> > ready to deal with.

    >>
    >> Oh yes. And he can read RFC 2616 (HTTP 1.1) to find out why.


    ....

    > Hi Venky,
    >
    > I don't think understanding whole HTTP RFC is needed for a simple
    > download of an URL. ;o)
    > (you are not writing a browser dude! no hassle of running javascipt
    > and POST)


    Javascript is not part of HTTP. POST is fairly often needed to get
    something useful (so often that wget(1) has options to set it) but he
    might not need it for his purposes.

    > If you are familiar with socket programming, you can do the following.
    >
    > - open a TCP socket
    > - break the URL in to host and URI
    > i.e if your URL is http://www.techpulp.com/articles/lin...pen-socket.php
    > parse for host (www.techpulp.com) and URI (/articles/linux/tip-identify-process-with-open-socket.php)


    I think you use the term "URI" incorrectly here, and you also do not
    take into account many variations, for example http://example.com/foo#bar.

    > - connect to the remote host (in this case www.techpulp.com)
    >
    > - Frame the GET request and send it to server.
    > ---------------
    > GET HTTP/1.0\r\n
    > \r\n
    > ---------------
    >
    > in this case, just write the following string to the server. (Remember
    > to have only one space character between words in the string)
    > "GET /articles/linux/tip-identify-process-with-open-socket.php HTTP/
    > 1.0\r\n\r\n"


    Surely that will break if the server uses virtual hosting, e.g if the
    IP address you talk to hosts both http://foo.example.com/articles/ and
    http://bar.example.com/articles/ ?

    > - Then start reading the data sent by server until server closes the
    > connection. the response will be of following format.

    [snip response example]

    Lots of things can vary here, like character set and transfer
    encoding, and probably half a dozen things I do not know about.
    (You can get a 3xx redirection response which you might want to
    handle, and so on ...)

    > - In the response you received from server, first check if you have
    > "200" in the first line. If you see it that means you have received
    > valid response.


    This breaks if a server decides to use any other code in 200--299,
    which also mark a successful response.

    > - Then search for first occurrence of "\r\n\r\n" in the response you
    > received from the server. All the data following first occurrence of
    > "\r\n\r\n" is the content of file you have requested.
    >
    > You can save it to a local file or use it with in your program
    > according to your need.
    >
    > Hope this helps.


    It *does* help, probably -- it shows that it's easy to throw something
    together quickly, and have it work some/most of the time.

    But in my humble opinion, you are doing "venky" a disservice by
    pointing him/her past the RFC, without pointing out that this is only
    useful for quick hacks or use in limited situations. "venky" didn't
    really specify how robust the solution needed to be.

    When a specification is freely available and readable, like the HTTP
    spec, it is a waste not to read it when implementing it -- if not to
    implement it correctly, then at least to know which parts one has
    botched!

    /Jorgen

    --
    // Jorgen Grahn \X/ snipabacken.se> R'lyeh wgah'nagl fhtagn!

  7. Re: Saving a file from inet

    On Jul 12, 11:43*am, Neo - Techpulp wrote:

    > Hi Venky,
    >
    > I don't think understanding whole HTTP RFC is needed for a simple
    > download of an URL. ;o)
    > (you are not writing a browser dude! no hassle of running javascipt
    > and POST)
    >
    > If you are familiar with socket programming, you can do the following.

    [snip]

    Except that won't work with any server that uses name-based virtual
    hosting. Oh well, hope that's not important.

    DS

+ Reply to Thread