Apache2 VHosts and AWStats within a clustered environment? - Debian

This is a discussion on Apache2 VHosts and AWStats within a clustered environment? - Debian ; Hi all, I'm not sure if "clustered" is exactly the phrase I'm looking for here, but nonetheless, here is my issue: We have a "cluster" of web-servers configured using heartbeat as follows (only two servers illustrated for clarity's sake!): ----------- ...

+ Reply to Thread
Results 1 to 6 of 6

Thread: Apache2 VHosts and AWStats within a clustered environment?

  1. Apache2 VHosts and AWStats within a clustered environment?

    Hi all,

    I'm not sure if "clustered" is exactly the phrase I'm looking for here,
    but nonetheless, here is my issue:

    We have a "cluster" of web-servers configured using heartbeat as
    follows (only two servers illustrated for clarity's sake!):

    ----------- -----------
    |Director1| |Director2|
    ----------- -----------
    | \ / |
    | \ / |
    | \ / |
    | \/ |
    | X |
    | / \ |
    | / \ |
    ------------ ------------
    |WebServer1| |Webserver2|
    ------------ ------------


    The servers share a number of vhosts - many of which are dynamic sites
    using PHP/MySQL -for which we need to provide AWstats info for our
    customers.

    Our concern is that as we have individual log files for each VHost on
    each individual webserver, the AWStats information cannot be guaranteed
    to be accurate.

    One of the options that we have discussed is logging all the vhost's
    log files to a central log server via NFS (in order to keep the log
    format) and then have each AWStats instance on each server read the
    logs from the central NFS share when it comes to update the graphs
    etc. The Issue I have found is that a number of people (include Tony
    Mobily in the "Hardening Apache" book) appear to recommend against this.

    The other option that we have spoken about is copying all the log files
    to a central server each day, merging them into a single file, sorting
    the file and then running AWStats against that file before copying all
    the graphs etc back to the webServers. I can see two distinct
    disadvantages to this:

    1) There will be a huge amount of network traffic as the logs are
    shuttled back and forth between the servers
    2) If AWStats is updated from the browser interface, it will only be
    updated on the server that holds the current connection

    I'm sure that someone out there must have been in this situation
    before, so how did you do it? P

    Unfortunately, switching from AWStats to another package is not an
    option. Fortunately, everything else is fair game! )

    Thanks in advance,

    Matt
    --
    Matthew Macdonald-Wallace
    matthew@truthisfreedom.org.uk
    http://www.truthisfreedom.org.uk


    --
    To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
    with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

  2. Re: Apache2 VHosts and AWStats within a clustered environment?

    This one time, at band camp, Matthew Macdonald-Wallace said:
    > I'm sure that someone out there must have been in this situation
    > before, so how did you do it? P


    Switch your log backend to either SQL or something that can export to a
    remote host (syslog, etc) and aggregate the log messages as they come
    in.
    --
    -----------------------------------------------------------------
    | ,''`. Stephen Gran |
    | : :' : sgran@debian.org |
    | `. `' Debian user, admin, and developer |
    | `- http://www.debian.org |
    -----------------------------------------------------------------

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.6 (GNU/Linux)

    iD8DBQFIY7KcSYIMHOpZA44RAvCPAJ0Wr3FDxSKxWj16133tAM HRyPap1gCeMbwd
    4At2eiYryNIGcNqsSuwYjKc=
    =rLrF
    -----END PGP SIGNATURE-----


  3. Re: Apache2 VHosts and AWStats within a clustered environment?

    Hi Matthew,

    forgot to mention:
    http://mod-log-spread2.alioth.debian.org/

    Cheers, Norbert


    --
    To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
    with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

  4. Re: Apache2 VHosts and AWStats within a clustered environment?

    Hi Matthew,

    we tried both variants, NFS and copying logfiles nightly.
    NFS was a performance killer and somewhat unstable then (think concurent
    writes). Copying/merging/sorting several hundred GB logdata nightly ran
    into problems once the volume was too large to be handled within 24
    hours ("daily stats"). Having logs on an NFS host and processsing them
    via NFS (from the web servers, as you suggested) at least doubles
    network traffic, so this never was an option to us even then. But ymmv,
    it could work for small businesses ;-)

    Did you check out the log-pipe feature of Apache?
    http://httpd.apache.org/docs/2.2/logs.html#piped
    You might be able to distribute logging to one ore many central log
    hosts this way. We use a homemade daemon on each web server to
    streamline the piping somewhat (caching, some pre-processing etc.)
    Additional plus: You can accumulate data into near-realtime stats for
    your customers, set up traffic limits, accounting, you name it. Just
    make sure you have sufficient bandwidth so you do not stuff up the lines
    with log traffic. Maybe just add an extra hardware interface for logging
    (and a switch, if this is the bottleneck).
    Then set up as many stats-servers as needed to process the logfiles in
    time and for your customers to grab their graphs.

    Additional tips: Set up your own Aapche logfile format to include the
    vhost name ;-) If your customers are used to find their stats on, say
    theirdomain.com/stats/ just set up a system-wide redirect to your
    stats-server(s). Ah, and: We do not offer on-the-fly awstats generation,
    so that eases the pain somewhat ;-)


    Cheers, Norbert


    --
    To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
    with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

  5. Re: Apache2 VHosts and AWStats within a clustered environment?

    On Thu, Jun 26, 2008 at 03:21:08PM +0100, Matthew Macdonald-Wallace wrote:

    [...]

    > Our concern is that as we have individual log files for each VHost on
    > each individual webserver, the AWStats information cannot be guaranteed
    > to be accurate.
    >
    > One of the options that we have discussed is logging all the vhost's
    > log files to a central log server via NFS (in order to keep the log
    > format) and then have each AWStats instance on each server read the
    > logs from the central NFS share when it comes to update the graphs
    > etc. The Issue I have found is that a number of people (include Tony
    > Mobily in the "Hardening Apache" book) appear to recommend against this.


    Ouch. Yes, not something I'd do.

    > The other option that we have spoken about is copying all the log files
    > to a central server each day, merging them into a single file, sorting
    > the file and then running AWStats against that file before copying all
    > the graphs etc back to the webServers. I can see two distinct
    > disadvantages to this:
    >
    > 1) There will be a huge amount of network traffic as the logs are
    > shuttled back and forth between the servers


    are you aware of logresolvemerge?
    http://awstats.sourceforge.net/docs/...ogresolvemerge
    it's quite useful for this sort of thing, ime.

    Another way might be to rotate off the logfiles hourly, or something,
    and rsync those over (to a central repo), merge, and analyse; that's
    still a bit messy, imo.

    > 2) If AWStats is updated from the browser interface, it will only be
    > updated on the server that holds the current connection


    Disable that "feature", and update, say hourly via cron.


    --
    To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
    with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

  6. Re: Apache2 VHosts and AWStats within a clustered environment?

    Thanks for all your replies.

    After careful consideration, we've gone with mod_log_mysql with a cron
    job to pull the data onto an NFS share once a day from which the nodes
    in the cluster read the log-files and generate the awstats.

    At the moment, it's working fine in the labs, we're going to run
    apache-benchmark on it over the weekend to see if we can break it! )

    Thanks again,

    M.
    --
    Matthew Macdonald-Wallace
    matthew@truthisfreedom.org.uk
    http://www.truthisfreedom.org.uk


    --
    To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
    with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

+ Reply to Thread