Wireshark - post-processing capture files. - Security

This is a discussion on Wireshark - post-processing capture files. - Security ; Hi All, This may be off-topic. If it is please accept my apologies and point me in the right direction! I have been using Wireshark for packet capture to detect abuse of my employers systems. I can quite easily scroll ...

+ Reply to Thread
Results 1 to 8 of 8

Thread: Wireshark - post-processing capture files.

  1. Wireshark - post-processing capture files.

    Hi All,

    This may be off-topic. If it is please accept my apologies and point me
    in the right direction!

    I have been using Wireshark for packet capture to detect abuse of my
    employers systems. I can quite easily scroll through the capture files
    in Wireshark to view the information I want, primarily requested URLs
    and email contents.

    Unfortunately, this is a slow process.

    Does anybody know of any tools that will parse the capture files? I have
    been limiting capture to 20MB files.

    What I would ideally like to do is create a list of websites visited,
    format of searches, submitted information, etc. Also, if at all
    possible, dump each individual email to a seperate text file for each one...

    Does anybody know of a tool or tools to achieve this?

    Many thanks,

    Boggy.

  2. Re: Wireshark - post-processing capture files.

    On Sat, 26 May 2007, in the Usenet newsgroup comp.os.linux.security, in article
    , Bogwitch wrote:

    >This may be off-topic. If it is please accept my apologies and point me
    >in the right direction!


    Might be more appropriate in comp.security.misc - but why not answer it
    here.

    >I have been using Wireshark for packet capture to detect abuse of my
    >employers systems. I can quite easily scroll through the capture files
    >in Wireshark to view the information I want, primarily requested URLs
    >and email contents.


    Whoa...

    >NNTP-Posting-Host: 86.29.54.240


    Is this _legal_ to do so in the UK? Be sure to check with the company
    legal advisors. In most cases I'm aware of, the employees need to be
    aware BEFORE HAND that monitoring may/will occur, and that the network
    is only to be used for company business. Once that warning is well
    known, you are usually allowed to monitor and take actions based on
    the discovered content. But check with that legal advisor first.

    That said, I usually just use simple scripts to find the data needed.
    You have the entire packets? Depending on the options you passed to
    Wireshark (I actually use LBL 'tcpdump' to capture the entire packets)
    I use 'cut' to snarf the IP headers to get some indication of the
    destination IPs, then 'sort +N' (where N is the field containing the
    destination IP. This tends to identify where the employees are going,
    and which company systems are being used.

    >Unfortunately, this is a slow process.


    That's what computers are for

    >Does anybody know of any tools that will parse the capture files? I have
    >been limiting capture to 20MB files.


    No - I'm not joking. awk, cut, grep, sed, tr, and a bunch of pipes to
    connect output to input. If you aren't used to using these tools, see
    the Grendel's fine LDP Guide at http://tldp.org/guides.html

    * Advanced Bash-Scripting Guide
    version: 4.3
    author: Mendel Cooper,
    last update: Apr 2007

    This document is both a tutorial and a reference on shell scripting
    with Bash. It assumes no previous knowledge of scripting or
    programming, but progresses rapidly toward an intermediate/advanced
    level of instruction. The exercises and heavily-commented examples
    invite active reader participation. Still, it is a work in progress.
    The intention is to add much supplementary material in future
    updates to this document, as it evolves into a comprehensive book
    that matches or surpasses any of the shell scripting manuals in
    print.

    >What I would ideally like to do is create a list of websites visited,


    easy enough

    >format of searches, submitted information, etc.


    a bit more scripting, but still pretty easy

    >Also, if at all possible, dump each individual email to a seperate text
    >file for each one...


    Again, relatively easy - BUT watch the legal problems.

    Old guy

  3. Re: Wireshark - post-processing capture files.

    Hey Moe,

    Moe Trin wrote:
    > On Sat, 26 May 2007, in the Usenet newsgroup comp.os.linux.security, in article
    > , Bogwitch wrote:
    >
    >> This may be off-topic. If it is please accept my apologies and point me
    >> in the right direction!

    >
    > Might be more appropriate in comp.security.misc - but why not answer it
    > here.


    :-) Thanks!

    >> I have been using Wireshark for packet capture to detect abuse of my
    >> employers systems. I can quite easily scroll through the capture files
    >> in Wireshark to view the information I want, primarily requested URLs
    >> and email contents.

    >
    > Whoa...
    >
    >> NNTP-Posting-Host: 86.29.54.240

    >
    > Is this _legal_ to do so in the UK? Be sure to check with the company
    > legal advisors. In most cases I'm aware of, the employees need to be
    > aware BEFORE HAND that monitoring may/will occur, and that the network
    > is only to be used for company business. Once that warning is well
    > known, you are usually allowed to monitor and take actions based on
    > the discovered content. But check with that legal advisor first.


    All users on the LAN sign to say they have read and agree to the User
    Security Operating Procedures. Monitoring is covered in there. The LAN
    is for business use only - there is a separate Internet network that is
    allowed to be used for personal use and I monitor on that, too. However,
    there is a different level of logging as I am checking for different
    things. User agreement is required for that LAN, too!

    > That said, I usually just use simple scripts to find the data needed.
    > You have the entire packets? Depending on the options you passed to
    > Wireshark (I actually use LBL 'tcpdump' to capture the entire packets)
    > I use 'cut' to snarf the IP headers to get some indication of the
    > destination IPs, then 'sort +N' (where N is the field containing the
    > destination IP. This tends to identify where the employees are going,
    > and which company systems are being used.


    I need the entire packets. I am checking to ensure certain types of
    confidential data are not being emailed off site. Sure, I could drop the
    logging of inbound http traffic but that's pretty much all....
    I've got so used to the flashy GUI that wireshark presents, I'd not even
    considered tcpdump. Very remiss of me!

    >> Unfortunately, this is a slow process.

    >
    > That's what computers are for


    Which is why I wanted to automate the process....

    >> Does anybody know of any tools that will parse the capture files? I have
    >> been limiting capture to 20MB files.

    >
    > No - I'm not joking. awk, cut, grep, sed, tr, and a bunch of pipes to
    > connect output to input. If you aren't used to using these tools, see
    > the Grendel's fine LDP Guide at http://tldp.org/guides.html


    I'm not an experienced user of awk and sed, I have never really got my
    head round regex. I'm guessing, from what I've read, they would be the
    most effective of the tools listed - time for some man page reading!

    >> What I would ideally like to do is create a list of websites visited,

    >
    > easy enough


    >> format of searches, submitted information, etc.

    >
    > a bit more scripting, but still pretty easy


    Any hints? Examples anywhere?

    >> Also, if at all possible, dump each individual email to a seperate text
    >> file for each one...

    >
    > Again, relatively easy - BUT watch the legal problems.


    Like I said, I'm not worried about the legal problems of capture/ audit.
    We could have much bigger legal problems if inappropriate information
    leaks out of our network!

    Thanks for your input, it has made me think about what I'm capturing and
    I suspect the overall quantity of data that I'm capturing is excessive
    and would reduce the size of the problem if I planned it a little better!

    Thanks,

    Bogwitch.

  4. Re: Wireshark - post-processing capture files.

    Bogwitch wrote:

    > All users on the LAN sign to say they have read and agree to the User
    > Security Operating Procedures. Monitoring is covered in there. The LAN
    > is for business use only - there is a separate Internet network that is
    > allowed to be used for personal use and I monitor on that, too. However,
    > there is a different level of logging as I am checking for different
    > things. User agreement is required for that LAN, too!


    Fair enough.

    >> That said, I usually just use simple scripts to find the data needed.
    >> You have the entire packets? Depending on the options you passed to
    >> Wireshark (I actually use LBL 'tcpdump' to capture the entire packets)
    >> I use 'cut' to snarf the IP headers to get some indication of the
    >> destination IPs, then 'sort +N' (where N is the field containing the
    >> destination IP. This tends to identify where the employees are going,
    >> and which company systems are being used.

    >
    > I need the entire packets. I am checking to ensure certain types of
    > confidential data are not being emailed off site. Sure, I could drop the
    > logging of inbound http traffic but that's pretty much all....
    > I've got so used to the flashy GUI that wireshark presents, I'd not even
    > considered tcpdump. Very remiss of me!


    Fly-In-Ointment: How do you know that they won't email it offsite via a
    TLS-SMTP session - or via an SSL webmail service? Or just walk out with it
    on a keyring or printout?

    However: What my last place of work did was:

    We had a linux box capturing the first 300-odd bytes of every ingress and
    egress packet via a mirror tap on the main feed into the department. That
    was done with two tcpdumps and a script for rotation and it worked very
    reliably.

    In the case of problems, that was generally enough data to prove badness. It
    was also possible to log it for a couple of weeks (gig link).

    The box was also a suitable place to have network load monitoring (eg iftop)
    installed.

    Cheers

    Tim

  5. Re: Wireshark - post-processing capture files.

    On Sun, 27 May 2007, in the Usenet newsgroup comp.os.linux.security, in article
    <9Ze6i.2113$qD.433@newsfe6-win.ntli.net>, Bogwitch wrote:

    >Moe Trin wrote:


    >> But check with that legal advisor first.


    >All users on the LAN sign to say they have read and agree to the User
    >Security Operating Procedures. Monitoring is covered in there. The LAN
    >is for business use only - there is a separate Internet network that is
    >allowed to be used for personal use and I monitor on that, too. However,
    >there is a different level of logging as I am checking for different
    >things. User agreement is required for that LAN, too!


    OK - as you may be well aware, there are all kinds of nasty legal issues
    that _can_ be involved. Good on the separate network - we've had that
    one in place since the early 90s exactly for the same reason. The systems
    are owned by the employee association, and are located in break areas.
    Monitoring is less needed due to peer monitoring by fellow employees. I
    haven't heard of any abuse issues resulting. Those systems also lack
    removable media capabilities (no floppy, CD, DVD, USB, etc.) to further
    quell temptations.

    >I need the entire packets. I am checking to ensure certain types of
    >confidential data are not being emailed off site.


    Yes on the "whole packet" - evidentiary requirements. If the data is
    to sensitive, then there shouldn't be an Internet connection. Some one
    has posted "The best firewall is two inches of air." and I know of some
    installations that use them - along with armed guards who are ordered
    to shoot _WITHOUT_ warning. Yet they have had their little (some say
    "not so little") security lapses where critical information has been
    lost, stolen and/or destroyed.

    If the main stuff they are going after is web pages, another solution
    is to use a web proxy server, and monitor it's logs. Your firewall
    needs then to block _all_ web access (some say all access) except by
    the proxy server. This also can be sold as "a good point" by the
    increase in average download speeds (and you can also wipe all the
    advertisements - giving another increase in speed).

    >Sure, I could drop the logging of inbound http traffic but that's
    >pretty much all.... I've got so used to the flashy GUI that wireshark
    >presents, I'd not even considered tcpdump. Very remiss of me!


    I can't speak to "Wireshark" - I had looked to the earlier text versions
    called "Ethereal" and "Tethereal" - but these pre-parsed the packets to
    much for a simple scripting type of process.

    >> No - I'm not joking. awk, cut, grep, sed, tr, and a bunch of pipes to
    >> connect output to input. If you aren't used to using these tools, see
    >> the Grendel's fine LDP Guide at http://tldp.org/guides.html

    >
    >I'm not an experienced user of awk and sed, I have never really got my
    >head round regex. I'm guessing, from what I've read, they would be the
    >most effective of the tools listed - time for some man page reading!


    That 'Advanced Bash Scripting Guide' is a jewel of a book. If you want to
    look at what might be an easier starting point, there is also the

    -rw-rw-r-- 1 gferg ldp 31540 Jul 27 2000 Bash-Prog-Intro-HOWTO

    which should be hidden in someplace like /usr/share/HOWTO/ (or your
    favorite search engine can find it).

    No matter which capture tool you use, you should have a basic knowledge
    of the format of the packets. For this, the RFCs are a good start, though
    a text book like W. Richard Stevens classic "TCP/IP Illustrated Volume 1"
    (Addison Wesley, ISBN 0-201-93346-9, 1994, 1996 at least, 576 pages,
    $EXPENSIVE - see if your lending library has a copy).

    0768 User Datagram Protocol. J. Postel. August 1980. (Format: TXT=5896
    bytes) (Also STD0006) (Status: STANDARD)

    0791 Internet Protocol. J. Postel. September 1981. (Format: TXT=97779
    bytes) (Obsoletes RFC0760) (Updated by RFC1349) (Also STD0005)
    (Status: STANDARD)

    0792 Internet Control Message Protocol. J. Postel. September 1981.
    (Format: TXT=30404 bytes) (Obsoletes RFC0777) (Updated by RFC0950)
    (Also STD0005) (Status: STANDARD)

    0793 Transmission Control Protocol. J. Postel. September 1981.
    (Format: TXT=172710 bytes) (Updated by RFC3168) (Also STD0007)
    (Status: STANDARD)

    1349 Type of Service in the Internet Protocol Suite. P. Almquist. July
    1992. (Format: TXT=68949 bytes) (Obsoleted by RFC2474) (Updates
    RFC1248, RFC1247, RFC1195, RFC1123, RFC1122, RFC1060, RFC0791)
    (Status: PROPOSED STANDARD)

    3168 The Addition of Explicit Congestion Notification (ECN) to IP. K.
    Ramakrishnan, S. Floyd, D. Black. September 2001. (Format: TXT=170966
    bytes) (Obsoletes RFC2481) (Updates RFC2474, RFC2401, RFC0793)
    (Status: PROPOSED STANDARD)

    Those RFCs are available via any search engine.

    A minor problem is that the header sizes are not "fixed" (both TCP and
    IP have a 20 octet header, but there may be up to 20 extra octets [in
    steps of four]) for optional parameters, like timestamps and so on. The
    individual header size is given in the second nibble of the IP header,
    and the 13th nibble (nibble = half a byte = 4 bits) of the TCP header,
    but the header length is sized in 32 bit words (legal values 0x05 to 0x0A)
    which is cause for some fun in scripting. Ethereal/wireshark can make
    this a lot simpler, at the cost of increasing the size of the data to
    be parsed through. But using the -A and -B options to 'grep' can clean
    that up pretty quickly. Hmmm, have you looked at 'editcap'?
    http://www.wireshark.org/docs/man-pages/editcap.html Thinking a bit
    further, I can see indications on google of Ethereal and Wireshark log
    parsing applications. Some time spent at google may also be useful.

    >Any hints? Examples anywhere?


    Regular expressions aren't _as_ important in this case. I'm tried to
    find some Ethereal/Wireshark output (I don't have either installed
    here), but recall that is parses it out for you, putting convenient
    (searchable) headers like "Source Address" and "Source Port" (as well
    as "Destination*"). I'd grep for those words, using the -A and -B options
    to get the "needed" lines before/after that point, piping the result to
    'grep -v' to eliminate unneeded lines. I should say that I would be
    making multiple passes through the data - first to see who is talking
    to who (source and destination IP and destination port number) and
    perhaps a time hack. Another pass would look only for specific time hacks
    and parsing more of the packet to see what (in the event of a 'GET'
    command to a web server) pages or what-ever.

    >Like I said, I'm not worried about the legal problems of capture/ audit.
    >We could have much bigger legal problems if inappropriate information
    >leaks out of our network!


    Some of this is a "trust" issue (if you don't trust your employees, then
    they shouldn't _be_ your employees). You talk of legal problems if there
    were data leaks - then this really goes back to perhaps there shouldn't be
    Internet access (perhaps other than isolated public PCs).

    >Thanks for your input, it has made me think about what I'm capturing and
    >I suspect the overall quantity of data that I'm capturing is excessive
    >and would reduce the size of the problem if I planned it a little better!


    For _legal_ reasons, you really do need to capture the entire connection.
    That's standard evidence requirements. On the _other_ hand, there is no
    need to inspect every bit of the datastream. If 192.0.2.10 is constantly
    hitting the pr0n sites, that's one problem that can be quickly cured. If
    192.0.2.22 is occasionally sending encrypted packets to an IP address
    allocated to the Peruvian Defense Intelligence Agency, that might be
    another. If 192.0.2.45 is frequently hitting http://www.bbc.co.uk, that
    may be of little interest and can be ignored. But you can use common
    search tools to see what the interesting packets contain _after_ you
    identify that this (or that) traffic is worth looking at more closely.

    Old guy

  6. Re: Wireshark - post-processing capture files.

    On Sun, 27 May 2007, in the Usenet newsgroup comp.os.linux.security, in article
    <4659d68d$0$647$5a6aecb4@news.aaisp.net.uk>, Tim S wrote:

    >Bogwitch wrote:
    >
    >> All users on the LAN sign to say they have read and agree to the User
    >> Security Operating Procedures. Monitoring is covered in there. The LAN
    >> is for business use only - there is a separate Internet network that is
    >> allowed to be used for personal use and I monitor on that, too.


    >Fly-In-Ointment: How do you know that they won't email it offsite via a
    >TLS-SMTP session - or via an SSL webmail service?


    Because use of encrypted traffic is _relatively_ rare, and is going to
    sites that aren't so suspicious. Personal mail? Sorry, that's banned,
    and encrypted traffic to such sites raise flags, if not outright alarms.
    For example, we block access to known "residential" address blocks, so
    that you can't ssh into your home system from work.

    >Or just walk out with it on a keyring or printout?


    That is harder to detect or prevent (although this is facility
    dependent - the employee has agreed to random searches as part of the
    general employment agreement here).

    >We had a linux box capturing the first 300-odd bytes of every ingress and
    >egress packet via a mirror tap on the main feed into the department. That
    >was done with two tcpdumps and a script for rotation and it worked very
    >reliably.


    Depends on the situation. We're monitoring traffic in/out of the entire
    facility which is around 2000 employees. Corporate also does monitoring,
    though I'm not sure how much. We're a bit on the paranoid side, as this
    is an R&D facility.

    >In the case of problems, that was generally enough data to prove badness.
    >It was also possible to log it for a couple of weeks (gig link).


    Disk space is cheap - that's not been a problem for several years now.

    Old guy

  7. Re: Wireshark - post-processing capture files.

    On Mon, 28 May 2007, in the Usenet newsgroup comp.os.linux.security, in article
    <465a7463$0$647$5a6aecb4@news.aaisp.net.uk>, Tim S wrote:

    >Moe Trin wrote:


    >> Tim S wrote:


    >>> Bogwitch wrote:


    >>>> The LAN is for business use only - there is a separate Internet
    >>>> network that is allowed to be used for personal use and I monitor
    >>>> on that, too.


    >>> Fly-In-Ointment: How do you know that they won't email it offsite via a
    >>> TLS-SMTP session - or via an SSL webmail service?


    >> Because use of encrypted traffic is _relatively_ rare,

    >
    >You sure about that? What about every webmail account like yahoo, gmail etc.


    Maybe your company depends on webmail - ours sure as hell does not, and
    anyone attempting to do company business using yahoo, gmail, or similar
    is going to have a new employment opportunity in minutes after such
    idiocy is detected.

    >> Personal mail? Sorry, that's banned, and encrypted traffic to such
    >> sites raise flags, if not outright alarms. For example, we block
    >> access to known "residential" address blocks, so that you can't ssh
    >> into your home system from work.

    >
    >Fair enough


    That's the key - yahoo/gmail/et al. may use encrypted traffic, but they
    are not something one would/should be using for business.

    >We had 1000 workstations - and as I say a 1 gig link. The machine which
    >was IIRC a dual Xeon box with two gig nics for sniffing plus a nic for
    >it's "normal presence" wasn't strained. You shouldn't have much trouble
    >up-scaling that to do who packets if you really wanted.


    Like many, we've got mixed media from 10 megabit up through gigabit fiber,
    but the choke points, such as the link to other divisions, are slow
    enough that even a classic Pentium is adequate.

    Old guy

  8. Re: Wireshark - post-processing capture files.

    On May 26, 4:57 am, Bogwitch wrote:
    > Hi All,
    >
    > This may be off-topic. If it is please accept my apologies and point me
    > in the right direction!
    >
    > I have been using Wireshark for packet capture to detect abuse of my
    > employers systems. I can quite easily scroll through the capture files
    > in Wireshark to view the information I want, primarily requested URLs
    > and email contents.
    >
    > Unfortunately, this is a slow process.
    >
    > Does anybody know of any tools that will parse the capture files? I have
    > been limiting capture to 20MB files.
    >
    > What I would ideally like to do is create a list of websites visited,
    > format of searches, submitted information, etc. Also, if at all
    > possible, dump each individual email to a seperate text file for each one...
    >


    > Does anybody know of a tool or tools to achieve this?
    >


    You are definitely using the wrong tool. Wireshark is too low level.
    You should install a proxy web server (squid was written by the same
    folks that gave us the Internet). Not only can you monitor web usage
    but set policies, block sites, etc.

    -Ramon



+ Reply to Thread