Newbie gawk (?) question - Help

This is a discussion on Newbie gawk (?) question - Help ; New Linux user question follows: I have a large file that I need to break into segments that can be processed independently. Each segment is marked by a beginning string and an ending string. I observe that I can use ...

+ Reply to Thread
Results 1 to 3 of 3

Thread: Newbie gawk (?) question

  1. Newbie gawk (?) question

    New Linux user question follows:
    I have a large file that I need to break into segments that can be
    processed independently. Each segment is marked by a beginning string
    and an ending string. I observe that I can use gawk to list each line
    that begins a segment by:
    gawk ‘/beginstring/ {print NR}’ inputfile
    Looking at gawk it would appear that I could use it to output each
    segment into individual files, but the syntax to do this eludes me.
    And, I am not really certain that gawk is the optimum way to do this…
    Can someone offer suggestions / examples?

  2. Re: Newbie gawk (?) question

    On 2006-05-03, FatFree wrote:
    > New Linux user question follows:
    > I have a large file that I need to break into segments that can be
    > processed independently. Each segment is marked by a beginning string
    > and an ending string. I observe that I can use gawk to list each line
    > that begins a segment by:
    > gawk ‘/beginstring/ {print NR}’ inputfile
    > Looking at gawk it would appear that I could use it to output each
    > segment into individual files, but the syntax to do this eludes me.
    > And, I am not really certain that gawk is the optimum way to do this…
    > Can someone offer suggestions / examples?


    You probably want csplit, but you can use gawk, e.g.:

    awk 'BEGIN { file = "filename_" n++ }
    /beginstring/ {
    file = "filename_" n++ }
    { print > file }
    ' inputfile

    --
    Chris F.A. Johnson, author
    Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)
    ===== My code in this post, if any, assumes the POSIX locale
    ===== and is released under the GNU General Public Licence

  3. Re: Newbie gawk (?) question

    Chris F.A. Johnson wrote:
    > On 2006-05-03, FatFree wrote:
    >
    >>New Linux user question follows:
    >>I have a large file that I need to break into segments that can be
    >>processed independently. Each segment is marked by a beginning string
    >>and an ending string. I observe that I can use gawk to list each line
    >>that begins a segment by:
    >>gawk ‘/beginstring/ {print NR}’ inputfile
    >>Looking at gawk it would appear that I could use it to output each
    >>segment into individual files, but the syntax to do this eludes me.
    >>And, I am not really certain that gawk is the optimum way to do this…
    >>Can someone offer suggestions / examples?

    >
    >
    > You probably want csplit, but you can use gawk, e.g.:
    >
    > awk 'BEGIN { file = "filename_" n++ }
    > /beginstring/ {
    > file = "filename_" n++ }
    > { print > file }
    > ' inputfile
    >


    Yes, I found csplit shortly after I hit post, but had to wait a while
    for my post to show up in order to post my "reply".
    In any case, thank you VERY much for the assistance with awk / gawk; I
    need to know how to the sort of thing you illustrate anyway because I
    will undoubtedly need to do more complex things than a split later. A
    real example goes much further in helping me see how its done than the
    man pages.

+ Reply to Thread