sed/awk question - SCO

This is a discussion on sed/awk question - SCO ; Hello SCO folks, Is there a way to take the following lines: 100001 ASCII_STRING 1000 999999 ASCII STRING 99999 and have output: 100001 1000 999999 99999 ^^^^^------ 1000 - 99999 Acceptable 4 to 5 digits only ^^^^^^------------- 100001 - 999999 ...

+ Reply to Thread
Results 1 to 4 of 4

Thread: sed/awk question

  1. sed/awk question

    Hello SCO folks,

    Is there a way to take the following lines:
    100001 ASCII_STRING 1000
    999999 ASCII STRING 99999
    and have output:
    100001 1000
    999999 99999
    ^^^^^------ 1000 - 99999 Acceptable 4 to 5 digits only
    ^^^^^^------------- 100001 - 999999 Acceptable Only 6 digits only

    *ALL* non-numeric characters must be ignored:
    123456 ASCII STRING[* space " , # ~ ! etc ]55555[* space " , # ~ ! etc ]
    should end up with:
    123456 55555

    PS:
    123456 ASCII_STRING, 555.222.1111 (phone numbers should be ignored)
    123456 ASCII STRING 555-222-1111 (phone numbers should be ignored)

    I got it as far as the top example, but trying to 'set' the line is
    not reliable with all the senerios. Any feedback is appreciated.

    - Jeff H

  2. Re: sed/awk question

    On Sun, Feb 24, 2008, Jeff Hyman wrote:
    >Hello SCO folks,
    >
    >Is there a way to take the following lines:
    >100001 ASCII_STRING 1000
    >999999 ASCII STRING 99999
    >and have output:
    >100001 1000
    >999999 99999
    > ^^^^^------ 1000 - 99999 Acceptable 4 to 5 digits only
    >^^^^^^------------- 100001 - 999999 Acceptable Only 6 digits only
    >
    >*ALL* non-numeric characters must be ignored:
    >123456 ASCII STRING[* space " , # ~ ! etc ]55555[* space " , # ~ ! etc ]
    >should end up with:
    >123456 55555
    >
    >PS:
    >123456 ASCII_STRING, 555.222.1111 (phone numbers should be ignored)
    >123456 ASCII STRING 555-222-1111 (phone numbers should be ignored)


    I would use python (or perhaps perl). This should work

    #!/usr/bin/env python
    import re, fileinput, sys

    # this pattern matches 6 digitsanythingfive or six
    # digits.
    matchPattern = re.compile(r'([0-9]{6})\s.*\s([0-9]{5,6})\s+$')

    for line in fileinput.input():
    R = matchPattern.match(line[:-1])
    if R:
    print '\t'.join(R.groups())
    sys.exit(0)

    Bill
    --
    INTERNET: bill@celestial.com Bill Campbell; Celestial Software LLC
    URL: http://www.celestial.com/ PO Box 820; 6641 E. Mercer Way
    FAX: (206) 232-9186 Mercer Island, WA 98040-0820; (206) 236-1676

    Now if there is one thing that we do worse than any other nation, it is
    try and manage somebody else's affairs.
    Will Rogers

  3. Re: sed/awk question

    Jeff Hyman wrote:
    > Hello SCO folks,
    >
    > Is there a way to take the following lines:
    > 100001 ASCII_STRING 1000
    > 999999 ASCII STRING 99999
    > and have output:
    > 100001 1000
    > 999999 99999
    > ^^^^^------ 1000 - 99999 Acceptable 4 to 5 digits only
    > ^^^^^^------------- 100001 - 999999 Acceptable Only 6 digits only


    Jeff,

    Not really enough information to give you a complete answer.

    On the surface the following should do what you want:

    sed 's/ *.* */ /' input_file > output_file

    From the examples you give "ASCII_STRING" is separated from the first
    block and last block of numbers with one or more spaces with no
    spaces after the last block of numbers:

    100001 ASCII_STRING 1000$ <-- The $ marks the end of the line

    The search string above is /space_space_*.*space_space_*/ to get all
    cases of singe or multiple spaces between the first block of numbers
    "ASCII_STRING" and the last block of numbers.


    >
    > *ALL* non-numeric characters must be ignored:
    > 123456 ASCII STRING[* space " , # ~ ! etc ]55555[* space " , # ~ ! etc ]
    > should end up with:
    > 123456 55555
    >
    > PS:
    > 123456 ASCII_STRING, 555.222.1111 (phone numbers should be ignored)
    > 123456 ASCII STRING 555-222-1111 (phone numbers should be ignored)
    >
    > I got it as far as the top example, but trying to 'set' the line is
    > not reliable with all the senerios. Any feedback is appreciated.
    >
    > - Jeff H
    >
    >


    --
    Steve Fabac
    S.M. Fabac & Associates
    816/765-1670

  4. Re: sed/awk question

    On Feb 24, 3:23 pm, "Steve M. Fabac, Jr." wrote:
    > Jeff Hyman wrote:
    > > Hello SCO folks,

    >
    > > Is there a way to take the following lines:
    > > 100001 ASCII_STRING 1000
    > > 999999 ASCII STRING 99999
    > > and have output:
    > > 100001 1000
    > > 999999 99999
    > > ^^^^^------ 1000 - 99999 Acceptable 4 to 5 digits only
    > > ^^^^^^------------- 100001 - 999999 Acceptable Only 6 digits only

    >
    > Jeff,
    >
    > Not really enough information to give you a complete answer.
    >
    > On the surface the following should do what you want:
    >
    > sed 's/ *.* */ /' input_file > output_file
    >
    > From the examples you give "ASCII_STRING" is separated from the first
    > block and last block of numbers with one or more spaces with no
    > spaces after the last block of numbers:
    >
    > 100001 ASCII_STRING 1000$ <-- The $ marks the end of the line
    >
    > The search string above is /space_space_*.*space_space_*/ to get all
    > cases of singe or multiple spaces between the first block of numbers
    > "ASCII_STRING" and the last block of numbers.
    >
    >
    >
    >
    >
    > > *ALL* non-numeric characters must be ignored:
    > > 123456 ASCII STRING[* space " , # ~ ! etc ]55555[* space " , # ~ ! etc ]
    > > should end up with:
    > > 123456 55555

    >
    > > PS:
    > > 123456 ASCII_STRING, 555.222.1111 (phone numbers should be ignored)
    > > 123456 ASCII STRING 555-222-1111 (phone numbers should be ignored)

    >
    > > I got it as far as the top example, but trying to 'set' the line is
    > > not reliable with all the senerios. Any feedback is appreciated.

    >
    > > - Jeff H

    >
    > --
    > Steve Fabac
    > S.M. Fabac & Associates
    > 816/765-1670


    Your constraints aren't well defined.

    But if the input is regular and you want to print the warnings shown
    on an unspecified SCO system without installing new software then awk
    is the thing to use, perhaps preprocessed with sed as Steve suggests.
    E.g.,
    sed 's/ *.* */ /' input_file | awk '{ print $1 " " $2 NL ; \
    if $1 < 10000 || > 99999 print "4 or 5 digits only" NL } '

    --RLR

+ Reply to Thread