Please help me in writing script - Unix

This is a discussion on Please help me in writing script - Unix ; Hi , I have the following two files.The characters 14-18 in file input2 will match 2nd field deleimited by ~ in file input1. swadmin@tb142:/rangedoms1/working/ CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input1 P00000012~00027 P00000027~00061 P00000270~00417 P00000271~00418 P00000272~00419 P00000273~00420 P00000274~00422 P00000275~00424 P00000276~00428 P00000277~00429 P00000278~00431 P00000279~00432 P00000329~00483 ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: Please help me in writing script

  1. Please help me in writing script

    Hi ,
    I have the following two files.The characters 14-18 in file input2
    will match 2nd field deleimited by ~ in file input1.

    swadmin@tb142:/rangedoms1/working/
    CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input1
    P00000012~00027
    P00000027~00061
    P00000270~00417
    P00000271~00418
    P00000272~00419
    P00000273~00420
    P00000274~00422
    P00000275~00424
    P00000276~00428
    P00000277~00429
    P00000278~00431
    P00000279~00432
    P00000329~00483
    P60000329~00483
    P50000329~00483
    P40000329~00483
    P30000329~00483
    P20000329~00483
    P10000329~00483
    P00000483~00639
    P01000079~00178
    P11000079~00178
    P90000079~00178
    P80000079~00178
    P70000079~00178
    P60000079~00178
    P50000079~00178
    P40000079~00178
    P30000079~00178
    P20000079~00178
    P10000079~00178
    P00000178~00306
    swadmin@tb142:/rangedoms1/working/
    CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input2
    000134900017400027B-ABATPNB050062184TPNB050063880
    000134900017400483B-ABATPNB050062184TPNB050063880
    000134900017400178C-BCBTPNB050562934TPNB050446531
    000134900017400178B-ABATPNB050062184TPNB050063880
    000134900017400483C-ACATPNB050064199TPNB050064268
    000134900017400178C-ACATPNB050064199TPNB050064268
    Now I want the file that contains every line in input2 but the
    characters 14-18 must be replaced with the matching field1 delimited
    by ~ in file input1 and it should not be repeated.
    I have used the following script which displays
    swadmin@tb142:/rangedoms1/working/
    CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV>awk -F'~' 'NR==FNR{a[$2]=
    $1;next}{print substr($0,1,13),a[substr($0,14,5)],substr($0,19)}'
    input1 input2
    0001349000174 P00000012 B-ABATPNB050062184TPNB050063880
    0001349000174 P10000329 B-ABATPNB050062184TPNB050063880
    0001349000174 P10000079 C-BCBTPNB050562934TPNB050446531
    0001349000174 P10000079 B-ABATPNB050062184TPNB050063880
    0001349000174 P10000329 C-ACATPNB050064199TPNB050064268
    0001349000174 P10000079 C-ACATPNB050064199TPNB050064268

    But P10000079, P10000329 are repeated which we donít want.

    resulting file should look like the following

    0001349000174 P00000012B-ABATPNB050062184TPNB050063880
    0001349000174 P00000329B-ABATPNB050062184TPNB050063880
    0001349000174 P01000079C-BCBTPNB050562934TPNB050446531
    0001349000174 P11000079B-ABATPNB050062184TPNB050063880
    0001349000174 P60000329C-ACATPNB050064199TPNB050064268
    0001349000174 P90000079C-ACATPNB050064199TPNB050064268
    I have used the following script which displays
    swadmin@tb142:/rangedoms1/working/
    CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV>awk -F'~' 'NR==FNR{a[$2]=
    $1;next}{print substr($0,1,13),a[substr($0,14,5)],substr($0,19)}'
    input1 input2
    0001349000174 P00000012 B-ABATPNB050062184TPNB050063880
    0001349000174 P10000329 B-ABATPNB050062184TPNB050063880
    0001349000174 P10000079 C-BCBTPNB050562934TPNB050446531
    0001349000174 P10000079 B-ABATPNB050062184TPNB050063880
    0001349000174 P10000329 C-ACATPNB050064199TPNB050064268
    0001349000174 P10000079 C-ACATPNB050064199TPNB050064268

    But P10000079, P10000329 are repeated which we donít want.

    Many thanks in advance for help being done.

    Regards
    Injam

  2. Re: Please help me in writing script

    Injam wrote:
    >
    > I have the following two files.The characters 14-18 in file input2
    > will match 2nd field deleimited by ~ in file input1.
    >
    > swadmin@tb142:/rangedoms1/working/
    > CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input1
    > P00000012~00027
    > P00000027~00061


    [ snip ]

    > P10000079~00178
    > P00000178~00306
    > swadmin@tb142:/rangedoms1/working/
    > CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input2
    > 000134900017400027B-ABATPNB050062184TPNB050063880
    > 000134900017400483B-ABATPNB050062184TPNB050063880
    > 000134900017400178C-BCBTPNB050562934TPNB050446531
    > 000134900017400178B-ABATPNB050062184TPNB050063880
    > 000134900017400483C-ACATPNB050064199TPNB050064268
    > 000134900017400178C-ACATPNB050064199TPNB050064268
    > Now I want the file that contains every line in input2 but the
    > characters 14-18 must be replaced with the matching field1 delimited
    > by ~ in file input1 and it should not be repeated.
    > I have used the following script which displays
    > swadmin@tb142:/rangedoms1/working/
    > CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV>awk -F'~' 'NR==FNR{a[$2]=
    > $1;next}{print substr($0,1,13),a[substr($0,14,5)],substr($0,19)}'
    > input1 input2
    > 0001349000174 P00000012 B-ABATPNB050062184TPNB050063880
    > 0001349000174 P10000329 B-ABATPNB050062184TPNB050063880
    > 0001349000174 P10000079 C-BCBTPNB050562934TPNB050446531
    > 0001349000174 P10000079 B-ABATPNB050062184TPNB050063880
    > 0001349000174 P10000329 C-ACATPNB050064199TPNB050064268
    > 0001349000174 P10000079 C-ACATPNB050064199TPNB050064268
    >
    > But P10000079, P10000329 are repeated which we donít want.
    >
    > resulting file should look like the following
    >
    > 0001349000174 P00000012B-ABATPNB050062184TPNB050063880
    > 0001349000174 P00000329B-ABATPNB050062184TPNB050063880
    > 0001349000174 P01000079C-BCBTPNB050562934TPNB050446531
    > 0001349000174 P11000079B-ABATPNB050062184TPNB050063880
    > 0001349000174 P60000329C-ACATPNB050064199TPNB050064268
    > 0001349000174 P90000079C-ACATPNB050064199TPNB050064268


    #!/usr/bin/perl

    @ARGV = 'input1';
    while ( <> ) {
    /^([^~]+)~(.+)$/ && push @{ $data{ $2 } }, $1;
    }

    @ARGV = 'input2';
    while ( <> ) {
    exists $data{ substr $_, 13, 5 } && substr $_, 13, 5, ' ' . shift
    @{ $data{ substr $_, 13, 5 } };
    print;
    }

    __END__


    John
    --
    Perl isn't a toolbox, but a small machine shop where you
    can special-order certain sorts of tools at low cost and
    in short order. -- Larry Wall

  3. Re: Please help me in writing script

    On Jul 2, 11:04*pm, "John W. Krahn" wrote:
    > Injam wrote:
    >
    > > I have the following two files.The characters 14-18 in file input2
    > > will match 2nd field deleimited by ~ in file input1.

    >
    > > swadmin@tb142:/rangedoms1/working/
    > > CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input1
    > > P00000012~00027
    > > P00000027~00061

    >
    > [ snip ]
    >
    >
    >
    >
    >
    > > P10000079~00178
    > > P00000178~00306
    > > swadmin@tb142:/rangedoms1/working/
    > > CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input2
    > > 000134900017400027B-ABATPNB050062184TPNB050063880
    > > 000134900017400483B-ABATPNB050062184TPNB050063880
    > > 000134900017400178C-BCBTPNB050562934TPNB050446531
    > > 000134900017400178B-ABATPNB050062184TPNB050063880
    > > 000134900017400483C-ACATPNB050064199TPNB050064268
    > > 000134900017400178C-ACATPNB050064199TPNB050064268
    > > Now I want the file that contains every line in input2 but the
    > > characters 14-18 must be replaced with the matching field1 delimited
    > > by ~ in file input1 and it should not be repeated.
    > > I have used the following script which displays
    > > swadmin@tb142:/rangedoms1/working/
    > > CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV>awk -F'~' 'NR==FNR{a[$2]=
    > > $1;next}{print substr($0,1,13),a[substr($0,14,5)],substr($0,19)}'
    > > input1 input2
    > > 0001349000174 P00000012 B-ABATPNB050062184TPNB050063880
    > > 0001349000174 P10000329 B-ABATPNB050062184TPNB050063880
    > > 0001349000174 P10000079 C-BCBTPNB050562934TPNB050446531
    > > 0001349000174 P10000079 B-ABATPNB050062184TPNB050063880
    > > 0001349000174 P10000329 C-ACATPNB050064199TPNB050064268
    > > 0001349000174 P10000079 C-ACATPNB050064199TPNB050064268

    >
    > > But P10000079, P10000329 are repeated which we donít want.

    >
    > > resulting file should look like the following

    >
    > > 0001349000174 P00000012B-ABATPNB050062184TPNB050063880
    > > 0001349000174 P00000329B-ABATPNB050062184TPNB050063880
    > > 0001349000174 P01000079C-BCBTPNB050562934TPNB050446531
    > > 0001349000174 P11000079B-ABATPNB050062184TPNB050063880
    > > 0001349000174 P60000329C-ACATPNB050064199TPNB050064268
    > > 0001349000174 P90000079C-ACATPNB050064199TPNB050064268

    >
    > #!/usr/bin/perl
    >
    > @ARGV = 'input1';
    > while ( <> ) {
    > * * */^([^~]+)~(.+)$/ && push @{ $data{ $2 } }, $1;
    > * * *}
    >
    > @ARGV = 'input2';
    > while ( <> ) {
    > * * *exists $data{ substr $_, 13, 5 } && substr $_, 13, 5, ' ' . shift
    > @{ $data{ substr $_, 13, 5 } };
    > * * *print;
    > * * *}
    >
    > __END__
    >
    > John
    > --
    > Perl isn't a toolbox, but a small machine shop where you
    > can special-order certain sorts of tools at low cost and
    > in short order. * * * * * * * * * * * * * *-- Larry Wall- Hide quoted text -
    >
    > - Show quoted text -


    Hi John,

    It is working fine....many thanks for the help

    Regards
    Injam

  4. Re: Please help me in writing script

    On Jul 3, 8:49*am, Injam wrote:
    > On Jul 2, 11:04*pm, "John W. Krahn" wrote:
    >
    >
    >
    >
    >
    > > Injam wrote:

    >
    > > > I have the following two files.The characters 14-18 in file input2
    > > > will match 2nd field deleimited by ~ in file input1.

    >
    > > > swadmin@tb142:/rangedoms1/working/
    > > > CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input1
    > > > P00000012~00027
    > > > P00000027~00061

    >
    > > [ snip ]

    >
    > > > P10000079~00178
    > > > P00000178~00306
    > > > swadmin@tb142:/rangedoms1/working/
    > > > CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input2
    > > > 000134900017400027B-ABATPNB050062184TPNB050063880
    > > > 000134900017400483B-ABATPNB050062184TPNB050063880
    > > > 000134900017400178C-BCBTPNB050562934TPNB050446531
    > > > 000134900017400178B-ABATPNB050062184TPNB050063880
    > > > 000134900017400483C-ACATPNB050064199TPNB050064268
    > > > 000134900017400178C-ACATPNB050064199TPNB050064268
    > > > Now I want the file that contains every line in input2 but the
    > > > characters 14-18 must be replaced with the matching field1 delimited
    > > > by ~ in file input1 and it should not be repeated.
    > > > I have used the following script which displays
    > > > swadmin@tb142:/rangedoms1/working/
    > > > CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV>awk -F'~' 'NR==FNR{a[$2]=
    > > > $1;next}{print substr($0,1,13),a[substr($0,14,5)],substr($0,19)}'
    > > > input1 input2
    > > > 0001349000174 P00000012 B-ABATPNB050062184TPNB050063880
    > > > 0001349000174 P10000329 B-ABATPNB050062184TPNB050063880
    > > > 0001349000174 P10000079 C-BCBTPNB050562934TPNB050446531
    > > > 0001349000174 P10000079 B-ABATPNB050062184TPNB050063880
    > > > 0001349000174 P10000329 C-ACATPNB050064199TPNB050064268
    > > > 0001349000174 P10000079 C-ACATPNB050064199TPNB050064268

    >
    > > > But P10000079, P10000329 are repeated which we donít want.

    >
    > > > resulting file should look like the following

    >
    > > > 0001349000174 P00000012B-ABATPNB050062184TPNB050063880
    > > > 0001349000174 P00000329B-ABATPNB050062184TPNB050063880
    > > > 0001349000174 P01000079C-BCBTPNB050562934TPNB050446531
    > > > 0001349000174 P11000079B-ABATPNB050062184TPNB050063880
    > > > 0001349000174 P60000329C-ACATPNB050064199TPNB050064268
    > > > 0001349000174 P90000079C-ACATPNB050064199TPNB050064268

    >
    > > #!/usr/bin/perl

    >
    > > @ARGV = 'input1';
    > > while ( <> ) {
    > > * * */^([^~]+)~(.+)$/ && push @{ $data{ $2 } }, $1;
    > > * * *}

    >
    > > @ARGV = 'input2';
    > > while ( <> ) {
    > > * * *exists $data{ substr $_, 13, 5 } && substr $_, 13, 5, ' ' . shift
    > > @{ $data{ substr $_, 13, 5 } };
    > > * * *print;
    > > * * *}

    >
    > > __END__

    >
    > > John
    > > --
    > > Perl isn't a toolbox, but a small machine shop where you
    > > can special-order certain sorts of tools at low cost and
    > > in short order. * * * * * * * * * * * * * *-- Larry Wall- Hide quoted text -

    >
    > > - Show quoted text -

    >
    > Hi John,
    >
    > It is working fine....many thanks for the help
    >
    > Regards
    > Injam- Hide quoted text -
    >
    > - Show quoted text -


    Hi John,
    The script you haven is working fine..but we want shell script.
    Really appreciable if you would write shell script...

    Many thanks for the help being done.


    Thanks
    Injam

  5. Re: Please help me in writing script

    On 2008-07-03, Injam wrote:
    > On Jul 3, 8:49¬*am, Injam wrote:
    >> On Jul 2, 11:04¬*pm, "John W. Krahn" wrote:
    >>
    >>
    >>
    >>
    >>
    >> > Injam wrote:

    >> ...
    >>
    >> ...

    >
    > Hi John,
    > The script you haven is working fine..but we want shell script.
    > Really appreciable if you would write shell script...


    It may help if you define what you mean by "shell script".


    --


+ Reply to Thread