Please help me in writing script - Unix
This is a discussion on Please help me in writing script - Unix ; Hi ,
I have the following two files.The characters 14-18 in file input2
will match 2nd field deleimited by ~ in file input1.
swadmin@tb142:/rangedoms1/working/
CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input1
P00000012~00027
P00000027~00061
P00000270~00417
P00000271~00418
P00000272~00419
P00000273~00420
P00000274~00422
P00000275~00424
P00000276~00428
P00000277~00429
P00000278~00431
P00000279~00432
P00000329~00483
...
-
Please help me in writing script
Hi ,
I have the following two files.The characters 14-18 in file input2
will match 2nd field deleimited by ~ in file input1.
swadmin@tb142:/rangedoms1/working/
CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input1
P00000012~00027
P00000027~00061
P00000270~00417
P00000271~00418
P00000272~00419
P00000273~00420
P00000274~00422
P00000275~00424
P00000276~00428
P00000277~00429
P00000278~00431
P00000279~00432
P00000329~00483
P60000329~00483
P50000329~00483
P40000329~00483
P30000329~00483
P20000329~00483
P10000329~00483
P00000483~00639
P01000079~00178
P11000079~00178
P90000079~00178
P80000079~00178
P70000079~00178
P60000079~00178
P50000079~00178
P40000079~00178
P30000079~00178
P20000079~00178
P10000079~00178
P00000178~00306
swadmin@tb142:/rangedoms1/working/
CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input2
000134900017400027B-ABATPNB050062184TPNB050063880
000134900017400483B-ABATPNB050062184TPNB050063880
000134900017400178C-BCBTPNB050562934TPNB050446531
000134900017400178B-ABATPNB050062184TPNB050063880
000134900017400483C-ACATPNB050064199TPNB050064268
000134900017400178C-ACATPNB050064199TPNB050064268
Now I want the file that contains every line in input2 but the
characters 14-18 must be replaced with the matching field1 delimited
by ~ in file input1 and it should not be repeated.
I have used the following script which displays
swadmin@tb142:/rangedoms1/working/
CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV>awk -F'~' 'NR==FNR{a[$2]=
$1;next}{print substr($0,1,13),a[substr($0,14,5)],substr($0,19)}'
input1 input2
0001349000174 P00000012 B-ABATPNB050062184TPNB050063880
0001349000174 P10000329 B-ABATPNB050062184TPNB050063880
0001349000174 P10000079 C-BCBTPNB050562934TPNB050446531
0001349000174 P10000079 B-ABATPNB050062184TPNB050063880
0001349000174 P10000329 C-ACATPNB050064199TPNB050064268
0001349000174 P10000079 C-ACATPNB050064199TPNB050064268
But P10000079, P10000329 are repeated which we don’t want.
resulting file should look like the following
0001349000174 P00000012B-ABATPNB050062184TPNB050063880
0001349000174 P00000329B-ABATPNB050062184TPNB050063880
0001349000174 P01000079C-BCBTPNB050562934TPNB050446531
0001349000174 P11000079B-ABATPNB050062184TPNB050063880
0001349000174 P60000329C-ACATPNB050064199TPNB050064268
0001349000174 P90000079C-ACATPNB050064199TPNB050064268
I have used the following script which displays
swadmin@tb142:/rangedoms1/working/
CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV>awk -F'~' 'NR==FNR{a[$2]=
$1;next}{print substr($0,1,13),a[substr($0,14,5)],substr($0,19)}'
input1 input2
0001349000174 P00000012 B-ABATPNB050062184TPNB050063880
0001349000174 P10000329 B-ABATPNB050062184TPNB050063880
0001349000174 P10000079 C-BCBTPNB050562934TPNB050446531
0001349000174 P10000079 B-ABATPNB050062184TPNB050063880
0001349000174 P10000329 C-ACATPNB050064199TPNB050064268
0001349000174 P10000079 C-ACATPNB050064199TPNB050064268
But P10000079, P10000329 are repeated which we don’t want.
Many thanks in advance for help being done.
Regards
Injam
-
Re: Please help me in writing script
Injam wrote:
>
> I have the following two files.The characters 14-18 in file input2
> will match 2nd field deleimited by ~ in file input1.
>
> swadmin@tb142:/rangedoms1/working/
> CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input1
> P00000012~00027
> P00000027~00061
[ snip ]
> P10000079~00178
> P00000178~00306
> swadmin@tb142:/rangedoms1/working/
> CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input2
> 000134900017400027B-ABATPNB050062184TPNB050063880
> 000134900017400483B-ABATPNB050062184TPNB050063880
> 000134900017400178C-BCBTPNB050562934TPNB050446531
> 000134900017400178B-ABATPNB050062184TPNB050063880
> 000134900017400483C-ACATPNB050064199TPNB050064268
> 000134900017400178C-ACATPNB050064199TPNB050064268
> Now I want the file that contains every line in input2 but the
> characters 14-18 must be replaced with the matching field1 delimited
> by ~ in file input1 and it should not be repeated.
> I have used the following script which displays
> swadmin@tb142:/rangedoms1/working/
> CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV>awk -F'~' 'NR==FNR{a[$2]=
> $1;next}{print substr($0,1,13),a[substr($0,14,5)],substr($0,19)}'
> input1 input2
> 0001349000174 P00000012 B-ABATPNB050062184TPNB050063880
> 0001349000174 P10000329 B-ABATPNB050062184TPNB050063880
> 0001349000174 P10000079 C-BCBTPNB050562934TPNB050446531
> 0001349000174 P10000079 B-ABATPNB050062184TPNB050063880
> 0001349000174 P10000329 C-ACATPNB050064199TPNB050064268
> 0001349000174 P10000079 C-ACATPNB050064199TPNB050064268
>
> But P10000079, P10000329 are repeated which we don’t want.
>
> resulting file should look like the following
>
> 0001349000174 P00000012B-ABATPNB050062184TPNB050063880
> 0001349000174 P00000329B-ABATPNB050062184TPNB050063880
> 0001349000174 P01000079C-BCBTPNB050562934TPNB050446531
> 0001349000174 P11000079B-ABATPNB050062184TPNB050063880
> 0001349000174 P60000329C-ACATPNB050064199TPNB050064268
> 0001349000174 P90000079C-ACATPNB050064199TPNB050064268
#!/usr/bin/perl
@ARGV = 'input1';
while ( <> ) {
/^([^~]+)~(.+)$/ && push @{ $data{ $2 } }, $1;
}
@ARGV = 'input2';
while ( <> ) {
exists $data{ substr $_, 13, 5 } && substr $_, 13, 5, ' ' . shift
@{ $data{ substr $_, 13, 5 } };
print;
}
__END__
John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
-
Re: Please help me in writing script
On Jul 2, 11:04*pm, "John W. Krahn" wrote:
> Injam wrote:
>
> > I have the following two files.The characters 14-18 in file input2
> > will match 2nd field deleimited by ~ in file input1.
>
> > swadmin@tb142:/rangedoms1/working/
> > CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input1
> > P00000012~00027
> > P00000027~00061
>
> [ snip ]
>
>
>
>
>
> > P10000079~00178
> > P00000178~00306
> > swadmin@tb142:/rangedoms1/working/
> > CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input2
> > 000134900017400027B-ABATPNB050062184TPNB050063880
> > 000134900017400483B-ABATPNB050062184TPNB050063880
> > 000134900017400178C-BCBTPNB050562934TPNB050446531
> > 000134900017400178B-ABATPNB050062184TPNB050063880
> > 000134900017400483C-ACATPNB050064199TPNB050064268
> > 000134900017400178C-ACATPNB050064199TPNB050064268
> > Now I want the file that contains every line in input2 but the
> > characters 14-18 must be replaced with the matching field1 delimited
> > by ~ in file input1 and it should not be repeated.
> > I have used the following script which displays
> > swadmin@tb142:/rangedoms1/working/
> > CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV>awk -F'~' 'NR==FNR{a[$2]=
> > $1;next}{print substr($0,1,13),a[substr($0,14,5)],substr($0,19)}'
> > input1 input2
> > 0001349000174 P00000012 B-ABATPNB050062184TPNB050063880
> > 0001349000174 P10000329 B-ABATPNB050062184TPNB050063880
> > 0001349000174 P10000079 C-BCBTPNB050562934TPNB050446531
> > 0001349000174 P10000079 B-ABATPNB050062184TPNB050063880
> > 0001349000174 P10000329 C-ACATPNB050064199TPNB050064268
> > 0001349000174 P10000079 C-ACATPNB050064199TPNB050064268
>
> > But P10000079, P10000329 are repeated which we don’t want.
>
> > resulting file should look like the following
>
> > 0001349000174 P00000012B-ABATPNB050062184TPNB050063880
> > 0001349000174 P00000329B-ABATPNB050062184TPNB050063880
> > 0001349000174 P01000079C-BCBTPNB050562934TPNB050446531
> > 0001349000174 P11000079B-ABATPNB050062184TPNB050063880
> > 0001349000174 P60000329C-ACATPNB050064199TPNB050064268
> > 0001349000174 P90000079C-ACATPNB050064199TPNB050064268
>
> #!/usr/bin/perl
>
> @ARGV = 'input1';
> while ( <> ) {
> * * */^([^~]+)~(.+)$/ && push @{ $data{ $2 } }, $1;
> * * *}
>
> @ARGV = 'input2';
> while ( <> ) {
> * * *exists $data{ substr $_, 13, 5 } && substr $_, 13, 5, ' ' . shift
> @{ $data{ substr $_, 13, 5 } };
> * * *print;
> * * *}
>
> __END__
>
> John
> --
> Perl isn't a toolbox, but a small machine shop where you
> can special-order certain sorts of tools at low cost and
> in short order. * * * * * * * * * * * * * *-- Larry Wall- Hide quoted text -
>
> - Show quoted text -
Hi John,
It is working fine....many thanks for the help
Regards
Injam
-
Re: Please help me in writing script
On Jul 3, 8:49*am, Injam wrote:
> On Jul 2, 11:04*pm, "John W. Krahn" wrote:
>
>
>
>
>
> > Injam wrote:
>
> > > I have the following two files.The characters 14-18 in file input2
> > > will match 2nd field deleimited by ~ in file input1.
>
> > > swadmin@tb142:/rangedoms1/working/
> > > CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input1
> > > P00000012~00027
> > > P00000027~00061
>
> > [ snip ]
>
> > > P10000079~00178
> > > P00000178~00306
> > > swadmin@tb142:/rangedoms1/working/
> > > CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV> cat input2
> > > 000134900017400027B-ABATPNB050062184TPNB050063880
> > > 000134900017400483B-ABATPNB050062184TPNB050063880
> > > 000134900017400178C-BCBTPNB050562934TPNB050446531
> > > 000134900017400178B-ABATPNB050062184TPNB050063880
> > > 000134900017400483C-ACATPNB050064199TPNB050064268
> > > 000134900017400178C-ACATPNB050064199TPNB050064268
> > > Now I want the file that contains every line in input2 but the
> > > characters 14-18 must be replaced with the matching field1 delimited
> > > by ~ in file input1 and it should not be repeated.
> > > I have used the following script which displays
> > > swadmin@tb142:/rangedoms1/working/
> > > CRST_OVERLAY_ENHANCE_Analysis_RNGCTRL_DEV>awk -F'~' 'NR==FNR{a[$2]=
> > > $1;next}{print substr($0,1,13),a[substr($0,14,5)],substr($0,19)}'
> > > input1 input2
> > > 0001349000174 P00000012 B-ABATPNB050062184TPNB050063880
> > > 0001349000174 P10000329 B-ABATPNB050062184TPNB050063880
> > > 0001349000174 P10000079 C-BCBTPNB050562934TPNB050446531
> > > 0001349000174 P10000079 B-ABATPNB050062184TPNB050063880
> > > 0001349000174 P10000329 C-ACATPNB050064199TPNB050064268
> > > 0001349000174 P10000079 C-ACATPNB050064199TPNB050064268
>
> > > But P10000079, P10000329 are repeated which we don’t want.
>
> > > resulting file should look like the following
>
> > > 0001349000174 P00000012B-ABATPNB050062184TPNB050063880
> > > 0001349000174 P00000329B-ABATPNB050062184TPNB050063880
> > > 0001349000174 P01000079C-BCBTPNB050562934TPNB050446531
> > > 0001349000174 P11000079B-ABATPNB050062184TPNB050063880
> > > 0001349000174 P60000329C-ACATPNB050064199TPNB050064268
> > > 0001349000174 P90000079C-ACATPNB050064199TPNB050064268
>
> > #!/usr/bin/perl
>
> > @ARGV = 'input1';
> > while ( <> ) {
> > * * */^([^~]+)~(.+)$/ && push @{ $data{ $2 } }, $1;
> > * * *}
>
> > @ARGV = 'input2';
> > while ( <> ) {
> > * * *exists $data{ substr $_, 13, 5 } && substr $_, 13, 5, ' ' . shift
> > @{ $data{ substr $_, 13, 5 } };
> > * * *print;
> > * * *}
>
> > __END__
>
> > John
> > --
> > Perl isn't a toolbox, but a small machine shop where you
> > can special-order certain sorts of tools at low cost and
> > in short order. * * * * * * * * * * * * * *-- Larry Wall- Hide quoted text -
>
> > - Show quoted text -
>
> Hi John,
>
> It is working fine....many thanks for the help
>
> Regards
> Injam- Hide quoted text -
>
> - Show quoted text -
Hi John,
The script you haven is working fine..but we want shell script.
Really appreciable if you would write shell script...
Many thanks for the help being done.
Thanks
Injam
-
Re: Please help me in writing script
On 2008-07-03, Injam wrote:
> On Jul 3, 8:49Â*am, Injam wrote:
>> On Jul 2, 11:04Â*pm, "John W. Krahn" wrote:
>>
>>
>>
>>
>>
>> > Injam wrote:
>> ...
>>
>> ...
>
> Hi John,
> The script you haven is working fine..but we want shell script.
> Really appreciable if you would write shell script...
It may help if you define what you mean by "shell script".
--