random substitution with awk
Dear all,
I am a new user of the forum and of awk.
I have the same problem:
I have a file in the following format:
Si -7.87760000 6.16710000 16.80090000
Si -2.20190000 8.96480000 18.80290000
Si -3.91340000 6.18110000 16.79500000
Si -5.89320000 3.44010000 18.81540000
Si -5.89980000 7.45760000 17.18100000
H -7.84830000 0.67980000 16.84620000
H -7.86760000 4.81220000 18.78920000
Si 2.20250000 8.96490000 18.80280000
Si -0.00010000 6.26880000 16.82620000
Si -1.94640000 3.46010000 18.83720000
H -1.99140000 7.62480000 16.81710000
Si -3.91150000 0.68930000 16.84840000
Si -3.92500000 4.81880000 18.78390000
Si -5.88640000 -2.06440000 18.85090000
H -5.88640000 2.06580000 16.83280000
Si -7.86620000 -4.81050000 16.89460000
Si -7.84800000 -0.67730000 18.83550000
Si 3.91310000 6.18150000 16.79530000
Si 1.94640000 3.46000000 18.83770000
Si 1.99110000 7.62510000 16.81700000
I would like to substitute in a random way the symbol "Si" with the symbol "Ge" in the first column. This substitution should not happen when the line contains the symbol H. I have tried to start with this script, but it doesn't work:
#!/usr/bin/awk -f
#
# Usage:
# ./impurity_gen.awk -v NIMP=12 -v SYMB=Ge
#
BEGIN{
natom=16;
### this is the total number of lines containing the symbol "Si"
nimp=NIMP;
### this the number of lines I would like to substitute
symb=SYMB;
### this is the symbol with whom I'd like to substitute Si
srand()
for (j = 1; j <= nimp; ++j) {
# loop to find a not-yet-seen selection
do {
select = 1 + int(rand() * natom)
} while (select in pick)
pick[j] = select
}
}
NF != 4 { next }
which_Si = 0
symb_tmp = $1
if ( /Si/ ) {
which_Si += 1
do {
symb_tmp=symb
} while ( which_Si in pick )
x=$2; y=$3; z=$4;
printf "%5s %15.9f %15.9f %15.9f \n", symb, x, y, z
}
Please can you give any suggestions or solutions to this problem???
Thank you very much in advance
Edit/Delete Message
Re: random substitution with awk
[QUOTE=neppolo;1724032]Dear all,
I am a new user of the forum and of awk.
I have the same problem:
I have a file in the following format:
Si -7.87760000 6.16710000 16.80090000
Si -2.20190000 8.96480000 18.80290000
Si -3.91340000 6.18110000 16.79500000
Si -5.89320000 3.44010000 18.81540000
Si -5.89980000 7.45760000 17.18100000
H -7.84830000 0.67980000 16.84620000
H -7.86760000 4.81220000 18.78920000
Si 2.20250000 8.96490000 18.80280000
Si -0.00010000 6.26880000 16.82620000
Si -1.94640000 3.46010000 18.83720000
H -1.99140000 7.62480000 16.81710000
Si -3.91150000 0.68930000 16.84840000
Si -3.92500000 4.81880000 18.78390000
Si -5.88640000 -2.06440000 18.85090000
H -5.88640000 2.06580000 16.83280000
Si -7.86620000 -4.81050000 16.89460000
Si -7.84800000 -0.67730000 18.83550000
Si 3.91310000 6.18150000 16.79530000
Si 1.94640000 3.46000000 18.83770000
Si 1.99110000 7.62510000 16.81700000
I would like to substitute in a random way the symbol "Si" with the symbol "Ge" in the first column. This substitution should not happen when the line contains the symbol H. I have tried to start with this script, but it doesn't work:
#!/usr/bin/awk -f
#
# Usage:
# ./impurity_gen.awk -v NIMP=12 -v SYMB=Ge
#
BEGIN{
natom=16;
### this is the total number of lines containing the symbol "Si"
nimp=NIMP;
### this the number of lines I would like to substitute
symb=SYMB;
### this is the symbol with whom I'd like to substitute Si
srand()
for (j = 1; j <= nimp; ++j) {
# loop to find a not-yet-seen selection
do {
select = 1 + int(rand() * natom)
} while (select in pick)
pick[j] = select
}
}
NF != 4 { next }
which_Si = 0
symb_tmp = $1
if ( /Si/ ) {
which_Si += 1
do {
symb_tmp=symb
} while ( which_Si in pick )
x=$2; y=$3; z=$4;
printf "%5s %15.9f %15.9f %15.9f \n", symb, x, y, z
}
Please can you give any suggestions or solutions to this problem???
Thank you very much in advance
Edit/Delete Message[/QUOTE]
I am not quite sure why you have made such a complicated script. you can simply change "Si" to "Ge" by doing the following:
$awk -v var="Ge" '{gsub(/^Si/,var); print}' impurity.txt
Ge -7.87760000 6.16710000 16.80090000
Ge -2.20190000 8.96480000 18.80290000
Ge -3.91340000 6.18110000 16.79500000
Ge -5.89320000 3.44010000 18.81540000
Ge -5.89980000 7.45760000 17.18100000
H -7.84830000 0.67980000 16.84620000
H -7.86760000 4.81220000 18.78920000
.......
impurity.txt is the file with all of the text.
Or if you want to really use srand() you could do something like this:
$ awk '{gsub(/^Si/,srand()); print}' impurity.txt
0 -7.87760000 6.16710000 16.80090000
1316152466 -2.20190000 8.96480000 18.80290000
1316152466 -3.91340000 6.18110000 16.79500000
1316152466 -5.89320000 3.44010000 18.81540000
1316152466 -5.89980000 7.45760000 17.18100000
H -7.84830000 0.67980000 16.84620000
H -7.86760000 4.81220000 18.78920000
........
of course that will give you just one random string that will replace all of the "Si"s. (I am not sure what happened to the first line however.)