random substitution with awk - Unix

This is a discussion on random substitution with awk - Unix ; Dear all, I am a new user of the forum and of awk. I have the same problem: I have a file in the following format: Si -7.87760000 6.16710000 16.80090000 Si -2.20190000 8.96480000 18.80290000 Si -3.91340000 6.18110000 16.79500000 Si -5.89320000 ...

+ Reply to Thread
Results 1 to 2 of 2

Thread: random substitution with awk

  1. random substitution with awk

    Dear all,
    I am a new user of the forum and of awk.
    I have the same problem:
    I have a file in the following format:

    Si -7.87760000 6.16710000 16.80090000
    Si -2.20190000 8.96480000 18.80290000
    Si -3.91340000 6.18110000 16.79500000
    Si -5.89320000 3.44010000 18.81540000
    Si -5.89980000 7.45760000 17.18100000
    H -7.84830000 0.67980000 16.84620000
    H -7.86760000 4.81220000 18.78920000
    Si 2.20250000 8.96490000 18.80280000
    Si -0.00010000 6.26880000 16.82620000
    Si -1.94640000 3.46010000 18.83720000
    H -1.99140000 7.62480000 16.81710000
    Si -3.91150000 0.68930000 16.84840000
    Si -3.92500000 4.81880000 18.78390000
    Si -5.88640000 -2.06440000 18.85090000
    H -5.88640000 2.06580000 16.83280000
    Si -7.86620000 -4.81050000 16.89460000
    Si -7.84800000 -0.67730000 18.83550000
    Si 3.91310000 6.18150000 16.79530000
    Si 1.94640000 3.46000000 18.83770000
    Si 1.99110000 7.62510000 16.81700000

    I would like to substitute in a random way the symbol "Si" with the symbol "Ge" in the first column. This substitution should not happen when the line contains the symbol H. I have tried to start with this script, but it doesn't work:

    #!/usr/bin/awk -f

    #

    # Usage:

    # ./impurity_gen.awk -v NIMP=12 -v SYMB=Ge

    #



    BEGIN{

    natom=16;
    ### this is the total number of lines containing the symbol "Si"
    nimp=NIMP;
    ### this the number of lines I would like to substitute
    symb=SYMB;
    ### this is the symbol with whom I'd like to substitute Si
    srand()

    for (j = 1; j <= nimp; ++j) {

    # loop to find a not-yet-seen selection

    do {

    select = 1 + int(rand() * natom)

    } while (select in pick)

    pick[j] = select

    }

    }



    NF != 4 { next }



    which_Si = 0

    symb_tmp = $1



    if ( /Si/ ) {

    which_Si += 1

    do {

    symb_tmp=symb

    } while ( which_Si in pick )



    x=$2; y=$3; z=$4;

    printf "%5s %15.9f %15.9f %15.9f \n", symb, x, y, z

    }

    Please can you give any suggestions or solutions to this problem???
    Thank you very much in advance
    Edit/Delete Message

  2. Re: random substitution with awk

    Quote Originally Posted by neppolo View Post
    Dear all,
    I am a new user of the forum and of awk.
    I have the same problem:
    I have a file in the following format:

    Si -7.87760000 6.16710000 16.80090000
    Si -2.20190000 8.96480000 18.80290000
    Si -3.91340000 6.18110000 16.79500000
    Si -5.89320000 3.44010000 18.81540000
    Si -5.89980000 7.45760000 17.18100000
    H -7.84830000 0.67980000 16.84620000
    H -7.86760000 4.81220000 18.78920000
    Si 2.20250000 8.96490000 18.80280000
    Si -0.00010000 6.26880000 16.82620000
    Si -1.94640000 3.46010000 18.83720000
    H -1.99140000 7.62480000 16.81710000
    Si -3.91150000 0.68930000 16.84840000
    Si -3.92500000 4.81880000 18.78390000
    Si -5.88640000 -2.06440000 18.85090000
    H -5.88640000 2.06580000 16.83280000
    Si -7.86620000 -4.81050000 16.89460000
    Si -7.84800000 -0.67730000 18.83550000
    Si 3.91310000 6.18150000 16.79530000
    Si 1.94640000 3.46000000 18.83770000
    Si 1.99110000 7.62510000 16.81700000

    I would like to substitute in a random way the symbol "Si" with the symbol "Ge" in the first column. This substitution should not happen when the line contains the symbol H. I have tried to start with this script, but it doesn't work:

    #!/usr/bin/awk -f

    #

    # Usage:

    # ./impurity_gen.awk -v NIMP=12 -v SYMB=Ge

    #



    BEGIN{

    natom=16;
    ### this is the total number of lines containing the symbol "Si"
    nimp=NIMP;
    ### this the number of lines I would like to substitute
    symb=SYMB;
    ### this is the symbol with whom I'd like to substitute Si
    srand()

    for (j = 1; j <= nimp; ++j) {

    # loop to find a not-yet-seen selection

    do {

    select = 1 + int(rand() * natom)

    } while (select in pick)

    pick[j] = select

    }

    }



    NF != 4 { next }



    which_Si = 0

    symb_tmp = $1



    if ( /Si/ ) {

    which_Si += 1

    do {

    symb_tmp=symb

    } while ( which_Si in pick )



    x=$2; y=$3; z=$4;

    printf "%5s %15.9f %15.9f %15.9f \n", symb, x, y, z

    }

    Please can you give any suggestions or solutions to this problem???
    Thank you very much in advance
    Edit/Delete Message

    I am not quite sure why you have made such a complicated script. you can simply change "Si" to "Ge" by doing the following:

    $awk -v var="Ge" '{gsub(/^Si/,var); print}' impurity.txt
    Ge -7.87760000 6.16710000 16.80090000
    Ge -2.20190000 8.96480000 18.80290000
    Ge -3.91340000 6.18110000 16.79500000
    Ge -5.89320000 3.44010000 18.81540000
    Ge -5.89980000 7.45760000 17.18100000
    H -7.84830000 0.67980000 16.84620000
    H -7.86760000 4.81220000 18.78920000
    .......

    impurity.txt is the file with all of the text.

    Or if you want to really use srand() you could do something like this:

    $ awk '{gsub(/^Si/,srand()); print}' impurity.txt
    0 -7.87760000 6.16710000 16.80090000
    1316152466 -2.20190000 8.96480000 18.80290000
    1316152466 -3.91340000 6.18110000 16.79500000
    1316152466 -5.89320000 3.44010000 18.81540000
    1316152466 -5.89980000 7.45760000 17.18100000
    H -7.84830000 0.67980000 16.84620000
    H -7.86760000 4.81220000 18.78920000
    ........

    of course that will give you just one random string that will replace all of the "Si"s. (I am not sure what happened to the first line however.)

+ Reply to Thread