Some math tricks in GNU awk

Prerequisite: Check if your awk (gawk) was compiled with GNU MPFR and MP Bignum libraries:
awk --version

GNU Awk 4.1.4, API: 1.1 (GNU MPFR 4.0.1, GNU MP 6.1.2) Copyright (C) 1989, 1991-2016 Free Software Foundation.

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see http://www.gnu.org/licenses/.


Contents of N2O5.txt:
 0           0.0165
10           0.0124
20           0.0093
30           0.0071
40           0.0053
50           0.0039
60           0.0029
  1. Sum of the 1st column:
    awk '{ sum += $1} ; END {print sum}' N2O5.txt
    210

  2. Math inside awk :
    awk -F' ' '{print $1, $2, log($2), 1/($2)}' N2O5.txt
    Or
    awk -F' ' '{natlog=log($2); inv=1/$2; print $1, $2, natlog, inv}' N2O5.txt
    Compare the output with
    awk -F' ' '{natlog=log($2); inv=1/$2; printf "%2d %.8f %.8f %.8f\n", $1, $2, natlog, inv}' N2O5.txt


  3. Example of floating-point (FP) precision consideration: the IEEE Standard for FP Arithmetic
    awk 'BEGIN { printf "%.17f\n", 99.15-20.85 }'
    awk -M -v PREC="double" 'BEGIN { printf "%.17f\n", 99.15-20.85 }'
    awk -M -v PREC="quad" 'BEGIN { printf "%.17f\n", 99.15-20.85 }'
    awk -M -v PREC=113 'BEGIN { printf "%.17f\n", 99.15-20.85 }'


  4. awk loops to calculate $y_2-y_1$, $y_3-y_2$, $y_4-y_3$ ... etc, or divisions alike.

    awk -F' ' '{printf " " $2}' N2O5.txt | awk -F' ' '{for (i=1;i<NF;++i) print $(i+1)"-"$i}' | bc -l
    awk -F' ' '{printf " " $2}' N2O5.txt | awk -F' ' '{for (i=1;i<NF;++i) printf $(i+1)-$i"\n"}'
    awk -F' ' '{printf " " $2}' N2O5.txt | awk -F' ' '{for (i=1;i<NF;++i) printf "%.8f\n", $(i+1)-$i}'
    awk -F' ' '{printf " " $2}' N2O5.txt | awk -F' ' '{for (i=1;i<NF;++i) print "scale=8; "$(i+1)"/"$i}' | bc -l


  5. Calculate devation and derivative using column-based recursive operations in awk :
    awk '{if (NR>1) printf "%3d\t %s\t %.8f\n", $1, $2, ($3-y); y=$3 }' ~jsyu/atom.3 |grep -B1 -
    awk '{if (NR>1) printf "%.8f\n", ($2-y)/($1-x); x=$1; y=$2 }' N2O5.txt
    awk '{if (NR>1) print "("$2"-y)/("$1"-x)"; x=$1; y=$2 }' N2O5.txt | bc -l
    ↑↑↑ The loop will NOT work outside awk!!


  6. Least square fitting using awk, script driven by bash
    #!/bin/bash
    exec awk '
    BEGIN { FS = "[ ,\t]+" }
    
    NF == 2 { x_sum += $1
              y_sum += $2
              xy_sum += $1*$2
              x2_sum += $1*$1
              num += 1
              x[NR] = $1
              y[NR] = $2
            }
    
    END { mean_x = x_sum / num
          mean_y = y_sum / num
          mean_xy = xy_sum / num
          mean_x2 = x2_sum / num
          slope = (mean_xy - (mean_x*mean_y)) / (mean_x2 - (mean_x*mean_x))
          inter = mean_y - slope * mean_x
          for (i = num; i > 0; i--) {
              ss_total += (y[i] - mean_y)**2
              ss_residual += (y[i] - (slope * x[i] + inter))**2
          }
          r2 = 1 - (ss_residual / ss_total)
          printf("Slope      :  %g\n", slope)
          printf("Intercept  :  %g\n", inter)
          printf("R-Squared  :  %g\n", r2)
        }'
    
    Save the script as linereg_awk.bash then chmod +x linereg_awk.bash and run it by:
    ./linereg_awk.bash < N2O5.txt
    Slope : -0.000187143
    Intercept : 0.0133667
    R-Squared : -0.904708


  7. Transpose of matrix by awk :
    transpose.bash
    #!/bin/bash
    # Transpose the row and column for a matrix.
    # Written by Geoff Clare
    # Published in "sed & awk" O'Reilly, page 432, ISBN:9781565922259 (Mandarin Chinese version).
    
    exec awk '
    NR == 1 {
       n = NF
       for ( i=1 ; i <= NF ; i++ )
           row[i] = $i
       next
    }
    {
       if ( NF > n )
          n = NF
       for ( i=1 ; i <= NF ; i++ )
          row[i] = row[i] " " $i
    }
    END {
        for ( i=1 ; i <= NF ; i++ )
          print row[i]
        }' ${1+"$@"}
    

    For an ASCII file matrix.txt containing an 8×8 matrix:
    101 102 103 104 105 106 107 108
    109 110 111 112 113 114 115 116
    117 118 119 120 121 122 123 124
    125 126 127 128 129 130 131 132
    133 134 135 136 137 138 139 140
    141 142 143 144 145 146 147 148
    149 150 151 152 153 154 155 156
    157 158 159 160 161 162 163 164

    The output of ./transpose.bash < matrix.txt gives:
    101 109 117 125 133 141 149 157
    102 110 118 126 134 142 150 158
    103 111 119 127 135 143 151 159
    104 112 120 128 136 144 152 160
    105 113 121 129 137 145 153 161
    106 114 122 130 138 146 154 162
    107 115 123 131 139 147 155 163
    108 116 124 132 140 148 156 164