symbols, atomic mass and atomic numbers, respectively.
Data source: http://www.csudh.edu/oliver/chemdata/atmass.htm
Sample loop:
for i in `cat /etc/passwd | cut -d : -f 1`
do echo $i ;
finger $i ;
done
head and tail:
Array in bash
Assign the content of "Lin Dong Tsai Hsieh Yu" to an array of five elements, then print the print array contents. Array index in bash starts with 0.
LASTNAME="Lin Dong Tsai Hsieh Yu" string input
arr=($LASTNAME) assign string to the array named arr
echo ${arr[0]} ${arr[3]} print the values of arr[0] and arr[3]
echo ${arr[*]} print all the values in array arr
echo ${!arr[*]} print the indices of array arr
echo ${#arr[*]} print the lengths of array arr
echo ${#arr[@]} also print the lengths of array arr
b=`seq 1 9` b is a string
brr=($b) from string variable $b to array ${brr[*]}
brr=(`seq 1 9`) brr is an array
arr=(`echo "Lin Dong Tsai Hsieh Yu"`)assign array via stdout
The "@" sign can be used instead of the "*" in constructs such as ${arr[*]}, the result is the same except when expanding to the items of the array within a quoted string. In this case the behavior is the same as when expanding "$*" and "$@" within quoted strings: "${arr[*]}" returns all the items as a single word, whereas "${arr[@]}" returns each item as a separate word. For further explanations, see "Bash Arrays", Linux Journal, Jun 19 2008, written by Mitch Frazier.
Arithmetic operations in bash
Arithmetic operations of bash are performed inside $(( .... )) and only applies to integers!
echo $(( 7*8 )) 56
a="13"
b="5"
c="8.9"
echo $(($a+$b)) 18
echo $(($a*$b)) 65
echo $(($a/$b)) 2
echo $(($a* $(($b-$c)))) bash: 5-8.9: syntax error in expression (error token is ".9")
echo "$a*($b-$c)" | bc -l -50.7
echo $(($a%$b)) 3
echo ${c/./,} 8,9
echo -n $c NO NEWLINE
End of Week 07-08
Delete blank lines in the file.
sed -e '/^[ \t] *$/d' -e '/^$/d' file
Exercise 1: Write a shell script to convert 1-letter code to 3-letter code for amino acids. Example website. You may find tables in the Wiki or use the file. Also calcuate its molecular weight in the units of kDa and g/mol.
Exercise 2: Write a shell script to convert convert 3-letter code to 1-letter code for amino acids.
Exercise 3: Calculate the GC contents for an input sequence of DNA.
Internet references:
Convert PDB file to other formats:
Protein Data Bank
PDB file format
Crystal structure of chitosanase from Bacillus circulans MH-K1 at 1.6 Å resolution and its substrate recognition mechanism.
Original PDB file 1QGI.pdb before conversion.
After conversion: 1QGI.gjf in gaussian format.
and: 1QGI_3.gjf in gaussian ONIOM 3-layer format.
grep "^ATOM " 1QGI.pdb | awk -F " " '{ print $NF$2, $7, $8, $9}'
grep "^ATOM " 1QGI.pdb | awk -F " " '{ printf "%s, %-2.4f, %-2.4f, %-2.4f \n" $NF,$2, $7, $8, $9}'
grep "^ATOM " 1QGI.pdb | awk -F " " '{ print $NF$2, $7, $8, $9}'|sed 's/ /\(Fragment\=1\)\ /'
grep "^FORMUL " 1QGI.pdb
grep "^FORMUL " 1QGI.pdb | wc
grep "^FORMUL " 1QGI.pdb | wc -l
grep "^FORMUL " 1QGI.pdb | awk -F " " '{print $3}'
grep "^FORMUL " 1QGI.pdb | awk -F " " '{printf $3}'
grep "^FORMUL " 1QGI.pdb | head -1 | awk -F " " '{printf $3}'
grep "^FORMUL " 1QGI.pdb | head -1 | tail -1 | awk -F " " '{printf $3}'
grep "^FORMUL " 1QGI.pdb | head -2 | tail -1 | awk -F " " '{printf $3}'
grep "^HETATM " 1QGI.pdb | grep GCS
grep "^HETATM " 1QGI.pdb | grep `grep "^FORMUL " 1QGI.pdb | head -1 | tail -1 | awk -F " " '{printf $3}'`
grep "^HETATM " 1QGI.pdb | grep `grep "^FORMUL " 1QGI.pdb | head -1 | tail -1 | awk -F " " '{printf $3}'` | awk -F " " '{ print $NF$2, $7, $8, $9}'|sed 's/ /\(Fragment\=2\)\ /'
Set variable from commands:
NUM_HETGRPS=`grep "^FORMUL " 1QGI.pdb | wc -l`
Loop in bash
for i in `cat /etc/passwd | cut -d : -f 1`
do echo $i ;
finger $i ;
done
NUM_HETGRPS=`grep "^FORMUL " 1QGI.pdb | wc -l`
HETGRPS=1
while (("$HETGRPS" <= $NUM_HETGRPS))
do grep "^HETATM " 1QGI.pdb | grep `grep "^FORMUL " 1QGI.pdb | head -$HETGRPS | tail -1 | awk -F " " '{printf $3}'` | awk -F " " '{ print $NF$2"(Fragment="2")", $7, $8, $9}'
HETGRPS=$((HETGRPS+1))
done
NUM_HETGRPS=`grep "^FORMUL " 1QGI.pdb | wc -l`
HETGRPS=1
while (("$HETGRPS" <= $NUM_HETGRPS))
do grep "^HETATM " 1QGI.pdb | grep `grep "^FORMUL " 1QGI.pdb | head -$HETGRPS | tail -1 | awk -F " " '{printf $3}'` | awk -F " " '{ print $NF$2"(Fragment="HETGRPS")", $7, $8, $9}' HETGRPS=$HETGRPS
HETGRPS=$((HETGRPS+1))
done
NUM_HETGRPS=`grep "^FORMUL " 1QGI.pdb | wc -l`
HETGRPS=1
while (("$HETGRPS" <= $NUM_HETGRPS))
do grep "^HETATM " 1QGI.pdb | grep `grep "^FORMUL " 1QGI.pdb | head -$HETGRPS | tail -1 | awk -F " " '{printf $3}'` | awk -F " " '{ print $NF$2"(Fragment="HETGRPS")", $7, $8, $9}' HETGRPS=$((HETGRPS+1))
HETGRPS=$((HETGRPS+1))
done
Gaussian header: .gjf
# PM6
空行
Title
空行
0,1
< xyz format follows >
空行
Gaussian header for ONIOM layer format: .gjf
# ONIOM(MP2/6-31G:HF/6-31G:PM6)
空行
Title
空行
0,1
< xyz format follows with extra L,M or H flag>
空行