Advanced Bash-Scripting Guide: An in-depth exploration of the art of shell scripting | ||
---|---|---|
Prev | Chapter 9. Variables Revisited | Next |
Bash supports a surprising number of string manipulation operations. Unfortunately, these tools lack a unified focus. Some are a subset of parameter substitution, and others fall under the functionality of the UNIX expr command. This results in inconsistent command syntax and overlap of functionality, not to mention confusion.
String Length
stringZ=abcABC123ABCabc echo ${#stringZ} # 15 echo `expr length $stringZ` # 15 echo `expr "$stringZ" : '.*'` # 15 |
Length of Matching Substring at Beginning of String
$substring is a regular expression.
$substring is a regular expression.
stringZ=abcABC123ABCabc # |------| echo `expr match "$stringZ" 'abc[A-Z]*.2'` # 8 echo `expr "$stringZ" : 'abc[A-Z]*.2'` # 8 |
Index
Numerical position in $string of first character in $substring that matches.
stringZ=abcABC123ABCabc echo `expr index "$stringZ" C12` # 6 # C position. echo `expr index "$stringZ" 1c` # 3 # 'c' (in #3 position) matches before '1'. |
This is the near equivalent of strchr() in C.
Substring Extraction
Extracts substring from $string at $position.
If the string parameter is "*" or "@", then this extracts the positional parameters, [1] starting at position.
Extracts $length characters of substring from $string at $position.
stringZ=abcABC123ABCabc # 0123456789..... # 0-based indexing. echo ${stringZ:0} # abcABC123ABCabc echo ${stringZ:1} # bcABC123ABCabc echo ${stringZ:7} # 23ABCabc echo ${stringZ:7:3} # 23A # Three characters of substring. |
If the string parameter is "*" or "@", then this extracts a maximum of length positional parameters, starting at position.
echo ${*:2} # Echoes second and following positional parameters. echo ${@:2} # Same as above. echo ${*:2:3} # Echoes three positional parameters, starting at second. |
Extracts $length characters from $string starting at $position.
stringZ=abcABC123ABCabc # 123456789...... # 1-based indexing. echo `expr substr $stringZ 1 2` # ab echo `expr substr $stringZ 4 3` # ABC |
Extracts $substring at beginning of $string, where $substring is a regular expression.
Extracts $substring at beginning of $string, where $substring is a regular expression.
stringZ=abcABC123ABCabc echo `expr match "$stringZ" '\(.[b-c]*[A-Z]..[0-9]\)'` # abcABC1 echo `expr "$stringZ" : '\(.[b-c]*[A-Z]..[0-9]\)'` # abcABC1 # Both of the above forms are equivalent. |
Substring Removal
Strips shortest match of $substring from front of $string.
Strips longest match of $substring from front of $string.
stringZ=abcABC123ABCabc # |----| # |----------| echo ${stringZ#a*C} # 123ABCabc # Strip out shortest match between 'a' and 'C'. echo ${stringZ##a*C} # abc # Strip out longest match between 'a' and 'C'. |
Strips shortest match of $substring from back of $string.
Strips longest match of $substring from back of $string.
stringZ=abcABC123ABCabc # || # |------------| echo ${stringZ%b*c} # abcABC123ABCa # Strip out shortest match between 'b' and 'c', from back of $stringZ. echo ${stringZ%%b*c} # a # Strip out longest match between 'b' and 'c', from back of $stringZ. |
Example 9-10. Converting graphic file formats, with filename change
#!/bin/bash
# cvt.sh:
# Converts all the MacPaint image files in a directory to "pbm" format.
# Uses the "macptopbm" binary from the "netpbm" package,
#+ which is maintained by Brian Henderson ([email protected]).
# Netpbm is a standard part of most Linux distros.
OPERATION=macptopbm
SUFFIX=pbm # New filename suffix.
if [ -n "$1" ]
then
directory=$1 # If directory name given as a script argument...
else
directory=$PWD # Otherwise use current working directory.
fi
# Assumes all files in the target directory are MacPaint image files,
# + with a ".mac" suffix.
for file in $directory/* # Filename globbing.
do
filename=${file%.*c} # Strip ".mac" suffix off filename
#+ ('.*c' matches everything
#+ between '.' and 'c', inclusive).
$OPERATION $file > $filename.$SUFFIX
# Redirect conversion to new filename.
rm -f $file # Delete original files after converting.
echo "$filename.$SUFFIX" # Log what is happening to stdout.
done
exit 0 |
Substring Replacement
Replace first match of $substring with $replacement.
Replace all matches of $substring with $replacement.
stringZ=abcABC123ABCabc echo ${stringZ/abc/xyz} # xyzABC123ABCabc # Replaces first match of 'abc' with 'xyz'. echo ${stringZ//abc/xyz} # xyzABC123ABCxyz # Replaces all matches of 'abc' with # 'xyz'. |
If $substring matches front end of $string, substitute $replacement for $substring.
If $substring matches back end of $string, substitute $replacement for $substring.
stringZ=abcABC123ABCabc echo ${stringZ/#abc/XYZ} # XYZABC123ABCabc # Replaces front-end match of 'abc' with 'xyz'. echo ${stringZ/%abc/XYZ} # abcABC123ABCXYZ # Replaces back-end match of 'abc' with 'xyz'. |
A Bash script may invoke the string manipulation facilities of awk as an alternative to using its built-in operations.
Example 9-11. Alternate ways of extracting substrings
#!/bin/bash # substring-extraction.sh String=23skidoo1 # 012345678 Bash # 123456789 awk # Note different string indexing system: # Bash numbers first character of string as '0'. # Awk numbers first character of string as '1'. echo ${String:2:4} # position 3 (0-1-2), 4 characters long # skid # The awk equivalent of ${string:pos:length} is substr(string,pos,length). echo | awk ' { print substr("'"${String}"'",3,4) # skid } ' # Piping an empty "echo" to awk gives it dummy input, #+ and thus makes it unnecessary to supply a filename. exit 0 |
For more on string manipulation in scripts, refer to Section 9.3 and the relevant section of the expr command listing. For script examples, see:
[1] | This applies to either command line arguments or parameters passed to a function. |