Re: The shell and its crappy handling of whitespace
Tuesday, August 1st, 2023
I saw an interesting article on The Universe of Discourse about The shell and its crappy handling of whitespace.
I’m about thirty-five years into Unix shell programming now, and I continue to despise it. The shell’s treatment of whitespace is a constant problem. The fact that
for i in *.jpg; do cp $i /tmp donedoesn’t work is a constant pain. The problem here is that if one of the filenames is
bite me.jpgthen thecpcommand will turn intocp bite me.jpg /tmpand fail, saying
cp: cannot stat 'bite': No such file or directory cp: cannot stat 'me.jpg': No such file or directoryor worse there is a file named
bitethat is copied even though you did not want to copy it, maybe overwriting/tmp/bitethat you wanted to keep.To make it work properly you have to say
for i in *; do cp "$i" /tmp donewith the quotes around the $i.
The article then goes on:
Suppose I want to change the names of all the .jpeg files to the corresponding names with .jpg instead. I can do it like this:
for i in *.jpeg; do mv $i $(suf $i).jpg doneHa ha, no,some of the files might have spaces in their names. […]
before finally settling on the quote-hell version:
for i in *.jpeg; do mv "$i" "$(suf "$i")".jpg # three sets of quotes done
This sparked some interesting discussions on Lobste.rs and Hacker News, and several people suggested that other shells do this properly, suggesting that there is no proper solution for this in standard shells such as bash.
A proper solution
However, this problem has long been solved and is in fact part of the POSIX standard. That solution is called the IFS, or the Internal Field Separator:
The shell shall treat each character of the IFS as a delimiter and use the delimiters as field terminators to split the results of parameter expansion, command substitution, and arithmetic expansion into fields.
I’m quite surprised that noone on the Hacker News of Lobste.rs discussions mentioned it. You can simply set the IFS to an empty value, and things work a lot saner. For example, to achieve the post’s example:
Suppose I want to change the names of all the .jpeg files to the corresponding names with .jpg instead.
You can simply do something like:
# Create some test files
echo "foo" > "foo.jpg"
echo "bar" > "bar.jpg"
echo "baz quux" > "baz cuux.jpg"
# Store old IFS and set it to empty value
OLD_IFS="$IFS"; IFS=""
for I in *; do
# No need to quote anything at all!
mv $I $(basename $I .txt).jpeg
done
# Reset IFS to previous value
IFS="$OLD_IFS"
ls -l
Which results in:
-rw-r--r-- 1 Gebruiker Geen 0 Aug 1 10:07 bar.jpeg -rw-r--r-- 1 Gebruiker Geen 0 Aug 1 10:07 'baz cuux.jpeg' -rw-r--r-- 1 Gebruiker Geen 0 Aug 1 10:07 foo.jpeg
Sure, it feels a little bit uncomfortable, but it’s a perfectly fine solution nonetheless.
