Re: The shell and its crappy handling of whitespace
Tuesday, August 1st, 2023
I’m about thirty-five years into Unix shell programming now, and I continue to despise it. The shell’s treatment of whitespace is a constant problem. The fact thatfor i in *.jpg; do cp $i /tmp done
doesn’t work is a constant pain. The problem here is that if one of the filenames is
bite me.jpgthen the
cpcommand will turn into
cp bite me.jpg /tmp
and fail, saying
cp: cannot stat 'bite': No such file or directory cp: cannot stat 'me.jpg': No such file or directory
or worse there is a file named
bitethat is copied even though you did not want to copy it, maybe overwriting
/tmp/bitethat you wanted to keep.
To make it work properly you have to say
for i in *; do cp "$i" /tmp done
with the quotes around the $i.
The article then goes on:
Suppose I want to change the names of all the .jpeg files to the corresponding names with .jpg instead. I can do it like this:for i in *.jpeg; do mv $i $(suf $i).jpg done
Ha ha, no,some of the files might have spaces in their names. […]
before finally settling on the quote-hell version:
for i in *.jpeg; do mv "$i" "$(suf "$i")".jpg # three sets of quotes done
This sparked some interesting discussions on Lobste.rs and Hacker News, and several people suggested that other shells do this properly, suggesting that there is no proper solution for this in standard shells such as bash.
A proper solution
The shell shall treat each character of the IFS as a delimiter and use the delimiters as field terminators to split the results of parameter expansion, command substitution, and arithmetic expansion into fields.
I’m quite surprised that noone on the Hacker News of Lobste.rs discussions mentioned it. You can simply set the IFS to an empty value, and things work a lot saner. For example, to achieve the post’s example:
Suppose I want to change the names of all the .jpeg files to the corresponding names with .jpg instead.
You can simply do something like:
# Create some test files echo "foo" > "foo.jpg" echo "bar" > "bar.jpg" echo "baz quux" > "baz cuux.jpg" # Store old IFS and set it to empty value OLD_IFS="$IFS"; IFS="" for I in *; do # No need to quote anything at all! mv $I $(basename $I .txt).jpeg done # Reset IFS to previous value IFS="$OLD_IFS" ls -l
Which results in:
-rw-r--r-- 1 Gebruiker Geen 0 Aug 1 10:07 bar.jpeg -rw-r--r-- 1 Gebruiker Geen 0 Aug 1 10:07 'baz cuux.jpeg' -rw-r--r-- 1 Gebruiker Geen 0 Aug 1 10:07 foo.jpeg
Sure, it feels a little bit uncomfortable, but it’s a perfectly fine solution nonetheless.