Electricmonk

Ferry Boender

Programmer, DevOpper, Open Source enthusiast.

Blog

Re: The shell and its crappy handling of whitespace

Tuesday, August 1st, 2023

I saw an interesting article on The Universe of Discourse about The shell and its crappy handling of whitespace.

I’m about thirty-five years into Unix shell programming now, and I continue to despise it. The shell’s treatment of whitespace is a constant problem. The fact that

for i in *.jpg; do
  cp $i /tmp
done

doesn’t work is a constant pain. The problem here is that if one of the filenames is bite me.jpg then the cp command will turn into

  cp bite me.jpg /tmp

and fail, saying

  cp: cannot stat 'bite': No such file or directory
  cp: cannot stat 'me.jpg': No such file or directory

or worse there is a file named bite that is copied even though you did not want to copy it, maybe overwriting /tmp/bite that you wanted to keep.

To make it work properly you have to say

for i in *; do
  cp "$i" /tmp
done

with the quotes around the $i.

The article then goes on:

Suppose I want to change the names of all the .jpeg files to the corresponding names with .jpg instead. I can do it like this:

for i in *.jpeg; do
  mv $i $(suf $i).jpg
done

Ha ha, no,some of the files might have spaces in their names. […]

before finally settling on the quote-hell version:

for i in *.jpeg; do
  mv "$i" "$(suf "$i")".jpg  # three sets of quotes
done

This sparked some interesting discussions on Lobste.rs and Hacker News, and several people suggested that other shells do this properly, suggesting that there is no proper solution for this in standard shells such as bash.

A proper solution

However, this problem has long been solved and is in fact part of the POSIX standard. That solution is called the IFS, or the Internal Field Separator:

The shell shall treat each character of the IFS as a delimiter and use the delimiters as field terminators to split the results of parameter expansion, command substitution, and arithmetic expansion into fields.

I’m quite surprised that noone on the Hacker News of Lobste.rs discussions mentioned it. You can simply set the IFS to an empty value, and things work a lot saner. For example, to achieve the post’s example:

Suppose I want to change the names of all the .jpeg files to the corresponding names with .jpg instead.

You can simply do something like:

# Create some test files
echo "foo" > "foo.jpg"
echo "bar" > "bar.jpg"
echo "baz quux" > "baz cuux.jpg"

# Store old IFS and set it to empty value
OLD_IFS="$IFS"; IFS=""
for I in *; do
    # No need to quote anything at all!
    mv $I $(basename $I .txt).jpeg
done
# Reset IFS to previous value
IFS="$OLD_IFS"

ls -l

Which results in:

-rw-r--r-- 1 Gebruiker Geen   0 Aug  1 10:07  bar.jpeg
-rw-r--r-- 1 Gebruiker Geen   0 Aug  1 10:07 'baz cuux.jpeg'
-rw-r--r-- 1 Gebruiker Geen   0 Aug  1 10:07  foo.jpeg

Sure, it feels a little bit uncomfortable, but it’s a perfectly fine solution nonetheless.

The text of all posts on this blog, unless specificly mentioned otherwise, are licensed under this license.