With a UTF-8 locale, this sed command to insert a character at the beginning of each line incorrectly crashes with an error about an illegal byte sequence:
$ echo “hi | LANG=en_US.UTF-8 sed -e s'/^/x/g'
sed: RE error: illegal byte sequence
If you change the ^
to an ordinary character, or drop the g
flag, it works fine. I’m guessing the g
makes it check the line again, but it gets messed up trying to find the start of the line with the multi-byte character there.
I’m seeing this on both Ventura on arm and Monterey on intel, and haven’t checked further back than that. I know that the sed is BSD-derived, so I did test this on FreeBSD 13.1 and it does not have this bug.
I’ve never filed a bug with Apple before. If I file this as a bug with Feedback Assistant, what on earth do I tag it with in the hopes that the apple sed maintainer(s) might see it? There’s no ‘sed’ or ‘unix’ or ‘command line tools’ option.