With a UTF-8 locale, this sed command to insert a character at the beginning of each line incorrectly crashes with an error about an illegal byte sequence:
$ echo “hi | LANG=en_US.UTF-8 sed -e s'/^/x/g'
sed: RE error: illegal byte sequence
If you change the ^ to an ordinary character, or drop the g flag, it works fine. I’m guessing the g makes it check the line again, but it gets messed up trying to find the start of the line with the multi-byte character there.
I’m seeing this on both Ventura on arm and Monterey on intel, and haven’t checked further back than that. I know that the sed is BSD-derived, so I did test this on FreeBSD 13.1 and it does not have this bug.
I’ve never filed a bug with Apple before. If I file this as a bug with Feedback Assistant, what on earth do I tag it with in the hopes that the apple sed maintainer(s) might see it? There’s no ‘sed’ or ‘unix’ or ‘command line tools’ option.
Selecting any option will automatically load the page