Log in

View Full Version : How to strip extended XTerm ANSI Escape sequences with RegEx?


LigH
17th April 2023, 10:32
In general, an ANSI Escape sequence is made of


the ESC character (ASCII 27 = 0x1B)
usually the [ character
optional number parameters, separated by ; semicola
usually one alphanumeric letter


These strings could be filtered with a regular expression substitution for e.g. sed like:

s/\x1b\[[0-9;]*[a-zA-Z]//g

But ... XTerm (in mintty) supports a few more custom extensions which do not only start with the common [, some also start with ] or (.

I tried several ways to substitute the \[ with a list of \[, \] and \(, but I can't get it working correctly...

If you want to test a solution you would like to recommend, here is a small sample (https://github.com/m-ab-s/media-autobuild_suite/issues/2204#issuecomment-1147911581).
__

PS: Some progress ... I could combine patterns starting with ESC and ([ or () using

s/\x1b[[(][0-9;]*[a-zA-Z]//g

But I'm still missing the ESC ]0; pattern as alternative, which does not close with an alphanumeric letter.
__

PPS: I believe I got it.

The pipe | needs to be escaped to be recognised as logical OR inside the sed parameter, but only in the live editor; in a shell script it doesn't.

s/\x1b[[(][0-9;]*[a-zA-Z]|\x1b\][0-9];//g

Thanks to the GNU sed live editor (https://sed.js.org/) :)