Alexander Gromnitsky's Blog

Escape sequences in file names

Air Date: 2025-03-30
Latest update: 2025-03-30 17:58:10

How to annoy folks who use busybox:

$ touch `printf "\033[1;33m\033[44mhello"`
$ ls
''$'\033''[1;33m'$'\033''[44mhello'
$ tar cf 1.tar *hello
$ busybox tar tf 1.tar
hello
$ rpm -q busybox
busybox-1.36.1-8.fc41.x86_64
$

Oopsie-daisy.

A simple ls|cat or ls|less -r produces the same effect.

This won't work with gnu tar & bsdtar, for they both properly escape escape sequences.

Tags: ойті
Authors: ag

Marc Rochkind on managers vs. programmers

Air Date: 2025-03-27
Latest update: 2025-03-27 21:06:09

This is the guy who wrote SCCS while working at Bell Labs.

Date: Wed, 3 Jul 2024 17:29:26 -0600
From: Marc Rochkind <mrochkind@gmail.com>
Newsgroups: gmane.org.unix-heritage.general
Subject: Re: Anyone ever heard of teaching a case study of Initial Unix?
Message-ID: <CAOkr1zXSefHKOqTCaGE7Zb_T09HD-s2pM9QfTW8PuMkLGioDGg@mail.gmail.com>

On Wed, Jul 3, 2024 at 9:27 AM Vincenzo Nicosia wrote:
> The programmers considered as "fungible workforce" by mainstream
> software engineering and project management theories are *paid* to
> to their programming job, and they mostly have to carry that job
> over working on prescribed objectives and timelines which have been
> decided by somebody else, managers who know nothing at all about
> software development. Personal interest in the project, passion,
> motivation, curiosity, creative power, sense of beauty, the joy of
> belonging to a community of likeminded people, are never part of the
> equation, at any point.

What a cynical take on software development! The logical error is to
assume that if something is sometimes true (e.g., "managers who know
nothing at all about software development") then it is always true.

My experience over many decades is quite different. Most often,
managers know software quite well. Where they fail is in their very
poor understanding of how to manage people.

The bias that operates in software development, and perhaps all
organizations, is that when there is a disagreement between management
and non-management (e.g., programmers), the non-managers usually
assume that they are always right and the managers are wrong.

I have never met a programmer or group of programmers who were always
right. Most often, they are ignorant of financing, regulatory
constraints, product schedules, commitments, staffing issues, and
everything else that isn't coding. (There are exceptions, but they are
uncommon.) Management, by definition, is the art and science of using
resources to reach an objective. Programmers generally are concerned
only with themselves as a resource and with their own personal
programming objective. It is unusual to find a programmer who
understands management.

Tags: quote
Authors: ag

ls colours

Air Date: 2025-03-26
Latest update: 2025-03-26 23:44:06

While reading about Poettering's adventures in determining the best $TERM value for serial & VM terminals in systemd, I accidentally discovered that ls from coreutils doesn't consult terminfo to check whether a terminal emulator (TE) supports colour.

What does it consult then? $TERM? Yes and no.

At first glance, with --color=auto flag, it ignores $TERM completely, trusting the LS_COLORS environment variable. But if $LS_COLORS is absent, it still prints in colour for some terminals--yet, simultaneously, not for all file types. What is going on?

You may have seen /etc/DIR_COLORS file. Usually, your distro includes some default sh scripts that invoke the dircolors(1) program with that file as an argument, generating a big, ugly looking string value for $LS_COLORS.

The interesting thing about /etc/DIR_COLORS is that it's just a copy of a file embedded into ls and dircolors programs (as a single static char const G_line[] variable) during compilation, meaning that in the absence of /etc/DIR_COLORS they technically still have a default colour scheme.

ls doesn't read that (probably modified by a user) file directly--it reads LS_COLORS environment variable, presumably because, in 1996, dynamically generating $LS_COLORS values was considered too costly. Hence, ls has a separate table (color_indicator[], for the curious) listing "important" file types. That table is consulted in case $LS_COLORS is unset.

But that doesn't explain why with no $LS_COLORS in sight, ls suddenly starts looking at $TERM. The DIR_COLORS file also contains a list of TEs ls considers capable of printing in colour. So why, then, is vt100 on that list when, according to the interwebs, nobody in 1978 in their right mind thought a colour terminal made any sense, unless they were driving a Lamborghini Countach? (I might be exaggerating a little.)

Anyhow, ls employs its embedded DIR_COLORS file to:

/* Check if the content of TERM is a valid name in dircolors.  */
static bool known_term_type(void) {
  char const *term = getenv("TERM");
  if (!term || !*term)
    return false;

  char const *line = G_line;
  while (line - G_line < sizeof(G_line)) {
    if (STRNCMP_LIT(line, "TERM ") == 0) {
      if (fnmatch(line + 5, term, 0) == 0)
        return true;
    }
    line += strlen(line) + 1;
  }

  return false;
}

See? Now it's all come together. Moreover, to please Lennart, in the next release of coreutils, vt220 will be joining the illustrious crew of colour terminals too. What a time to be alive.

Tags: ойті
Authors: ag

XTGETTCAP

Air Date: 2025-03-25
Latest update: 2025-03-26 00:05:02

When the authors of a new terminal emulator (TE) decide which capabilities to support, they inevitably end up with a subset of XTerm's features. Then, much like a non-mainstream web browser hiding its true identity behind a fake User-Agent request header, the new TE often claims to be a variant of xterm. This frustrates the current XTerm maintainer & the maintainers of the terminfo database, who insist that every TE should have its own entry in terminfo, and that it's always better to esse quam videri. So far, TE authors have largely ignored this prudent advice.

Sometimes, the ability to deceive programs that rely on the TERM environment variable can be useful. Suppose you want to display an infobox to a user via dialog(1), but you want it to use a monochrome palette. The dialog program doesn't have a flag for this, but we can trick it into thinking it's running in a terminal that doesn't support colours:

$ TERM=vt100 dialog --infobox 'ну шо ти малá' 0 0

This works because the dialog(1) program uses the ncurses library, which in turn relies on the terminfo database. The latter includes an entry for the vt100 terminal, that, according to terminfo, is fairly unsophisticated:

$ TERM=vt100 tput colors
-1

To get a full list of such simpletons, run:

$ find /usr/share/terminfo -type f |
    xargs -n1 basename |
    while read line; do [ `tput -T $line colors` = -1 ] && echo $line; done

In general, CLI programs that use terminal features beyond simply deleting the previous character fall into 2 categories--those that:

consult terminfo for supported capabilities;
include their own little database of compatible TE.

Both rely on $TERM, hence, both can be easily lead astray.

There is a 3rd way: ask the terminal emulator (TE) to report its capabilities using escape sequences. This mechanism, called XTGETTCAP, is slowly gaining popularity among TEs. Obviously, it "works over ssh" & doesn't require terminfo to be installed on the remote machine. XTGETTCAP was first introduced by XTerm¹ and is now supported by Kitty and iTerm2. With this mechanism, the value of $TERM becomes less relevant:

$ echo $TERM
xterm-256color
$ TERM=vt100 ./XTGETTCAP TN
xterm-256color
$ ./XTGETTCAP colors
256

Alas, there is no widely known CLI utility named XTGETTCAP--in the example above, it's just a ~small shell script. Even more unfortunate is that doing a proper XTGETTCAP query & reading the response is a tricky business. You need to:

Construct a query using an escape sequence & hexify the capability name, e.g., colors becomes 636f6c6f7273.
Switch the TE from canonical to raw mode.
Handle raw mode correctly. In this mode, the TE doesn't assemble characters into lines, so we can't rely on it to return everything up to one line at a time. Instead, we must either request a response of a known size (we don't know the size!) or read an unknown-length response with a timeout. If you royally screw this up, your read operation may block ∞.
Restore canonical mode.
Unpack the reply.

#!/bin/sh

set -e

eh() { echo "Error: $*" 1>&2; trap - 0; exit 1; }
text2hex() { od -A n -t x1 | tr -d ' \n'; }
hex2text() { sed 's/../0x& /g' | xargs printf '\\\\%03o\n' | xargs printf %b; }
stty_orig=`stty -g`
stty_restore() { stty "$stty_orig"; }

[ -n "$1" ] || eh Usage: XTGETTCAP capability

capablity=`printf '%s' "$1" | text2hex`
trap stty_restore 0

stty -echo raw time 1 min 0
printf '\033P+q%s\033'\\ "$capablity" > /dev/tty
buf=`dd status=none count=1`
stty_restore

[ -n "$buf" ] || eh unsupported terminal
[ "$V" ] && printf %s "$buf" | xxd

buf=`printf %s "$buf" | sed 's/[^a-zA-Z0-9+=]//g'`
[ P1 = "${buf%+*}" ] || eh unknown capabilitah
printf %s "${buf#*=}" | hex2text | xargs

A couple of notes:

the script doesn't work under screen/tmux, even if an underline TE supports the XTGETTCAP mechanism.
if you decide to fiddle with time and min settings, think twice; here's a matrix from APUE, ch. 18:
hex2text() may look silly, but it works even under FreeBSD (unless the source contains encoded null bytes) & doesn't rely on non-standard utilities like xxd;
in principle, dd ... call can be replaced with a simple cat.

$ ./XTGETTCAP TN
xterm-kitty
$ VERBOSE=1 ./XTGETTCAP colors
00000000: 1b50 312b 7236 3336 6636 6336 6637 3237  .P1+r636f6c6f727
00000010: 333d 3332 3335 3336 1b5c                 3=323536.\
256
$ ./XTGETTCAP шо?
Error: unknown capabilitah

Requires allowTcapOps resource to be true. By default, it's off on all IBM distros due to security concerns (e.g., see CVE-2022-45063).

Tags: ойті
Authors: ag