TB 121 Ltnews 28
TB 121 Ltnews 28
TB 121 Ltnews 28
LATEX News
Issue 28, April 2018
LATEX News, and the LATEX software, are brought to you by the LATEX3 Project Team; Copyright 2018, all rights reserved.
provide support for multiple encodings. It also allows Only documents that have been stored in a legacy
to correctly process a file written in one encoding on a encoding and used accented letters from the keyboard
computer using a different encoding and even supports without loading inputenc (relying on the similarities
documents where the encoding changes midway. between the input used and the T1 font encoding) are
Since the first release of LATEX 2ε in 1994, LATEX affected.
documents that used any characters outside ascii in These documents will now generate an error that
the source (i.e. any characters in the range of 128–255) they contain invalid UTF-8 sequences. However, such
were supposed to load inputenc and specify in which documents may be easily processed by adding the new
file encoding they were written and stored. If the command \UseRawInputEncoding as the first line of the
inputenc package was not loaded then LATEX used a file. This will re-instate the previous “raw” encoding
“raw” encoding which essentially took each byte from default.
the input file and typeset the glyph that happened to \UseRawInputEncoding may also be used on the
be in that position in the current font—something that command line to process existing files without requiring
sometimes produces the right result but often enough the file to be edited
will not.
pdflatex ’\UseRawInputEncoding \input’ file
In 1992 Ken Thompson and Rob Pike developed the
UTF-8 encoding scheme which enables the encoding will process the file using the previous default encoding.
of all Unicode characters within 8-bit sequences. Over Possible alternatives are reencoding the file to UTF-8
time this encoding has gradually taken over the world, using a tool (such as recode or iconv or an editor) or
replacing the legacy 8-bit encodings used before. These adding the line
days all major computer operating systems use UTF-8 \usepackage[hencodingi]{inputenc}
to store their files and it requires some effort to explicitly
store files in one of the legacy encodings. to the preamble specifying the hencodingi that fits the
As a result, whenever LATEX users want to use any file encoding. In many cases this will be latin1 or
accented characters from their keyboard (instead of cp1252. For other encoding names and their meaning
resorting to \"a and the like) they always have to use see the inputenc documentation.
As usual, this change may also be reverted via
\usepackage[utf8]{inputenc} the more general latexrelease package mechanism, by
in the preamble of their documents as otherwise LATEX speciying a release date earlier than this release.
will produce gibberish.
BOM: byte order mark handling
The new default When using Unicode the first bytes of a file may be a, so
With this release, the default encoding for LATEX files called, BOM character (byte order mark) to indicate the
has been changed from the “fall through raw” encoding byte oder used in the file. While this is not required with
to UTF-8 if used with classic TEX or pdfTEX. The UTF-8 encoded files (where the byte order is known) it
implementation is essentially the same as the existing is nevertheless allowed by the standard and some editors
UTF-8 support from \usepackage[utf8]{inputenc}. add that byte sequence to the beginning of a file. In the
The LuaTEX and XETEX engines always supported past such files would have generated a “Missing begin
the UTF-8 encoding as their native input encoding, so document” error or displayed strange characters when
with these engines inputenc was always a no-op. loaded at a later stage.
This means that with new documents one can assume With the addition of UTF-8 support to the kernel it is
UTF-8 input and it is no longer required to always now possible to identify and ignore such BOM characters
specify \usepackage[utf8]{inputenc}. But if this line even before \documentclass so that these issues will no
is present it will not hurt either. longer be showing up.
Compatibility
A general rollback concept for packages and
For most existing documents this change will be
transparent: classes
• documents using only ascii in the input file and In 2015 a rollback concept for the LATEX kernel was
accessing accented characters via commands; introduced. Providing this feature allowed us to make
corrections to the software (which more or less didn’t
• documents that specified the encoding of their file happen for nearly two decades) while continuing to
via an option to the inputenc package and then maintain backward compatibility to the highest degree.
used 8-bit characters in that encoding; In this release we have now extended this concept to
• documents that already had been stored in UTF-8 the world of packages and classes which was not covered
(whether or not specifying this via inputenc). initially. As the classes and the extension packages
have different requirements compared to the kernel, Obscure overprinting with multicol fixed
the approach is different (and simplified). This should A rather peculiar bug was reported on StackExchange
make it easy for package developers to apply it to their for multicol. If the column/page breaking was fully
packages and authors to use when necessary. controlled by the user (through \columnbreak) instead
The documentation of this new feature is given in an of letting the environment do its job and if then more
article submitted to TUGboat and also available from \columnbreak commands showed up on the last page
our website [3]. then the balancing algorithm was thrown off track. As a
result some parts of the columns overprinted each other.
Integration of remreset and chngcntr packages The fix required a redesign of the output routines
into the kernel used by multicol and while it “should” be transparent in
With the optional argument to \newcounter LATEX other cases (and all tests in the regession test suite came
offers to automatically reset counters when some counter out fine) there is the off-chance that code that hooked
is stepped, e.g., stepping a chapter counter resets the into the internals of multicol needs adjustment.
section counter (and recursively all other heading
Changes to packages in the amsmath category
counters). However, what was until now missing was a
way to undo such a link between counters or to link two With this release of LATEX a few minor issues with
counters after they have been defined. amsmath have been corrected.
This can be now be done with \counterwithin Updated user’s guide
and \counterwithout, respectively. In the past one Furthermore, amsldoc.pdf, the AMS user’s guide
had to load the chngcntr package for this. For the for the amsmath package [5], has been updated from
programming level we also added \@removefromreset version 2.0 to 2.1 to incorporate changes and corrections
as the counterpart of the already existing \@addtoreset made between 2016 and 2018.
command. Up to now this was offered by the remreset
package.
References
Testing for undefined commands
[1] Frank Mittelbach: New rules for reporting bugs in
LATEX packages often use a test \@ifundefined to test the LATEX core software. In: TUGboat, 39#1, 2018.
if a command is defined. Unfortunately this had the https://www.latex-project.org/publications/
side effect of defining the command to \relax in the
case that it had no definition. The new release uses [2] Frank Mittelbach: LATEX 2ε Encoding
a modified definition (using extra testing possibilities Interface — Purpose, concepts, and Open
available in ε-TEX). The new definition is more natural, Problems. Talk given in Brno June 1995.
however code that was relying on the side effect of the https://www.latex-project.org/publications/
command being tested being defined if it was previously [3] Frank Mittelbach: A rollback concept for packages
undefined may have to add \let\hcommandi\relax. and classes. Submitted to TUGboat.
https://www.latex-project.org/publications/
Changes to packages in the tools category
[4] Frank Mittelbach: LATEX table columns with fixed
LATEX table columns with fixed widths widths. In: TUGboat, 38#2, 2017.
Frank published a short paper in TUGboat [4] on https://www.latex-project.org/publications/
producing tables that have columns with fixed widths.
The outlined approach using column specifiers “w” and [5] American Mathematical Society and The LATEX3
“W” has now been integrated into the array package. Project: User’s Guide for the amsmath package
(Version 2.1). April 2018. Available from
https://www.ctan.org and distributed as part of
every LATEX distribution.