Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Hyph-Utf8: The Package and Hyphenation With TEX

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

The hyph-utf8 package and hyphenation with TEX

Maintainers of the hyph-utf8 package and collectors of patterns:


• Mojca Miklavec, Arthur Reutenauer
• With contributions by Khaled Hosny, Manuel Pégourié-Gonnard, Élie Roux

Main description:
June 2011
Latest editorial change:
16 Mars 2018

Abstract:
In 2008 all the existing hyphenation patterns from TEX distributions have been collected in a
single package hyph-utf8, converted into UTF-8 encoding and adapted for use in different
TEX engines. The patterns can be used directly by Unicode-aware engines such as LuaTEX
and XƎTEX, and there is a mechanism to convert the patterns to the appropriate 8-bit encoding
when used with pTEX, pdfTEX or Knuth’s TEX.

Table of Contents:

1 Using hyphenation patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2


1.1 Plain TEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 LATEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
LATEX with Babel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
LATEX with Polyglossia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Low-level commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 ConTEXt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
ConTEXt MkII . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Some advanced examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Example for Polyglossia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 List of supported languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1
1 Using hyphenation patterns

1.1 Plain TEX


In engines that support ε-TEX you can select the desired hyphenation patterns with:

\uselanguage{langname}

where langname is the string identifying a particular hyphenation file in language.def (see Section 2).

1.2 LATEX

1.2.1 LATEX with Babel


You can switch the language in LATEX with:

\usepackage[languagename]{babel}

In 8-bit engines you also need to make sure that you load the proper font encoding which supports all the
characters used in the language of your choice, for example:

\usepackage[T1]{fontenc}

N.B.: You can use Babel with any TEX engine, however it has never been properly adapted to work well with
Unicode engines. If you are using XƎTEX it is advisable to use Polyglossia instead.

1.2.2 LATEX with Polyglossia


Polyglossia should be the preferred choice when using XeLATEX. It doesn’t support LuaLATEX yet, but it is
planned to extend it in future.

\usepackage{polyglossia}
\setmainlanguage[optional settings]{langname}
\setotherlanguages{otherlangname}

\begin[optional settings]{otherlangname} ... \end{otherlangname}

See Polyglossia manual for extensive list of options.

1.2.3 Low-level commands


Since Babel’s hyphen.cfg is built into the LATEX format, hyphenation patterns can be used without even
loading Babel or Polyglossia. At the low-level this simply corresponds to defining

\language=\l@<langname>

2
The user command is supposed to be

\hyphenrules{langname}

or

\begin{hyphenrules}{langname} ... \end{hyphenrules}.

and should work with any flavour of LATEX, however we couldn’t make it work.

1.3 ConTEXt
ConTEXt doesn’t load patterns for all the language that hyph-utf8 provides. If you miss any language, please
contact the mailing list. The general syntax for supported languages is the following:

% language of the main document


\mainlanguage[language]

% to switch to another language locally


{\language[otherlanguage] language of some short fragment}

You can use full language name or the two-letter language code.

1.3.1 ConTEXt MkII


When using ConTEXt MkII the EC/T1 font encoding is used by default already, but you might need to change
the encoding when using Polish, languages written in Cyrillic scripts, etc. For example:

\usetypescript[iwona][qx]
\setupbodyfont[iwona]
\mainlanguage[polish]

ConTEXt loads hyphenation patterns in several encodings. The Czech or Slovak patterns can be used with both
EC and IL2 font encoding for example. The right hyphenation patterns will be chosen based on current font
encoding.

3
1.4 Some advanced examples

1.4.1 Example for Polyglossia

\usepackage{polyglossia}
% the language used for main document
\setmainlanguage{asturian}
% American English with extended hyphenation patterns
\setotherlanguage[variant=usmax]{english}
% German with experimental patterns "ngerman-x-latest"
\setotherlanguage[spelling=new,latesthyphen=true]{german}
\setotherlanguages{spanish,catalan,french}

\begin{document}

Long Asturian text ... (Hyphenation for Asturian is not available,


but polyglossia automatically falls back on Catalan for now,
which seems to be a reasonable choice.)

\begin{german}
Deutscher Text ... (with the hyphenation patterns selected above:
"ngerman-x-latest")
\end{german}

\begin[script=fraktur,spelling=old]{german}
Deutſcher Text ... (set in Fraktur, with traditional hyphenation).
\end{german}

\end{document}

4
2 List of supported languages
For several languages, there is additional documentation in a separate file: see

• For German, dehyph-exptl.pdf


• For Spanish, division.pdf

English
- english usenglish, USenglish, american
en-us usenglishmax
en-gb ukenglish british, UKenglish
Afrikaans Farsi
af afrikaans fa farsi persian
Ancientgreek Finnish
grc ancientgreek fi finnish
grc-x-ibycus ibycus
Arabic
ar arabic
Armenian
hy armenian
Assamese
as assamese
Basque
eu basque
Belarusian
be belarusian
Bengali
bn bengali
Bulgarian
bg bulgarian
Catalan
ca catalan
Chinese
zh-latn-pinyin pinyin
Church Slavonic
cu churchslavonic
Coptic
cop coptic
Croatian
hr croatian
Czech
cs czech
Danish
da danish
Dutch
nl dutch
Esperanto
eo esperanto
Estonian
et estonian
Ethiopic
mul-ethi ethiopic amharic, geez

5
French Marathi
fr french patois, francais mr marathi
Friulan Mongolian
fur friulan mn-cyrl mongolian
Galician mn-cyrl-x-lmc mongolianlmc
gl galician Norwegian
Georgian nb bokmal norwegian, norsk
ka georgian nn nynorsk
German Occitan
de-1901 german oc occitan
de-1996 ngerman Oriya
de-ch-1901 swissgerman or oriya
Greek Panjabi
el-monoton monogreek pa panjabi
el-polyton greek polygreek Polish
Gujarati pl polish
gu gujarati Piedmontese
Hindi pms piedmontese
hi hindi Portuguese
Hungarian pt portuguese portuges
hu hungarian Romanian
Icelandic ro romanian
is icelandic Romansh
Indonesian rm romansh
id indonesian Russian
Interlingua ru russian
ia interlingua Sanskrit
Irish sa sanskrit
ga irish Serbian
Italian sr-latn serbian
it italian sr-cyrl serbianc
Kannada Slovak
kn kannada sk slovak
Kurmanji Slovenian
kmr kurmanji sl slovenian slovene
Latin Spanish
la latin es spanish espanol
la-x-classic classiclatin Swedish
la-x-liturgic liturgicallatin sv swedish
Latvian Tamil
lv latvian ta tamil
Lithuanian Telugu
lt lithuanian te telugu
Malayalam Thai
ml malayalam th thai
Turkish
tr turkish
Turkmen
tk turkmen
Ukrainian
uk ukrainian
Uppersorbian
hsb uppersorbian
Welsh
cy welsh
6
Babel defines a few more synonyms (which consequently only work in LATEX):
english canadian
british australian, newzealand
german austrian
ngerman naustrian, nswissgerman
portuguese brazilian, brazil

You might also like