Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 9721240

Browse files
committed
I've sent 3 mails to pgsql-patches. There are two files, one for doc
and for src/data directories, and one minor patch for doc/README.locale. Please apply. Oleg.
1 parent c5d0a1b commit 9721240

File tree

3 files changed

+138
-1
lines changed

3 files changed

+138
-1
lines changed

doc/README.Charsets

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
2+
PostgreSQL Charsets README
3+
Josef Balatka, <balatka@email.cz>
4+
Draft v0.1, Tue Jul 20 15:49:07 CEST 1999
5+
6+
This document is a brief overview of the national charsets support
7+
that PostgreSQL ver. 6.5 has implemented. Various compilation options
8+
and setup tips are mentioned here to be helpful in the particular use.
9+
10+
---------------------------------------------------------------------------
11+
12+
Table of Contents
13+
14+
1. Locale awareness
15+
16+
2. Single-byte charsets recoding
17+
18+
3. Multi-byte support/recoding
19+
20+
4. Credits
21+
22+
---------------------------------------------------------------------------
23+
24+
1. Locale awareness
25+
26+
PostgreSQL server supports both locale aware and locale not aware
27+
(default) operational modes. You can determine this mode during the
28+
configuration stage of the installation with --enable-locale option.
29+
30+
If you don't use --enable-locale, the multi-language code will not be
31+
compiled and PostgreSQL will behave as an ASCII compliant application.
32+
This mode is useful for its speed but only provided that you don't
33+
have to consider national specific chars.
34+
35+
With --enable-locale you will get a locale aware server using LC_*
36+
environment variables to determine how to process national specifics.
37+
In this case strcoll(3) and similar functions are used internally
38+
so speed is somewhat lower.
39+
40+
Notice here that --enable-locale is sufficient when all your clients
41+
use the same single-byte encoding as the database server does.
42+
43+
When your clients use encoding different from the server than you have
44+
to use, moreover, --enable-recode or --with-mb=<encoding> options on
45+
the server side or a particular client that does recoding itself (e.g.
46+
there exists a PostgreSQL ODBC driver for Win32 with various Cyrillic
47+
encoding capability). Option --with-mb=<encoding> is necessary for the
48+
multi-byte charsets support.
49+
50+
51+
2. Single-byte charsets recoding
52+
53+
You can set up this feature with --enable-recode option. This option
54+
is described as 'enable Cyrillic recode support' which doesn't express
55+
all its power. It can be used for *any* single-byte charset recoding.
56+
57+
This method uses charset.conf file located in the $PGDATA directory.
58+
It's a typical configuration text file where spaces and newlines
59+
separate items and records and # specifies comments. Three keywords
60+
with the following syntax are recognized here:
61+
62+
BaseCharset <server_charset>
63+
RecodeTable <from_charset> <to_charset> <file_name>
64+
HostCharset <host_spec> <host_charset>
65+
66+
BaseCharset defines encoding of the database server. All charset
67+
names are only used for mapping inside the charset.conf so you can
68+
freely use typing-friendly names.
69+
70+
RecodeTable records specify translation table between server and client.
71+
The file name is relative to the $PGDATA directory. Table file format
72+
is very simple. There are no keywords and characters are represented by
73+
a pair of decimal or hexadecimal (0x prefixed) values on single lines:
74+
75+
<char_value> <translated_char_value>
76+
77+
HostCharset records define IP address and charset. You can use a single
78+
IP address, an IP mask range starting from the given address or an IP
79+
interval (e.g. 127.0.0.1, 192.168.1.100/24, 192.168.1.20-192.168.1.40)
80+
81+
The charset.conf is always processed up to the end, so you can easily
82+
specify exceptions from the previous rules. In the src/data you will
83+
find charset.conf example and a few recoding tables.
84+
85+
As this solution is based on the client's IP address / charset mapping
86+
there are obviously some restrictions as well. You can't use different
87+
encoding on the same host at the same time. It's also inconvenient when
88+
you boot your client hosts into more operating systems.
89+
Nevertheless, when these restrictions are not limiting and you don't
90+
need multi-byte chars than it's a simple and effective solution.
91+
92+
93+
3. Multi-byte support/recoding
94+
95+
It's a new generation of charset encoding in PostgreSQL designed as a
96+
more complex solution supporting both single-byte and multi-byte chars.
97+
You can set up this feature with --with-mb=<encoding> option.
98+
99+
There is no IP mapping file and recoding is controlled through the new
100+
SQL statements. Recoding tables are included in the code. Many national
101+
charsets are already supported and further will follow.
102+
103+
See doc/README.mb, doc/README.mb.jp to get detailed instruction on how
104+
to use the multibyte support. In the file doc/README.locale there is
105+
a particular instruction on usage of the multibyte support with Cyrillic.
106+
107+
108+
4. Credits
109+
110+
I'd like to thank the PostgreSQL development team and all contributors
111+
for creating PostgreSQL. Thanks to Oleg Bartunov, Oleg Broytmann and
112+
Tatsuo Ishii for opening the door into the multi-language world.
113+

doc/README.locale

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,17 @@
11
===========
2-
14 Apr 1999
2+
1999 Jul 21
3+
===========
4+
5+
Josef Balatka, <balatka@email.cz> asked us to remove RECODE and sent me
6+
Czech ISO-8859-2 -> WIN-1250 translation table.
7+
RECODE is no longer contains Cyrillic RECODE and will stay in PostgreSQL.
8+
9+
He also created some bits of documentation, mostly concerning RECODE -
10+
see README.Charsets.
11+
12+
13+
===========
14+
1999 Apr 14
315
===========
416

517
Tatsuo Ishii <t-ishii@sra.co.jp> updated Multibyte support extending it

src/data/isocz-wincz.tab

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
#
2+
# Czech ISO-8859-2 -> WIN-1250 translation table
3+
#
4+
165 188
5+
169 138
6+
171 141
7+
174 142
8+
181 190
9+
185 154
10+
187 157
11+
190 158
12+

0 commit comments

Comments
 (0)