Lists: | pgsql-hackers |
---|
From: | Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | WIN32 pg_import_system_collations |
Date: | 2021-12-13 08:41:10 |
Message-ID: | CAC+AXB0WFjJGL1n33bRv8wsnV-3PZD0A7kkjJ2KjPH0dOWqQdg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
I want to propose an implementation of pg_import_system_collations() for
WIN32 using EnumSystemLocalesEx() [1], which is available from Windows
Server 2008 onwards.
The patch includes a test emulating that of collate.linux.utf8, but for
Windows-1252. The main difference is that it doesn't have the tests for
Turkish dotted and undotted 'i', since that locale is WIN1254.
I am opening an item in the commitfest for this.
[1]
https://docs.microsoft.com/en-us/windows/win32/api/winnls/nf-winnls-enumsystemlocalesex
Regards,
Juan José Santamaría Flecha
Attachment | Content-Type | Size |
---|---|---|
0001-WIN32-pg_import_system_collations.patch | application/x-patch | 52.0 KB |
From: | Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WIN32 pg_import_system_collations |
Date: | 2021-12-13 16:28:47 |
Message-ID: | CAC+AXB2Znciv-r9zorGBpTPsTkq1KJY33wrkhsy1QRJ+igUrKg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Mon, Dec 13, 2021 at 9:41 AM Juan José Santamaría Flecha <
juanjo(dot)santamaria(at)gmail(dot)com> wrote:
Per path tester.
> Regards,
>
> Juan José Santamaría Flecha
>
Attachment | Content-Type | Size |
---|---|---|
v2-0001-WIN32-pg_import_system_collations.patch | application/octet-stream | 52.0 KB |
From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WIN32 pg_import_system_collations |
Date: | 2021-12-13 20:53:27 |
Message-ID: | CA+hUKG+=EYh1C33GTkPfx9OnDRfouneJLuJjr7tQkDVXjNtMhQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Tue, Dec 14, 2021 at 5:29 AM Juan José Santamaría Flecha
<juanjo(dot)santamaria(at)gmail(dot)com> wrote:
> On Mon, Dec 13, 2021 at 9:41 AM Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com> wrote:
> Per path tester.
Hi Juan José,
I haven't tested yet but +1 for the feature. I guess the API didn't
exist at the time collation support was added.
+ /*
+ * Windows will use hyphens between language and territory, where ANSI
+ * uses an underscore. Simply make it ANSI looking.
+ */
+ hyphen = strchr(localebuf, '-');
+ if (hyphen)
+ *hyphen = '_';
+
This conversion makes sense, to keep the user experience the same
across platforms. Nitpick on the comment: why ANSI? I think we can
call "en_NZ" a POSIX locale identifier[1], and I think we can call
"en-NZ" a BCP 47 language tag.
+/*
+ * This test is for Windows/Visual Studio systems and assumes that a full set
+ * of locales is installed. It must be run in a database with WIN1252 encoding,
+ * because of the locales' encondings. We lose some interesting cases from the
+ * UTF-8 version, like Turkish dotted and undotted 'i' or Greek sigma.
+ */
s/encondings/encodings/
When would the full set of locales not be installed on a Windows
system, and why does this need Visual Studio? Wondering if this test
will work with some of the frankenstein/cross toolchains tool chains
(not objecting if it doesn't and could be skipped, just trying to
understand the comment).
Slightly related to this, in case you didn't see it, I'd also like to
use BCP 47 tags for the default locale for PostgreSQL 15[2].
[1] https://en.wikipedia.org/wiki/Locale_(computer_software)#POSIX_platforms
[2] https://www.postgresql.org/message-id/flat/CA%2BhUKGJ%3DXThErgAQRoqfCy1bKPxXVuF0%3D2zDbB%2BSxDs59pv7Fw%40mail.gmail.com
From: | Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WIN32 pg_import_system_collations |
Date: | 2021-12-14 20:13:52 |
Message-ID: | CAC+AXB2Mvdn=2d0Vfgz6_SGmD=yTUEqdKRJ01BVfDAnZhW+CuA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Mon, Dec 13, 2021 at 9:54 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> I haven't tested yet but +1 for the feature. I guess the API didn't
> exist at the time collation support was added.
>
> Good to hear.
> This conversion makes sense, to keep the user experience the same
> across platforms. Nitpick on the comment: why ANSI? I think we can
> call "en_NZ" a POSIX locale identifier[1], and I think we can call
> "en-NZ" a BCP 47 language tag.
>
> POSIX also works for me.
> When would the full set of locales not be installed on a Windows
> system, and why does this need Visual Studio? Wondering if this test
> will work with some of the frankenstein/cross toolchains tool chains
> (not objecting if it doesn't and could be skipped, just trying to
> understand the comment).
>
> What I meant to say is that to run the test, you need a database that has
successfully run pg_import_system_collations. This would be also possible
in Mingw for _WIN32_WINNT> = 0x0600, but the current value in
src\include\port\win32.h is _WIN32_WINNT = 0x0501 when compiling with
Mingw.
> Regards,
Juan José Santamaría Flecha
From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WIN32 pg_import_system_collations |
Date: | 2021-12-14 21:45:28 |
Message-ID: | CA+hUKGKHu5M9CzD+oueMfW6pr9g9koLaYio=b57QRjrFd1++zQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Wed, Dec 15, 2021 at 9:14 AM Juan José Santamaría Flecha
<juanjo(dot)santamaria(at)gmail(dot)com> wrote:
> What I meant to say is that to run the test, you need a database that has successfully run pg_import_system_collations. This would be also possible in Mingw for _WIN32_WINNT> = 0x0600, but the current value in src\include\port\win32.h is _WIN32_WINNT = 0x0501 when compiling with Mingw.
Ah, right. I hope we can make the leap to 0x0A00 (Win10) soon and
just stop thinking about these old ghosts, as mentioned by various
people in various threads. Do you happen to know if there are
complications for that, with the non-MSVC tool chains?
From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WIN32 pg_import_system_collations |
Date: | 2021-12-15 02:52:12 |
Message-ID: | YblYXOuWyLVHopKD@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Wed, Dec 15, 2021 at 10:45:28AM +1300, Thomas Munro wrote:
> Ah, right. I hope we can make the leap to 0x0A00 (Win10) soon and
> just stop thinking about these old ghosts, as mentioned by various
> people in various threads.
Seeing your message here.. My apologies for the short digression.
Would that mean that we could use CreateSymbolicLinkA() as a mapper
for pgreadlink() rather than junction points? I am wondering how much
code in src/port/ such a move could allow us to do.
--
Michael
From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WIN32 pg_import_system_collations |
Date: | 2021-12-15 04:03:30 |
Message-ID: | CA+hUKG+q6_wi5cXvc2m6HozdmvD2Xsx1AKerdG2MPSjdHrCbtg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Wed, Dec 15, 2021 at 3:52 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> On Wed, Dec 15, 2021 at 10:45:28AM +1300, Thomas Munro wrote:
> > Ah, right. I hope we can make the leap to 0x0A00 (Win10) soon and
> > just stop thinking about these old ghosts, as mentioned by various
> > people in various threads.
>
> Seeing your message here.. My apologies for the short digression.
> Would that mean that we could use CreateSymbolicLinkA() as a mapper
> for pgreadlink() rather than junction points? I am wondering how much
> code in src/port/ such a move could allow us to do.
Sadly, (1) it wouldn't work unless running with a special privilege or
as admin, and (2) it wouldn't work on non-NTFS filesystems. I think
it's mostly intended to allow things like unpacking tarballs, checking
out git repos etc etc etc that came from Unix systems, which is why it
works with 'developer mode' enabled[1], though obviously it wouldn't
be totally impossible for us to require that privilege. Didn't seem
great to me, though, that's why I gave up on it over in
https://commitfest.postgresql.org/36/3090/ where this was recently
discussed.
[1] https://blogs.windows.com/windowsdeveloper/2016/12/02/symlinks-windows-10/
From: | Julien Rouhaud <rjuju123(at)gmail(dot)com> |
---|---|
To: | Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WIN32 pg_import_system_collations |
Date: | 2022-01-19 09:53:12 |
Message-ID: | 20220119095312.b5ya3fvmmklnopmb@jrouhaud |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
On Mon, Dec 13, 2021 at 05:28:47PM +0100, Juan José Santamaría Flecha wrote:
> On Mon, Dec 13, 2021 at 9:41 AM Juan José Santamaría Flecha <
> juanjo(dot)santamaria(at)gmail(dot)com> wrote:
>
> Per path tester.
This version doesn't apply anymore:
http://cfbot.cputube.org/patch_36_3450.log
=== Applying patches on top of PostgreSQL commit ID e0e567a106726f6709601ee7cffe73eb6da8084e ===
=== applying patch ./v2-0001-WIN32-pg_import_system_collations.patch
[...]
patching file src/tools/msvc/vcregress.pl
Hunk #1 succeeded at 153 (offset -1 lines).
Hunk #2 FAILED at 170.
1 out of 2 hunks FAILED -- saving rejects to file src/tools/msvc/vcregress.pl.rej
Could you send a rebased version? In the meantime I will switch the CF entry
to Waiting on Author.
From: | Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com> |
---|---|
To: | Julien Rouhaud <rjuju123(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WIN32 pg_import_system_collations |
Date: | 2022-01-19 12:24:40 |
Message-ID: | CAC+AXB2zm9aBUJHyXU788DgH--aog8iDbwTzpqVOkBFjcNOWPA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Wed, Jan 19, 2022 at 10:53 AM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> This version doesn't apply anymore:
>
> Thanks for the heads up.
Please find attached a rebased patch. I have also rewritten some comments
to address previous reviews, code and test remain the same.
Regards,
Juan José Santamaría Flecha
Attachment | Content-Type | Size |
---|---|---|
v3-0001-WIN32-pg_import_system_collations.patch | application/octet-stream | 52.2 KB |
From: | Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru> |
---|---|
To: | Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WIN32 pg_import_system_collations |
Date: | 2022-01-25 07:56:53 |
Message-ID: | aa8db1b7-b9e8-2ef3-407b-8575ae2e9f99@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi Juan José,
I a bit tested this feature and have small doubts about block:
+/*
+ * Windows will use hyphens between language and territory, where POSIX
+ * uses an underscore. Simply make it POSIX looking.
+ */
+ hyphen = strchr(localebuf, '-');
+ if (hyphen)
+ *hyphen = '_';
After this block modified collation name is used in function
GetNLSVersionEx(COMPARE_STRING, wide_collcollate, &version)
(see win32_read_locale() -> CollationFromLocale() -> CollationCreate()
call). Is it correct to use (wide_collcollate = "en_NZ") instead of
(wide_collcollate = "en-NZ") in GetNLSVersionEx() function?
1) Documentation [1], [2], quote:
If it is a neutral locale for which the script is significant,
the pattern is <language>-<Script>.
2) Conversation [3], David Rowley, quote:
Then, since GetNLSVersionEx()
wants yet another variant with a - rather than an _, I've just added a
couple of lines to swap the _ for a -.
On my computer (Windows 10 Pro 21H2 19044.1466, MSVC2019 version
16.11.9) work correctly both variants ("en_NZ", "en-NZ").
But David Rowley (MSVC2010 and MSVC2017) replaced "_" to "-"
for the same function. Maybe he had a problem with "_" on MSVC2010 or
MSVC2017?
[1]
https://docs.microsoft.com/en-us/windows/win32/api/winnls/nf-winnls-getnlsversionex
[2] https://docs.microsoft.com/en-us/windows/win32/intl/locale-names
[3]
https://www.postgresql.org/message-id/flat/CAApHDvq3FXpH268rt-6sD_Uhe7Ekv9RKXHFvpv%3D%3Duh4c9OeHHQ%40mail.gmail.com
With best regards,
Dmitry Koval.