Wikidata:Property proposal/FVDP Vietnamese dictionary ID

From Wikidata
Jump to navigation Jump to search

‎FVDP Vietnamese dictionary ID

[edit]

Return to Wikidata:Property proposal/Lexemes

   Under discussion
Descriptionentry for a lexeme in the Free Vietnamese Dictionary Project’s monolingual Vietnamese dictionary
RepresentsFVDP Vietnamese dictionary (Q130812916)
Data typeExternal identifier
Example 1xanh (L705061) 573415
Example 2gặp (L1011653) 104184
Example 3chuột (L1360864) 61855
Formatter URLhttps://www.informatik.uni-leipzig.de/~duc/TD/td/index.php?bpos=$1&db=vv

Motivation

[edit]

This property is proposed for use as a reference to link to Vietnamese lexemes. -عُثمان (talk) 22:26, 3 November 2024 (UTC)[reply]

Discussion

[edit]
  •  Support Mahir256 (talk) 22:44, 3 November 2024 (UTC)[reply]
  •  Oppose I'm hesitant about claiming this ID means anything more than a very specific way to form a particular URL. I would support an external reference property about each of the FVDP dictionaries, but I think this property as proposed should be limited to qualifiers.

    The Free Vietnamese Dictionary Project (FVDP) consists of a DICT (Q977872) Web server and desktop client, software to generate compatible dictionaries, and a collection of precompiled dictionaries compiled by a long-gone group of volunteers. [1] The software is licensed as open source, but I have no idea where to find the source code anymore. The provided dictionaries are available in two formats: StarDict Info (Q105858121) (which can be used with any compatible client and server) and a custom format specific to this client and server. [2]

    This proposal relies on the FVDP Web server's bpos URL query parameter, which indicates the byte offset of the entry within the dictionary's index file (in which each entry is listed alphabetically, separated by 8 bytes). Specifically, it assumes the "DE1" server, one of two DICT servers that the author Hồ Ngọc Đức (Q102291268) runs out of the University of Leipzig. If you plug the same byte offset into a different server, it will likely return a different entry. For example, xanh (L705061) is 573415 on "DE1" but 573297 on "DE3" (which is currently malfunctioning). Another popular instance ("US2") no longer exposes bpos at all.

    As I understand it, the purpose of this property is to durably link to an entry in the dictionary from a lexeme that inherently pertains to a specific word. The differences in byte offsets between servers illustrates that this is not an inherent property of a dictionary entry. The offsets have changed over time for a variety of reasons, such as adding more "00" front-matter entries and deleting duplicate entries. Moreover, the byte offset doesn't seem to be useful for offline distributions of this content. I think a primary external reference should follow wikt:Template:R:FVDP and its translations, which set the word parameter to the word itself. This would be a good way to indicate that the dictionary spells hóa/hoá differently in hóa đơn versus hoá nhi for no particular reason.

     – Minh Nguyễn 💬 18:19, 10 November 2024 (UTC)[reply]