Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Cleaning up request templates

edit

I went through and made a list of the request templates listed in WT:Templates with current language parameter, and found all the aliases and uses of them. Most of the aliases were obscure, hardly used and undocumented, so I went ahead and orphaned and deleted them. Some, however, need discussion:

Aliased template Canonical template #Uses Comments Outcome
Template:tea room Template:tea room 100 (including the 69 uses of {{rft}}, hence only 31 uses of {{tea room}}) Maybe replace with {{rft}} or {{tea}} (see also barely used template {{beer}}) Kept as canonical.
Template:rft Template:tea room 69 Maybe this should be canonical? Deleted.
Template:tea Template:tea room (unused, but mentioned in Template:tea room) Maybe this should be canonical? Although it's quite short for a not-very-used template. Deleted.
Template:rfdef Template:rfdef 65117 (including the 18481 uses of {{defn}}, hence 46,636 uses under this name) Canonical name. Kept.
Template:defn Template:rfdef 18481 Clearly not a good name, but has many uses; I propose orphaning and deprecating it, rather than deleting it. Orphaned in favor of {{rfdef}}.
Template:request for etymology Template:request for etymology 34200 (including the 34194 uses of {{rfe}}, hence only 6 uses under the longer name) Canonical name. I propose deleting this name in favor of {{rfe}}. Deleted in favor of {{rfe}}.
Template:rfe Template:request for etymology 34194 I propose making this the canonical and only name. Kept.
Template:request for references Template:request for references 21 (including the 20 uses of {{rfr}}, hence only 1 use under the longer name) Canonical name. I propose deleting this name in favor of {{rfr}}. Deleted in favor of {{rfref}}.
Template:rfr Template:request for references 20 I propose making this the canonical and only name. Deleted in favor of {{rfref}}.
Template:rfv-etymology Template:rfv-etymology 681 (681 - 46 = 635 uses under this name) This is more-used than {{rfv-etym}}, but I prefer the latter, shorter name and propose orphaning and eliminating this template in favor of {{rfv-etym}}. Deleted in favor of {{rfv-etym}}.
Template:rfv-etym Template:rfv-etymology 46 I propose making this the canonical and only name. Kept.
Template:rfv-pronunciation Template:rfv-pronunciation 164 This is more-used than {{rfv-pron}}, but I prefer the latter, shorter name and propose orphaning and eliminating this template in favor of {{rfv-pron}}. Deleted in favor of {{rfv-pron}}.
Template:rfv-pron Template:rfv-pronunciation 6 I propose making this the canonical and only name. Kept.
Template:sense stub Template:sense stub 836 (including the 446 uses of {{rfgloss}}, hence 390 uses under this name) Canonical name. I propose orphaning and eliminating this template in favor of {{rfgloss}}. Deleted in favor of {{rfclarify}}.
Template:stub-gloss Template:sense stub (none; formerly 3) I already orphaned this in favor of {{rfgloss}} and deleted it. Deleted.
Template:rfgloss Template:sense stub 446 I propose making this the canonical and only name. Deleted in favor of {{rfclarify}}.
Template:gloss-stub Template:sense stub (none; formerly 122) I already orphaned this in favor of {{rfgloss}} and deleted it. Deleted.

There are about 35 or 40 request templates, the vast majority of which begin with rf followed by an abbreviation or short word. I propose bringing the remainder under this scheme. There are two cases where the canonical name is long ({{request for etymology}} and {{request for references}}), but in both cases the long names are rarely used, and the shorter versions {{rfe}} and {{rfr}} are almost always found. I propose eliminating the long names in favor of the short names, consistent with the other request templates. Similarly, I propose eliminating {{sense stub}} (a misnomer in any case, as this template concerns glosses of foreign terms) in favor of {{rfgloss}}. I'm not quite sure what to do with {{tea room}} vs. {{rft}} vs. {{tea}}. Maybe {{rft}} should be canonical for consistency with the other request templates, and because it's the most used.

NOTES:

  1. My threshold for delete vs. deprecation is usually 1000 uses.
  2. In the table above, the counts for canonical names include uses under all aliases.

Benwing2 (talk) 01:37, 1 October 2019 (UTC)[reply]

This seems like a worthwhile effort and the specific proposals seem good. Could consideration be given to having the canonical templates default to a 'lite' display? Compare {{rfelite}} and {{rfe}}. Very few requests really warrant the warning function of the large request display boxes. Requiring more typing (longer alias or switch) to display the larger size seems appropriate to me. DCDuring (talk) 02:34, 1 October 2019 (UTC)[reply]
@DCDuring I agree with you that the big boxes are annoying. How should we proceed? One way is just to change the format of {{rfe}} and related templates that display a big box (e.g. {{rfp}}, {{rfap}}) to use the "lite" display by default, and take a |big=1 param to display a big box instead. This would change the way I lot of pages look though, so I'd want to make sure others are in agreement. Another possibility is to make the same change but also bot-add |big=1 everywhere so that the display doesn't change. Benwing2 (talk) 14:22, 1 October 2019 (UTC)[reply]
Whatever is acceptable. We are still "under construction", but the big boxes don't much help attract contributors at this point. The option of "big=1" will probably have its uses. DCDuring (talk) 14:32, 1 October 2019 (UTC)[reply]
I expect that box=1 will convey the intention more clearly.  --Lambiam 15:32, 1 October 2019 (UTC)[reply]
I also support lite boxes by default. Ultimateria (talk) 17:38, 1 October 2019 (UTC)[reply]
{{rfgloss}} is a bad name to distinguish from {{rfdef}}}, because there is no reason why the latter couldn’t be called the former. {{rfclarify}} looks good. The alias {{defn}} for {{rfdef}}} is gross.
Request for etymology templates: Note that the text in |2= in {{rfe}} has been displayed on pages so far. It often contains hypotheses or speculations which have been consciously made visible and are better than nothing until the etymology is cleared up by someone who knows better, often after years. If you change whatever don’t change the visibility of that text on the visibility of which editors have relied upon.
Tea room: Perhaps {{tea}} and {{tea sense}}
References: {{rfref}} better. You know as a programmer how common the clipping “ref” is, so this is catchy.
Request for verification of pronunciation: Note sure if {{rfv-pron}} is good, it’s a bit lewd, and maybe analogy makes {{rfvp}} (or {{rfv-ipa}}?) better. Apart from the issue that verification of pronunciation meets only limited resources so I wonder why anyone would want to use the template or what he would gain with it. Fay Freak (talk) 15:35, 1 October 2019 (UTC)[reply]
I agree with most of Benwing's suggestions, and with Fay Freak that {{rfref}} is better than {{rfr}}. "pron" seems fine to me; we use it in plenty of templates and modules already. Ultimateria (talk) 17:38, 1 October 2019 (UTC)[reply]
I think I'll go with {{rfref}} and {{rfclarify}} per Fay Freak, {{rfv-pron}} per Ultimateria and maybe {{rft}} for Tea Room links (but not sure about the last). I agree with Lambiam about |box=1. Benwing2 (talk) 00:46, 2 October 2019 (UTC)[reply]
I vote for {{tea room}}; {{rft}} confuses me because it looks like "request for tea", or for something else, like translation or transliteration. — Eru·tuon 01:29, 2 October 2019 (UTC)[reply]
@Erutuon There's also {{rft-sense}}; what would you call that? Benwing2 (talk) 01:45, 2 October 2019 (UTC)[reply]
The text for {{rft-sense}} says "Discuss this sense" so we could call it {{rfdiscuss}} or something. Note that there are only 17 uses so it could even be transformed into a param of some other template. Benwing2 (talk) 01:47, 2 October 2019 (UTC)[reply]
It isn’t a request for anything specific, but rather an invitation, and gets removed even if no contribution has taken place after some time, or no? Hence I suggested {{tea}} and {{tea sense}}. Maybe call it {{invitation for tea}} and {{invitation for tea-sense}}. Fay Freak (talk) 02:55, 2 October 2019 (UTC)[reply]
Maybe {{tea room}} and {{tea room sense}}? Benwing2 (talk) 04:50, 2 October 2019 (UTC)[reply]
None of the tea room templates get regularly or systematically removed by anyone. DTLHS (talk) 05:18, 2 October 2019 (UTC)[reply]
@Benwing2: Here are my suggestions (I wanted to submit these but forgot to do so):
  1. {{tea room}} - Keep as canonical. Rename rest.
  2. {{rfdef}} - Keep as canonical.
    {{defn}} is used primarily in Han character entries and can be converted to {{rfdef|lang}} without the sort key. {{defn|cmn|sort=}} and {{defn|yue|sort=}} can both be converted to {{rfdef|zh}} due to unified Chinese.
  3. {{request for etymology}} - Convert to {{rfe}}, as suggested.
  4. {{request for references}} - {{rfref}} is much better
  5. {{request for etymology}} - Convert to {{rfv-etym}}, as suggested.
  6. {{request for pronunciation}} - Convert to {{rfv-pron}}, as suggested.
  7. {{sense stub}} - Convert to {{rfgloss}}, as suggested. KevinUp (talk) 10:22, 2 October 2019 (UTC)[reply]
Updated thoughts:
  1. I've noticed that posting directly at the Beer Parlour and Tea Room attracts more traffic than using templates such as {{tea room}} or {{beer}}. Do we really need these templates? I'm not keen on seeing these boxes because they appear to be slightly intrusive and needs to be removed from time to time.
  2. I found only 146 uses of {{defn}} in Latin script entries compared to 13600 uses of {{defn}} in Han script character entries. I noticed that WingerBot has already converted {{defn|cmn|sort=}} and {{defn|yue|sort=}} to {{rfdef|cmn|sort=}} and {{rfdef|yue|sort=}} (with the sort key), but both of these can actually be converted to {{rfdef|zh}} without the sort key because sorting is now done automatically via Module:zh-sortkey/data.
  3. Matter resolved.
  4. Matter resolved.
  5. Matter resolved.
  6. Matter resolved.
  7. {{rfclarify}} is indeed much better compared to {{rfgloss}}. KevinUp (talk) 10:22, 2 October 2019 (UTC)[reply]
@KevinUp I've already converted over half of the {{defn}} templates to {{rfdef}}. I'll let it finish now, and do a separate pass sometime later to clean up the sort keys and unify the Chinese entries, as you suggest. For which languages can the sort key be unilaterally removed? You mention cmn, yue, zh; what about ko, ja, vi, ...? Benwing2 (talk) 11:04, 2 October 2019 (UTC)[reply]
@Benwing2: The sort keys for cmn, yue, ja, ko, vi, zh for {{rfdef}} involving entries in Category:Han script characters can be removed. I've confirmed this in this edit. KevinUp (talk) 21:55, 2 October 2019 (UTC)[reply]
@KevinUp: Just to confirm, is it okay that {{rfdef}} doesn't add a radical–stroke sortkey for cmn, ja, ko, yue if |sort= isn't provided? Here are testcases for this in Special:ExpandTemplates. — Eru·tuon 22:21, 2 October 2019 (UTC)[reply]
Okay. Turns out I was wrong. Vietnamese and Chinese does not require the sort key, but Korean and Japanese requires the sort key to work properly. cmn and yue also need the sort key, but they are to be converted into {{rfdef|zh}} which does not need the sort key. Thanks for pointing this out. KevinUp (talk) 22:40, 2 October 2019 (UTC)[reply]
@KevinUp Are we sure we want to convert occurrences of cmn, yue and hak to zh? See for example , which has occurrences of {{rfdef}} in separate "Mandarin" and "Cantonese" sections. Once we convert, the section headers will disagree with the language code of {{rfdef}}. Following is a table of the occurrences of various languages in {{rfdef}} usages that were converted from {{defn}}:
8508 ja
7216 ko
6689 cmn
5579 vi
4729 yue
27 zh
1 hak
Benwing2 (talk) 02:36, 3 October 2019 (UTC)[reply]

Merging {{rfe}}, {{rfelite}}, {{etystub}}

edit

These three templates do similar things and it's not obvious to me they deserve to be separate. Here's an example (from the Moksha синь (śiń) page):

1. Using {{rfe}}:

Cognates include Erzya сынь (siń), perhaps cognate with Northern Sami sii. (This etymology is missing or incomplete. Please add to it, or discuss it at the Etymology scriptorium.)

2. Using {{rfe|inline=yes}}:

Cognates include Erzya сынь (siń), perhaps cognate with Northern Sami sii.

This etymology is missing or incomplete. Please add to it, or discuss it at the Etymology scriptorium.

3. Using {{rfelite}}:

Cognates include Erzya сынь (siń), perhaps cognate with Northern Sami sii. (This etymology is missing or incomplete. Please add to it, or discuss it at the Etymology scriptorium.)

4. Using {{etystub}}:

Cognates include Erzya сынь (siń), perhaps cognate with Northern Sami sii. This etymology is incomplete. You can help Wiktionary by elaborating on the origins of this term.

User:DCDuring suggests that the default display box of {{rfe}} is too glaring, and I agree. I suggest that we synthesize the three wordings in some fashion and standardize on {{rfe}}, which is changed to display the text inline unless |box=1. Note that currently {{rfe}} has about 34,000 uses while {{rfelite}} and {{etystub}} have around 4,000 each, so standardizing on {{rfe}} will be the least disruptive (as well as the shortest name). Benwing2 (talk) 03:46, 2 October 2019 (UTC)[reply]

The wording for etystub is different because it is used where there is some etymological information, but arguably not a complete one. Also whatever template is used for an incomplete etymology, it should probably appear after the existing, incomplete etymology and a space appearing as follows:
Cognates include Erzya сынь (siń), perhaps cognate with Northern Sami sii.
This etymology is incomplete. You can help Wiktionary by elaborating on the origins of this term.
Otherwise the request is visually lost, IMO. DCDuring (talk) 04:32, 2 October 2019 (UTC)[reply]
@DCDuring I'm fine with putting the request on a separate line. But there's absolutely no consistency in the use of {{rfe}}, {{rfelite}} and {{etystub}}; people aren't observing the distinction between missing and incomplete etymologies. The Moksha page I reference above, for example, uses {{rfe}} with an incomplete etymology. I suggest that we choose some wording that works for both missing and incomplete etymologies, maybe like this:
This etymology is missing or incomplete. Please add to it, or discuss it at the Etymology scriptorium.
Benwing2 (talk) 04:39, 2 October 2019 (UTC)[reply]
It seems to me that there are differences that deserve to be recognized among missing etymology ({{rfe}}), incomplete etymology ({{etystub}}), and one that is being challenged ({{rfv-etym}}). I find that sometimes smaller tasks, like completing an etymology, fit my mood, sharpness, and energy level better than possibly larger tasks like providing a complete etymology. I also note that some users fail to remove these etymologies, even when the etymologies seem pretty good to me. Is it carelessness or are the requesting that someone else review it? I doubt that all such etymologies should go the the Etymology Scriptorum. It also seems like a task that requires a higher level of skill.
If some users use the wrong one for the situation, then someone seeing it can and should correct it. Some such needed corrections could be identified by regex searches.
I guess I am thinking that these tags should fit it into a more refined workflow than has been customary, probably requiring more maintenance categories. The implementation would ideally not require less-frequent users to learn brand-new ways of doing things, but would allow frequent and skilled users to work more effectively. My perceptions may just be wrong about this, but I'd like to hear thoughts of others about workflow. DCDuring (talk) 05:09, 2 October 2019 (UTC)[reply]
@DCDuring I see your point, although I still have the feeling that the current usage of {{rfe}} and {{etystub}} is a total mess. If you want to maintain this distinction, we should definitely rename {{etystub}}, as its current name gives no clear indication that it's for partial etymologies (in fact to me, "stub" suggests there's basically nothing there). Perhaps {{rfe-partial}} would be a clearer name. Benwing2 (talk) 06:19, 2 October 2019 (UTC)[reply]
The main difference between {{rfe}} and the other related templates is that an additional line of text can be displayed after the language code (Particularly: “...“) but this is sometimes misused to include possible theories and incorrect information that are more suitable for discussion pages.
As with {{tea room}} and {{beer}}, I don't think {{rfe}} actually attracts user traffic to the Etymology scriptorium. I would prefer for the box format to be replaced by short inline sentences and the particularly: “...“ statements completely hidden or moved to the talk page.
{{rfelite}} displays a short, clean sentence. I prefer this format as it can be added to almost any lemma entry that lacks an etymology header. Whenever I find {{rfe|lang}} without additional "particularly ..." statements, I would convert the template to {{rfelite|lang}} which is less intrusive. KevinUp (talk) 10:22, 2 October 2019 (UTC)[reply]
  1. {{rfe}} - Keep the name, but the current format needs to be revamped.
  2. {{rfe|inline=yes}} - Much better and less intrusive. Can be used without the |inline=yes parameter to replace the current boxed format. However, it is a bit wordy. I prefer the statement from {{rfelite}}.
  3. {{rfelite}} - Possibly delete after merging with {{rfe}}.
  4. {{etystub}} I would say keep. This template can be used to indicate that the information is incomplete, e.g. some intermediate ancestors were skipped. This templates categorizes entries into Category:Requests for expansion of etymologies by language. KevinUp (talk) 10:22, 2 October 2019 (UTC)[reply]
@KevinUp If you want to keep {{etystub}}, what do you think about an alternative name like {{rfe-expand}}? It's 3 more letters to type but I think it much more clearly describes the intention (request for expansion of an existing etymology). Benwing2 (talk) 11:08, 2 October 2019 (UTC)[reply]
@Benwing2: Good idea. {{etystub}} was meant to be similar with {{sense stub}}. Using {{rfe-expand}} for categorization into "Category:Requests for expansion" is a fitting choice. KevinUp (talk) 21:55, 2 October 2019 (UTC)[reply]
Could we add a switch to rfe to request a review for an etymology without forcing it to go to WT:ES? Presumably the switch could put the L2 (or L3?) in a maintenance category. DCDuring (talk) 15:00, 2 October 2019 (UTC)[reply]
@KevinUp Another possible name is {{rfe-exp}}, which is the same length as {{etystub}} and is analogous to the existing {{rfexp}} ("request for expansion"). Thoughts? Benwing2 (talk) 06:36, 3 October 2019 (UTC)[reply]
@Benwing2: I think {{rfe-expand}} would be better as the canonical name because "exp" can be interpreted as experimental, experience, etc. Anyway. a shortcut can still be created to redirect {{rfe-exp}} to {{rfe-expand}}. KevinUp (talk) 19:34, 20 October 2019 (UTC)[reply]
@DCDuring Sure, we can add that switch. Although, I don't think there's a problem with mentioning the WT:ES by default as an option (not something forced), using text like this (duplicated above):
This etymology is missing. Please add to it, or discuss it at the Etymology scriptorium.
I think this text is succinct enough to be the replacement text for both {{rfe}} and {{rfelite}}. Benwing2 (talk) 06:39, 3 October 2019 (UTC)[reply]
OK, but it would be helpful if regulars at WT:ES chimed in. DCDuring (talk) 15:10, 3 October 2019 (UTC)[reply]
As a regular user of those templates, I don't have a problem with a complete merge. Canonicalization (talk) 19:29, 3 October 2019 (UTC)[reply]
@DCDuring, KevinUp, Canonicalization I switched {{rfe}} to display inline unless |box=1. If you specify |noes=1, it suppresses the reference to the Etymology scriptorium. You can see the various possibilities at User:Benwing2/test-rfe. If the display is acceptable, I will redirect {{rfelite}} to {{rfe}}. Benwing2 (talk) 18:33, 20 October 2019 (UTC)[reply]
The current display without the box is much better compared to the previous display. I would suggest for the default output of {{rfe|LANG}} to become "You can help Wiktionary by providing a proper etymology." if no additional parameters are specified.
I would also suggest for the displayed message to become "This etymology is missing. Please add to it, or discuss it at the Etymology scriptorium." if |es=1 is specified or "This etymology is missing or incomplete. Please discuss it at the talk page, particularly "..."." if |2= or |talk=1 is specified.
This is because some editors use talk pages to discuss word origins rather than the etymology scriptorium. I'm also good with the current display. Anyway, {{rfelite}} can be deprecated. KevinUp (talk) 19:34, 20 October 2019 (UTC)[reply]
@KevinUp Thanks for your comments. I'm fine with changing the wording but your suggestions seem a bit odd in that using |es=1 completely changes the wording in ways that aren't obviously related to the presence or absence of "Etymology scriptorium". Benwing2 (talk) 21:28, 20 October 2019 (UTC)[reply]
@Benwing2 Could we add "or incomplete" after "This etymology is missing", as you suggested above? Canonicalization (talk) 14:33, 16 November 2019 (UTC)[reply]

말#Etymology 4 is jarring. Can the box by automatically restored if the string is too long? —Suzukaze-c 17:49, 22 October 2019 (UTC)[reply]

@Suzukaze-c Apologies for the delay, I made the box the default if the string is >= 100 chars. Benwing2 (talk) 16:06, 16 November 2019 (UTC)[reply]
It's fine. I hope it isn't too expensive. —Suzukaze-c 16:59, 16 November 2019 (UTC)[reply]
@Suzukaze-c Not expensive, equivalent to one call to mw.ustring.len(). Benwing2 (talk) 04:21, 17 November 2019 (UTC)[reply]
{{rfelite}} is deprecated. Benwing2 (talk) 02:34, 25 November 2019 (UTC)[reply]

Templates for coined words

edit

I couldn't find an existing template that expresses that a word was coined by a certain person. Especially for newer vocabulary (neologisms), this information may be well documented and worth presenting under the Etymology section (not just under a quotation). I'm thinking of implementing such a template - perhaps {{coinage}} or {{coined}} that would take

  1. the person who coined it
  2. (optionally) whether to add a Wikipedia link for that person (and the language to add it in)
  3. (optionally) the date of coinage, if known.

In addition, the template could then be used to categorize entries under the person who coined them, which could be particularly useful for some European languages (such as Finnish) where words have been coined in larger numbers by some people, such as Elias Lönnrot, who put together a very sizable Finnish-Swedish dictionary and coined a lot of the Finnish words himself, many of which are still used today.

My initial proposal is {{coinage|fi|Elias Lönnrot|w=en}} showing up as "Coined by Elias Lönnrot." (with a nodot and year/date option available as well) and adding the page into a category called "Category:Finnish words coined by Elias Lönnrot" (which also needs some kind of category structure). Any thoughts? — surjection?13:29, 2 October 2019 (UTC)[reply]

I now have an initial version available in my userspace. Some small changes: the date parameter exists and is called "in", so {{coinage|und|coiner|in=year}}, and w now also accepts the article name if given in the format lang:Article name. — surjection?15:11, 2 October 2019 (UTC)[reply]
See Wiktionary:Beer parlour/2017/December#Template for coinages. —Μετάknowledgediscuss/deeds 15:56, 2 October 2019 (UTC)[reply]
As well as Wiktionary:Tea room/2018/January § Netflix and chill. Canonicalization (talk) 16:01, 2 October 2019 (UTC)[reply]
I've taken a look at both - because of them, I changed the categorization structure a bit, allowing both "(language) coinages" and "(language) terms coined by (coiner)" and the latter to be disabled separately. — surjection?16:15, 2 October 2019 (UTC)[reply]
Although I don't feel that the parameter name litecat is the best, so I'm open to suggestions. — surjection?16:30, 2 October 2019 (UTC)[reply]
We'll need a strict definition in our glossary, copied over to the template documentation, and perhaps even have the word "Coined" in the template display link to the glossary entry. Obviously, first attestations are not necessarily coinages, but we also need to clarify the status of words like weeaboo (coined as a nonce word, but used with an unrelated meaning), flan (a slip of the tongue humorously repurposed as a coinage), Imogen (a misprint later accepted as a coinage), and cdesign proponentsist or medireview (misprints humorously repurposed as a coinage). —Μετάknowledgediscuss/deeds 18:42, 2 October 2019 (UTC)[reply]
I have added a link to Appendix:Glossary#coinage, but the entry needs to be defined. I feel the definition should take into account that the word was intentionally created (not just first written down) by a person or another entity in order to describe something definite. — surjection?19:02, 2 October 2019 (UTC)[reply]
I support the creation of this template. I think it can be used to categorize entries in Category:English terms first attested in Shakespeare. My only concern is the categorization of entries into "Category:English words coined by author XX". Will this be done automatically or manually? I've seen coinages that were eventually deleted due to lack of widespread use. I think categorization for specific authors can be manually added to the module/template, rather than appear automatically, to prevent incorrect categories from popping up. KevinUp (talk) 21:55, 2 October 2019 (UTC)[reply]
Categorization for specific coiners is done automatically, but can be disabled with a certain parameter given to the template. I don't see how the "lack of widespread use" really affects anything here, as the creation of this template wouldn't somehow subvert WT:CFI. — surjection?22:23, 2 October 2019 (UTC)[reply]
Just a note that most words first attested in Shakespeare are not thought to have been coined by him, so this would not change that category at all. The idea of having the module know a list of people to make categories for is an appealing way to avoid lots of categories with one entry, though. —Μετάknowledgediscuss/deeds 22:27, 2 October 2019 (UTC)[reply]
A more flexible approach is to occasionally check for such categories with only one entry by using a bot or the like. — surjection?22:36, 2 October 2019 (UTC)[reply]
I will soon be releasing this out as {{coinage}}, with alias as {{coin}}. The final change is renaming |litecat= to something else. This of course doesn't mean the template can't have any further changes done to it, but they should from now on aim to be backwards compatible. — surjection?09:31, 3 October 2019 (UTC)[reply]

Further reading and References at L3 or L4 or L5

edit

The heading level for ===Further reading=== and ===References=== appears to be inconsistent. Statistically, I found the following:

Further reading:

  1. 405 entries (0.24%) with "Further reading" at L5.
  2. 18,421 entries (10.77%) with "Further reading" at L4.
  3. 152,293 entries (89.00%) with "Further reading" at L3.

References:

  1. 1,572 entries (0.59%) with "References" at L5.
  2. 50,178 entries (18.77%) with "References" at L4 - 29,418 entries (11.00%) are from Han script characters.
  3. 215,636 entries (80.65%) with "References" at L3.

Are there official guidelines regarding the heading levels of these two headings? If not, can we start a vote or mini-vote regarding this matter? Previous discussion can be found here. KevinUp (talk) 21:55, 2 October 2019 (UTC)[reply]

Option 1 - Both headings at L3 by default
Option 2 - Both headings at L4 by default
Option 3 - Headings at L4 if multiple etymologies or POS exist
Comments
In general, I think that ===References=== should always be at L3 and always at the bottom of an entry. Individual references should be inlined using <ref>...</ref>.
I can see use cases for Further reading being at L4 or even L5, depending on the structure of the entry and what the "further reading" is intended to apply to. For instance, a multi-etym entry would have POSes at L4, and a Further reading section intended for that etym would thus also be at L4. A Further reading section intended for a specific POS would then be at L5. (This assumes that Further reading sections are allowed for specific POSes.)
‑‑ Eiríkr Útlendi │Tala við mig 22:04, 2 October 2019 (UTC)[reply]
I treat References as a per-entry section (L3) while Further reading is a per-term section (L4 if POS is at L3, L5 if POS is at L4). —Rua (mew) 14:59, 3 October 2019 (UTC)[reply]
Although I almost always put References at L3, I’d support having both be flexible, so that they can be used per-entry or per-term as needed. — Vorziblix (talk · contribs) 15:57, 3 October 2019 (UTC)[reply]
I agree with this, lets keep them flexible.--So9q (talk) 19:36, 7 October 2019 (UTC)[reply]
It depends on what the further reading or references section refers to, it seems, particularly if to support the etymology or maybe the pronunciations or representing further information on the senses. And then again there is no gain with unifying as everything is moved one level down if there are multiple etymologies as there are references sometimes for separate etymologies and sometimes for all. كپنك started with references for all and then they have been separated. There was a discussion started by Rua (talkcontribs) on whether there should be etymology groupings but I hardly find it.
And actually there is no real distinction between “Further reading” and “References”. It is just what is left over after in the past there have been a lot more headers. Yet I cannot wholly distinguish these two. Fay Freak (talk) 19:48, 7 October 2019 (UTC)[reply]
See Wiktionary:Entry_layout#References. "References" are for verifying specific claims such as a pronunciation or etymology with an outside source. "Further reading" is broader and directs you to reference works for more information (or, I guess, if you don't trust Wiktionary until you see the word is in a "real" dictionary). But this distinction is almost never carried out in practice. Ultimateria (talk) 16:17, 8 October 2019 (UTC)[reply]

Eirikr's clear misbehaviour and breach of rules

edit

Aside from Eirikr's behaviour, shown on User_talk:Eirikr#Nata...

When there is a dispute in regards to a matter, which is therefore being discussed, I would assume that everyone agrees, that people should concentrate on the discussion, and not make any edits, whilst the discussion is underway (unless it is to make edits, that a consensus approves of) Anything else would be quite chaotic, and disorderly, and would make a mockery of the discussion.

is currently under discussion. (at first at User_talk:Eirikr#Nata, which went nowhere, but now at Wiktionary:Requests_for_verification/Non-English#鉈) This discussion is ongoing and has not concluded. It is directly concerned with whether or not 鉈 can or should be defined as hatchet, machete, billhook, and/or froe.
Whilst this is happening, Eirikr added a translation of "鉈" to billhook. (in addition to the existing translation of 鉈鎌, which I put there)
If 鉈 can be defined as billhook (which is what the above mentioned RFV is about), then that is a valid to include it as a translation. If, however, 鉈 can't/shouldn't be defined as billhook, then it is clearly not valid to include it as a translation.
Thus Eirikr edit, in billhook, was an edit in regards to a matter that is currently under discussion. Hence I reverted it, whilst pointing out this issue, in my edit summary.
In answer, Eirikr reverted it back, with the justification that billhook isn't the entry under discussion.
This is a clear case of Wikilawyering. Using technicalities to justify your actions. Using the letter of the law, to pervert the spirit of the law. Going against the point/purpose of the rule ...as I pointed out in my further revert, of this clearly invalid edit ...which was answered with no justification, but just a revert and an indefinite protection, blocking me from further editing.

Now on Wikipedia, this would have had to go differently.
Eirikr made an edit, I reverted it ...and then, given Wikipedia rules, he would not have been allowed to re-revert, as that would constitute edit warring. (a concept that apparently doesn't exist here) He'd have to follow the process of the Wikipedia:BOLD, revert, discuss cycle. In other words, he made an edit (bold), got reverted ...and then would have had no choice but to discuss, to make his case. Because Wikipedia tries to make sure to avoid destructive squabbles, and keep things somewhat civilized and reasonable, preferring communication and clarification, to mere emotion and obstinacy.
...but as this is apparently a lawless anarchy (or rather more of a kratocracy, or the proverbial "law of the jungle"), where even the few rules that exist are ignored and broken, even by admins, and where no discussion is tolerated, it went the way it did.
Is this acceptable behaviour, on Wiktionary?--213.113.50.173 03:49, 3 October 2019 (UTC)[reply]

There is no such rule. Eirikr is an established Japanese editor. You are not. Maybe your edits will be respected one day, but right now they will be reverted. DTLHS (talk) 04:01, 3 October 2019 (UTC)[reply]
There is no such rule? Well no, there are no rules at all ...but is it sensible? Is it not the opposite of constructive?
Also, when you talk about who makes an edit, rather than about the edit itself... Someone who reverts to spouting ad hominem and argument from authority fallacies, thereby reveals that they have no case. That they have zero confidence, in being able to make their case, with honest or rational arguments.--213.113.50.173 04:19, 3 October 2019 (UTC)[reply]
Indeed, argumenta ad hominem show how wrong some people (including some WT admins) are. --2003:F8:13C7:59D1:1DCB:D847:CFD:6A02 06:46, 3 October 2019 (UTC)[reply]
Can you cool it down a bit? One relevant data point: wikidata:Q708852 unites billhook and . The question is really, whether a Japanese speaker, in describing a billhook, or translating an English text that uses the term, would be inclined to use the term “鉈”. A Japanese すり鉢 does not look anything like a Western mortar, but they serve the same function, so it is reasonable to offer one as a translation of the other.  --Lambiam 16:03, 3 October 2019 (UTC)[reply]
"Can you cool it down a bit?"
Everyone else is assuming bad faith of me, attacks me (not argues against or disagrees. That's fine. Attacks!), regularly make personal attacks etc etc. All the while refusing to even make any arguments, or to debate anything, at all. As is clear and obvious, to anyone who can see
...and you tell ME to cool down!?
You have no words to say to the others (who are clearly not "cool", though that is the least of their issues), but you tell ME off!?
That is clear proof, that you are an utterly disingenuous hypocrite.
"One relevant data point:"
No, that data point is not relevant, at all. How could/would it be?
It is circular (pointing to Wiki, to defend what is written on Wiki) and also does nothing to demonstrate that it is accurate usage.
"The question is really, whether a Japanese speaker, in describing a billhook, or translating an English text that uses the term, would be inclined to use the term “鉈”."
No it isn't.
First of all, how a Japanese speaker would translate an English word, is completely irrelevant, due to the fact that the Japanese speaker cannot be assumed to have a proper understanding of English. (not to mention the example I like to cite, of how EVERY English-Japanese dictionary in existence, translates "hip" as "尻")
A more sensible question would be if they would, when presented with a billhook (as in be handed a physical specimen, of the tool itself) and be shown how it is made and used, be inclined to use the term “鉈”.
They might, sure, but...
Would an English speaker, in describing a naginata (薙刀), or translating a Japanese text that uses the term, be inclined to use the term "weapon"?
Yes. Yes they would.
Does this mean that "weapon" is a valid term to add as a translation, for the entry "薙刀"?
No, obviously not. Just as using "鉈" to describe a billhook, doesn't mean that it is valid to include as a translation.
Hence your argument is clearly invalid, in multiple ways.
"A Japanese すり鉢 does not look anything like a Western mortar"
...
They look identical. (Japanese すり鉢, "Western" mortar and pestle. Yes that specific mortar is shaped a bit differently, but there is variation in "Western" mortars, including ones that have exactly the same shape. The one my parents have, for example. There is, of course, also variation in Japanese すり鉢. Yet another thing they have in common)
No sane and honest person would deny, that they look identical ...except for one tiny little detail, that many might miss: The groves in the Japanese すり鉢, which differentiates it from normal 乳鉢.
In other words: a Japanese "すり鉢" does not translate as "mortar and pestle" (that would be "乳鉢"). It is a specific sub-type, of mortar and pestle.
Much the same as "柳刃包丁" doesn't translate as "knife", seeing as it's not merely a knife, but a specific type of Japanese kitchen knife. (with Japanese kitchen knives, being a specific sub-group, of kitchen knife, kitchen knives being a specific sub-group of knives. All of this being known and obvious to pretty much everyone. Not yanagiba, as most people aren't familiar with specific Japanese kitchen knives, but rather the groupings and how all that works)--85.228.52.161 08:16, 5 October 2019 (UTC)[reply]
You wrote: how EVERY English-Japanese dictionary in existence, translates "hip" as "尻" — I don’t know what your definition of every is but that’s clearly wrong: [1], [2]. — TAKASUGI Shinji (talk) 15:29, 8 October 2019 (UTC)[reply]
Okay, so you found two dictionaries that don't technically translate it as "尻", specifically ...but one still give a translation, that points to the exact same part of the body., whilst is kinda close to accurate, translating it as "腰(回り", and has a more in-depth explanation, that is kinda correct. You are technically correct. My statement is, apparently, not exactly correct. Thanks for the (nit-picky) correction ...but my point still stand. My argument is unaffected: Just about EVERY English-Japanese dictionary in existence, translates "hip" as (in Japanese) buttocks. (there is one confirmed exception. A thorough search, among net and book versions, might result in one more) Not that this is necessary, to point out that Japanese dictionaries are not infallible. Dictionaries are not infallible. You're supposed to go with usage, not what a dictionary dictates ...but Japanese dictionaries are especially prone to error, and this is an especially obvious example, of a huge and obvious error.--85.229.234.72 10:43, 10 October 2019 (UTC)[reply]
Actually, I shouldn't have given those answers. All of this is off-topic: The issue is Eirikr's behaviour. Not the facts or validity of any part of any entry, but purely how Eirikr handled things.--85.228.52.161 08:18, 5 October 2019 (UTC)[reply]
I do not see a breach of rules, but I do see a lot of fuss about a relatively unimportant issue. When there is a dispute about a sense of an entry, it is normally upon the person disputing the sense to raise the issue at Requests for verification. Definitions will rarely be perfect, and aiming at perfection may ultimately even defeat the purposes of a dictionary by making definitions incomprehensible.  --Lambiam 11:36, 5 October 2019 (UTC)[reply]
You don't see any problems with the fact that he regularly assumes bad faith (i.e. without there being any evidence of it, of any kind), edit wars, refuses to discuss, and makes edits on things, despite the fact that they are under active discussion? You don't think any of this breaks any of the rules? WT:Civility is just a joke, is it? The notion that one should be constructive, and try to act in ways that improve Wiktionary, is of no importance?--85.228.52.161 12:00, 5 October 2019 (UTC)[reply]
In this very response you are insinuating bad faith. And if you think you were civil while Eirikr was not, you have a different understanding of civility than me. If you get so upset when your precious contributions are reverted, then I suggest for your own sake that you find something else to do.  --Lambiam 12:35, 5 October 2019 (UTC)[reply]
To quote en:WP:ASSUME "Unless there is clear evidence to the contrary, assume that people who work on the project are trying to help it, not hurt it." (or, indeed, take a look at WT:Assume good faith)
Where there is clear and obvious evidence, I have pointed it out.
When you say I insinuated bad faith, whom are you saying I did so against? I would challenge you to show me any instance, where I have done so.
If you mean Eirikr, I have repeatedly and clearly stated that Eirikr has acted in bad faith, many times. Those were not insinuations.
If you mean against you... Again, I have no insinuated that you have bad faith. The statement "That is clear proof, that you are an utterly disingenuous hypocrite", is not an insinuation.
"And if you think you were civil while Eirikr was not, you have a different understanding of civility than me."
So you think assuming bad faith isn't uncivil? Reverting an edit, which, in its edit summary, contains a mention of where the editor has started a discussion about the subject and proceeding to ignore the discussion, isn't uncivil? (and edit warring?) Yes, it would appear that my understanding of civility, is very different from you, and everyone else on Wiktionary. Most people, in general, as well as Wikipedia as a whole, however...
"If you get so upset when your precious contributions are reverted"
Without any justification, discussion, or any form of a coherent process. Just reverted.
"then I suggest for your own sake that you find something else to do."
So you are saying that Wiktionary is hopeless? That what is said in its Main Page, is a complete lie? Everything in Help:Interacting_with_other_users is a lie? ...and when the Wikimedia Foundation states (emphasis mine) "The Wikimedia Foundation is the nonprofit that hosts Wikipedia and our other free knowledge projects. We want to make it easier for everyone to share what they know. To do this, we keep Wikipedia and Wikimedia sites fast, reliable, and available to all. We protect the values and policies that allow free knowledge to thrive. We build new features and tools to make it easy to read, edit, and share from the Wikimedia sites. Above all, we support the communities of volunteers around the world who edit, improve, and add knowledge across Wikimedia projects", that should not apply to Wiktionary?
Also, in WT:Civility, it says (again, emphasis mine):
"Most of the time, insults are used in the heat of the moment during a longer conflict. They are essentially a way to end the discussion. /.../ In other cases, the offender is doing it on purpose: either to distract the "opponent(s)" from the issue, or simply to drive them away from working on the article or even from the project, or to push them to commit an even greater breach in civility, which might result in ostracism or banning. In those cases, it is far less likely that the offender will have any regrets and apologize."
This sounds like a perfect description of the approach, that people have taken against me. (and, it would seem, any other newcomers as well)--85.228.53.143 08:55, 6 October 2019 (UTC)[reply]
What a waste of everyone's time. Canonicalization (talk) 21:11, 6 October 2019 (UTC)[reply]
Just within the past week or so, I've seen Eirikr say "I'm sorry, I was mistaken- you're right". You, on the other hand, seem to be only interested in winning your argument, and demolishing anyone who contradicts you. You've obviously gone through all of our pages on rules, policies and procedures, but only in search of ammunition, not to try to understand. You have this nasty habit of dismissing anything other people say that doesn't directly address the points you've made in the manner and the place that you deem such points should be addressed. A wiki is a community, not a set of rules. You need to pay attention to what people are trying to tell you, and stop wikilawyering. You don't have to agree with us, but you do have to listen. Chuck Entz (talk) 06:25, 10 October 2019 (UTC)[reply]
You have zero basis, for any of your claims/insinuations, which are purely assumptions of bad faith. As for your accusation of wikilawyering... That's laughable. How could anything I've said, even begin to qualify? Eirikr has engaged in it, certainly, but me?
"A wiki is a community", you say? Usually, that's true, but I've seen no evidence of that, here. Quite the contrary.
"You don't have to agree with us, but you do have to listen."
You've made no arguments, for me to listen to. Just accusations and bitching.
I've made arguments ...that none of you have listened to, addressed or acknowledged.
The degree of projection, in your comment is mind-boggling.--85.229.234.72 10:35, 10 October 2019 (UTC)[reply]
Which is, of course, itself an entirely content-free ad hominem argument- one of many. By the way, I probably won't have access to a computer until Tuesday, so don't think I'm ignoring you. Chuck Entz (talk) 04:47, 11 October 2019 (UTC)[reply]
Which is, of course, itself an entirely content-free ad hominem argument- one of many. (making an accusatorial statement about someone, when that statement itself is an example of the same thing, from you, is never a good idea)
Also, that is a complete misunderstanding of what an ad hominem is. An ad hominem is saying that "Person A made argument X, person A is bad, therefore argument X is wrong". (i.e. the argument/evidence/position is deemed wrong, not because of anything to do with the argument/evidence/position, but purely because of the person stating it)
Saying "Person A made argument X, this counter-argument Y shows why it's wrong, and also person A is bad" (or if the order of the last two bits are flipped) is in no way an ad hominem. Nor is merely badmouthing or insulting someone, unless it is done for the purpose of dismissing that person.
At no point, have I made any argument here, that even approaches being an ad hominem. You may want to check en:Ad_hominem, especially en:Ad_hominem#Non-fallacious_types ...or this (specifically the bit from the timestamp in the link, and 15 seconds on) or this video from PBS.--213.113.51.51 19:03, 13 October 2019 (UTC)[reply]
──────────────────────────────────────────────────────────────────────────────────────────────────── Are you here to do something useful or only to argue with everyone in order to gain points for your tally of "arguments won on the Internet"? — surjection?19:32, 13 October 2019 (UTC)[reply]

Well... Wiktionary needs some guys to do the dirty jobs for it, while it turns a blind eye and give them the pleasure of abusing power to some extent in return. This is just how politics works in the world... ᾨδή (talk) 11:15, 5 October 2019 (UTC)[reply]

No it isn't. Not entirely. Yes, people with more power often have a bit more of an ability to get away with thing, but usually no one is fully above the law. Able to get away with a bit more, be given more of a benefit of the doubt? Sure ...but able to blatantly go against it? No. There is usually some form of accountability. I can certainly say that none of this would/could ever happen on Wikipedia.--85.228.52.161 12:00, 5 October 2019 (UTC)[reply]
Also, you say "Wiktionary needs some guys to do the dirty jobs for it", but if they do it in a way that is more detrimental to Wiktionary, than it is positive...--85.228.53.143 08:55, 6 October 2019 (UTC)[reply]
***Sigh*** Canonicalization is right – what a waste of time and quite derivative (reminds me of a certain "cunning" user). --Robbie SWE (talk) 11:47, 10 October 2019 (UTC)[reply]
Derivative?--85.229.233.209 20:55, 11 October 2019 (UTC)[reply]

Adding |unc=1 and moving labels in {{desc}}

edit

I'd like to propose, 1. adding an |unc=1 to {{desc}} to replace manually adding {{q|possibly}} at the end of entries, and 2. moving labels such as (calque) after the borrowed arrowed. Please see below.

Current system

Using {{q|possibly}}:

  • English: example (possibly)
  • English: example (possibly)
  • English: example (calque) (possibly)
  • English: example (semantic loan) (possibly)
  • English: example (semi-learned) (possibly)
  • English: example (possibly)
Proposal #1Proposal #2Proposal #3

Using |unc=1:

Using |unc=1:

Using |unc=1:

  • English: example (possibly)
  • English: example (possibly)
  • English: example (possibly, calque)
  • English: example (possibly, semantic loan)
  • English: example (possibly, semi-learned)
  • English: example (possibly)

Another nice thing about having a parameter is that we could create a category for entries with uncertain descendants. What are people's thoughts? --{{victar|talk}} 04:26, 3 October 2019 (UTC)[reply]

@Victar I think this is a great idea. Of your possible formattings, I like #2 best as I think the superscript question mark should be fairly self-explanatory. I'm ok with #3 as well, but not so much #1, as abbreviations like "clq", "slb", etc. are fairly obscure. Benwing2 (talk) 06:23, 3 October 2019 (UTC)[reply]
@Benwing2: Glad to hear. The idea is inspired from mathematics, where we actually have Unicode ⩼, but it's rather hard to make out. I would say that most people using en.Wikt wouldn't even know what the c in example c is, which is why the tooltips are crucial. --{{victar|talk}} 06:38, 3 October 2019 (UTC)[reply]
I think it should be unk=1 to match the template {{unk}}. —Rua (mew) 15:31, 3 October 2019 (UTC)[reply]
I disagree. Uncertainty is not the same as something being unknown. —Μετάknowledgediscuss/deeds 16:08, 3 October 2019 (UTC)[reply]
Yeah, I thought about that too because I often find myself doing {{unk|lang|Uncertain}}, but "uncertain/unclear" is more appropriate language than "unknown" in all cases involving {{desc}}. One could also probably argue we need an {{unc}} template. --{{victar|talk}} 16:35, 3 October 2019 (UTC)[reply]
I like 2 and 3, although in 2 the > sign before inherited terms looks very ugly; I’d leave it out and just have the question mark there. — Vorziblix (talk · contribs) 15:52, 3 October 2019 (UTC)[reply]
> is what's used in linguistics to denote a direct inheritance, so it is the appropriate symbol to use. I think just using a question mark looks strange. Maybe they would look better in monospace font > → ⇒ vs. > → ⇒. --{{victar|talk}} 16:35, 3 October 2019 (UTC)[reply]
Ugh, Consolas doesn't support ⇒. --{{victar|talk}} 22:00, 3 October 2019 (UTC)[reply]
@Victar Is it necessary to have Consolas support for that arrow? Also, this is a larger can of worms but I've always thought that {{desc}} should take a list of descendants; that would obviate problems with text on the right ending up in the wrong place. Benwing2 (talk) 18:06, 6 October 2019 (UTC)[reply]
@Benwing2: Consolas is the default monospace font on many Windows machines, so using Consolas doesn't make all the arrows equal width, kinda defeating the point. But anyone can apply their own font using their common.css. You mean using multiple terms? Yeah... that's been suggested before (for {{cog}} too), but I think that would be a fervent battle. --{{victar|talk}} 03:43, 7 October 2019 (UTC)[reply]
I've implemented proposal #2. --{{victar|talk}} 19:41, 7 October 2019 (UTC)[reply]

Unique calque symbol

edit

While I have people's attention, does anyone have any objections to giving calques a unique arrow? They're pretty different from other types of borrowings. Perhaps English: example? --{{victar|talk}} 18:26, 10 October 2019 (UTC)[reply]

I agree with using another symbol, but the two-way arrow doesn't really support the nature a calque's evolution; it reminds me too much of twice-borrowed terms. I googled Unicode arrows and thought these might be interesting for showing indirectness: ⥤ ➾ ⥲ ⤏. Ultimateria (talk) 05:18, 15 October 2019 (UTC)[reply]
I beg to differ. could connote a borrowing (), that the returned () with a native construct. In math, and are called equivalence (biconditional) arrows. We could also just use =, but that would imply a cognate in my mind. --{{victar|talk}} 06:10, 15 October 2019 (UTC)[reply]
That's interesting. I never learned those logic symbols so I was only looking at the shape. I think the two-way arrow is fitting then. Ultimateria (talk) 07:24, 15 October 2019 (UTC)[reply]
I still like dotted or wavy arrows (↜⇜⇠ or <≈≈ ), though, since most of our readers won't have a math background. Chuck Entz (talk) 08:06, 15 October 2019 (UTC)[reply]
Many symbols in linguistics are based on mathematical ones, and we are a dictionary, after all. --{{victar|talk}} 08:20, 15 October 2019 (UTC)[reply]
To also note, is used in chemical formulas as an equilibrium symbol. --{{victar|talk}} 16:01, 15 October 2019 (UTC)[reply]
If we’re following math symbols, a more closely analogous one might be ⥲, in that it’s used to indicate isomorphism — isomorphism being, to my mind at least, much more like the relation between calques than logical equivalence is. Not sure if it renders clearly for most people, though. — Vorziblix (talk · contribs) 17:22, 15 October 2019 (UTC)[reply]

Obsolete symbol

edit

Would an obsolete parameter be a good idea, something like English: tharf? --{{victar|talk}} 00:04, 2 November 2019 (UTC)[reply]

Symbols are used in printed dictionaries to save space, but Wiktionary is not paper. Besides, symbols require explanations, and although some users might be familiar with a dagger being used in this way in other dictionaries, many won't. The existing tag of "obsolete" is clear, and I think using a symbol would only obfuscate the meaning. — 85.211.40.75 11:38, 24 November 2019 (UTC)[reply]

Cleaning up or deleting Template:jump

edit

I recently came across {{jump}}. This is a weirdly named template with strange syntax and I'm not sure it's needed at all. It's used on about 700 pages (which BTW is below the 1000-use deletion vs. deprecation threshold that I've been using). The vast majority of uses appear to be in Icelandic entries. I'd like to see what people think about the following:

  1. First, is it needed at all? The vast majority of uses were put there long ago (e.g. about the 600th chronological usage, on page stafli, was put there in 2011). The stated purpose of the template is to assist in connecting synonyms/antonyms/etc. to definitions on long entries, but (1) this is better achieved by using {{syn}} etc. to place the synonym directly below the definition line, and (2) many (perhaps most) of the uses are on small pages, e.g. stafli, where three lines separate the cross-references between synonym and definition.
  2. Secondly, the name {{jump}} is fairly opaque. The purpose appears to be a cross-reference; hence I think it should be named something like {{xref}} or maybe {{seclink}}; or better yet, since it serves two purposes (cross references to a section like synonyms, and cross references from such a section back to the definition), split it into two templates {{xref-to}} and {{xref-from}}.
  3. Also, the syntax is very weird. On stafli, for example, the cross-reference from defn to synonym is formatted as {{jump|is|computing stack|s}}, while the cross-reference from synonym to definition is formatted as {{jump|s|is|computing stack}}: Same params, just different order. By convention, when there's a language code it should normally go in the first param and be mandatory, but here it (a) floats around, and (b) is optional; if nothing that looks like a valid language code is seen, English is assumed.
  4. Finally, the section codes are non-obvious: for example, s = synonyms, which is better named syn.

My instinct is to deprecate this with an abuse filter to prevent new entries that use it, and to eventually rewrite uses to use {{syn}} etc. under the definition. Alternatively, we could just remove these template calls entirely, since they don't really seem to be used the way they should be in most cases. As a last resort, change the syntax as described above.

Comments?

Benwing2 (talk) 17:50, 6 October 2019 (UTC)[reply]

Deprecate and delete. Canonicalization (talk) 22:21, 7 October 2019 (UTC)[reply]
I vaguely remember something about it being used at one time to provide an anchor for offwiki links to land on as a kludgy alternative to {{sense}}. If that's the case, you might need to search via Google to see if there are links to the anchors. Chuck Entz (talk) 04:19, 8 October 2019 (UTC)[reply]
I saw that template, it’s almost only used in Icelandic entries, and yes, delete. Those superscript links are more confusing than everything, and yes, the syntax is terrible. There must be few who ever click em, and if they do it is mostly without technical effect, at least when there aren’t many senses or the screen is large (the screens have become larger since!). Fay Freak (talk) 16:57, 8 October 2019 (UTC)[reply]
Delete. The few times I encountered it it was often pointing to invalid targets. – Jberkel 19:40, 8 October 2019 (UTC)[reply]

What's "retrospective tense", referred to in the entry of 've, mean? --Backinstadiums (talk) 18:51, 6 October 2019 (UTC)[reply]

Ancient Greek in translation tables

edit

Hi, I suggest we change to the language name we have in module:languages. Do we need a vote for this change?--So9q (talk) 19:20, 7 October 2019 (UTC)[reply]

To clarify, this means nesting Ancient Greek in translations as
* Greek:
*: Ancient Greek:
rather than the current
* Greek:
*: Ancient:
Eru·tuon 19:26, 7 October 2019 (UTC)[reply]
Funny, I have already written it like that and have seen people change such usages to Ancient:, and I still do not know why. I don’t see what is related in Module:languages or its data. I get that it has something to do with the translation adder adding Ancient:, but has there been a reason not to add nested names that the translation adder does not add? What I have also done is nest Modern Turkish: and Ottoman Turkish: under “Turkish”, and then come the endless Aramaic lects, as in bed. Fay Freak (talk) 19:35, 7 October 2019 (UTC)[reply]
The line grc: 'Greek/Ancient', in MediaWiki:Gadget-TranslationAdder-Data.js (part of the nesting variable) makes the TranslationAdder gadget use the nesting "Greek: Ancient:". That's what User:So9q's proposal would change. As far as I know, that's the only place where nestings are enforced automatically, though, as you noted, people have been imposing the same nesting for consistency's sake. — Eru·tuon 19:46, 7 October 2019 (UTC)[reply]
It would make sense if the gadget were flexible enough to use the name “Greek” when there is only modern Greek and “Modern Greek” when the same Greek is nested when there is also Ancient Greek, same for Turkish. Fay Freak (talk) 19:51, 7 October 2019 (UTC)[reply]
I support changing from Ancient to Ancient Greek in this circumstance. Benwing2 (talk) 03:28, 8 October 2019 (UTC)[reply]

Jutish or Jutlandic?

edit

jut is the iso code for a collection of dialects of Danish. Translation-adder defaults to Jutish but maybe Jutlandic is more correct? WDYT?--So9q (talk) 21:24, 7 October 2019 (UTC)[reply]

Both terms are found in the literature. Wikipedia uses Jutlandic, but according to Google Ngram Viewer Jutish is far more common.  --Lambiam 22:17, 7 October 2019 (UTC)[reply]
Thanks for taking the time to look.--So9q (talk) 22:15, 8 October 2019 (UTC)[reply]
There is also the erstwhile Jutlandish, which is today as rare as Jutlandic Leasnam (talk) 00:06, 10 October 2019 (UTC)[reply]
I looked through a few search pages for "Jutish", and most of them are references to the historical people called the Jutes.__Gamren (talk) 20:01, 10 October 2019 (UTC)[reply]

Nest Jutish under Danish

edit

WDYT?--So9q (talk) 06:31, 8 October 2019 (UTC)[reply]

Merging translations of poop and poo

edit

I went ahead and did it without discussing first. I see now that there is a slight difference in meaning namely that poop is also informal. WDYT?--So9q (talk) 09:33, 8 October 2019 (UTC)[reply]

That's fine, the difference is more geographic than semantic. One might even be an alternative form of the other. Ultimateria (talk) 16:08, 8 October 2019 (UTC)[reply]

I did good work

edit

Just thought you'd like to know that I did some good work. I have gone through all the entries of Category:Spanish idioms and linked their component parts to them. It took 18 days to do. Obviously, it's not as awesome as So9q's Epic 2019 Poo/Poop Translation Merge, but you're welcome in any case. What have been your main wiki-achievements in the last 12 months? --Vealhurl (talk) 07:04, 9 October 2019 (UTC)[reply]

Wow, this WF flew under the radar. I didn't realise it was you. Achievements? Nearly finished with Chambers 1908. Down to about 5,000 entries to review (mostly short but annoying or obscure ones to deal with). w00t. Equinox 08:48, 9 October 2019 (UTC)[reply]
Achievements:- Added hundreds of words that nobody is ever going to look up - but it keeps me busy. Keep taking the tablets. SemperBlotto (talk) 09:46, 9 October 2019 (UTC)[reply]
I deleted a few hundred old IP talk pages. This doesn't actually accomplish anything of benefit to the project but it annoys me that they exist. - TheDaveRoss 12:19, 9 October 2019 (UTC)[reply]
User:Ashley Pomeroy would have words with you... Equinox 15:50, 13 October 2019 (UTC)[reply]
I suppose helping build up Category:en:Rail transportation to 686 entries (currently) is an achievement, not to mention adding to Category:nb:Rail transportation. DonnanZ (talk) 14:00, 9 October 2019 (UTC)[reply]
Same old same old. Up past 25,400 pages with {{taxlink}}, 15,700 with {{taxoninfl}}, and 12,400 with {{vern}}; just a wee bit short of the estimated 1.3 million described species. DCDuring (talk) 00:40, 10 October 2019 (UTC)[reply]
BTW, I'd like to thank all of those who have used these templates to build up the total that Wiktionary has. There are quite a few now. DCDuring (talk) 23:15, 10 October 2019 (UTC)[reply]
We love our Sisyphean labors, don't we. My big thing has been adding etymologies, derived terms, related terms, and further reading to many hundreds of Romance entries, then adding descendants to their Latin ancestors. Have I inspired Wonderfool to do something tedious with my interlinking mania? :o Ultimateria (talk) 16:01, 10 October 2019 (UTC)[reply]
Ult, you always inspire me. --Vealhurl (talk) 11:42, 11 October 2019 (UTC)[reply]
Creating and maintaining wanted entries lists. Nice to watch older lists slowly turning blue. – Jberkel 20:17, 10 October 2019 (UTC)[reply]

Template:blend categorization

edit

Currently, {{blend}} adds the entry under "...terms borrowed from..." if the language is one of the two components is different, such as with romppu having {{blend|fi|ROM|korppu|lang1=en}}; the term is under "Finnish terms borrowed from English", as opposed to just "Finnish terms derived from English", even though a blend is a form of derivation and the word wasn't directly borrowed in this case. Is there a reason the categorization is as it is? — surjection?10:12, 9 October 2019 (UTC)[reply]

Hmm, {{affix}}, {{compound}}, and the other etymology templates handled by Module:compound do this as well. It probably dates from this edit by Rua in May 2016. In March of this year {{blend}} followed suit with this edit by Benwing2. — Eru·tuon 18:36, 9 October 2019 (UTC)[reply]
I agree, this should be changed. If no one objects, I'll change it. Benwing2 (talk) 03:39, 10 October 2019 (UTC)[reply]
@Erutuon, Surjection Changed. Benwing2 (talk) 07:59, 15 October 2019 (UTC)[reply]
Thank you, the new categorization makes a lot more sense. — surjection?08:41, 15 October 2019 (UTC)[reply]
edit

Please see Wiktionary:Requests_for_deletion/Others#Deletion_of_rel-top,_der-top_and_related_templates and join the discussion.

As for deletion of templates with more that 1000 transclusions, is it then a better idea to deprecate it? In that case {{rel-top}} and {{rel-bottom}} is the only ones qualifying for that.--So9q (talk) 11:41, 9 October 2019 (UTC)[reply]

label "(rare)"

edit

Why isn't the oft-added label "(rare)" added to Appendix:Glossary? --Backinstadiums (talk) 16:04, 9 October 2019 (UTC)[reply]

Don't think there's a reason (other than no one has bothered to do so). How exactly "rare" is defined in terms of lexicography, I don't know. Maybe a certain percentage threshold in a corpus? – Jberkel 09:00, 19 October 2019 (UTC)[reply]
Each word should have some kind of meter template, or 'graph over time' of how common it is. See also Talk:in spades#Commonness. I'd like to know if I am about to use a word that only a percentage (graph should show what %) of English speakers know. Jidanni (talk) 21:43, 15 November 2019 (UTC)[reply]
The problem is that there's no objective way of doing that. Google n-Grams, for instance, only looks at published texts, whereas we try to document slang and other terms that are common in speech but not in writing. Andrew Sheedy (talk) 01:39, 16 November 2019 (UTC)[reply]

Suppressing verb inflections

edit

The auto-generated verb inflections for call one on one's shit (presently nominated for deletion) are haywire. Obviously this can be fixed by hand-typing "calls one on one's shit", "calling one on one's shit", "called one on one's shit", but to me including the inflections at all looks highly silly, like way too much information. What is the recommended way to suppress the inflections altogether while still using the correct template? Mihia (talk) 14:08, 10 October 2019 (UTC)[reply]

One simple way is to use {{head|en|verb}}. DCDuring (talk) 16:10, 10 October 2019 (UTC)[reply]
Thank you! Mihia (talk) 17:05, 10 October 2019 (UTC)[reply]

What is the standard of 当て字?

edit
The use of kanji chosen primarily for their phonetic (narrow sense) "OR" semantic (broad sense) value to represent foreign or native Japanese words, or the kanji so used.
Ateji (当て字) – use of kanji for phonetic value (sound) "RATHER THAN" semantic value (meaning), such as 寿司 (すし, sushi, “sushi”). Opposite of jukujikun.
  • Which sense does Wiktionary take? Is 秋刀魚(さんま) (sanma) ateji or jukujikun (or both or neither)? Why?

ᾨδή (talk) 11:42, 11 October 2019 (UTC)[reply]

Lack of macrons on Classical Nahuatl pages.

edit

We should change the current practice of having no macrons on Classical Nahuatl pages titles. I think it's just an inconvenience to leave out a part of the language's phonology like that on the site. --Jaydreams (talk) 17:46, 10 October 2019 (EDT)

I dunno, we omit vowel length markings in titles for several other languages. For instance, Latin (well, Classical Latin) has phonemic vowel length, but is usually written without macrons or breves, so we don't include macrons and breves in titles. Ancient Greek similarly. In both cases though, not all varieties of the languages had vowel length. I don't know if this is true of Classical Nahuatl. — Eru·tuon 21:53, 11 October 2019 (UTC)[reply]
While I'm pretty sure some varieties of Modern Nahuatl have lost vowel length, Classical Nahuatl was pretty centralized phonologically; you don't worry about dialects when writing for it. I am aware of the lack of length marking on Classical Latin, but it has a standardized script that explicitly did not mark length; Greek has η and ω that do. Nahuatl, meanwhile, has several divergent orthographies. The only one which marks both vowel length and the glottal stop seems to be the orthography shared between Frances Karttunen (An Analytical Dictionary of Nahuatl) and J. Richard Andrews (An Introduction to Classical Nahuatl). I think it would be best to just go for phonological accuracy with page titles, since there isn't any standardization to go off of. Also, maybe not as relevant, but Classical Nahuatl is phonologically pretty simple. Japanese and Hawaiian don't get that treatment when it comes to macrons, and Nahuatl definitely shouldn't either. --Jaydreams (talk) 20:12, 10 October 2019 (EDT)
[In regard to Ancient Greek, I was referring not to η (ē) and ω (ō), but to monophthongal α (a), ι (i), υ (u). Length is not typically marked for those except in some grammars or textbooks, and it was lost sometime in the Koine Greek period, so is not present in all varieties of what Wiktionary calls Ancient Greek.] — Eru·tuon 00:21, 12 October 2019 (UTC)[reply]
Why do you need it in the pagetitle, in the header it is enough. Also for rarer words it is not known, as in Latin, so it is better to omit it there, in addition to that it is harder to type if diacritics are present in the page title. I can well write macrons with my keyboard, but do you imagine that people always do that to look up words? No, people won’t look up words with macrons. There is nothing to improve here. By speaking a language without long vowels and using a script without long vowels Spaniards have permanently poisoned the tradition of Nahuatl. Perhaps one can fix that for the Modern dialects by using Devanāgarī or Kashmiri Arabic script to write Nahuatl as it deserves by breaking with the tradition but Classical Nahuatl has its order petrified. Fay Freak (talk) 00:27, 12 October 2019 (UTC)[reply]
  • Technical consideration and curious query: are there any contrastive pairs of words in Classical Nahuatl that would differ in spelling only by the presence or lack of macrons? That might be one factor in determining how important it is to include macrons in page titles. In Japanese, for instance, 三度 (sando, three times) and 参道 (sandō, pilgrim's path) are very different things. Likewise in Hawaiian, where we have lolo (brains; marrow) contrasting with lōlō (paralyzed, numb; crazy). ‑‑ Eiríkr Útlendi │Tala við mig 00:49, 12 October 2019 (UTC)[reply]
    By looking through {{head}}, {{nci-noun}}, {{nci-proper noun}}, and {{nci-phrase}} from the last dump, I found cua and cuā, patla and pātla, āhuatl and āhuātl, metztōntli and mētztōntli. — Eru·tuon 05:28, 12 October 2019 (UTC)[reply]
    There's also tōloa (bend) vs. toloa (swallow), piloa (hang) vs pīloa (shorten), -pān (flag) vs. -pan (on top), māca (negative imperative) vs. maca (give) etc. Like Eiríkr Útlendi showed with the languages I mentioned, it's just too important a feature. Also, I don't agree with Fay Freak at all. People are going to care about having entries be actually accurate, and all you have to do is click on the See Also section to find similar spelled words. Classical Nahuatl's "order" is not without macrons. As I said earlier, there's no standards orthography, so for main entries we should use the Karttunen form as it's the most accurate and most used. — User:Jaydreams 02:06, 12 October 2019 (UTC)[reply]
@Jaydreams IMO it could go either way. Latvian, for example, includes length marks in page titles, and various languages (e.g. Ancient Greek, Spanish, Portuguese) include accents in page titles. But many other languages don't. The general principle followed is to go by the standard orthography (although a small exception is made for Russian, which includes ё in page titles instead of е even though the diaeresis is normally omitted in standard writing; note that Russian page titles don't include stress, even though it's highly unpredictable and often lexically relevant and found in the header). In this case, my impression is that the original orthography used by the Spanish lacked macrons, so we should probably do the same. The "orthography shared between Frances Karttunen (An Analytical Dictionary of Nahuatl) and J. Richard Andrews (An Introduction to Classical Nahuatl)" that you mention is exactly parallel to the situation in Latin, where dictionaries and introductory books include macrons but the standard orthography doesn't. Note that we also don't include macrons in Old English entries even though nearly all critical editions these days do include them, again based on the argument from standard orthography. Benwing2 (talk) 15:12, 12 October 2019 (UTC)[reply]
  • I cannot agree with the suggestion to use an orthography invented centuries ago by non-English-speakers. We don't use the original Portuguese orthography from the late-1500s / early 1600s for Japanese romanization for similar reasons -- this is the English Wiktionary for one, and we have different ideas about spelling now than we would have had hundreds of years ago. Perhaps more so for orthographies coming from Spanish or Portuguese speakers, in light of sound changes that have occurred in both languages. Another aspect is that the Spaniards that invented the first Latin-alphabet orthography for Nahuatl were also likely not fluent Nahuatl speakers, and they may not have understood the importance of marking vowel length when inventing their orthography.
It also seems to me that the Classical Nahuatl situation is closer to Hawaiian or Japanese than it is to Latin with regard to macrons, as they appear to be contrastive in Nahuatl in a way they aren't in Latin (NB: I am no Latinist, however, and I could very well be wrong about vowel-length contrastiveness in Latin). Moreover, if what I've read above is correct, there isn't a standard orthography for Nahuatl, whereas there is a very long-standing orthography for Latin.
2p from the sidelines, in the hopes of being helpful.  :) ‑‑ Eiríkr Útlendi │Tala við mig 18:13, 14 October 2019 (UTC)[reply]
Actually, macrons are very contrastive in Latin; compare e.g. legō (I choose) vs. lēgō (I dispatch), levis (light, quick) vs. lēvis (smooth), etc. Benwing2 (talk) 03:56, 15 October 2019 (UTC)[reply]
Aha, thank you for setting me straight on that.  :) Considering then that we have the option now of determining an orthography for Classical Nahuatl and we are not saddled with millenia of convention, why would we not want to distinguish such a contrastive element? ‑‑ Eiríkr Útlendi │Tala við mig 19:24, 15 October 2019 (UTC)[reply]

Change {{cat}} to mean {{categorize}} not {{topics}}

edit

{{topics}} has an overabundance of short forms and aliases, including {{top}}, {{topic}}, {{catlangcode}}, {{cat}}, {{C}} and {{c}}, the latter three weirdly named (cat/c = topics???). Meanwhile {{categorize}} has no short forms. Logically, {{cat}} especially should mean {{categorize}}, and there are less than 1000 current uses, so I'd like to bot-replace {{cat}} with {{top}} and then repurpose {{cat}} as a short form of {{categorize}}. Thoughts? Benwing2 (talk) 14:46, 12 October 2019 (UTC)[reply]

  SupportRua (mew) 13:18, 13 October 2019 (UTC)[reply]
On a slightly separate note, do we really need a three-letter abbreviation for topic, already a short word? It seems to lend itself to confusion with concepts like "top of page". Equinox 13:36, 13 October 2019 (UTC)[reply]
@Equinox I agree, it's quite confusing. If others agree, we can deprecate this alias. Benwing2 (talk) 16:18, 13 October 2019 (UTC)[reply]
A common user of {{top}} reporting in, yeah we need it, I use it to add topics on definition lines like so:
  • #{{top|foo|Barring}} {{lb|foo|colloquial}} to [[bar]]
Expanding it to {{topics}} could crowd the line further Crom daba (talk) 20:16, 13 October 2019 (UTC)[reply]
Can we get more input and a consensus? If "crowding the line" is a problem then I feel that's more of a problem with our layout than with the choice of words. Templates and computer programming can be redesigned if needed. See for example the Python programming language which got rid of the famous {...} curly brackets and used indentation instead. Equinox 22:23, 13 October 2019 (UTC)[reply]
@Crom daba There's also {{topic}}, which is only two chars more than {{top}}. Benwing2 (talk) 22:47, 13 October 2019 (UTC)[reply]
BTW here's a table of all uses of {{topics}} and aliases:
Aliased template Canonical template #Uses
Template:topics Template:topics 86729 (47644 not including the aliases)
Template:catlangcode Template:topics 2365
Template:C Template:topics 25455
Template:c Template:topics 7946
Template:topic Template:topics 481
Template:cat Template:topics 597
Template:top Template:topics 2241
I actually propose making {{topic}} be the canonical name, even though it's not so commonly used now. It's shorter and (to me) more logical than the plural {{topics}}. Benwing2 (talk) 23:32, 13 October 2019 (UTC)[reply]
I have deprecated {{cat}} as a shortcut for {{topics}} and redirected it instead to {{categorize}}. Now, the three categorization templates ({{categorize}}, {{catlangname}} and {{topics}}) each hard three-letter short forms. I haven't done anything with {{top}} because there appears to be some objection from User:Crom daba to deprecating it (what do you think of using {{topic}} or {{C}} instead?). I actually think the canonical name of {{topics}} should be {{topic}}, with {{topics}} and {{T}} as the only aliases; all the other aliases are too confusing. Benwing2 (talk) 05:49, 17 October 2019 (UTC)[reply]
I would prefer keeping {{topics}} as the proper name. The template can take multiple topics as arguments after all, not just a single one. The name should hint at this usage to avoid people doing stuff like {{topic|xx|a}}{{topic|xx|b}}{{topic|xx|c}}. —Rua (mew) 20:47, 17 October 2019 (UTC)[reply]
I agree that "topics" is a good name, and I think we should definitely deprecate "top". Ultimateria (talk) 05:03, 31 October 2019 (UTC)[reply]

IMO, "learnedly borrowed" sounds extremely awkward. Google shows only 8 hits, of which 6 are to Wiktionary. I want to create a {{semi-learned borrowing}} template, and "semi-learnedly borrowed" sounds even worse. Benwing2 (talk) 18:45, 12 October 2019 (UTC)[reply]

I support the renaming; "learned borrowings" sounds much better. — Eru·tuon 17:04, 13 October 2019 (UTC)[reply]
I support the renaming too (see Wiktionary:Requests for moves, mergers and splits § Template:semantic loan and Template:learned borrowing). Canonicalization (talk) 17:40, 13 October 2019 (UTC)[reply]
I support anything that doesn't have learnedly in it. NES do not use that word. Equinox 02:05, 14 October 2019 (UTC)[reply]
Done. Benwing2 (talk) 08:21, 15 October 2019 (UTC)[reply]

I just created this entry. At the bottom where the categories are there is this redlink: "English terms spelled with". Does anyone know what's going on here? Cheers. ---> Tooironic (talk) 04:51, 13 October 2019 (UTC)[reply]

@Tooironic: Yeah, the title has two invisible characters at the beginning, the zero-width no-break space or byte order mark (U+FEFF). It is not a standard character in English, so the "spelled with" category is added. There's already an entry for matchless without ZWNBSP characters.
Probably I should come up with an abuse filter to warn editors if there are invisible characters in the title, except where they're wanted, as in فارسی‌زبان (fârsi-zabân, Persian speaker). (The invisible character is after فارسی, though there it prevents the joining of the letters.) — Eru·tuon 06:06, 13 October 2019 (UTC)[reply]
Actually, for this character, only administrators can create the entries because the character is on the title blacklist, so a filter wouldn't be very useful. But tagging edits that add the character to wikitext might be helpful. — Eru·tuon 06:18, 13 October 2019 (UTC)[reply]

Deprecate {{docparam}}

edit

I don't really understand why we have {{docparam}} as well as {{para}}. They display the same; the only difference is that {{para}} supports specifying the value of the param, whereas {{docparam}} supports indicating whether the param is required, optional, etc. I propose adding an argument to {{para}} to support the required/optional/etc. use case and converting cases of {{docparam}}. There aren't very many uses of {{docparam}} anyway, only about 300-400. What I'm thinking of is adding a third numbered param to specify arbitrary info, and also adding boolean params |req= and |opt= for the common use cases of specifying required and optional params. Benwing2 (talk) 20:35, 13 October 2019 (UTC)[reply]

Can we rename one of {{para}} and {{param}} while we're at it? —Rua (mew) 20:37, 13 October 2019 (UTC)[reply]
@Rua Yes, we can rename {{param}} to something like {{pararef}} or {{paramref}}. It's used on < 100 pages in any case, while {{para}} is used on ~ 3000 pages. Benwing2 (talk) 21:55, 13 October 2019 (UTC)[reply]
@Rua I renamed {{param}} to {{paramref}} ({{pararef}} sounds like it could refer to paragraphs). I haven't deprecated {{param}}; I'll wait somewhat longer for comments on this. Benwing2 (talk) 00:12, 14 October 2019 (UTC)[reply]
The reason Template:docparam exists was the very large number of existing {{para}} calls on talk pages. I wanted a different formatting for use in template documentation and did not want to change any existing {{para}} calls. Only later was the template changed to actually call {{para}} instead of using its own formatting. At that point it became redundant. - dcljr (talk) 02:33, 14 October 2019 (UTC)[reply]
Deprecated and deleted {{docparam}}. Benwing2 (talk) 05:44, 17 October 2019 (UTC)[reply]
Deprecated and deleted {{param}} in favor of {{paramref}}. Benwing2 (talk) 05:58, 17 October 2019 (UTC)[reply]

Please avoid using template calls in section headings

edit

I'd like to request that people here (and on other discussion pages, of course) try to avoid using template calls in section headings. It makes wikilinking to those sections from other pages unnecessarily difficult (for some users, anyway). For example, if I "naively" tried to link to the above section using any standard linking method, it wouldn't work:

  • normal wikilink, visible section title:
    [[Wiktionary:Beer parlour/2019/October#Deprecate {{docparam}}]]
    → [[Wiktionary:Beer parlour/2019/October#Deprecate Template:docparam]]
  • normal wikilink, actual section title:
    [[Wiktionary:Beer parlour/2019/October#Deprecate {{temp|docparam}}]]
    → [[Wiktionary:Beer parlour/2019/October#Deprecate {{docparam}}]]
  • external-style link, visible title:
    [https://en.wiktionary.org/wiki/Wiktionary:Beer_parlour/2019/October#Deprecate_{{docparam}}]
    Template:docparam
  • external-style link, actual title:
    [https://en.wiktionary.org/wiki/Wiktionary:Beer_parlour/2019/October#Deprecate_{{temp|docparam}}]
    {{docparam}}

None of these can be fixed by using nowiki tags, BTW. (Note that browsers do successfully follow the appropriate link shown in the Table of Contents at the top of this page, but trying to copy/paste that link location into your own wikilink results in the problems shown above.)

These URL-encoded versions do work:

but the encoding apparently needs to be done "manually" by the user (it does on my broswer, anyway: FF 69.0.2), so it's not a practical solution for most users. (I'm not sure if this changed in a recent browser upgrade, because I know FF used to URL-encode text copied from the location bar, when necessary.)

Before I remembered to try the URL-encoding solution, I added a 'span' tag to allow for an alternate link target to that particular section (so I could link to it from a talk page), but the better solution is to simply avoid creating such section headings in the future. Thanks. - dcljr (talk) 06:15, 14 October 2019 (UTC)[reply]

Do you get the same problems with a manual link, e.g. Deprecate [[Template:docparam]] ? Benwing2 (talk) 06:33, 14 October 2019 (UTC)[reply]
No, because of the way MediaWiki handles such headings. Witness: Wiktionary:Beer parlour/2019/October#Cleaning up or deleting Template:jump (and the shorter version, when used on the same page: #Cleaning up or deleting Template:jump). Those links work fine for everyone. - dcljr (talk) 04:59, 15 October 2019 (UTC)[reply]
Sounds more like a problem that should be fixed, than a habit changed. --{{victar|talk}} 06:13, 15 October 2019 (UTC)[reply]
@Victar Funny… I see it the other way 'round. In any case, do you have a suggestion as to how it could be fixed? - dcljr (talk) 03:28, 19 October 2019 (UTC)[reply]
Sounds like the work of a sanitation filter. --{{victar|talk}} 04:23, 19 October 2019 (UTC)[reply]

"Wikipedia has articles on" + list

edit

Can anyone think of a good reason why we should have a fake Wikipedia box at Orange County with links to eight Wikipedia articles? All of them are covered by Wikipedia's disambiguation page, which is now linked to by a real Wikipedia box that was added last year. I was tempted to just remove it, but it's been there for 13 years. Chuck Entz (talk) 09:57, 15 October 2019 (UTC)[reply]

I say delete it. Wikipedia should handle disambiguation and linking themselves.--So9q (talk) 10:36, 15 October 2019 (UTC)[reply]
Agreed, remove that. We could also adjust the {{wikipedia}} template to accommodate more links, right now it only handles two. - TheDaveRoss 12:03, 15 October 2019 (UTC)[reply]
Sometimes the WP dab page has so many items that the ones relevant to our entry are hard to find in the clutter. That said, I prefer the use of the in-line template for such links. DCDuring (talk) 15:46, 15 October 2019 (UTC)[reply]

Idiomacity of danish terms

edit

I'm unsure how to label these 2:

What about idiomatic?--So9q (talk) 05:17, 16 October 2019 (UTC)[reply]

Idiomatic, just like middle of nowhere. Ultimateria (talk) 18:08, 16 October 2019 (UTC)[reply]
Please don't use the idiomatic tag, it's completely pointless. Canonicalization (talk) 19:26, 16 October 2019 (UTC)[reply]
Can you elaborate? --So9q (talk) 05:11, 18 October 2019 (UTC)[reply]
Using our typical definition of idiomatic, all phrases included in Wiktionary should be idiomatic – that is, not sum-of-parts (WT:SOP) – unless they're translation hubs (WT:THUB). You're probably thinking of some other definition of idiomatic for these phrases though. — Eru·tuon 05:32, 18 October 2019 (UTC)[reply]

Feedback wanted on Desktop Improvements project

edit

07:15, 16 October 2019 (UTC)

I just discovered {{unreferenced}}, which seems to serve exactly the same purpose as {{rfref}} but categorizes differently. It's used on maybe 10-15 pages; I propose rewriting {{unreferenced}} to {{rfref}} and deleting the former. Benwing2 (talk) 14:44, 16 October 2019 (UTC)[reply]

Merge. Canonicalization (talk) 21:50, 16 October 2019 (UTC)[reply]
Done, and deleted Category:Entries lacking sources, which was populated solely by that template (see Category:Requests for references by language). Benwing2 (talk) 19:29, 19 October 2019 (UTC)[reply]

Phrasal verbs

edit

Can anyone explain to me how phrasal verbs work on English Wiktionary? First of all, I see that there is both Category:English verbs and Category:English phrasal verbs. However, most entries in the latter seem to use the header Verb. Should they not use Phrasal verb, or is this not used as a header? Does this mean that the phrasal verbs are only marked as such in the categorization, and this is done through the normal verb templates? I notice for example for blåsa upp that this word is included in both Category:Swedish verbs and Category:Swedish phrasal verbs, and I suspect this is done by the parameter particle= in the verb templates. Is this done correctly?

Second of all, what is the preferred way of linking to phrasal verbs from the main verb entry? I wrote it under Usage notes in styra upp, does this work, or should it be under Derived terms?

Third of all, since the parts are blue linked in the entry title, I suspect no Etymology section is needed? I.e. the one I used in styra upp is superfluous and might as well be removed? If not, is there a preferred etymology template to be used?

Thanks for taking the time to have a look at my questions, I’d really like to clean up among Swedish phrasal verbs. --Lundgren8 (t · c) 21:32, 16 October 2019 (UTC)[reply]

To answer just one of your questions: WT:POS gives a limited list of parts of speech allowed as headers. “Verb” is on the list; “Phrasal verb” is not. With the exception of “Prepositional phrase” and “Proper noun”, headers of the form “(attribute) (POS)” are even explicitly disallowed.  --Lambiam 22:19, 16 October 2019 (UTC)[reply]
It would seem that {{sv-verb}} works differently from {{en-verb}}, so questions that can't be answered in the documentation (See WT:ASV.) should be directed to active Swedish admins (eg, User:Mike, User:Robbie SWE} or other active contributors to Swedish entries, eg, User:LA2. DCDuring (talk) 03:07, 17 October 2019 (UTC)[reply]

Past participles

edit

I’ll post this as a separate question. I was wondering whether there is a policy on past participles. My question is whether they are to be treated as verb forms, adjectives or both. Where do we draw the line? I created e.g. blankspolad and uppstyrd today, where I defined it under adjective and used the declension templates there, but then added the verb form under a separate verb heading. It seems a bit tautological, but is this the way it’s done? --Lundgren8 (t · c) 21:51, 16 October 2019 (UTC)[reply]

An (imperfect) adjectivality test for English is whether the term can be used attributively, and whether it allows comparative and superlative, or more in general gradation (hardly, very, too, ...). You can say, “his time is come”, but not *“is it his come time already?” or *“his time is very come, but her time is even more come”. But you can say, “the disappointed candidate is open to a mandamus”, and “she was bitterly disappointed”. So the past participle “come” cannot fill the role of an adjective, but “disappointed” can; hence it is listed twice. (Aside: it would make sense to me to allow combining parts of speech in a heading. In many languages, most adjectives can also be used as adverbs, but they are typically only listed now as being adjectives. Why can’t we say ==Adjective/Adverb==?)  --Lambiam 22:43, 16 October 2019 (UTC)[reply]
The OED tends to have adjectives for "most" of them (haha, I haven't read most of the OED, but they do frequently add adjectives for what were only verb forms before). Equinox 05:53, 17 October 2019 (UTC)[reply]
Adding an adj sense with "the obvious meaning" to a verb pa.p is, at any rate, no stupider than creating entirely separate entries for noun plurals. Equinox 05:54, 17 October 2019 (UTC)[reply]

@Lundgren8 I do the same for Hungarian: adjective with declension and verb form. See akkreditált. Panda10 (talk) 18:20, 17 October 2019 (UTC)[reply]

Wikidata integration

edit

Hi, I just visited our and the wikidata page for atomic clock. They are not linked. What would it take to link them? Has anybody worked on this? — This unsigned comment was added by So9q (talkcontribs) at 04:06, 18 October 2019 (UTC).[reply]

@So9q: You may want to take a look at WT:Wikidata. — justin(r)leung (t...) | c=› } 04:13, 18 October 2019 (UTC)[reply]
Thanks for the link. That page mostly talks about lexemes in wikidata. I saw now that Yurik in RU:WT added linking but only to the lexemes his bot extracted and imported there. Example. --So9q (talk) 05:08, 18 October 2019 (UTC)[reply]
I had previously linked several Wiktionary entries to their corresponding items in Wikidata but these were removed by bot. I had also attempted to link Wikidata items to corresponding pages in Wiktionary but this did not go well. It seems like both projects do not favor one another. Nevertheless, I have created Template:wikidatalite as a replacement of Template:wikidata based on the discussion here. KevinUp (talk) 09:12, 18 October 2019 (UTC)[reply]
So we can create a one-way link from here to Wikidata, which will look like “  Q227467 on Wikidata.Wikidata ”. But even if someone adds the lexeme atomic clock to the Wikidata Lexeme space, I don’t see how there would be a (possibly indirect) two-way link with Wikidata:Q227467.  --Lambiam 10:28, 18 October 2019 (UTC)[reply]
I'm happy to hear you are positive to this Lambiam. We were only discussing links to non-lexemes in the Q-namespace. I see no reason why a link to wiktionary articles in different languages for "atomic clock" could not be added to Wikidata:Q227467, but that is a discussion best done there.--So9q (talk) 18:12, 18 October 2019 (UTC)[reply]
I tried adding the link to our atomic clock at it was refused because of Notability. In the guidelines it is stated: "On Wiktionary, items for citation pages are not allowed. Main namespace is also excluded because interlanguage links are automatically provided by Cognate." source.--So9q (talk) 18:35, 18 October 2019 (UTC)[reply]
Thanks KevinUp for these templates. That was exactly what I meant. Do you know the rationale for removing one-way wikidata-links? Seems weird to me seeing that we have links to multiple other sources including non-WMF ones. I think the more linking the better.--So9q (talk) 18:12, 18 October 2019 (UTC)[reply]
The reason why Wikidata does not link to Wiktionary is because Wikidata items represent "concepts" rather than words. A single word can have multiple meanings, e.g. orange can refer to both the color and the fruit, so linking the orange fruit (  Q13191 on Wikidata.Wikidata ) to wikt:orange is not ideal.
If you're interested to link Wiktionary senses to the Wikidata Q-namespace, please use {{senseid}} instead. I don't think it is a good idea to use {{wikidatalite}} to link Wiktionary lemmas to Wikidata, because lemmas can have multiple senses whereas concepts tend to be more precise. I think {{wikidatalite}} can be used for scientific names and Unicode characters, but not words in general. KevinUp (talk) 08:42, 19 October 2019 (UTC)[reply]

Request for wikibase for Wiktionary

edit

Hi, recently Jura proposed that we request a wikibase instance from WMF to integrate our cross language senses and other stuff into. If I understand correctly this avoids the licensing CC0 issue completely. WDYT--So9q (talk) 04:58, 18 October 2019 (UTC)[reply]

I am not sure what the CC0 issue is; who does not like which license for which project, and what license would be proposed for the new “wikibase” – whatever that means –, and what would be its remit? An ontological database of semantemes?  --Lambiam 10:44, 18 October 2019 (UTC)[reply]
The CC0 issue is that Wikidata is CC0 and that license is not compatible with Wiktionary's current licenses, so we can't just put Wiktionary data into Wikidata. A new Wikibase instance could have a compatible license so that data could be legally moved to that structure. - TheDaveRoss 12:45, 18 October 2019 (UTC)[reply]
And our license is not compatible how? Does it have to do with our citations, our external links? DCDuring (talk) 20:20, 18 October 2019 (UTC)[reply]
How is it that a new instance can ignore the existing CC-BY-SA 3.0 License? Doesn't it still have to have hyperlinks or URLs to Wiktionary for each element of information? DCDuring (talk) 20:32, 18 October 2019 (UTC)[reply]
The new instance would have a compatible default license (CC-BY-SA), and not CC0, as used on Wikidata. I'm in favor of Jura's proposal: It's not just the lexicographical data, we also have language and category data stored in Lua modules (with ugly split hacks to avoid memory issues). And we could do more radical UI improvements/experiments, when the data is no longer stored/tied to specific templates. – Jberkel 21:33, 18 October 2019 (UTC)[reply]
And the wikibase instance is to be automagically kept in sync with current Wiktionary? Will any syncing degrade performance for contributors or passive users? DCDuring (talk) 00:39, 19 October 2019 (UTC)[reply]
We would probably slowly move more and more data into the wikibase and generate entries partly from there. No need for synching in other words. This would also mean editing some things directly in the wikibase.--So9q (talk) 19:40, 21 October 2019 (UTC)[reply]

If English Wiktionary supports this idea, I think that the French Wiktionary may be interested as well (and probably others). Just to clarify, you would like to request a Wikibase instance from WMF to integrate the English Wiktionary only or you are open to include all interested Wiktionaries? For example, I guess that the entries in French on the French Wiktionary are better quality than the French ones on the English Wiktionary. Pamputt (talk) 15:24, 22 October 2019 (UTC)[reply]

This would give access to interesting tools. But what's the most important point in the project? Contribution is vital, not tools. I'm afraid that this idea would make things more complex (cf. above: This would also mean editing some things directly in the wikibase), and thus reduce contribution. I think that Wikidata is one of the reasons explaining why Wikipedia loses editors (w:Wikipedia:Why is Wikipedia losing contributors - Thinking about remedies lists some additional reasons). Lmaltier (talk) 18:14, 5 November 2019 (UTC)[reply]
I completely disagree. As a counter-example I give you the OpenStreetMap wiki that recently installed wikibase. This has made it much easier for external tools like the editor iD to query e.g. descriptions and other metadata to show to the user during editing essentially integrating the two. The text part of the wikipage remains and the metadata from the wikibase items like on wikipedia have this small edit button in the end. As on wikipedia it should be clearly stated below the box than data is stored in the wikibase item. In the beginning we would have information both in the wikipage and in a special sidebox like on
OSM wiki introduced these data items with little or no resistance as it currently complements the wiki and only very slowly replace it. See https://wiki.openstreetmap.org/wiki/Data_items for an explanation of how it works. The wikibase items are automatically updated by this bot (source code in python) and this all seems to work very well.
When we have installed a wikibase and parsed the wiki into it we could very easily e.g. create a set of flashcards for anki with any front and back side. We can also much more easily handle translations and possibly avoid the current memory restrictions on Lua memory. Also others could much more easily extract definitions and translations of definitions which is now pretty hard because of need for parsing custom syntax (we have nothing linking a certain definiton to its translations).
Wiktionary in its current structure is IMO inferior to the wikidata approach and therefore becomming increasingly obsolete.--So9q (talk) 10:44, 18 November 2019 (UTC)[reply]
@Pamputt: A local one is the only thing that makes some sense. A dictionary is very very close the structure of a database, and we could also consider whether a future license change and integration with wikidata might be a good idea. I will start a separate discussion of a license change and deprecation of the wiktionary project in favor of Wikidata lexemes.
Adding a wikibase to en.wiktionary can only improve the current situation IMO and a later integration with any other project will be helped by having most of the data parsed into data items (complementary or not). The idea of having internationally spanning definitions of concepts is interesting but I'm not sure it maps well to words in the individual languages and right now the whole wiktionary project is divided in languages and tied to words (not definitions) which might be 2 fundamental design errors in retrospective. Both seem to have been solved in the Wikidata lexeme subproject.--So9q (talk) 10:44, 18 November 2019 (UTC)[reply]
I think having a Wikibase instance specific to the English Wiktionary can work well, because it is subject to the same rules and "jurisdiction" so to say. I don't see a multi-language project work as well for the same reason. —Rua (mew) 10:56, 18 November 2019 (UTC)[reply]

Place and given names in other languages

edit

In light of discussions such as this and this, I realized that while in practice we have followed the principle that place names are acceptable in other languages too, while given names are so if they are used for people speaking that language specifically, while primarily not when talking about people of other nationalities. But is this actually codified on any existing policy? I couldn't find it in the CFI, for instance, and I'm fine with codifying the system as it is now. — surjection?07:36, 18 October 2019 (UTC)[reply]

It seems like WT:CFI#Given and family names did not elaborate much on names that were borrowed or romanized from other languages. I don't think it is a good idea to consider all romanized forms of given names and surnames as English lemmas unless it is backed up by statistical evidence.
Currently, there are many false positives in Category:English surnames from Japanese and Category:Portuguese surnames from Japanese. If these are allowed, then we will be seeing lots of similar entries for Dutch, French, German, Italian, Portuguese, Spanish, Tagalog, etc. with the same spelling. KevinUp (talk) 08:56, 18 October 2019 (UTC)[reply]
I think we should consider policy and practice for personal names of non-historical figures and for toponyms separately. Toponyms are usually subject to translation; Turkish İstanbul (with a dotted ⟨İ⟩ becomes dotless English Istanbul and even Dutch Istanboel. These names have the same referent. Except for historical figures (English John Lackland, German Johann Ohneland, Azeri Torpaqsız İoann), a given name like John may be transliterated, but is not translated; “John F. Kennedy” does not become “Jean F. Kennedie” in French or “Иван Ф. Кеннедий” in Russian; it remains “John F. Kennedy” in French and German alike, and becomes “Τζον Φ. Κέννεντυ” in Greek and “Джон Ф. Кеннеди” in Russian. However, “John” has become a common given name or nickname for (e.g.) Dutch men, such as Dutch reality-TV tycoon John de Mol, which makes it reasonable to also list it as a Dutch given name. We can list transliterations of Language X names under the L2 of Language X, using “A transliteration of the name”. For toponyms I’d recommend to only list a name from a non-Anglophone country under an L2 of English if (next to attestability) it was originally not in Latin script, or has historical variant names in other languages using Latin script (like Dutch Parijs for the French capital).  --Lambiam 10:10, 18 October 2019 (UTC)[reply]
Well, I have suggested to not sort nomina propria under language headers altogether. People begin to see why. Fay Freak (talk) 14:36, 18 October 2019 (UTC)[reply]
I agree that separate consideration will be needed for personal names and toponyms. I don't have much issues with toponyms because toponyms tend to have nativized spelling, such as Istanbul losing the dotted İ as mentioned above. Moreover, the English entry of the toponym can still serve as a translation hub if it does not pass RFV/RFD.
However, different considerations will be needed for personal names, because they tend to preserve the same spelling among different languages that share the same script, as mentioned above. Proper guidelines will be needed to identify what qualifies as a borrowing and what qualifies as a transliteration. For personal names in English, statistical evidence of non-Anglophone names being used among citizens of Anglophone countries is a good indication of borrowing into English. See entries such as Nguyen, Tamura, Abdulla for example. KevinUp (talk) 14:54, 18 October 2019 (UTC)[reply]
If they are spelled the same or not is of little relevance. If a Pakistani moves to England his name will just continue being used, English or not. This is not a borrowing. There will never be borrowing. In the 7th generation there will not have been a borrowing, and if by family reunification he lets his whole inbred village follow there will still be no borrowing even if there are now thousand bearers of his name in England. It does not depend on how “English” his descendants are since we do not want to get into these nation questions which aren’t linguistic and names and languages are not as a principle bound to areas. Editors just have to bid farewell to the notion that usage is that whereby a word is assigned to a language. Fay Freak (talk) 15:06, 18 October 2019 (UTC)[reply]
inbred......? —Suzukaze-c 21:46, 18 October 2019 (UTC)[reply]
If there will never be borrowing, I wonder how names like John (Greek) and Isaac (Hebrew) got listed as English names. The number of names that can trace their history back to the Germanic substrate of English through Old English is pretty small. These continue; Liam is now the second most popular name in the US, and I'm pretty sure that's mainly among English speakers.
It doesn't matter how “English” his descendants are... then what? What names can be English, and how are you making that distinction? The only thing you actually said was that a Pakistani's name will never been English, with no reason I can see.--Prosfilaes (talk) 20:47, 19 October 2019 (UTC)[reply]
I explained the reason and you missed the point. Personal names and place names are not used like other words, they travel independently of languages and aren’t members in the sets of individual languages, albeit part of the phenomenon of language in general. Isaac isn’t an English name, nothing is an English name. When you say “Isaac is part of the English language” you use “language” in a different way than I imagine it, in a way that is not necessary and problematic here because of the Sorites paradox. People only ever come up with arbitrary criteria like Anglicized spelling or ”no Romanizations” to exclude names, and “Englishness” is also one, especially if tied to extralinguistic factors. I have suggested to restrict the assignment of personal and place names to languages altogether. I can very well forgo German sections of Isaac and Isaak, as well as Hermann and Heinrich, and whether pronunciations in each language should be listed then under a translingual header is also debatable, there are fields where a dictionary should not get into or it will stay a kludge. No 100 sections for Srebrenica. It is a problem traditional dictionaries did not have because they were restricted by language and personnel and could arbitrarily include but few personal names they deemed interesting. Now you won’t find criteria except “muh it has a macron so it is not English”, “only Asian migrants bear this name”. Nothing convincing people argue for the names, all start from wrong premises. Why not have one section for all languages, talking about frequencies of spellings by country or what data there is? Other pages will point to such a page saying things like “predominantly German spelling of”, “Persian encoding of”, usage notes with particular frequency information, etc. whatever, it’s the details. This can be introduced for some names to start. But calling every name ever used (durably …) in an English-language context English by whatever rubbery restriction is false. It wasn’t false when market demand and limited manpower regulated the inclusion of names in dictionaries but now that the whole world can spread every name to the whole world for free it is wrong, has been wrong to begin with, for the concept of a Wiki dictionary, but people were not farsighted enough to see it. This problem is peculiar to names as words in languages are naturally limited, names are not, are more arbitrary than the common nouns, with looser relation to any individual language. Fay Freak (talk) 22:16, 19 October 2019 (UTC)[reply]
A message which gets missed when you start talking about inbred Pakistanis.
One one hand if you have a different model, then demo it in user space. Show us what you want to do.
On the other, throwing up our hands on doing what traditional dictionaries are doing because it's too hard is lazy and not helping our users. We need a section for Srebrenica in every language where people talk about Srebrenica and need to know how it's pronounced in their tongue. The line between names and other words in the language is more blurry, but not absolutely different. Isaac, as the name of Abraham's son, is as clearly English as any name for any other thing. Isaac as an English name may be more complex, but there is a tight association. Names spelled the same way are frequently pronounced differently in different languages. It's a mess and needs to be handled differently.--Prosfilaes (talk) 20:10, 20 October 2019 (UTC)[reply]
Oof, Fay, that was super racist. I agree with Prosfilaes on all points. --{{victar|talk}} 04:17, 2 November 2019 (UTC)[reply]
I think pronunciation needs to be taken into account too. Paris or Trois-Rivières should be listed as English words even though the spelling doesn't vary from the original, because they have well-established English pronunciations and are frequently used in English texts. Often, foreign place names have non-intuitive pronunciations because they aren't spelled like typical English words, so people often look to dictionaries to see how to pronounce them. I would argue that the same attestation rule should apply for toponyms as for regular words (perhaps excluding atlases and maps), since obscure places are unlikely to be ever mentioned outside of the dominant language in the area anyway. But I wouldn't be opposed to a stricter citaton requirement. Andrew Sheedy (talk) 21:15, 19 October 2019 (UTC)[reply]
I think it should also be kept in mind that historically, people's last names were often modified when they came to English-speaking countries, because census-takers, their parish priests, etc. would often record their name phonetically rather than the way it would be spelled in the original language. There are a half dozen spellings of my mother's maiden name, but only one spelling in the original Polish. At that point, those last names are obviously English words, because they certainly aren't Polish anymore.
This no longer happens as much, just because of the nature of modern-day documentation (the exception being when transliteration is necessary), so it's a lot more difficult to pinpoint when an immigrant family's name becomes English. However, I would suggest that it occurs when there are people with that last name who no longer speak the language it is from, since they can't be considered to be code-switching, and because they will almost certainly pronounce it differently than in the source language. It will then be valuable to record how the name is typically pronounced in English. Especially with names like Nguyen, which have non-intuitive pronunciations, (often loosely) based on the original. Another indication is when names are stripped of their diacritics. Andrew Sheedy (talk) 21:20, 19 October 2019 (UTC)[reply]
Category:Navajo surnames might be of consideration: they are all Anglicized. —Suzukaze-c 21:49, 18 October 2019 (UTC)[reply]
See Wiktionary:About given names and surnames#The language statement of a name. In my opinion given names and surnames should have slightly different criteria. Toponyms are another matter. --Makaokalani (talk) 12:04, 19 October 2019 (UTC)[reply]
What I would propose is that the reasoning there be honed and then included into WT:CFI. — surjection?08:38, 22 October 2019 (UTC)[reply]

I think it's very simple: a section for a language should be allowed if the word is used in texts written in this language. This does not give the word any particular status. It's difficult to consider autoroute as an English word, or highway as a French word. Nonetheless, an English section for autoroute and a French section for highway (as a feminine or masculine noun) are fully justifiable. It's the same for John: this word is used in texts written in French, it should get a French section, with a definition such as An English given name (see John). This gives the opportunity to provide quotations in French, pronunciations used in French (certainly not /dʒɒn/ nor /dʒɑn/, it should be /dʒɔn/...), anagrams for French, etc. Lmaltier (talk) 21:41, 31 October 2019 (UTC)[reply]

Agreed. Pronunciation is a big part of language and some last names have well-established pronunciations in languages they don't naturally belong to. Andrew Sheedy (talk) 01:21, 2 November 2019 (UTC)[reply]

This is now under Wiktionary:Votes/pl-2019-11/CFI policy for foreign given names and surnames. — surjection?14:05, 4 November 2019 (UTC)[reply]

I would rather add the simple principle I mention above. Things should not be made complex but simple (KISS principle). Lmaltier (talk) 17:59, 4 November 2019 (UTC)[reply]
The principle above makes no sense to me, because as User:Lambiam pointed out here, it would permit entries like Alistair for Finnish, even though it doesn't have a consistent pronunciation (perhaps as many as five different ones depending on how competent the speaker is at English). It's basically code-switching. Quotations aren't really useful enough to justify this, nor are acronyms. — surjection?19:00, 6 November 2019 (UTC)[reply]
As Lmaltier's sole lexicographic principle is basically "add everything indiscriminately", I would advise not to pay attention to what he says, here or elsewhere. Canonicalization (talk) 19:46, 6 November 2019 (UTC)[reply]

Cleaning up request categories

edit

There was a vote awhile ago, spearheaded by User:Daniel Carrero, to clean up the names of request categories so they more-or-less consistently begin with "Requests for X". This was very helpful and much better than the old names (e.g. Category:English entries needing definition or Category:English terms needing attention or Category:English requests for example sentences); now, you can use autocomplete to search easily for request categories. But there are still some inconsistencies. This became apparent to me when I went through and documented all the request templates I could locate; this results of this can be seen at Category:Request templates. Most of the categories have the form "Requests for X in LANG entries" but there are several that don't, e.g.:

What do people think about renaming some of these categories to be more consistent?

Oops, need signature. Pinging User:Daniel Carrero again. Benwing2 (talk) 19:25, 19 October 2019 (UTC)[reply]
Agree. What was it with unhiding them? There is little gain to have well-named request categories if no one finds em. Fay Freak (talk) 22:33, 19 October 2019 (UTC)[reply]
  Support Consistency is nice.--So9q (talk) 04:03, 20 October 2019 (UTC)[reply]
So far I've changed all the Category:LANG terms with incomplete gender categories into Category:Requests for gender in LANG entries. Similarly we now have Category:Requests for tone in LANG entries, Category:Requests for accents in LANG entries, Category:Requests for aspect in LANG entries and corresponding POS-specific variants (e.g. Category:Requests for tone in Zulu noun entries, replacing Category:Zulu nouns needing tone). Each type of requested info has a corresponding request template ({{rfgender}}, {{rftone}}, {{rfaccents}}, {{rfaspect}}), and all the categories are recognized by {{auto cat}}.
I also cleaned up the situation where some noun and adjective declension templates were categorized under Category:LANG POS inflection-table templates but others were categorized under Category:LANG declension-table templates. All of them now use the former scheme, and I deleted all the Category:LANG declension-table templates categories along with {{rfdecl}} (which was used on only ~ 100 pages and was identical to {{rfinfl}} but linked to a "declension-table templates" category instead of a "POS inflection-table templates" category). Benwing2 (talk) 19:40, 27 October 2019 (UTC)[reply]

@Rua I notice these categories still exist, and are redundant to Category:Requests for inflections by language (e.g. Category:Requests for inflections in Albanian entries). They are populated by the undocumented |fNrequest= parameter to {{head}}. Anyone object to renaming the "Entries needing inflection" categories to "Requests for inflections ..." and then documenting the |fNrequest= param? Benwing2 (talk) 19:48, 19 October 2019 (UTC)[reply]

BTW this is done. Benwing2 (talk) 19:09, 27 October 2019 (UTC)[reply]

Both templates appear to do exactly the same thing except that {{term-context}} allows the language code in |lang= (but displays a deprecation warning for this) whereas {{term-label}} doesn't allow this. Consistent with {{context}} vs. {{label}}, I propose deprecating {{term-context}} in favor of {{term-label}}. Benwing2 (talk) 05:01, 20 October 2019 (UTC)[reply]

Upsilon with tilde

edit

In the etymology of rheumatism Greek letter "ῦ" is automatically transliterated as "û". Wouldn't it be better using "ũ" instead? --188.76.241.115 11:02, 20 October 2019 (UTC)[reply]

The diacritic on is a Greek circumflex, so we transcribe it with a Latin circumflex. In some fonts it somewhat resembles a Latin circumflex (more precisely, an inverted breve), but in your browser's font and mine it looks like a tilde. We transliterate it according to its meaning and don't try to match the exact form (which would be impossible, because it varies). — Eru·tuon 17:22, 20 October 2019 (UTC)[reply]

@-sche, Sgconlaw, msh210, DTLHS, Pious Eterino The horribly-named {{1}} creates a link to a lowercase term, with the display form being the uppercase form. It currently takes a language code in |lang=, defaulting to en, which is widely misused, mostly by being forgotten. In several Albanian entries, {{1}} by itself is misused to refer to the current pagename (which happens to be uppercase already), without properly specifying the language (and hence you get an #English link). Most uses are inside of other templates such as {{initialism of}}, but some are in lists or running text. I propose to split this into three templates:

  1. {{uc}} takes no lang code and directly generates a link for use inside another template, e.g. {{uc|foobar}} expands to [[foobar|Foobar]]. A longer example is found on NJQSAC (initialism of "New Jersey Quality Single Accountability Continuum"), which could be defined as
    # {{initialism of|en|[[New Jersey]] {{uc|quality}} {{uc|single}} {{uc|accountability}} {{uc|continuum}}}}
  2. {{ucl}} takes a lang code and generates a link like {{l}}; hence e.g. {{ucl|en|association}} is equivalent to {{l|en|association|Association}}.
  3. {{ucm}} takes a lang code and generates a link like {{m}}; hence e.g. {{ucm|en|association}} is equivalent to {{m|en|association|Association}}.

Thoughts? Benwing2 (talk) 04:03, 21 October 2019 (UTC)[reply]

Most of the time I see {{1}} used, it's in imitation of neighboring uses of {{l}} by new users who aren't paying attention. That probably explains your Albanian examples Chuck Entz (talk) 05:28, 21 October 2019 (UTC)[reply]
The template documentation says it must always be substed. So (contrary to the claim that it's "widely misused") we really don't know how it's usually used: presumably as intended, by substing, for editors' convenience instead of repeating typing a word. I see no problem with renaming it "uc" with a redirect from "1" so people can continue to use it (and with backward compatibility).​—msh210 (talk) 22:24, 21 October 2019 (UTC)[reply]

The issue with Westrobothnian and Scanian on Wiktionary

edit

Previous discussions:

I am very skeptical regarding the inclusion of certain Swedish linguistic varieties on English Wiktionary, primarily Category:Scanian language and Category:Westrobothnian language. Westrobothnian has been discussed on Wiktionary before, e.g. here in a discussion which resulted in that many entries were deleted, but since then @Knyȝt has added new entries, and the last three years or so this hasn’t been further discussed.

My issues can be summarized in a few points:

  • The orthographies used in Westrobothnian and Scanian entries are not in any way established.
  • The orthography for Westrobothnian is inconsistent.
  • The entries generally do not cite any sources, and the sources used do not use the same orthography.
  • Entries have previously been deleted but readded

1. The orthographies used in Westrobothnian and Scanian entries are not in any way established.

Westrobothnian is written by Knyȝt using their own developed orthography which is not found in any literature or elsewhere on the internet. Wiktionary is not the forum for original research or personal inventions (see also WP:FORUM), and we should not have entries in an orthography which is not found elsewhere.

Similarly, all the Scanian entries stem from a single source, a personal proposal for a Scanian orthography by a local enthusiast Mikael Lucazin from 2010, otherwise not used elsewhere. Surely this single proposal by one enthusiast can’t form the basis of Wiktionary inclusion? The Scanian orthography is very etymological, romantic, and archaistic, it uses etymological graphemes borrowed from Old Norse such as ⟨þ⟩ and ⟨ð⟩ and looks very peculiar overall. It looks very similar to the Focurc project on the internet, which is a Scots dialect with a very non-English inspired orthography in order to highlight its uniqueness.

2. The orthography for Westrobothnian is inconsistent

It is clear that Wiktionary is being used as a platform to launch a personal orthography, as the orthography is inconsistent and has changed over time.

The entry vâtn cites the definite form as vâtne, but here an earlier spelling vætnĕð is used, on Swedish Wiktionary vætne, and in the example sentence in spūt the spelling vattnä is used. The Westrobothnian project originally started on Swedish Wiktionary, where several entries were moved after the orthography changed, e.g. hähjänna to heþ hérna to heðhérnă, which on English Wiktionary corresponds to a fourth form: he + hjänna. Similarly, the Westrobothnian spellings used under garðr#Descendants are gål~gɑl, but gárþ here and gǫ́rð on Swedish Wiktionary. The entry auge has seven different spellings, with variations used in various example sentences and no sources for any of the forms or example sentences. The original orthography can be found on sv:Användare:Knyȝt#Vokaler.

Obviously a language can have many spelling variations and alternative forms, that’s not the issue, but it seems to be like Wiktionary is being used as a sandbox and spellings change over time as the personal orthography changes. Again, Wiktionary is not the platform for this, we are to use attested spellings found in reliable sources.

3. The entries generally do not cite any sources, and the sources used do not use the same orthography.

Another issue is the lack of sources. Sure, not every English entry contains a source, but I would argue it is more important when a variety is not as attested and easily double-checked. Many entries (such as frȯijen) contain a definition and information about the etymology and the pronunciation, but with no sources, and I don’t think there is a published etymological dictionary for Westrobothnian which means that many of the etymology sections are original research. Where do all the example sentences come from with different orthographies under e.g. eye? What are the sources for the regional pronunciation in frööys? These questions are relevant for a majority of the entries.

Sometimes certain sources are used such as Svenskt dialektlexikon (1862–1867) by Johan Ernst Rietz, but these do not contain the same orthography used in the entries.

If there were established orthographies and dictionaries on Westrobothnian and on Scanian, I would not think as much of it. We can compare it to other Nordic varieties like Elfdalian, Gutnish and Jämtlandic (Jamtish is by the way not used in published works) which have orthographies that are attested outside of Wiktionary as well as published dictionaries. These three varieties have more standardized orthographies than e.g. Scanian and Westrobothnian which lack orthographies altogether. In my opinion this would be enough for inclusion, as one could use these published materials as sources. The problem with Scanian and Westrobothnian on Wiktionary is rather than using existing published materials, they are an attempt to launch a personal project which should be done elsewhere than on Wiktionary.

Variety Dictionary Orthography Organization behind orthography
Elfdalian Dictionary Orthography Råðdjärum
Gutnish Dictionary Orthography Gutamålsgillet
Jämtlandic Dictionary Orthography Jamtamot

4. Entries have previously been deleted but readded

Westrobothnian has been up for one WT:RTD and one WT:RTV almost 3 years ago which resulted in that all entries were deleted, as the orthography was considered to be idiosyncratic and unattestable. This was ignored and entries were readded, and there are now 2761 Westrobothnian entries. I find it odd that the problems I bring up in this post have already been brought up before which resulted in actions (@Korn, @Mahagaja), yet the problem was not resolved.

Summarized, I don’t think that Wiktionary is the platform to launch a personal project such as for Westrobothnian or Scanian. I find the project impressive and it’s a good cause, but it can be done on a personal website as the orthography cannot be attested outside of Wiktionary. --Lundgren8 (t · c) 17:25, 21 October 2019 (UTC)[reply]

Most of the things you have stated are false. Anyone who has actually read more than a handful of my articles (or any of the cited works) would know this. I find it very bothersome to have to respond to all these accusations you make out of either ignorance or inability to understand what happens on Wiktionary and how any random post in time doesn’t necessarily directly and literally relate to some other action you randomly come across somewhere else at some other point. Maybe I will go into each thing in detail if you persist with this nonsense. Also, I assume we can restrict this discussion to the version of Wiktionary we actually are on, since Swedish Wiktionary would have their own beer parlour or somesuch where I’m sure you can post as much as you want about whatever you find troublesome there. Now, back to English Wiktionary again:
Let me ask you, am I the author of the cited works, the orthographies whereof I actually use in the articles (you were reading an obsolete post on my talk page):
  • Rietz, Johan Ernst Svenskt dialektlexikon: ordbok öfver svenska allmogespråket
  • Larsson, Evert, Söderström, Sven Hössjömålet : ordbok över en sydvästerbottnisk dialekt
  • Stenberg, Pehr, Gusten, Widmark Ordbok över Umemålet
  • Fältskytt, Gunnar, 2007, Ordbok över Lövångersmålet
  • Valfrid Lindgren, Jonas, Orbok över Burträskmålet
  • Marklund, Thorsten, 1986, Skelleftemålet: grammatik och ordlista : för lekmän - av lekman
  • Nyström, Jan-Olov, 1993, Ordbok över lulemålet
  • Sandberg Herny, Sandberg Ingrid, ed., I åol leist: ordlista på kalixmål, sådant det talades på 1990-talet
I will not be your servant any further and link you a bunch of articles citing these sources. You can yourself simply type these titles in the search bar and find many of my articles easily. And I can assure you, most of my articles can be sourced through several of these works, but it seems a bit daft that I should have to type out the sources for every single little article I write, when it is the same sources again and again, and I am not coming up with some new strange information that cannot be derived therefrom. Easily, if you familiarise yourself even slightly with a couple of these works, you will start to recognise their orthographies as used in my articles, and can stop making up lies about it.
Also, here is the orthographic reference for Scanian: http://docplayer.se/10011690-M-lucazin-utkast-till-ortografi-over-skanska-spraket-med-morfologi-och-ordlista-mmx.html
Do I have to explain every single detail you mention, how you are reading something out of date or not actually checking what the facts are, or can you look around a bit yourself and think? — Knyȝt 19:04, 21 October 2019 (UTC)[reply]
Hi, I agree with Lundgren we need at least 1 quotation for each article or else it should be deleted. I welcome you to put the content somewhere else e.g. on wikibooks. See details in the relevant policy.--So9q (talk) 19:31, 21 October 2019 (UTC)[reply]
See Wiktionary:Criteria for inclusion/Well documented languages. They don't need quotations but they do need references. DTLHS (talk) 19:33, 21 October 2019 (UTC)[reply]
1. I agree with Lundgren on every point; 2. By "article" Kneyt apparently means 'entry'. Allahverdi Verdizade (talk) 14:26, 25 October 2019 (UTC)[reply]

If everyone has a different orthography and there are tens, it does not harm any more whether you use any one of them or invent an own; also one should use the same one if one word is only found in sources spelt in an other orthography; this has also been stated for Middle Dutch to make the dictionary more useful and less unpredictable. Remember also that the template {{Template:normalized}} can be used to indicate disregard for the attested spelling, which in principle can be used also in English and German. The exact orthographies do not need to be attested, I deny this, the words must be, but they aren’t their spellings alone, nothing else is said by the criteria for inclusion, and the claim on a dictionary entry isn’t necessarily that the spelling is; if someone adds a word the claim is that there is an entity in the language that needs representation; the less often the language is written, the less is there a perception of the reader that the spelling here is what documented. The user is to be imagined as someone who hears a word and then types it in by the phonemes he heard, which he wants to do predictably; for languages like French and English the dictionary is also a dictionary of spelling, and for those dialects there is no dictionary of spelling, and even for the former languages the the dictionary is a spelling dictionary as far as word is spelled at all, that is apperceived by the users through spelling. The basis for any language remains that words only exist as sounds, however well-documented. Remember that one can attest from audio records, and if a words exist in audio records and has not been written there is even more reason to write it into the dictionary, particularly also in English, where if a word is not written then the spelling varies. There are quite some words that one does not dare to write or cannot reach print for various reasons but appear in music, and value is added to the dictionary if one can cover some of this which is left unwritten. This is confirmed by the notion that the more well-documented a language is the more its importance in unwritten speech is. Yet again it is this simple rule:
If you already know a word exists it is frivolous to demand the obliteration of its mention. Orthography idiosyncratic and unattestable? Shoudldn’t faze anyone, as the entries just have to be distinctive, that is, transmit information about a form, which might be unwritten. You’ll people fall victim to a common paralogism by asserting that “word”, ”form” etc. is the representation in the chosen signs, which is however if it ever occurs in rules only a metonymy. It is just generally desirable for literary languages, to add what is written, as we strive to have all in its original spelling, but non sequitur that representations that are new are undesirable. Fay Freak (talk) 23:14, 25 October 2019 (UTC)[reply]

While I generally share this attitude for Low German (which should goddammit be normalised), I think it is general consensus in this community that normalisation should be done in one single scheme which is either widely spread or accepted by the Wiktionary community. In other words, it should be based in some consensus or other. It seems the entries in question here do not have this consensus. Korn [kʰũːɘ̃n] (talk) 15:47, 1 November 2019 (UTC)[reply]
It is not possible to have a consensus in a group of people who are ignorant of the facts to begin with. What’s the consensus on when my birthday is? If there is no established consensus on when my birthday is then I guess I have to be hanged for feigning existence. — Knyȝt 21:09, 6 November 2019 (UTC)[reply]

I wonder why Lundgren8 never simply messaged me, asking for more sources to be written out in the newer articles. I could then ask if an article like jår is lacking something, or if I could use it as a model. Will we find out? If Lundgren8 was at all interested in this, we could simply talk about it. But he rather writes this wall of "my feeling is"-type crap on the public space, so that other twits like So9q and Allahverdi Verdizade who themselves have no interest in the truth, simply agree, and as such mock up some "consensus" about supposed facts that they do not even bother to establish.

For example, the belief that I am using my own orthography, which apparently is something to "agree" upon before even checking whether it is truth or not. And then users like Korn can appeal to this "established consensus" and claim the language just cannot be included. The answer nobody bothered to ask for is that I do not use my own orthography. Crazy world, is it not? This fact alone makes most of Lundgren8s post irrelevant. What if there are other things you are ignorant of? Will you bother to check? Are you actually interested in the facts?

The only thing we actually can establish thus far is that it seems some people like to think that they can come to the right conclusions while actually not at all knowing enough to do so.

So is there actually any problem here, other than the fake ones? Do I just have to make some hundreds or thousands of edits to add citations to each entry? — Knyȝt 21:09, 6 November 2019 (UTC)[reply]

Thanks for taking the time to respond. I cannot comment on the orthography because I know nothing about the subject. What I do know is that every single lemma in WT needs references or quotations to merrit inclusion. If you provide that I'm happy. I suggest you mark any lemmas where you cannot find that for deletion or move them out of mainspace 0.
As an aside I sense in the tone of your reply that you seem bothered by the way Lundgren8 brought this up and maybe take this personally? In that case I invite you to take a break and come back when you are ok.--So9q (talk) 04:57, 7 November 2019 (UTC)[reply]

Quantum for English language entries

edit

Hello group, May I suggest we have a separate aggregate total for English words entered into Wiktionary. Currently, the entries are over six-million from 4,000 languages. I would be interested to know the quantum for English words only. Thank you Aquataste (talk) 12:30, 22 October 2019 (UTC)[reply]

@Aquataste: You may be interested in the page Wiktionary:Statistics. - TheDaveRoss 12:33, 22 October 2019 (UTC)[reply]

Community Wishlist Survey 2020

edit

Hi fellow wiktionarians!

Here we are, Community Wishlist Survey 2020 is open and this year, it’s all for non-encyclopedic projects. More specifically, wishes have to be for all projects but Wikipedia, Wikidata and Commons.

So, it is our best chance to see some development for Wiktionaries! The main challenge with this survey is to write our idea as proposals understandable for techies. So, let’s find some way to help each other to make our voice understandable and in a classy English writing. Everyone can open a proposal, but with the limit of three for one author. I already wrote two, so I am eager to have some help for the next ones to write!

There is so much to do to improve Wiktionary, I hope we can write good proposals and have support for those!   Noé 15:11, 22 October 2019 (UTC)[reply]

Excellent! One of the issues we are having on English Wiktionary is the lack of Lua memory for entries that are basic words. I wonder if the tech team would be able to look into this matter. Affected entries are at CAT:E. The number of entries stuck in that category has been slowly increasing. Would anyone be interested to submit a proposal? KevinUp (talk) 13:42, 24 October 2019 (UTC)[reply]

Nasalization in Old French

edit

According to Wikipedia, nasalization of Old French vowels is only allophonic – it occurs before nasal consonants. But it is commonly marked in phonemic transcriptions on Wiktionary (with a tilde). If Wikipedia is correct, nasalization should only be marked in phonetic transcriptions, if anywhere.

I propose removing nasalization from all Old French transcriptions on Wiktionary: for instance, in blunt, {{IPA|fro|/blõnt/|[blũnt]}} would be replaced with {{IPA|fro|/blont/|[blũnt]}}. — Eru·tuon 18:52, 25 October 2019 (UTC)[reply]

@Erutuon Sounds good to me. Benwing2 (talk) 19:09, 27 October 2019 (UTC)[reply]
@Benwing2: Okay, I've finally gotten to this task and the edits can be seen here. [Edit: I also fixed the rhymes pages that had tildes.] — Eru·tuon 22:25, 1 January 2020 (UTC)[reply]

Inclusion of hyphenated compounds

edit

I would like to work towards developing CFI rules for hyphenated compounds. Different people seem to have different ideas about how to treat these, and it seems unsatisfactory that the CFI does not once mention "hyphen" or "hyphenated". A while ago a vote was passed to exclude hyphenated compounds where the definition was no more than "attributive form of unhyphenated phrase". I would like to propose that this is extended so as to exclude, essentially, all unidiomatic SoP hyphenated compounds that are created according to predictable rules of English grammar or orthography. For example, we would by this means exclude "harsh-sounding" and "moisture-resistant", as these employ reusable patterns that can be applied to arbitrary combinations, but retain "fig-leaf" as there is no predictable rule that tells us that the words "fig" and "leaf" should be joined by a hyphen, or whether it should be a solid word or two words (in fact, in this case all three exist, but you get the idea).

Unfortunately I suspect that the wording to achieve this in a watertight and sensible way may be tricky to formulate. We don't want to accidentally exclude compounds that, probably, most people would want kept. "good-looking" is an example; we would need some "sufficiently idiomatic" (or other) criterion to retain compounds such as this. We also have the prefixes/suffixes to consider, such as the "ex-" words that have been the subject of one or two RFDs lately, I think.

I was wondering too about the many cases such as "white-throated", or "plump-buttocked", where to be SoP we presumably need an entry at "throated", "buttocked", etc. However, this may not be an issue since there seems an established practice of already including such entries. After trying a few, I couldn't in fact find a missing one; even e.g. "eyebrowed" exists.

Please comment, especially to point out pitfalls in the above approach that may need addressing in any CFI wording. If there is sufficient interest in something along the above lines, then I will hopefully at some point get round to trying to propose a proper vote on specific CFI wording. Mihia (talk) 23:23, 26 October 2019 (UTC)[reply]

First off, I’m all in favour of such an exclusion rule. The basic principle is that the existing WT:SOP rule equally applies when the components are separated by a hyphen. A practical test is whether some of the components can be arbitrarily swapped (“ex-bachelor”, “ex-best friend”, “ex-colonial clerk”, “ex-colored man”, “ex-conservative”, “ex-crown princess”, “ex-German ship”, “ex-hero”, “ex-human”, “ex-Muslim”, “ex-offender”, “ex-officer”, “ex-parrot”, “ex-prisoner”, “ex-prom queen”, “ex-SAS soldier”, “ex-serviceman”, “ex-smoker”, “ex-sultan”).
As to the case of “good-looking”, when someone asks your opinion about a rewrite and you think, “looks good to me”, you wouldn’t say, “it is good-looking”. In contrast to, say, “horrible-looking”, which can be said of a salad, the meaning of that adjective has come to be confined to the visual appearance of a person, which makes it idiomatic. There is no iron-clad criterion for idiomaticity, so the issue is often non-trivial, but that equally applies to collocations that are not hyphenated (such as good looking). I don’t think this calls for specific consideration beyond the existing WT:SOP rule.
For consideration: which of the following should be excluded by the new rule, or if not, why not? flat-chested, foul-mouthed, L-shaped, pain-relieving, star-struck (WT:COALMINE?), yellow-green, zero-rated, σ-compact.  --Lambiam 08:33, 27 October 2019 (UTC)[reply]
As far as I can gather, some people do not believe that the existing WT:SOP rule applies to hyphenated compounds because they consider them "single words". Because the CFI is unclear about how hyphenated compounds should be treated in this respect, people can make their own individual interpretations. This doesn't seem satisfactory. On the other hand, the problem with a blanket application of SOP rule to hyphenated compounds is that it would result in us deleting e.g. "fig-leaf" (ignoring "coalmine" for the purposes of the example), which I wouldn't support. This is why I am proposing that hyphenated compounds should be subject to SOP rule only when created according to a "predictable rule", if this can be defined satisfactorily. (A "predictable rule" would include the hyphenation of attributive terms that we have already voted on.) Mihia (talk) 21:14, 27 October 2019 (UTC)[reply]
Sorry, it occurred to me after I wrote it that "fig-leaf" is just not a good example anyway, as it has a figurative meaning too. What I mean for an example is some non-figurative hyphenated noun-noun compound that is hyphenated by convention but is not really predictable as such. Mihia (talk) 23:47, 27 October 2019 (UTC)[reply]
Do any such exist, that is compounds we would delete as SOP if written with a space but a hyphen is conventional?  --Lambiam 09:13, 28 October 2019 (UTC)[reply]
Good question. In the "not hyphenated by predictable rule" category, the answer seems to be "surely there should be", but it is surprisingly hard to find good examples. In older writing, numerous SOP compounds that would now be written as two words are probably attestable. For example, "church-roof" [4] or "garden-wall" [5]. However, these kinds of examples rather argue against my original point. Do we really want to allow e.g. "garden-wall" on the basis of a few old or marginal citations? If no "sensible" examples of what you are asking for exist, then perhaps this is not such an issue after all. Somehow, though, I still feel nervous about a blanket application of SOP rules to hyphenated compounds. I just feel that it could result in us disallowing some entries that ought to be kept. On the other hand, if we can't actually identify any ...
In the "hyphenated by predictable rule" category, there are cases, e.g. "wine-lover", where the unhyphenated form would presumably be disallowed as SoP. However, I am arguing that hyphenated "wine-lover" ought to be disallowed anyway, because it is based on a predictable rule that allows arbitrary possibilities. Mihia (talk) 17:10, 28 October 2019 (UTC)[reply]
Whatever happens here, it shouldn't be seen as a carte blanche for overzealous deletionists to RFD every hyphenated term they can find. That would be detrimental to the project. Common sense should prevail, and it's not even a "rule" yet. I have my reservations, we don't need over-regulation by wannabe dictators. DonnanZ (talk) 10:49, 27 October 2019 (UTC)[reply]
I think we'd want to include ex-con, no matter how many arbitrary ex-[noun] combinations we'd want to exclude.
I had hoped that dvandva compounds (like Schleswig-Holstein) would includable per se, which would cover yellow-green, but one can have common dvandva-type collocations that involve hyphens that involve what seem to me to be combinations of arbitrary elements, eg, "a police-antifa clash", "a bluegrass-rap mash-up".
I strongly suspect that we will not be able to find any bright-line rule that works. DCDuring (talk) 13:56, 27 October 2019 (UTC)[reply]
There is nobody who tries to delete "every hyphenated term they can find". Equinox 14:58, 27 October 2019 (UTC)[reply]
The goal is not to delete "every hyphenated term" but to try to develop criteria that allow us to forgo e.g. "plump-buttocked" or "indistinct-looking" on the basis that dictionary users should understand how to interpret any such arbitrary combinations that they may encounter, not expect to be able to individually look up every single one of the thousands, maybe tens of thousands, of such compounds that might possibly exist. Mihia (talk) 00:41, 28 October 2019 (UTC)[reply]
If you've got something better than COALMINE then we are all ears. Equinox 00:43, 28 October 2019 (UTC)[reply]
My feelings about "coalmine" are well known, but in this case I don't understand how it is very relevant, generally speaking. AFAIK "plumpbuttocked" and "indistinctlooking" do not exist. Mihia (talk) 00:49, 28 October 2019 (UTC)[reply]
(My off the cuff opinion without looking at the above discussion) Option 1 and Option 2 both seem a little extreme for me. There are hyphenated words that really seem like legitimate words that need to be documented in this dictionary that I would never consider to be solely a sum of parts even though they pretty much basically are, and there are times when people are combining two words with a hyphen but it's not intended as a word or idiom even if used rather commonly. I would say that there's got to be a standard for these situtations that have been used in other dictionaries historically and I would (hypothetically) like to see a comparison between Option 1, Option 2 and the policies of other dictionaries historically before I would cast a vote in favor of either proposal. We can't be the first people to have encountered this problem. --Geographyinitiative (talk) 10:12, 5 November 2019 (UTC)[reply]

Unifying accent names for Singapore and Malaysia pronunciations of English words

edit

I'm proposing / asking permission of the community to unify "Singapore" and "Malaysia" as the accent names for 21 entries.

There is a prepared list showing the affected pages at User:Excellentia_in_Absentia/SingaporeMalaysia. These edits would be made manually, by myself, if approved.

The variance in accent descriptor may be due to the events noted at Wiktionary:Beer_parlour/2016/April#Singapore_English_entries.

Excellentia in Absentia (talk) 04:58, 28 October 2019 (UTC)[reply]

edit

Should the "Related terms" section include words that have a common etymology, but divergent semantics? For instance infant and infantry. SpinningSpark 23:06, 29 October 2019 (UTC)[reply]

Yes. DTLHS (talk) 23:08, 29 October 2019 (UTC)[reply]
Thanks. Done that one now. SpinningSpark 23:24, 29 October 2019 (UTC)[reply]

t:zh-pron takes too many space on the mobile site and can not be collapsed

edit

Mobile site readers may need to navigate a long distance to reach the definitions. I think this is a problem. -- Huhu9001 (talk) 04:50, 30 October 2019 (UTC)[reply]

I've made the change because it seems that nobody cares. -- Huhu9001 (talk) 15:36, 3 November 2019 (UTC)[reply]

2010's slang vs Internet slang

edit

Reading the page aesthetic I noticed that the third definition under the noun section is marked as Internet slang. I am not sure, but I think I have heard it used with this definition, or something close to it, in speech. Is there any policy for determining whether a definition is Internet slang or just general 2010's/early 2000's slang not specifically restricted to the Internet? I foresee this becoming more of a problem as the Internet continues to be integrated into human interaction and language. — This unsigned comment was added by Nemoanon (talkcontribs).

Sense referred to is:
(Internet slang) The artistic motifs defining a collection of things, especially works of art; more broadly, their vibe
This is not "Internet slang", in my opinion. I would just remove the label. By the way, may I encourage you, and everyone else who posts on these discussion pages, to quote the definitions that they are referring to, rather than relying on numbers. The numbers can change at any time as senses are added, deleted or reordered. Mihia (talk) 00:01, 31 October 2019 (UTC)[reply]
Good question really. This is language I strongly associate with Internet users (crazy Redditors who make plunderphonic videos out of 1990s multimedia CDs) but I'm almost 40 so why would I know how real-world kids actually talk? lol. Even if something originated online, if it spreads beyond, then we are only talking about etymology. There are some things we will probably never say aloud (like BTW) but the lines are blurry. In the 1990s the Internet was a separate "nerdy" space that had its own language. Now it's just another way to talk (David Crystal called it the "fourth medium", after, IIRC, speech, writing and signing). Equinox 05:01, 2 November 2019 (UTC)[reply]
Even BTW is something I've heard my peers use in regular speech. I often hear LOL (/lɔl/) and AF, LMAO, and FYI (as initialisms), among others, and I've definitely heard aesthetic in the sense described above. So the distinction between regular slang and Internet slang is fast fading, and probably not overly useful, except perhaps as chiefly Internet or chiefly texting slang. Andrew Sheedy (talk) 17:13, 2 November 2019 (UTC)[reply]
My perception is that this use of "aesthetic" is mainstream now, however it started. Maybe it is a bit jargony, but I do not perceive it as "slang" of any sort. For example, Google News Search for "this aesthetic" throws up a bunch of relevant hits from a range of publications, including this lot from The Times, occurring in what appear to be regular prose articles. Perhaps the label could say "originally Internet slang" if that is the case. I think that would be of interest; certainly I wasn't aware of this. Mihia (talk) 23:34, 4 November 2019 (UTC)[reply]