Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Page MenuHomePhabricator

Wrong section numbering if Parsoid is used and wikitext is invalid
Open, Needs TriagePublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

What happens?:

  • The sections are numbered in a different:
  • Parsoid rendering: section = 19
  • non-Parsoid rendering: section = 4
  • After pressing on "edit source" in Parsoid-renderer mode the error message "Cannot find section" occurs. With non-Parsoid renderer all is OK.

What should have happened instead?:

  • Also the parsoid renderer should use the correct section numbers.

Note: This only happened with the [[Portland (Oregon)|Portland],] typo before https://en.wikivoyage.org/w/index.php?title=Emeryville&diff=4922006&oldid=4921380

Event Timeline

The severity is to set "High" because editing articles may fail.

Aklapper raised the priority of this task from High to Needs Triage.Aug 13 2024, 10:39 AM

@RolandUnger Did cscott agree to work on this?

Aklapper renamed this task from Wrong section numbering if Parsoid is used to Wrong section numbering if Parsoid is used and wikitext is invalid.Aug 13 2024, 10:41 AM
Aklapper updated the task description. (Show Details)

cscott is working on this project Parsoid, and I hope he could help. At least, he should be a subscriber.

Aklapper removed cscott as the assignee of this task.EditedAug 13 2024, 10:45 AM
Aklapper added a subscriber: cscott.

Please don't assign folks without their consent as it's up to every individual what they (don't) plan to work on. Thanks!

Adding Content-Tranform-Team to the set of tags (or, indeed, Parsoid and/or Parsoid-Read-Views) is sufficient to ensure our team sees it and triages it to someone who can work on it.

For context during triage: https://www.mediawiki.org/wiki/Parsing/Notes/Section_Wrapping is how parsoid section numbering should work. There are corner cases where parsoid will disagree with the legacy parser wrt section boundaries, but parsoid *should* use section-id=-1 for these cases.

I've looked into that yesterday (and put my notes in T222419#10058673); I am not *convinced* that it requires the wikitext to be invalid for it to trigger. The PEG parser might backtrack on valid-wikitext-that's-just-ambiguous-enough, without the wikitext itself being entirely at fault. I think.

Just recording for debugging purposes since this is not reproducible on the current version of the page, if you 'view source' on https://en.wikivoyage.org/w/index.php?title=Emeryville&oldid=4921380 , you can see the bad section ids on the section wrapper tags.

This snippet below is sufficient to reproduce the id-assignment issue:

=S1=
[[Foo|Bar],]

=S2=
x

See Parsoid's output below -- S2 gets id 3 instead of 2.

$ php bin/parse.php --wrapSections < /tmp/wt
<section data-mw-section-id="0" data-parsoid="{}"></section><section data-mw-section-id="1" data-parsoid="{}"><h1 id="S1" data-parsoid='{"dsr":[0,4,1,1]}'>S1</h1>
<p data-parsoid='{"dsr":[5,17,0,0]}'>[[Foo|Bar],]</p>

</section><section data-mw-section-id="3" data-parsoid="{}"><h1 id="S2" data-parsoid='{"dsr":[19,23,1,1]}'>S2</h1>
<p data-parsoid='{"dsr":[24,25,0,0]}'>x</p>
</section>