Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Page MenuHomePhabricator

Parsoid doesn't properly handle double-underscore magic words
Open, Needs TriagePublic

Description

For the double-underscore magic words __HIDDENCAT__, __NOINDEX__, and __INDEX__, Parsoid is missing the tracking category corresponding to this magic word. This seems to indicate that Parsoid is not actually invoking the legacy parser to handle these magic words, and so the resulting ParserOutput is likely to be omitting the proper index policy values as well.

The __NOGALLERY__ and __NOTOC__ magic words are handled in the same place of the legacy parser, and might also be missing some necessary behavior flags.

Event Timeline

We have Env::{get,set}BehaviorSwitch in Parsoid, and this potentially violates fragment independence as well when we interrogate the state of a number of these behavior switches in src/Wt2Html/DOM/Processors/WrapSectionsState.

We should probably be invoking some code in SiteConfig to handle the switches to ensure that we don't drift from the core implementation, although we can probably handle it in Parsoid in src/Wt2Html/TT/BehaviorSwitchHandler.