Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Page MenuHomePhabricator

"RuntimeException: PCRE failure" displaying oldest revisions on eowiki
Open, Needs TriagePublicPRODUCTION ERROR

Description

When I try to display some of the oldest revisions on Esperanto Wikipedia, I am getting an error message:

Fatal exception of type "RuntimeException"

This applies to most revisions of "Main Page", listed at https://eo.wikipedia.org/w/index.php?title=Main_Page&action=history – in particular revisions between 4062 to 4097 (but there might be more with the same issue).

Is the issue possibly related to the page texts missing from the database? If that is the case, is there any chance of uploading them if I manage to recover them from early tarballs (dumps)?


Error
normalized_message
[{reqId}] {exception_url}   RuntimeException: PCRE failure
exception.trace
from /srv/mediawiki/php-1.42.0-wmf.5/includes/parser/Parser.php(2119)
#0 /srv/mediawiki/php-1.42.0-wmf.5/includes/parser/Parser.php(1574): Parser->handleExternalLinks(string)
#1 /srv/mediawiki/php-1.42.0-wmf.5/includes/parser/Parser.php(651): Parser->internalParse(string)
#2 /srv/mediawiki/php-1.42.0-wmf.5/includes/content/WikitextContentHandler.php(420): Parser->parse(string, MediaWiki\Title\Title, ParserOptions, boolean, boolean, integer)
#3 /srv/mediawiki/php-1.42.0-wmf.5/includes/content/ContentHandler.php(1759): WikitextContentHandler->fillParserOutput(WikitextContent, MediaWiki\Content\Renderer\ContentParseParams, ParserOutput)
#4 /srv/mediawiki/php-1.42.0-wmf.5/includes/content/Renderer/ContentRenderer.php(47): ContentHandler->getParserOutput(WikitextContent, MediaWiki\Content\Renderer\ContentParseParams)
#5 /srv/mediawiki/php-1.42.0-wmf.5/includes/Revision/RenderedRevision.php(260): MediaWiki\Content\Renderer\ContentRenderer->getParserOutput(WikitextContent, MediaWiki\Title\Title, integer, ParserOptions, boolean)
#6 /srv/mediawiki/php-1.42.0-wmf.5/includes/Revision/RenderedRevision.php(232): MediaWiki\Revision\RenderedRevision->getSlotParserOutputUncached(WikitextContent, boolean)
#7 /srv/mediawiki/php-1.42.0-wmf.5/includes/Revision/RevisionRenderer.php(226): MediaWiki\Revision\RenderedRevision->getSlotParserOutput(string, array)
#8 /srv/mediawiki/php-1.42.0-wmf.5/includes/Revision/RevisionRenderer.php(164): MediaWiki\Revision\RevisionRenderer->combineSlotOutput(MediaWiki\Revision\RenderedRevision, ParserOptions, array)
#9 [internal function]: MediaWiki\Revision\RevisionRenderer->MediaWiki\Revision\{closure}(MediaWiki\Revision\RenderedRevision, array)
#10 /srv/mediawiki/php-1.42.0-wmf.5/includes/Revision/RenderedRevision.php(199): call_user_func(Closure, MediaWiki\Revision\RenderedRevision, array)
#11 /srv/mediawiki/php-1.42.0-wmf.5/includes/poolcounter/PoolWorkArticleView.php(84): MediaWiki\Revision\RenderedRevision->getRevisionParserOutput()
#12 /srv/mediawiki/php-1.42.0-wmf.5/includes/poolcounter/PoolWorkArticleViewOld.php(66): PoolWorkArticleView->renderRevision()
#13 /srv/mediawiki/php-1.42.0-wmf.5/includes/poolcounter/PoolCounterWork.php(167): PoolWorkArticleViewOld->doWork()
#14 /srv/mediawiki/php-1.42.0-wmf.5/includes/page/ParserOutputAccess.php(307): PoolCounterWork->execute()
#15 /srv/mediawiki/php-1.42.0-wmf.5/includes/page/Article.php(756): MediaWiki\Page\ParserOutputAccess->getParserOutput(WikiPage, ParserOptions, MediaWiki\Revision\RevisionStoreRecord, integer)
#16 /srv/mediawiki/php-1.42.0-wmf.5/includes/page/Article.php(559): Article->generateContentOutput(MediaWiki\User\User, ParserOptions, integer, MediaWiki\Output\OutputPage, array)
#17 /srv/mediawiki/php-1.42.0-wmf.5/includes/actions/ViewAction.php(78): Article->view()
#18 /srv/mediawiki/php-1.42.0-wmf.5/includes/MediaWiki.php(583): ViewAction->show()
#19 /srv/mediawiki/php-1.42.0-wmf.5/includes/MediaWiki.php(363): MediaWiki->performAction(Article, MediaWiki\Title\Title)
#20 /srv/mediawiki/php-1.42.0-wmf.5/includes/MediaWiki.php(960): MediaWiki->performRequest()
#21 /srv/mediawiki/php-1.42.0-wmf.5/includes/MediaWiki.php(613): MediaWiki->main()
#22 /srv/mediawiki/php-1.42.0-wmf.5/index.php(50): MediaWiki->run()
#23 /srv/mediawiki/php-1.42.0-wmf.5/index.php(46): wfIndexMain()
#24 /srv/mediawiki/w/index.php(3): require(string)
#25 {main}

Details

Request URL
https://eo.wikipedia.org/w/index.php?oldid=4081&title=Main_Page

Event Timeline

Aklapper renamed this task from Fatal exception when displaying oldest revisions on eowiki to "RuntimeException: PCRE failure" displaying oldest revisions on eowiki.Nov 25 2023, 9:36 PM
Aklapper changed the subtype of this task from "Bug Report" to "Production Error".
Aklapper set Request URL to https://eo.wikipedia.org/w/index.php?oldid=4081&title=Main_Page.
Aklapper updated the task description. (Show Details)
Reedy subscribed.

Is the issue possibly related to the page texts missing from the database?

Which missing texts?

If we look at https://eo.wikipedia.org/w/index.php?oldid=4081&title=Main_Page&action=edit there seems to be page text...

Bonvenon al la [[Vikipedio]] de [http://www.esperanto.net/ Esperanto], reta enciklopedio, kiu estas parto de la [[Internacia Vikipedio]]. Kiel [[vikio]], <em>vi mem</em> povas  [[Komencu Tie Ĉi|redakti kaj aldoni paĝojn]]. Same kiel [[Linukso]] kaj multe de la programoj de la [[Interreto]], la Vikipedio estas tute la kreaĵo de retanoj tra la mondo. 



Por redakti la Vikipedion, bonvole vidu '''[[Komencu Tie Ĉi]]'''.



Ankaŭ, vidu: [[Nomoj de titoloj]], [[Oftaj Demandoj]].



La [[Esperanta Vikipedio]] estis fondita je novembro 2001. Ĝi estas sub la [[GFDL]] (la Permesilo de GNU por Liberaj Dokumentoj). La GFDL estas simila al la [[GPL]] sed ĝi estas por dokumentoj, ne programoj.



----

[[About This Page]] (klarigo por angleparolantoj)



---- 



Unue, ni volas havi bazon por encikopedio kaj mi pensis ke la [[Enciklopedio Kalblanda]] estis tre bona modelo por ni.  Mi demandis al li, kaj li diris ke ni povas uzi liajn artikolojn, ĉu ni donas krediton al li ([[Stefano KALB]]).  Do, se vi volas, vi povas kopii liajn artikolojn el tie al ĉi tie.  Ni dankegas Stefano por ĉi tio!  --[[Chuck SMITH]]



Ni uzos la [[X-sistemo|X-sistemon]] kaj esperas ke iu el Vikipedio ŝanĝos ĉion al Unikodo, sed ĝis tiam, ni volas la x-sistemon, ĉar ĝi plibonas por redaktemi.



----

:'''Filozofio, Matematiko, kaj Naturscienco'''

:[[Astronomio kaj Astrofiziko]] - [[Biologio]] - [[Filozofio]] - [[Fiziko]] - [[Ĥemio]] - [[Matematiko]] - [[Statistiko]] - [[Tersciencoj]] 



:'''Sociasciencoj'''

:[[Anthropologio]] - [[Arkaelogio]] - [[Ekonomio]] - [[Geografio]] - [[Historio]] -  [[Historio de Scienco kaj Teknologio]] - [[Lingvistiko]] - [[Lingvo]] - [[Parapsikologio]] - [[Politikscienco]] - [[Psikologio]] - [[Sociologio]]



:'''Applied Artoj kaj Sciencoj'''

:[[Agrikulturo]] - [[Arkitekturo]] - [[Biblioteka kaj Informatika Scienco]] - [[Edukado]] - [[Familia kaj Konsumanta Scienco]] - [[Inĝenierarto]] - [[Juro]] - [[Komerco kaj Industrio]] - [[Komputilscienco]] - [[Komunikaĵo]] - [[Publikaferoj]] - [[Sansciencoj]] - [[Sciencistoj]] - [[Teknologio]] - [[Transporto]]



:'''Kulturo'''

:[[Belarto]] - [[Danzo]] - [[Distriĝo]] - [[Hobioj]] -  [[Kino]] - [[Klasikoj]] - [[Kriza Teorio]] - [[Kuireco]] - [[Literaturo]] - [[Ludoj]] - [[Musiko]] - [[Opero]] - [[Recreation]] - [[Religio]] - [[Skulptarto]] - [[Sportoj]] - [[Teatro]] - [[Turismo]] - [[Visual Arts and Design]]



----



:'''Vikipedioj en aliaj lingvoj'''

:[http://af.wikipedia.com/ Afrikansa (Afrikaans)] - [http://www.wikipedia.com/ Angla (English)] - [http://ar.wikipedia.com/ Araba (Araby)] -  - [http://zh.wikipedia.com/ Ĉina (Hanyu)] - [http://dk.wikipedia.com/ Dana (Dansk)] - [http://eo.wikipedia.com/ Esperanto] - [http://fr.wikipedia.com/ Franca (Fran�ais)] - [http://de.wikipedia.com/ Germana (Deutsch)] - [http://he.wikipedia.com/ Hebrea (Ivrit)] - [http://es.wikipedia.com/ Hispana (Castellano)] - [http://hu.wikipedia.com/ Hungara (Magyar)] - [http://it.wikipedia.com/ Itala (Italiano)] - [http://ja.wikipedia.com/ Japana (Nihongo)] - [http://ca.wikipedia.com/ Kataluna (Catalan)] - [http://nl.wikipedia.com/ Nederlanda (Nederlands)] - [http://no.wikipedia.com/ Norvega (Norsk)] - [http://pl.wikipedia.com/ Pola (Polska)] - [http://pt.wikipedia.com/ Portugala (Portugu�s)] - [http://ru.wikipedia.com/ Rusa (Russkiy)] - [http://sv.wikipedia.com/ Sveda (Svensk)]

I'm seeing some that I don't think should be appearing...

Encoding issues wrt (some) old revisions?

Yeah, it really look like it might be window-1252 encoding issue. It's not super easy to fix it but not impossible either, just need to collect all revisions having this issue.

It does feel like something we should be trying to harden (ie handle the error better) at least in MW too. AFAIK we get a few of this sort of thing now and again, and the error message isn't very helpful.

It does feel like something we should be trying to harden (ie handle the error better) at least in MW too. AFAIK we get a few of this sort of thing now and again, and the error message isn't very helpful.

The problem is that they usually blow up somewhere higher and people have been adding try and catch which it simply blows up in another code path turning this into a fun whack-a-mole. We can do a simple trick though. During retrieval in SqlBlobStore, check for errors (open question: How? specially some way that's fast) and then return empty string (or an error text) to the above layers with emitting proper warning in logs of course.

Another similar case on https://en.wikibooks.org/w/index.php?oldid=88955&title=UK_Constitution_and_Government/Judiciary which also fails in Parser->handleExternalLinks(). The revision is from January 2005

Then if I edit it with https://en.wikibooks.org/w/index.php?title=UK_Constitution_and_Government/Judiciary&action=edit&oldid=88955 and press preview, it renders properly...

Error
labels.normalized_message
[{reqId}] {exception_url}   RuntimeException: PCRE failure
error.stack_trace
from /srv/mediawiki/php-1.42.0-wmf.7/includes/parser/Parser.php(2122)
#0 /srv/mediawiki/php-1.42.0-wmf.7/includes/parser/Parser.php(1577): Parser->handleExternalLinks(string)
#1 /srv/mediawiki/php-1.42.0-wmf.7/includes/parser/Parser.php(654): Parser->internalParse(string)
#2 /srv/mediawiki/php-1.42.0-wmf.7/includes/content/WikitextContentHandler.php(420): Parser->parse(string, MediaWiki\Title\Title, ParserOptions, boolean, boolean, integer)
#3 /srv/mediawiki/php-1.42.0-wmf.7/includes/content/ContentHandler.php(1760): WikitextContentHandler->fillParserOutput(WikitextContent, MediaWiki\Content\Renderer\ContentParseParams, ParserOutput)
#4 /srv/mediawiki/php-1.42.0-wmf.7/includes/content/Renderer/ContentRenderer.php(47): ContentHandler->getParserOutput(WikitextContent, MediaWiki\Content\Renderer\ContentParseParams)
#5 /srv/mediawiki/php-1.42.0-wmf.7/includes/Revision/RenderedRevision.php(260): MediaWiki\Content\Renderer\ContentRenderer->getParserOutput(WikitextContent, MediaWiki\Title\Title, integer, ParserOptions, boolean)
#6 /srv/mediawiki/php-1.42.0-wmf.7/includes/Revision/RenderedRevision.php(232): MediaWiki\Revision\RenderedRevision->getSlotParserOutputUncached(WikitextContent, boolean)
#7 /srv/mediawiki/php-1.42.0-wmf.7/includes/Revision/RevisionRenderer.php(226): MediaWiki\Revision\RenderedRevision->getSlotParserOutput(string, array)
#8 /srv/mediawiki/php-1.42.0-wmf.7/includes/Revision/RevisionRenderer.php(164): MediaWiki\Revision\RevisionRenderer->combineSlotOutput(MediaWiki\Revision\RenderedRevision, ParserOptions, array)
#9 [internal function]: MediaWiki\Revision\RevisionRenderer->MediaWiki\Revision\{closure}(MediaWiki\Revision\RenderedRevision, array)
#10 /srv/mediawiki/php-1.42.0-wmf.7/includes/Revision/RenderedRevision.php(199): call_user_func(Closure, MediaWiki\Revision\RenderedRevision, array)
#11 /srv/mediawiki/php-1.42.0-wmf.7/includes/poolcounter/PoolWorkArticleView.php(84): MediaWiki\Revision\RenderedRevision->getRevisionParserOutput()
#12 /srv/mediawiki/php-1.42.0-wmf.7/includes/poolcounter/PoolWorkArticleViewOld.php(66): PoolWorkArticleView->renderRevision()
#13 /srv/mediawiki/php-1.42.0-wmf.7/includes/poolcounter/PoolCounterWork.php(157): PoolWorkArticleViewOld->doWork()
#14 /srv/mediawiki/php-1.42.0-wmf.7/includes/page/ParserOutputAccess.php(307): PoolCounterWork->execute()
#15 /srv/mediawiki/php-1.42.0-wmf.7/includes/page/Article.php(756): MediaWiki\Page\ParserOutputAccess->getParserOutput(WikiPage, ParserOptions, MediaWiki\Revision\RevisionStoreRecord, integer)
#16 /srv/mediawiki/php-1.42.0-wmf.7/includes/page/Article.php(559): Article->generateContentOutput(MediaWiki\User\User, ParserOptions, integer, MediaWiki\Output\OutputPage, array)
#17 /srv/mediawiki/php-1.42.0-wmf.7/includes/actions/ViewAction.php(78): Article->view()
#18 /srv/mediawiki/php-1.42.0-wmf.7/includes/MediaWiki.php(585): ViewAction->show()
#19 /srv/mediawiki/php-1.42.0-wmf.7/includes/MediaWiki.php(365): MediaWiki->performAction(Article, MediaWiki\Title\Title)
#20 /srv/mediawiki/php-1.42.0-wmf.7/includes/MediaWiki.php(962): MediaWiki->performRequest()
#21 /srv/mediawiki/php-1.42.0-wmf.7/includes/MediaWiki.php(615): MediaWiki->main()
#22 /srv/mediawiki/php-1.42.0-wmf.7/index.php(50): MediaWiki->run()
#23 /srv/mediawiki/php-1.42.0-wmf.7/index.php(46): wfIndexMain()
#24 /srv/mediawiki/w/index.php(3): require(string)
#25 {main}

This is generally invalid UTF-8; I wonder if we can just write a maintenance script to run preg_match('//u', $articleText) on all articles to pull this out.

We have a variety of protections that keep bad UTF-8 from being *written*, but /historically/ these didn't exist and/or were buggy, so the bad UTF-8 got into the database in Olden Times. There's no indication that this is a bug in our *current* code; we just need to write some jobs/run some tasks to clean up old articles in the database. Usually the cleanup is just replacing bad UTF-8 with the unicode replacement character.

In theory we could probably do that replacement when you pull the article from the database as well, but that leads to this "whack-a-mole" where different codepaths may or may not have the appropriate normalization applied.

This is generally invalid UTF-8; I wonder if we can just write a maintenance script to run preg_match('//u', $articleText) on all articles to pull this out.

All articles is one large corpus, all revisions for all articles is an even larger one!

It'll be roughly 60TB of compressed text, uncompressed, probably half a PB. So yeah, we can't do that :( maybe only revisions older than 2010?

Does the text blob has a flag indicate it is not utf8? In that case it could be fixable in the context of T128150

Does the text blob has a flag indicate it is not utf8? In that case it could be fixable in the contest of T128150

There are but we migrated content of any wiki that had legacy encoding enabled, eowiki wasn't among them (not even sure it's a cp1252 encoding issue) and as cherry on top, these flags are not accurate sometimes.

As first, we need to find out if that's cp1252 encoding being treated as utf-8. Let me dig.

In this specific case, it's even already marked as utf-8 while not having a valid utf-8:

wikiadmin2023@10.192.16.180(eowiki)> select * from slots where slot_revision_id = 4081;
+------------------+--------------+-----------------+-------------+
| slot_revision_id | slot_role_id | slot_content_id | slot_origin |
+------------------+--------------+-----------------+-------------+
|             4081 |            1 |            8400 |        4081 |
+------------------+--------------+-----------------+-------------+
1 row in set (0.003 sec)

wikiadmin2023@10.192.16.180(eowiki)> select * from content where content_id = 8400;
+------------+--------------+---------------------------------+---------------+-----------------+
| content_id | content_size | content_sha1                    | content_model | content_address |
+------------+--------------+---------------------------------+---------------+-----------------+
|       8400 |         3408 | mvubyjvdwez7jxxg56rx8hynw2pxymq |             1 | tt:4081         |
+------------+--------------+---------------------------------+---------------+-----------------+
1 row in set (0.002 sec)

wikiadmin2023@10.192.16.180(eowiki)> select * from text where old_id = 4081;
+--------+-----------------------------------------------------+----------------+
| old_id | old_text                                            | old_flags      |
+--------+-----------------------------------------------------+----------------+
|   4081 | DB://cluster5/5320/366668a2c170284c7732757f7f2fee62 | external,utf-8 |
+--------+-----------------------------------------------------+----------------+
1 row in set (0.002 sec)

It's not from recent runs either (it's on cluster5 which has been RO for years now)

Maybe I'm querying it incorrectly [1], but cluster5 in eowiki with blobs_id = 5320 is an object, not text (the hex value of it: P54038):

>>> b.decode('iso-8859-1')
'O:27:"ConcatenatedGzipHistoryBlob":4:{s:8:"mVersion";i:0;s:11:"mCompressed";b:1;s:6:"mItems";s:2876:"íÝÛnÜÆ\x19\...

it doesn't decode in utf-8 or cp1252 but in latin-1 and iso-8859-1 but it's gzip and needs to be inflated first anyway.

[1] My query: sudo db-mysql es1029 eowiki -e "select hex(blob_text) from blobs_cluster5 where blob_id = 5320 limit 5" ExternaStoreDB::fetchFromURL() says I'm doing it correctly.

I guess we could have a new configuration flag $wgValidateUtf8ForOldRevisions and run the preg_match('//u', $articleText) every time we fetch an old revision from the database. I don't think we'd want to turn that on for most wikis, but we could certainly afford the small performance hit by enabling it for the esperanto wiki.

There are so many regular expressions in core, that hardening all of them against bad UTF-8 seems like an exercise in frustration. Much better just to assert that all of our articles are good.

Change 981354 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/core@master] POC: fix UTF-8 in old revisions

https://gerrit.wikimedia.org/r/981354

The config variable focuses on the wrong aspect. If we end up seeing old revisions in enwiki having similar issues, we can't set that variable there and/or someone might not know the performance penalty of this variable and accidentally enable it in wikidata bringing down everything.

It should focus on date of edit and check for utf-8 correctness if the edit is older than 2010, it's harder to build but much safer.

Well, it could still be a config variable, just one that sets a target date. The revision date of the blob isn't readily available at this point however. And if you try to convert the content at a higher level in the stack, closer to where the revision metadata lives, you run the risk of the bad utf-8 getting out into PHP land via different access routes.

Fresh stack trace in case that's helpful:

Error
labels.normalized_message
[{reqId}] {exception_url}   RuntimeException: PCRE failure
error.stack_trace
from /srv/mediawiki/php-1.43.0-wmf.4/includes/parser/Parser.php(2192)
#0 /srv/mediawiki/php-1.43.0-wmf.4/includes/parser/Parser.php(1647): MediaWiki\Parser\Parser->handleExternalLinks(string)
#1 /srv/mediawiki/php-1.43.0-wmf.4/includes/parser/Parser.php(725): MediaWiki\Parser\Parser->internalParse(string)
#2 /srv/mediawiki/php-1.43.0-wmf.4/includes/content/WikitextContentHandler.php(376): MediaWiki\Parser\Parser->parse(string, MediaWiki\Title\Title, ParserOptions, boolean, boolean, integer)
#3 /srv/mediawiki/php-1.43.0-wmf.4/includes/content/ContentHandler.php(1674): WikitextContentHandler->fillParserOutput(WikitextContent, MediaWiki\Content\Renderer\ContentParseParams, MediaWiki\Parser\ParserOutput)
#4 /srv/mediawiki/php-1.43.0-wmf.4/includes/content/Renderer/ContentRenderer.php(67): ContentHandler->getParserOutput(WikitextContent, MediaWiki\Content\Renderer\ContentParseParams)
#5 /srv/mediawiki/php-1.43.0-wmf.4/includes/Revision/RenderedRevision.php(260): MediaWiki\Content\Renderer\ContentRenderer->getParserOutput(WikitextContent, MediaWiki\Title\Title, MediaWiki\Revision\RevisionStoreRecord, ParserOptions, boolean)
#6 /srv/mediawiki/php-1.43.0-wmf.4/includes/Revision/RenderedRevision.php(232): MediaWiki\Revision\RenderedRevision->getSlotParserOutputUncached(WikitextContent, boolean)
#7 /srv/mediawiki/php-1.43.0-wmf.4/includes/Revision/RevisionRenderer.php(226): MediaWiki\Revision\RenderedRevision->getSlotParserOutput(string, array)
#8 /srv/mediawiki/php-1.43.0-wmf.4/includes/Revision/RevisionRenderer.php(164): MediaWiki\Revision\RevisionRenderer->combineSlotOutput(MediaWiki\Revision\RenderedRevision, ParserOptions, array)
#9 [internal function]: MediaWiki\Revision\RevisionRenderer->MediaWiki\Revision\{closure}(MediaWiki\Revision\RenderedRevision, array)
#10 /srv/mediawiki/php-1.43.0-wmf.4/includes/Revision/RenderedRevision.php(199): call_user_func(Closure, MediaWiki\Revision\RenderedRevision, array)
#11 /srv/mediawiki/php-1.43.0-wmf.4/includes/page/ParserOutputAccess.php(406): MediaWiki\Revision\RenderedRevision->getRevisionParserOutput()
#12 /srv/mediawiki/php-1.43.0-wmf.4/includes/page/ParserOutputAccess.php(357): MediaWiki\Page\ParserOutputAccess->renderRevision(WikiPage, ParserOptions, MediaWiki\Revision\RevisionStoreRecord, integer)
#13 /srv/mediawiki/php-1.43.0-wmf.4/includes/diff/DifferenceEngine.php(1245): MediaWiki\Page\ParserOutputAccess->getParserOutput(WikiPage, ParserOptions, MediaWiki\Revision\RevisionStoreRecord, integer)
#14 /srv/mediawiki/php-1.43.0-wmf.4/includes/diff/DifferenceEngine.php(1032): DifferenceEngine->renderNewRevision()
#15 /srv/mediawiki/php-1.43.0-wmf.4/includes/page/Article.php(1004): DifferenceEngine->showDiffPage(boolean)
#16 /srv/mediawiki/php-1.43.0-wmf.4/includes/page/Article.php(503): Article->showDiffPage()
#17 /srv/mediawiki/php-1.43.0-wmf.4/includes/actions/ViewAction.php(78): Article->view()
#18 /srv/mediawiki/php-1.43.0-wmf.4/includes/actions/ActionEntryPoint.php(731): ViewAction->show()
#19 /srv/mediawiki/php-1.43.0-wmf.4/includes/actions/ActionEntryPoint.php(508): MediaWiki\Actions\ActionEntryPoint->performAction(Article, MediaWiki\Title\Title)
#20 /srv/mediawiki/php-1.43.0-wmf.4/includes/actions/ActionEntryPoint.php(145): MediaWiki\Actions\ActionEntryPoint->performRequest()
#21 /srv/mediawiki/php-1.43.0-wmf.4/includes/MediaWikiEntryPoint.php(199): MediaWiki\Actions\ActionEntryPoint->execute()
#22 /srv/mediawiki/php-1.43.0-wmf.4/index.php(58): MediaWiki\MediaWikiEntryPoint->run()
#23 /srv/mediawiki/w/index.php(3): require(string)
#24 {main}

Still an issue in 1.43.0-wmf.12, updated trace:

labels.normalized_message
[{reqId}] {exception_url}   RuntimeException: PCRE failure
error.stack_trace
from /srv/mediawiki/php-1.43.0-wmf.12/includes/parser/Parser.php(2199)
#0 /srv/mediawiki/php-1.43.0-wmf.12/includes/parser/Parser.php(1656): MediaWiki\Parser\Parser->handleExternalLinks(string)
#1 /srv/mediawiki/php-1.43.0-wmf.12/includes/parser/Parser.php(728): MediaWiki\Parser\Parser->internalParse(string)
#2 /srv/mediawiki/php-1.43.0-wmf.12/includes/content/WikitextContentHandler.php(377): MediaWiki\Parser\Parser->parse(string, MediaWiki\Title\Title, ParserOptions, boolean, boolean, integer)
#3 /srv/mediawiki/php-1.43.0-wmf.12/includes/content/ContentHandler.php(1673): WikitextContentHandler->fillParserOutput(WikitextContent, MediaWiki\Content\Renderer\ContentParseParams, MediaWiki\Parser\ParserOutput)
#4 /srv/mediawiki/php-1.43.0-wmf.12/includes/content/Renderer/ContentRenderer.php(67): ContentHandler->getParserOutput(WikitextContent, MediaWiki\Content\Renderer\ContentParseParams)
#5 /srv/mediawiki/php-1.43.0-wmf.12/includes/Revision/RenderedRevision.php(260): MediaWiki\Content\Renderer\ContentRenderer->getParserOutput(WikitextContent, MediaWiki\Title\Title, MediaWiki\Revision\RevisionStoreRecord, ParserOptions, boolean)
#6 /srv/mediawiki/php-1.43.0-wmf.12/includes/Revision/RenderedRevision.php(232): MediaWiki\Revision\RenderedRevision->getSlotParserOutputUncached(WikitextContent, boolean)
#7 /srv/mediawiki/php-1.43.0-wmf.12/includes/Revision/RevisionRenderer.php(226): MediaWiki\Revision\RenderedRevision->getSlotParserOutput(string, array)
#8 /srv/mediawiki/php-1.43.0-wmf.12/includes/Revision/RevisionRenderer.php(164): MediaWiki\Revision\RevisionRenderer->combineSlotOutput(MediaWiki\Revision\RenderedRevision, ParserOptions, array)
#9 [internal function]: MediaWiki\Revision\RevisionRenderer->MediaWiki\Revision\{closure}(MediaWiki\Revision\RenderedRevision, array)
#10 /srv/mediawiki/php-1.43.0-wmf.12/includes/Revision/RenderedRevision.php(199): call_user_func(Closure, MediaWiki\Revision\RenderedRevision, array)
#11 /srv/mediawiki/php-1.43.0-wmf.12/includes/page/ParserOutputAccess.php(381): MediaWiki\Revision\RenderedRevision->getRevisionParserOutput()
#12 /srv/mediawiki/php-1.43.0-wmf.12/includes/page/ParserOutputAccess.php(332): MediaWiki\Page\ParserOutputAccess->renderRevision(WikiPage, ParserOptions, MediaWiki\Revision\RevisionStoreRecord, integer)
#13 /srv/mediawiki/php-1.43.0-wmf.12/includes/diff/DifferenceEngine.php(1246): MediaWiki\Page\ParserOutputAccess->getParserOutput(WikiPage, ParserOptions, MediaWiki\Revision\RevisionStoreRecord, integer)
#14 /srv/mediawiki/php-1.43.0-wmf.12/includes/diff/DifferenceEngine.php(1033): DifferenceEngine->renderNewRevision()
#15 /srv/mediawiki/php-1.43.0-wmf.12/includes/page/Article.php(1069): DifferenceEngine->showDiffPage(boolean)
#16 /srv/mediawiki/php-1.43.0-wmf.12/includes/page/Article.php(483): Article->showDiffPage()
#17 /srv/mediawiki/php-1.43.0-wmf.12/includes/actions/ViewAction.php(78): Article->view()
#18 /srv/mediawiki/php-1.43.0-wmf.12/includes/actions/ActionEntryPoint.php(731): ViewAction->show()
#19 /srv/mediawiki/php-1.43.0-wmf.12/includes/actions/ActionEntryPoint.php(508): MediaWiki\Actions\ActionEntryPoint->performAction(Article, MediaWiki\Title\Title)
#20 /srv/mediawiki/php-1.43.0-wmf.12/includes/actions/ActionEntryPoint.php(145): MediaWiki\Actions\ActionEntryPoint->performRequest()
#21 /srv/mediawiki/php-1.43.0-wmf.12/includes/MediaWikiEntryPoint.php(200): MediaWiki\Actions\ActionEntryPoint->execute()
#22 /srv/mediawiki/php-1.43.0-wmf.12/index.php(58): MediaWiki\MediaWikiEntryPoint->run()
#23 /srv/mediawiki/w/index.php(3): require(string)
#24 {main}

Updated the https://gerrit.wikimedia.org/r/c/mediawiki/core/+/981354 patch, if that is merged we could turn on validation on enwikibooks and eowiki. Folks who add logstash traces, if you could add the wiki prefix which shows the error that would be helpful.

In case it's helpful, here's a search (with too many filters) that will give you a list of pages that cause this: https://logstash.wikimedia.org/goto/246f0d53f76e1bba9a19d527448514dd