- #46 seems to detect a few internal links inside image description: there was 5 articles on frwiki in the list with that problem. --NicoV (Talk on frwiki) 16:31, 30 August 2013 (UTC)[reply]
- Can I see the examples? Bgwhite (talk) 17:13, 30 August 2013 (UTC)[reply]
- Forgot to put the link... The 5 articles done in the list: fr:Billet de banque (: au premier plan le [[5000 francs Flameng]]), fr:Lyricon (à vent [[Computone]]), fr:Moteur avec cylindres en ligne (en ligne de la [[Honda CBX|Honda CBX 1000]]), fr:Multi One Design (Environnement (MOD 70)|Veolia Environnement]]), and fr:Pollution sonore ([[circulation automobile]]). --NicoV (Talk on frwiki) 17:20, 30 August 2013 (UTC)[reply]
- Also in itwiki. Both when the "true" link is at the bottom it:Ampelio eremita and when it isn't it:Ascari del Cielo. Thanks! --AlessioMela (talk) 21:08, 4 November 2013 (UTC)[reply]
- Both cases follow the pattern. There is an image tag at the very beginning. There is a bracket error in the articles, but checkwiki shows the error in the image tag. I know why it is happening in the code, but I haven't found a way around it. Bgwhite (talk) 21:26, 4 November 2013 (UTC)[reply]
It seems to be happening again on frwiki (fr:Antihéros, fr:Insulte, ...) but I don't find anything wrong in the articles, even somewhere else. --NicoV (Talk on frwiki) 09:21, 5 April 2014 (UTC)[reply]
- Hmmm, this shouldn't be happening. Looks like it is counting ]]] as having two ]] possibilities. Will look into it. Bgwhite (talk) 20:25, 21 September 2013 (UTC)[reply]
- Matěj Suchánek, interesting case. The problem going on is: [[metr za sekundu|[m/s]]] was followed by a statement with a broken bracket, [ran/[[minuta|min]]. If it wasn't followed by a broken bracket, checkwiki would not say this was an error. Normally I'd say this is a rare case and checkwiki correctly said there is an error on the page, thus this is a real low priority. However, the problem in the code is similar to the problem in the code for #46 error report above this report. So, a solution in one probably fixes the other error. Problem is, I've yet to figure out the #46 error after many hours. Bgwhite (talk) 08:52, 29 October 2013 (UTC)[reply]
NicoV Magioladitis
After looking at some of the articles in a list of #39 errors not fixed by a bot, I've noticed some "false positives". I use quotation marks because it is actually errors with mediawiki that is causing the problem.
Newlines don't function in <blockquote> , {{quote}}, {{cquote}} and {{quotation}}. I have the checkwiki code skip these for error #39. After looking at the new list of articles, <ref> , [[Image: and {{bq}} also don't work.
<skip several hours>
I have the bug bookmarked and brought it up. Low and behold, the patch that was submitted in December 2011 was finally accepted. Final changes were made today on enwiki. Turns out Visual Editor was assuming newlines worked the same everywhere... silly VE. So, VE started the move to finally fix the problem. Hey, who knew, VE was actually helpful for the first time ever. According to the log, it only took 8 1/2 years to fix.
I've verified that {{quote}}, {{cquote}} and {{quotation}}, <blockquote> and {{bq}} now treat newlines correctly.
I've verified that <ref> and [[Image: still barfs on newlines.
I need to add the ref and various image tags to #39's code and remove the currently skipped templates in #39's code. Bgwhite (talk) 05:22, 16 October 2013 (UTC)[reply]
- Bgwhite this means that now AWB can replace p tags inside blockquote with newlines? -- Magioladitis (talk) 06:12, 16 October 2013 (UTC)[reply]
- Magioladitis. I'm confused. It doesn't work on Aristole#Geology, but it does work below.
a
b
c
- Bgwhite (talk) 06:42, 16 October 2013 (UTC)[reply]
- Asked a question at User talk:Kaldari#bug 6200 and quote templates. Bgwhite (talk) 06:55, 16 October 2013 (UTC)[reply]
- Comment was made at bugzilla bug 6200 about the problem. Also bug 55674 for newlines in ref tags. Bgwhite (talk) 09:05, 16 October 2013 (UTC)[reply]
- 6200 marked as fixed. -- Magioladitis (talk) 14:45, 28 October 2013 (UTC)[reply]
Hi, I know that you're always looking for more work since it's so easy to use Labs ;-)
I'd like to suggest adding some statistics for Check Wiki to give us some information on how errors evolve on each wiki.
Would it be possible to add a table with the following informations ?
- One line for each error
- Several columns for each day for a month : number of articles detected for the error after the daily scan, number of articles marked as fixed for the error during the day, eventually number of articles marked as fixed during the day but that were detected again
--NicoV (Talk on frwiki) 10:21, 6 November 2013 (UTC)[reply]
- Great idea. Though I am not sure if Bgwhite has enough time to implement it. --Meno25 (talk) 16:25, 9 November 2013 (UTC)[reply]
- I know, it's just wishful thinking, no emergency and no problem if it's not implemented. --NicoV (Talk on frwiki) 12:08, 14 November 2013 (UTC)[reply]
Please include pages in namespace "ملحق" (NS:104) on Arabic Wikipedia (arwiki) in the lists generated by Checkwiki script. This namespace contains lists and years pages. Pages in that namespace are counted in the number of articles (magic word: {{NUMBEROFARTICLES}}) and AWB's Auto-Tagger already tags articles in that namespace. --Meno25 (talk) 12:11, 23 November 2013 (UTC)[reply]
- Meno25, I'm going to wait on this for a bit. I've held off on 104, commonswiki and File namespace. I'm using code optimized to run only grab Article namespace from the dump file. Changing out will cause a severe decrease in speed. I'll have to some other changing around to insert the code, but everything else is setup for it. For example, there are if statements that say only Article and 104 namespace can check certain errors. Bgwhite (talk) 08:29, 24 November 2013 (UTC)[reply]
- @Bgwhite: Thank you for the explanation. Take your time. We are not in a hurry. --Meno25 (talk) 12:21, 24 November 2013 (UTC)[reply]
Time to start thinking about what new errors should be added to Checkwiki.
Ping: Magioladitis, NicoV, Meno25, Crazy1880, LindsayH, GoingBatty, Matěj Suchánek, Josve05a, ChrisGualtieri, Graham87. I think that is everybody. If not, add them to the list.
What should or should not be added will be determined by several factors:
- How easy is it to code up?
- Is it something that AWB or WPCleaner already can find.
- Is it something that AWB, WPCleaner or a bot can currently fix?
- Is it an accessibility issue?
- Is it a serious issue? Are the errors on the high or medium lists?
- High priority: error corrupts or distorts the content posted in Article
- Medium priority: improving the encyclopedic content or readability of the article
- Low priority: improving maintenance or MOS fixes
Some examples:
- Replacing
<strike> with <s> . It would take a copy/paste to code up. WPCleaner finds and fixes the problem. It would be Low priority.
- Finding cases of url=http://http:// This is a common error I see. It would be High priority. It is fixed by AWB.
- Blank lines in bulleted vertical lists. This is an accessibility issue per Wikipedia:Accessibility#Blocked elements. This causes problems for screen readers.
- No blank space after the comma in DEFAULTSORT values. An example would be: Bush,George. The article would be sorted first for all surnames beginning with Bush. Currently not fixed by AWB or WPCleaner. Probably medium priority.
Bgwhite (talk) 01:34, 26 November 2013 (UTC)[reply]
- How about putting the TOC in the standard position in the wiki-markup, which is also an accessibility issue? Not sure about automating this though. Graham87 01:39, 26 November 2013 (UTC)[reply]
- Seems that there are lots of new citation style errors, some of which appear in red text in the references section. Those might be something worth exploring. GoingBatty (talk) 02:08, 26 November 2013 (UTC)[reply]
A few suggestions:
- An error to detect non-existent files (red linked files). We have a bot on Arabic Wikipedia that removes such links. However, the bot works on all pages of the Main namespace. Generating a list of pages for the bot to work on would be a good idea. See Wikipedia:Database reports/Articles containing deleted files.
- Detecting user signatures in articles (articles containing links to user pages). To be worked on manually. See Wikipedia:Database reports/Articles containing links to the user space.
- Detecting fat redirects (redirects obscuring page content). To be checked manually. See Wikipedia:Database reports/Redirects obscuring page content. --Meno25 (talk) 06:43, 26 November 2013 (UTC)[reply]
The errors I suggested are covered by the Database reports on English Wikipedia. Database reports are updated regularly only on enwiki, Commons and Meta. Moving the errors to checkwiki means that the reports would get generated for other wikis too. So, maybe disable those errors for enwiki and enable them for other wikis. --Meno25 (talk) 06:43, 26 November 2013 (UTC)[reply]
- Meno25 you can request similar databases for other wikis. -- Magioladitis (talk) 09:19, 26 November 2013 (UTC)[reply]
CHECKWIKI is more about common syntax errors. We need to focus on that. If lists are already generated by other bots/projects we do not need to duplicate the job. Bgwhite's idea of unspaced DEFAULTSORT is a great example of what we are after. WPC's extended list is another good example. I have some minor suggestions:
I don't know if this is an error or maybe already monitored but:
{{cite web}} without access dates.
- When only
<ref>http://exemple.com/</ref> is used without title/description. This is to prevent link rot.
- When two (or more) refs with the same information has diffrent ref-name.
- When the time (e.g. 08:45 or 8 am) or the day (e.g. Moday or Saturday) is used inside
|accessdate= .
(t) Josve05a (c) 11:52, 26 November 2013 (UTC)[reply]
Hi, I think new errors should be generic enough to work on most wikis, so avoid very specific errors (for example: {{cite web}} without access dates should be dealt by the template itself: put the page in a maintenance category if access dates are missing). Otherwise, some of WPCleaner errors in the #5xx numbers:
- #502: Useless "Template" in
{{Template:...}} (low)
- #508: Non-existent templates (medium ?)
- #511: Internal link written as an external link (medium ?)
- #512: Interwiki link written as an external link (low ?)
- #513: Internal link inside an external link (medium ?)
- #517:
<strike>...</strike>
- #519:
<a>...</a>
- I like some of previous proposals: missing space after a comma in a DEFAULTSORT, doubled http, blank bulleted lined, non-existent files, ...
Some of them are probably hard to develop or require access to a lot more information, so they will be difficult to add (non-existent templates / files, ...)
--NicoV (Talk on frwiki) 12:57, 26 November 2013 (UTC)[reply]
- NicoV I agree with you. My first suggestion is not good neither. I think the best suitable new additions are WPCleaner errors in the #5xx numbers. For non-existent file etc I disagree that we should do something about them. There are databases for those already. -- Magioladitis (talk) 13:39, 26 November 2013 (UTC)[reply]
- NicoV: #508 is already listed, see Special:WantedTemplates, the files are in Category:Pages with missing files.
- I have once suggested a link to a year which has another description ([[2012|2013]], medium or high).
- Some inspiration: de:Benutzer:Stefan Kühn/Check Wikipedia#Next features. Matt S. (talk | cont. | cs) 15:16, 26 November 2013 (UTC)[reply]
A few more:
- More than one blue link per * on a disambig-page. (Per WP:MOSDAB)
- Refs and reflist on a disambig-page. (per WP:MOSDAB)
- When an article does not have "nbsp" between e.g. 15 km, 2,5 miles and 3 cm)
-(t) Josve05a (c) 16:12, 26 November 2013 (UTC)[reply]
- Moin, like a free space in a category as medium. Example: right: "[[category:xyz]]" and wrong "[[categorie: xyz]]". Stefan Kühn had had a code for the persondata-script. Regards --Crazy1880 (talk) 09:09, 29 November 2013 (UTC)[reply]
- Crazy1880, Error 22 should be picking those up. Bgwhite (talk) 02:20, 3 December 2013 (UTC)[reply]
- Bgwhite, oh, yes it did. In the german Wikipedia was the question, if ID 69 will check für "ISBN:", because the linked site only use ISBN. (example: ISBN: 978-3-7657-2781-8 > ISBN 978-3-7657-2781-8) Regards --Crazy1880 (talk) 20:19, 4 January 2014 (UTC)[reply]
- Crazy1880, #69 checks for ISBN: and ISBN- Bgwhite (talk) 22:21, 4 January 2014 (UTC)[reply]
Round 2
Ping: Magioladitis, NicoV, Meno25, Crazy1880, LindsayH, GoingBatty, Matěj Suchánek, Josve05a, ChrisGualtieri, Graham87.
Following is a list of errors that I think could be added. Some notes:
- English database reports that Meno25 are not being ported to other languages unless somebody is willing to take on the task. Very few have been ported. So, if a report meets the "standard", I see no reason not to add it to checkwiki.
- Most citation style errors would be a pain in the butt to code, too many articles that take too long to correct and are not really syntax errors. The one exception that I can think of is missing "url=" when the web address is given.
- NicoV and Magioladitis, could you WPCleaner or AWB to the appropriate errors and columns.
- Any errors not in the list that you think should be added? Any other comments?
Description
|
Priority
|
Coding
|
Tools to detect
|
Tools to fix
|
Other
|
Useless "Template" in {{Template:...}}
|
low
|
Done
|
WPC, AWB
|
WPC, AWB
|
#1 (#502)
|
Internal link written as an external link
|
medium
|
Done
|
WPC
|
WPC & Frescobot
|
#90 (#511)
|
Interwiki link written as an external link
|
low
|
Done
|
WPC
|
WPC
|
#91 (#512)
|
Internal link inside an external link
|
medium
|
|
WPC (#513)
|
WPC
|
|
<strike>...</strike>
|
low
|
Done
|
WPC, AWB
|
WPC, AWB*
|
#42 (#517). Obsolete in HTML5. Use <s>...</s> instead
|
<a>...</a>
|
low
|
Done
|
WPC
|
WPC
|
#4 (#519)
|
URL without http://
|
high
|
Done
|
WPC, AWB
|
WPC, AWB
|
#62
|
Finding cases of url=http://http://
|
medium
|
Done
|
WPC, AWB
|
WPC, AWB
|
#93
|
Blank lines in bulleted vertical lists
|
medium
|
|
|
|
Accessibility issue per Wikipedia:Accessibility#Blocked elements
|
Putting the TOC in the standard position
|
medium
|
Done
|
WPC
|
|
#96 and #97. Accessibility issue per MOS Elements of the lead
|
No blank space after the comma in DEFAULTSORT
|
low
|
Done
|
WPC, AWB
|
WPC, AWB
|
#89
|
Unbalanced ref tags
|
medium
|
Done
|
WPC, AWB
|
WPC, AWB
|
#94
|
Detecting user signatures in articles
|
low
|
Done
|
WPC, AWB
|
WPC, AWB
|
#95
|
Detecting fat redirects (redirects obscuring page content)
|
low
|
|
|
|
|
<span class="plainlinks"> in articles
|
low
|
|
|
|
|
Pipe in external link [http:/www.wikipedia.org|Wikipedia]
|
low
|
|
|
|
|
Link to a year which has another description ([[2012|2013]])
|
low
|
|
|
|
This error is often caused by VE.
|
Cases of {{cite web|http://www.wikipedia.org| title=
|
medium
|
|
|
|
|
Move anchor in front title in heading
|
|
|
|
|
|
Detect non-existent files (red linked files)
|
|
|
|
|
|
Detect non-existent templates
|
|
|
WPC (#508)
|
|
|
Detect refs <ref name=>
|
low
|
easy
|
|
|
often detected as #56
|
Category with double colon
|
|
easy
|
AWB
|
|
|
More same parameters in template
|
medium
|
medium
|
|
|
|
- Good :-). I've added the information about what WPCleaner can currently detect and fix (automatic or bot, at least partially). For errors I've already coded with a #5xx number, feel free to use an error number following what CW currently manages or keep the #5xx number. For other errors, I don't see any problem for implementing them in WPCleaner, but it will probably have to wait 2 months, as I will be almost completely unavailable for several weeks. --NicoV (Talk on frwiki) 20:05, 3 December 2013 (UTC)[reply]
Errors added
Magioladitis, NicoV, Meno25, GoingBatty, Matěj Suchánek, Josve05a, ChrisGualtieri
- #01 - Template with the useless word "template"
- #04 - HTML text style element
<a>
- #42 - HTML text style element
<strike>
- #62 - URL containing no http://
- #89 - DEFAULTSORT with no space after the comma
- #90 - Internal link written as an external link
- #91 - Interwiki link written as an external link
- #93 - External link with double http://
- #94 - Reference tags with no correct match
- #95 - Editor's signature or link to user space
- #96 - TOC after first headline
- #97 - Material between TOC and first headline
Notes
- Only turned on for enwiki for right now. Will start to expand after NicoV's return.
- Just added #90 and #91. So, there will probably be some problems.
- For #90 and #91, it will only search for articles written as an external link. Talk pages or special pages will no be searched. History of Wikipedia has examples on why it is done this way.
- The description on #91 most be changes to
The script found an external link that should be replaced with a interwiki link. An example would be on enwiki [http://fr.wikipedia.org/wiki/Larry Wall] should be written as [[:fr:Larry Wall]] so it says fr.wikipedia.org in the extrnal link and not en.wikipedia.org. -(t) Josve05a (c) 21:07, 24 December 2013 (UTC)[reply]
- And #90 most be changed to e.wikipedia.org. -(t) Josve05a (c) 21:11, 24 December 2013 (UTC)[reply]
- Another thing is that it should not say [...]/wiki/Larry Wall]. It should say [...]/wiki/Larry_Wall Larry Wall].(t) Josve05a (c) 21:14, 24 December 2013 (UTC)[reply]
Errors modified
- #22 - Finds more cases of a space in a category
- #19 - Finds headlines that start with one "=" anywhere in the article instead of only at the start of the article.
WPCleaner
Bgwhite I've updated WPCleaner (version 1.31) for the following errors for all wikis: #1 (previously #502), #4 (previously #519), #42 (previously #517), #90 (previously #511), #91 (previously #512). Still have to do: #62, #89, #93, #94. Old #62 and #89 have been disabled. --NicoV (Talk on frwiki) 21:51, 22 January 2014 (UTC)[reply]
- Thank you Nico. Do you want me to turn on the new errors for frwiki or wait? I'm sure Josve05a will have found a bug before I write this. :). New error #95 will be an editor's signature found in an article. Bgwhite (talk) 22:49, 22 January 2014 (UTC)[reply]
- Bgwhite, will 95 include UTC, CET, CEST etc.? Since this error might not only be used on enwp? (t) Josve05a (c) 22:55, 22 January 2014 (UTC)[reply]
- (BTW Bgwhite I'm 16 in 5...4...3...2...1...HAPPY BIRTHDAY TO ME!) (t) Josve05a (c) 23:00, 22 January 2014 (UTC)[reply]
- Hey, I already wished you a happy birthday, which you already complained about. Now you want another... pfffft. :) Time is irrelevant for #95 as I'm only looking for a signature. Bgwhite (talk) 23:09, 22 January 2014 (UTC)[reply]
- @Bgwhite Yes, I think you can turn the new errors on for frwiki, I'll check what has to be changed in the translation file. --NicoV (Talk on frwiki) 08:47, 23 January 2014 (UTC)[reply]
- Bgwhite, NicoV, (#91) WPCleaner changes
[http://www.imdb.com/name/nm0403424/ Hurley on the [[Internet Movie Database]]] to [[:imdbname:0403424|Hurley on the]][[Internet Movie Database]]] . I see multiple issues with this. It removes the blank space, it leaves 3 bracket at the end (without the WPCleander reporing it. (Found on Colin Hurley). (t) Josve05a (c) 10:52, 23 January 2014 (UTC)[reply]
- @Josve05a: It will be fixed in a next version. It's due to the incorrect syntax of having an internal link inside an external link. It can be reported by WPC if #513 is activated. --NicoV (Talk on frwiki) 19:54, 23 January 2014 (UTC)[reply]
- @Bgwhite: If possible, start please the new checks for cswiki, too. I will modify the configuration file. Within a week, you can also enable skwiki.
- @Josve05a: Happy birthday! You are now same aged as I am (for next 10 months). Matt S. (talk | cont. | cs) 18:28, 23 January 2014 (UTC)[reply]
NicoV and Matt S., in theory frwiki and cswiki should start seeing the new errors at the next 0z run.... if the database is up. Today's outage was caused by a disc getting full. Bgwhite (talk) 07:38, 24 January 2014 (UTC)[reply]
- Hi! It's strange: I've modified the translation file in frwiki 4 days ago, but the old description is still displayed in WMFLabs for #1, #4, #42, #62. No errors are found. --NicoV (Talk on frwiki) 12:15, 27 January 2014 (UTC)[reply]
- @Bgwhite For frwiki, I've changed the translation file a few days ago: descriptions for new error numbers (#93, ...) have been taken into account on WMF Labs, but not the modified descriptions for old error numbers that have been recycled (#1, #4, #42, #62, #89, #90, #91). Is it a problem to have kept the old descriptions as comments? --NicoV (Talk on frwiki) 09:12, 29 January 2014 (UTC)[reply]
- NicoV, hmmm, I didn't see the message above this one. Sorry for that. The translation file and every other program has been bombing lately, so that was probably why you didn't see it right away. Between database problems and mounting problems, I'm ready to go screaming into the night. The frwiki dump processing is still running. Which is very amazing that it hasn't died yet.
- Do you mean as comments in the translation file as you have done for the French one? I see no problems.
- For #96 and #97 I've thrown in a little regex in the English translation file to account for templates being used with a space and no space.
- For #95, I only have English "User" and "User talk". I'll get individual wiki's words in a bit. I'll get them thru the API. Bgwhite (talk) 09:31, 29 January 2014 (UTC)[reply]
- Bgwhite For example, on WMFLabs, #1 is still displayed as "Pas de texte en gras" (the old description, which is commented out in the translation file) when the translation file has been changed 6 days ago; whereas the translations for the new errors (#93 and so on) are correctly used even if they have been changed later (only 2 days ago). --NicoV (Talk on frwiki) 16:47, 29 January 2014 (UTC)[reply]
- NicoV, looking at the code, it grabs the first variable, commented or not. So, putting the commented lines second does the trick. Bgwhite (talk) 22:32, 29 January 2014 (UTC)[reply]
- Ok, thanks a lot! --NicoV (Talk on frwiki) 11:08, 30 January 2014 (UTC)[reply]
If a website is called "www.news.de" for example something like this is valid in the German Wikipedia:
<ref>www.news.de: [http://www.news.de/article Article].</ref>
<ref>www.news.de: ''[http://www.news.de/article Article]''.</ref>
This shouldn't be reported as an error. Would be nice to have this excluded somehow. Disabling the check would also disable the check for url= which would be a shame. Here is an idea for an extended regex (not tested).
/(?:<ref\b[^<>]*>|url\s*=)\s*www\w*\.(?![^<>[\]{|}]*\[\w*:?\/\/)/i
--TMg 17:10, 19 January 2014 (UTC)[reply]
- I had to drop checking for cases with
|url= . There were infoboxes which required external links not have http://. That should make the regex a little easier. I currently have:
- I'm not yet catching named refs, which you do. Bgwhite (talk) 06:10, 20 January 2014 (UTC)[reply]
- Unfortunately this will cause the same false positives. Here is my regex again without the url= option.
/<ref\b[^<>]*>\s*\[?www\w*\.(?![^<>[\]{|}]*\[\w*:?\/\/)/i
- --TMg 09:51, 20 January 2014 (UTC)[reply]
- Yes, I know it will cause the same false positives. I was only giving the reasons why for the current status of the regex, including dropping url. Bgwhite (talk) 22:17, 20 January 2014 (UTC)[reply]
- It does work, but it has a hitch. For example, it does find an error in Central Philippine University, Ciclosporin and Gravity Rush. However, it reports the error at the end of the article. The hitch happens with the entire regex or just /(<ref\b[^<>]*>\s*\[?www\.)/. I'm off to bed Bgwhite (talk) 09:10, 21 January 2014 (UTC)[reply]
- Not sure what you mean with "hitch". Maybe it's because I removed the brackets but you are relying on them? Let's re-add them:
/(<ref\b[^<>]*>\s*\[?www\w*\.)(?![^<>[\]{|}]*\[\w*:?\/\/)/i
This matches:
- But it does not match my two examples above. I'm happy. :-) --TMg 21:24, 21 January 2014 (UTC)[reply]
- The "hitch"... for some articles, the regex does not tell where the error is found. It just reports the last bracket in the article. See [1] and look at the notice column. Bgwhite (talk) 21:52, 21 January 2014 (UTC)[reply]
- I see. That's an upper/lowercase problem. The index() call is case-sensitive but gets $1 lowercased. --TMg 22:09, 21 January 2014 (UTC)[reply]
Current |
Suggestion
|
my $test_text = $lc_text;
if ( $test_text =~ /(<ref\b[^<>]*>\s*\[?www\.)/ ) {
my $pos = index( $text, $1 );
error_register( $error_code, substr( $text, $pos, 40 ) );
}
|
if ( $text =~ /<ref\b[^<>]*>\s*\[?www\w*\.(?![^<>[\]{|}]*\[\w*:?\/\/)/i ) {
my $pos = $-[0];
error_register( $error_code, substr( $text, $pos, 40 ) );
}
|
|
Hi all. In Demons (novel) the section headed "Characters" employs paragraphs within a bulleted list. This has been coded per the advice given here, but Yobot (and, I think, other AWB-based robots) persists in making "corrections": [2] [3] [4] [5] [6] [7] and so on. Aside from destroying the logical structure of the section, this is also contrary to accessibility guidelines.
I note that the detection of error #39 has already been modified to accept the use of <p> s within certain tags, such as <blockquote> . Can this tolerance be extended to include <p> s within lists?
(I was uncertain whether to raise this concern here, with Yobot, or with AWB. If I've chosen the wrong place, could you please let me know, and I'll try again.) In the meantime, thanks for your collective good work with checkwiki: fighting the good fight, and at scale! — Simon the Likable (talk) 13:49, 10 February 2014 (UTC)[reply]
- Simon the Likable hi. Thanks for starting the discussion. I was not aware of this problem. Bots tend to revisit a page unless something is changed. -- Magioladitis (talk) 13:54, 10 February 2014 (UTC)[reply]
- Simon the Likable can you please check if you like my version? -- Magioladitis (talk) 13:57, 10 February 2014 (UTC)[reply]
- This is both a Checkwiki and AWB issue, so having a discussion at either spot is just fine.
- @Graham87: As this is also an accessibility issue, Graham is the one to ask. Current version of Demons (novel) uses * and : to create paragraphs inside lists. This version uses * and standard html paragraph tag. Is the current version acceptable or should the older version be used? Bgwhite (talk) 18:24, 10 February 2014 (UTC)[reply]
- @Simon the Likable, Magioladitis, and Bgwhite: The older version is better, but even there, the gaps between the list items would need to be removed. In the newer version, the HTML lists finish at the end of each paragraph (as can be seen by checking the HTML source). It might be easier to use HTML rather than wiki-markup to create the lists. Graham87 01:13, 11 February 2014 (UTC)[reply]
- Sorry guys but on my laptop, both versions have the same visual result. I must be semi-blind or something. This happens to me after working on my laptop for several hours. Can someone explain me what are the visual differences? Thanks, Magioladitis (talk) 06:59, 11 February 2014 (UTC)[reply]
- Visually they are the same. On a screen reader, it breaks up the list. The first item on the list, the one with the
<p> tags, with the : it appears as an one item list to a screen reader. Bgwhite (talk) 07:12, 11 February 2014 (UTC)[reply]
- Thanks Magioladitis. As Bgwhite has outlined, your solution is impeccable visually, but will not allow visually impaired readers good access using a screen reader. I have therefore reverted your change (reinstating the
<p> s), but have also taken on board Graham87's point and removed the blank lines between list items. Thus, I think the current version covers both visual and accessibility requirements, and follows recommended coding practices in Help:List#Paragraphs_in_lists and now WP:LISTGAP.
- This leaves open my original issue: checkwiki and AWB both regard this recommended markup as an error. Can checks for error #39 be modified to accept the use of
<p> s within lists? (Or perhaps there is some other solution?) — Simon the Likable (talk) 13:59, 11 February 2014 (UTC)[reply]
- Thanks Simon; sounds good here now! Graham87 14:03, 11 February 2014 (UTC)[reply]
- Hey guys. Any chance that this is a Mediawiki bug and we should report it? -- Magioladitis (talk) 14:06, 11 February 2014 (UTC)[reply]
- I looked at source code for the latest version and Magioladitis' version. It does not appear to be a bug. In the latest version of Demons (novel), it is one long list made up of
<li> tags. If a blank line happens, the list ends. In Magioladitis' version, it starts as a list. When the first : happens, the list is ended. The HTML tags to produce the layout for the : consists of <dl> and <dt> tags. The use of the dl and dt tags is standard HTML practice when text needs varying indentation. The source for this talk page is full of dl and dt tags. Bgwhite (talk) 06:23, 12 February 2014 (UTC)[reply]
- Yes, both Checkwiki and AWB should not call this an error. Finding a solution is another matter. My brain isn't coming up with an answer. For the time being, I've added the article to a whitelist, so Checkwiki will not find a
<p> error in the article. Bgwhite (talk) 06:23, 12 February 2014 (UTC)[reply]
It would be very helpful if the check could recognize and ignore
- Example: de:Erich Burgener - {{Literatur | Autor=Bertrand Zimmermann | Titel=Erich Burgener | Verlag= Editions de la Thèle| Ort=Yverdon-les-Bains | Jahr=1987 | ISBN=2-8283-0024 | ISBNistFormalFalsch=J }}
- http://xxxxx/isbn/282830024
--Tsor (talk) 09:09, 2 March 2014 (UTC)[reply]
- Tsor, as usual, I'm confused. Why give a bad ISBN in the first place? I did a Google search and only two non-Wikipedia derived websites give this number and one of them is Wikipedia. Bgwhite (talk) 23:43, 2 March 2014 (UTC)[reply]
- Hello Bgwhite, this ist just a (bad) example. Sometimes we find in a book an ISBN which is formal wrong. Some guys use the template Vorlage:Literatur where they can mark such invalid ISBNs by "ISBNistFormalFalsch=J". There is another template Vorlage:Falsche ISBN which can mark such invalid ISBNs: {{Falsche ISBN|3-123-45678-9}} leads to "ISBN 3-123-45678-9 (formal falsche ISBN)". This template is used very often: https://de.wikipedia.org/wiki/Spezial:Linkliste/Vorlage:Falsche_ISBN
- I will look for a better example for an invalid ISBN. --Tsor (talk) 10:10, 3 March 2014 (UTC)[reply]
- PS: An additional column in the error-list "marked as invalid" would help. --Tsor (talk) 10:18, 3 March 2014 (UTC)[reply]
- Tsor, I'm slow, but I still fail to see what is wrong. It would be best to use a correct ISBN? A better example would help me understand. TMg, could you help me out.
- There are whitelists in which articles can be added so they won't be raised as an error again. To many things can go wrong with "marked as invalid" button... Already a problem of vandalism by people clicking done when they have no intention of fixing errors. Bgwhite (talk) 3 March 2014 (UTC)
- Here are 349 examples. --Tsor (talk) 11:10, 3 March 2014 (UTC)[reply]
- I just looked at the first one in the list, de:Charles de Melun and I don't understand why the ISBN is qualified as bad: the checksum is correct. Is it normal to have "ISBNistFormalFalsch=J" with an ISBN that seems correct? Edit: idem for second example de:Bussard (Einheit). --NicoV (Talk on frwiki) 12:26, 3 March 2014 (UTC)[reply]
- Hmm, you are right, in de:Charles de Melun ISBN is marked as bad but ist is ok. Same at your second example. I will have a closer look. --Tsor (talk) 13:26, 3 March 2014 (UTC)[reply]
- Please repeat your calculation. The checksum digit is false, if the first 9 digits are corect the checksum digit in the end should be a 1, so the ISBN should be 2902091311 and not 2902091312. --Cepheiden (talk) 19:15, 5 March 2014 (UTC)[reply]
- Well, you're just not looking at the version as was looking at, the page was modified since my comment and changed completely about the ISBN: a ISBN-13 with a coherent checksum was replaced by a ISBN-10 with a non-coherent checksum. --NicoV (Talk on frwiki) 20:21, 5 March 2014 (UTC)[reply]
- I'm sorry, you are right i didn't notice the edit. --Cepheiden (talk) 17:48, 8 March 2014 (UTC)[reply]
- I also looked at other, a lot seem in the same situation. There's also cases where the ISBN has indeed a wrong checksum, but the book can be found with the correct ISBN on the internet: de:Mare Imbrium and the corresponding book on google. I've spent quite some time on frwiki to fix ISBN reported by CW (still quite some work to do), but I've found very few situations where the ISBN with the incorrect checksum was confirmed as being the ISBN (it's usually fixed at some point). --NicoV (Talk on frwiki) 15:51, 3 March 2014 (UTC)[reply]
- Yes, there are cases of ISBN's with false checksum digits used as the original ISBN (printed in book and listed in databases of libraries etc.). If someone cites this book with this ISBN we mark them as "formally false" like some libraries do. So what's the point here? --Cepheiden (talk) 19:15, 5 March 2014 (UTC)[reply]
- My point was that I was surprised by the size of the list (349 pages), because as I said, I fixed a lot of ISBN on frwiki, and didn't find so much situations where the ISBN with the non-coherent checksum had to be kept. Given that the first hits in the search seemed to be mistakes, I was wondering if it was normal that you have so many page with ISBN tagged as formally false. --NicoV (Talk on frwiki) 20:26, 5 March 2014 (UTC)[reply]
- This was more a reply to Bgwhite (like Tsor already did). --Cepheiden (talk) 17:48, 8 March 2014 (UTC)[reply]
Just an example for the second point: http://www.randomhouse.ca/catalog/display.pperl?isbn=9780676978223 found in de:28 Stories über Aids in Afrika. --Tsor (talk) 22:08, 3 March 2014 (UTC)[reply]
- It links to "Page not found", the correct link seems to be at http://www.randomhouse.ca/catalog/display.pperl?isbn=9780676978230 (different last 2 digits ISBN). --NicoV (Talk on frwiki) 22:29, 3 March 2014 (UTC)[reply]
Hi,
Don't worry, not a request for more work to do, just an announcement to make. I'm happy to announce WPCleaner v1.32, with the main addition being the ability to add/update/remove a warning about ISBN errors (#70, #71, #72, #73) on article talk page. This can work either on a given article (from the full analysis window), or on a big bunch of articles as a bot tool (members of Category:Pages with ISBN errors, articles listed in #70-73, articles with the warning on their talk page).
Some configuration is required before being able to use it on a wiki. I've configured it for frwiki, and used it this weekend :
With the addition of the automatic detection of ISBN errors in cite templates on frwiki, I hope that it will help reduce the number of ISBN errors.
If you wish to configure this for an other wiki, please check what WPC is doing on one article before trying the bot tool on large scale. --NicoV (Talk on frwiki) 21:28, 27 April 2014 (UTC)[reply]
- And also the possibility to create a list of all ISBN errors: for each invalid ISBN, it gives a list of articles containing it. This allows working on all the articles that contain the same invalid ISBN. I'm currently running WPCleaner to create it for enwiki, you can see an example at frwiki (showing a record of the same invalid ISBN used 297 times). This function requires a lot less configuration (todo templates, and preferably a category for pages with ISBN errors). --NicoV (Talk on frwiki) 20:55, 28 April 2014 (UTC)[reply]
- List generated... big... but bad rendering... I thought the {{ISBN}} template would create an ISBN, not messages... --NicoV (Talk on frwiki) 21:55, 28 April 2014 (UTC)[reply]
Given that I was just working on ISBN errors last night, I feel entitled to spout my two halers worth...
On the page "→ Homepage → enwiki → middle priority → ISBN with wrong length", I wish the table contained an additional indication if the error occurs multiple times in the article. Surely, if the script can find the error once in an article, it can also find the error more than once and tell us rather that hording such information for itself.
--LukasMatt (talk) 01:48, 29 April 2014 (UTC)[reply]
- Ok, will add it to the generated list. --NicoV (Talk on frwiki) 06:41, 29 April 2014 (UTC)[reply]
- List updated: list of all ISBN errors --NicoV (Talk on frwiki) 15:38, 29 April 2014 (UTC)[reply]
- I'll contact Bgwhite as you suggested. I looked at "list of all ISBN errors"; it's not exactly what I had in mind for my first request. Sometimes, in one article, a person will cite the same source 10 times and not use a "ref name". Thus, the same incorrectly formatted ISBN occurs 10 times in the article. I need something in "→ Homepage → enwiki → middle priority → ISBN with wrong length" that tells me "This bad ISBN occurs 10 times in the article".
- --LukasMatt (talk) 16:30, 29 April 2014 (UTC)[reply]
- Lists on Labs only show the first error in each article (no information if the same error is happening several times, or there are other errors), and it's probably not going to change. I would suggest to use a tool that will show how many times each error occurs. WPCleaner does this, AWB probably also.
- On frwiki, I configured WPCleaner to be able to put a message on article talk page listing all ISBN errors (see fr:Modèle:Avertissement ISBN). --NicoV (Talk on frwiki) 16:59, 29 April 2014 (UTC)[reply]
Thanks, NicoV. One more request, please. In "→ Homepage → enwiki → middle priority → ISBN with wrong length", instead of only showing 25 articles per page, can we have something like
- View (previous 50) (next 50) (20 | 50 | 100 | 250 | 500)
--LukasMatt (talk) 12:33, 29 April 2014 (UTC)[reply]
- This is more a request for Bgwhite probably, I'm only updating WPCleaner, not the scripts that work on WMF Labs (probably the same for the previous request, I can only add the count the list WPCleaner generates). It's already possible manually by adding
&limit=50 to the URL like https://tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi?project=frwiki&view=only&id=12&limit=50 --NicoV (Talk on frwiki) 13:43, 29 April 2014 (UTC)[reply]
- Yep, it works. Thanks. (Still, a simple mouse click would be nicer. I'll contact Bgwhite.)
--LukasMatt (talk) 16:30, 29 April 2014 (UTC)[reply]
"List of all ISBN errors" is not going to happen. That information isn't stored in the database by design.
As for "View (previous 50) (next 50)", that is a good idea. Will add it to the list of things to do. Bgwhite (talk) 16:48, 29 April 2014 (UTC)[reply]
@NicoV: I am very interested in this feature, thanks for it! Will be working on assimilating this with cswiki. Matěj Suchánek (talk | cont.) 15:06, 30 April 2014 (UTC)[reply]
- Happy to know that it's going to be used on an other wiki. Keep me posted! --NicoV (Talk on frwiki) 09:35, 2 May 2014 (UTC)[reply]
- @Matěj Suchánek: Any luck using it with cswiki? The page containing the list of ISBN errors can now be updated automatically by WPCleaner (see frwiki). --NicoV (Talk on frwiki) 13:27, 6 May 2014 (UTC)[reply]
- @NicoV: wikt:dočkej času, jako husa klasu... actually, I have already created the template and updated some configuration, so it only depends on when I start using this feature or when someone finds this feature since I didn't write anywhere about it. Matěj Suchánek (talk | cont.) 17:21, 7 May 2014 (UTC)[reply]
- Ok, no rush ;-) Luckily, he only thing that is done completely automatically is updating the warning (but not creating it) when you save a page where you fixed some ISBN errors, so nothing should happen before someone tries to use it. --NicoV (Talk on frwiki) 17:47, 7 May 2014 (UTC)[reply]
Done
Copied from the section "Showing ISBN errors to other editors"
Thanks, NicoV. One more request, please. In "→ Homepage → enwiki → middle priority → ISBN with wrong length", instead of only showing 25 articles per page, can we have something like
- View (previous 50) (next 50) (20 | 50 | 100 | 250 | 500)
--LukasMatt (talk) 12:33, 29 April 2014 (UTC)[reply]
- "List of all ISBN errors" is not going to happen. That information isn't stored in the database by design.
- As for "View (previous 50) (next 50)", that is a good idea. Will add it to the list of things to do. Bgwhite (talk) 16:48, 29 April 2014 (UTC)[reply]
- LukasMatt, Done Bgwhite (talk) 06:38, 18 May 2014 (UTC)[reply]
- I just noticed it. Sweet! Thanks. --LukasMatt (talk) 15:41, 21 May 2014 (UTC)[reply]
Bgwhite, would it be possible to do the same for the list of "done" articles ? Thanks --NicoV (Talk on frwiki) 09:43, 25 May 2014 (UTC)[reply]
- NicoV Done Bgwhite (talk) 07:37, 22 August 2014 (UTC)[reply]
Done
Moin Moin @Bgwhite:, since today there is a problem with "more" in every ID. If an article has an special character you couldn't open "more". If there is no special character, there is no problem. Tip: Is this a Bug from #Homepage → enwiki? Regards --Crazy1880 (talk) 08:41, 10 May 2014 (UTC)[reply]
- Crazy1880, could you give me a link where you see it because I can't find it. It would not be related to the previous feature addition. Different parts of the code. Bgwhite (talk) 06:53, 11 May 2014 (UTC)[reply]
- Moin Moin Bgwhite, I checked some more round about this problem. I normally use Opera but yesterday I used the IE. Today in the morning I used Opera an see no problem. So I used IE 11, too, and there it is.
- Link one: //tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi?project=enwiki&view=detail&title=Ahmed Sékou Touré
- Link two: //tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi?project=enwiki&view=detail&title=Air Livonia
- It seems that the underlines at special characters link at "title" are the riddle solution. Regards --Crazy1880 (talk) 09:20, 11 May 2014 (UTC)[reply]
- Crazy1880, well that is strange. It works fine in Chrome and Firefox, but dies in IE. The edit and Article columns work fine in all browsers. I don't want to test the done column. I'll look at the code to see if it does anything different between the columns. Otherwise, I'll need to get an expert on IE. Bgwhite (talk) 05:12, 12 May 2014 (UTC)[reply]
- Crazy1880, with the help of Redrose64, the problem is now fixed. Bgwhite (talk) 05:58, 15 May 2014 (UTC)[reply]
- Moin, thank you Bgwhite and Redrose64. Regards --Crazy1880 (talk) 17:22, 15 May 2014 (UTC)[reply]
Moin Moin and sorry Bgwhite and Redrose64, but the problem is not done. Now I have the problem in every browser, that under "more" when there is a special character you couldn't click on "done" and set it as done.
- Link one: //tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi?project=enwiki&view=detail&title=Al-Qusayr,%20Syria
- Link two: //tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi?project=enwiki&view=detail&title=Air%20Command%20Tandem
And in the IE there is the problem, that I am not able to open "more" by articles with special character. Please check there again, thanks --Crazy1880 (talk) 05:43, 16 May 2014 (UTC)[reply]
- Crazy1880, it is the same exact problem, but in a different part of the code. I'll get to within the next hour. Bgwhite (talk) 05:48, 16 May 2014 (UTC)[reply]
- First part is fixed. Could you give me an example link for the second (IE) part. Bgwhite (talk) 06:03, 16 May 2014 (UTC)[reply]
- Moin Bgwhite, here the link to english CheckWikipedia see artikle "Ahmed Sékou Touré" or "Ajumako/Enyan/Essiam District". Regards --Crazy1880 (talk) 16:56, 16 May 2014 (UTC)[reply]
- Crazy1880, it does work for me with IE. I'm using IE 11 and I have a feeling you are using another version. What version are you using? Bgwhite (talk) 00:17, 17 May 2014 (UTC)[reply]
- Moin Bgwhite, true, I use multiple versions of Internet Explorer in my work, but primarily the FF and Opera. I now looked again to the problem and I found that in my version of IE now everything looks ok. Thanks. --Crazy1880 (talk) 14:40, 17 May 2014 (UTC)[reply]
Hi, what do you think of adding a detection for adjacent references, like <ref>...</ref><ref>...</ref><ref>...</ref> ? This error probably won't be of any interest for enwiki because reference numbers are put between square brackets [1][2][3]. But on frwiki reference numbers are displayed without any decoration so adjacent references may look like only one reference 123, so we're generally using a template {{,}} between references. --NicoV (Talk on frwiki) 22:14, 27 May 2014 (UTC)[reply]
- NicoV, could you get me some articles with the problem as test subjects. <maniacal laugh> Test Subjects </maniacal laugh> I take it I need to look for cases of:
</ref><ref> and <ref name=ack /><ref ? I also saw your message above about adding to the done pages. Bgwhite (talk) 05:29, 28 May 2014 (UTC)[reply]
- Ok, will try to find some... The subject was brought on WPCleaner's talk page for this modification, but the page is fixed now. --NicoV (Talk on frwiki) 07:17, 28 May 2014 (UTC)[reply]
- Bgwhite, I checked a lot of articles but I haven't found an other example yet... --NicoV (Talk on frwiki) 12:16, 28 May 2014 (UTC)[reply]
- fr:Utilisateur:Zetud/Pb Ref should have a list. --NicoV
- Bgwhite, fr:Leetchi, with at least 2 problems in the introduction. --NicoV (Talk on frwiki) 07:34, 2 July 2014 (UTC)[reply]
Discussion in User_talk:Frietjes#Infoboxes_to_take_of revealed that most probably Error #31 needs expansion to cover more HTML table tags. -- Magioladitis (talk) 22:45, 31 May 2014 (UTC)[reply]
- @Frietjes and Magioladitis:. #31 only checks for the case of
<table . There are legitimate cases where <td> can be used. Will first check the upcoming June dump file to see the lay of the land for tr and td tags. Bgwhite (talk) 06:47, 1 June 2014 (UTC)[reply]
Hello! I'd like to propose to detect a new error type: sometimes there are an in-page interlanguage links written as a regular interlanguage links, i.e. without a starting colon. But they are obviously in-page links since they contain a pipe symbol. For example, this situation was on a page 男同性恋免疫缺乏症 of Chinese Wiki (I don't know such examples in En.Wiki), which contained two such links: [[en:Kaposi's sarcoma|卡波西氏肉瘤]] and [[en:Pneumocystis pneumonia|卡氏肺囊虫肺炎]]. A link part after the pipe symbol is obviously useless for the regular interwikis and this situation is undoubted error. --Emaus (talk) 14:35, 2 June 2014 (UTC)[reply]
- Emaus @Magioladitis:. In theory, error #31, interwiki before last heading, should catch these situations. Since interwiki use should be minimal now, renaming this error would be a good thing. Maybe "interlanguage link with incorrect syntax"? Bgwhite (talk) 20:12, 2 June 2014 (UTC)[reply]
- @Bgwhite and Emaus: AWB will react by moving the interwiki at the bottom unless the interwiki matches the project code. -- Magioladitis (talk) 08:05, 3 June 2014 (UTC)[reply]
@Bgwhite and NicoV: [[[[foo]]]] is caught as #64 by CHECKWIKI but as #10 by WPCleaner. It is not fixed by AWB. -- Magioladitis (talk) 06:51, 18 June 2014 (UTC)[reply]
- Hi Magioladitis. What do you think we should do ? I don't see why it's detected as #64 (link equal to link text): do you mean #46 (Square brackets not correct begin)? WPCleaner should detect both #10 and #46. --NicoV (Talk on frwiki) 13:05, 20 June 2014 (UTC)[reply]
OK. I am getting rusty. Sorry again. This one show that AWB did not fix 64. but this is maybe due to the order of how stuff is done. Same here. -- Magioladitis (talk) 13:14, 20 June 2014 (UTC)[reply]
- Ok, I understand better, especially with the next modification. Maybe internal link is not correctly recognized by AWB due to the extra brackets? WPCleaner edit seems fine (#10, #46 and #64), except for the automatic comment ("null"...), I have to fix this one. NicoV (Talk on frwiki) 13:28, 20 June 2014 (UTC)[reply]
@Bgwhite: After the last dump I realised that the whitelist for #48 never works. Same for the #101 whitelist. -- Magioladitis (talk) 08:09, 18 June 2014 (UTC)[reply]
We should exclude anything inside timeline tags. -- Magioladitis (talk) 07:10, 19 June 2014 (UTC)[reply]
We should exclude search inside {{Not a typo}}. -- Magioladitis (talk) 07:49, 20 June 2014 (UTC)[reply]
Please update the arwiki Last scanned dump 2014-04-07 (80 days old). --Zaher talk 23:19, 26 June 2014 (UTC)[reply]
- Zaher, the good news is that the daily update is still running, so new errors in articles are being caught. Looking at the logs, it appears that a page is so badly borked that it causes the checkwiki program to die. This does happen every once in awhile. Last happened with svwiki around 8 months ago. I'll have to work on this on my home computer to find the article... it's not easy to find. I'll try and have the majority of a dump processed and up on the webpage by this weekend. Bgwhite (talk) 00:03, 27 June 2014 (UTC)[reply]
- @Magioladitis: Zaher. If you look at all of the languages, you would see that none of them are updating. WMFLabs' disk space for the dump files is full and they are currently not doing anything about it.
- Me reporting problem. Template:Bugzilla
- Others reporting the problem Template:Bugzilla
- Them saying it is known and will be fixed soon (July 11) Template:Bugzilla.
- Bgwhite (talk) 20:58, 4 August 2014 (UTC)[reply]
- Thanks for the clarification and for your efforts. --Zaher talk 17:44, 5 August 2014 (UTC)[reply]
I think WPCleaner catches the list found at el:Βικιπαίδεια:WikiProject_Check_Wikipedia/Μετάφραση while CHECKWIKI script does not. -- Magioladitis (talk) 16:48, 27 June 2014 (UTC)[reply]
- I think the problem is only with the last line. Now that I updated the code, I noticed that all errors shown are connected to the last line. -- Magioladitis (talk) 05:26, 8 July 2014 (UTC)[reply]
Hi! I can't find where are double small tags here. There are 90k entries so I thought it's something in a template but I haven't found anything. Thanks for your help! --AlessioMela (talk) 08:40, 1 July 2014 (UTC)[reply]
- AlessioMela, you are correct. I didn't see anything either. There is also something fishy with links as they goto the main page and not to the article. I will look into what is wrong. Bgwhite (talk) 22:45, 1 July 2014 (UTC)[reply]
People here might be interested in the thread Wikipedia:Village_pump_(technical)#Parsoid_Based_Linter.--Salix alba (talk): 02:38, 9 July 2014 (UTC)[reply]
- Now archived at Wikipedia:Village pump (technical)/Archive 128#Parsoid Based Linter. EdJohnston (talk) 23:08, 18 July 2014 (UTC)[reply]
- rev 10273 Double quotation marks covered (errors 6 and 37)
- rev 10296 A first try to expand MultipleHttp fixing inside url templates (error 93)
- rev 10301, rev 10302 Fix for lj and nj in sortkey (errors 6 and 37)
- rev 10319 moves punctuation in more cases. (error 61)
- rev 10334 move refs after question and exclamation mark (error 61)
- rev 10390 recognises more footnotes (error 61)
-- Magioladitis (talk) 20:47, 19 August 2014 (UTC)[reply]
Hi, it seems that false positives are detected when the closing ref tag is </ref > (with the space at the end). For Spahettification, CheckWiki reports the error being at <ref> pour une corde du même type de 8 m . --NicoV (Talk on frwiki) 05:27, 10 July 2014 (UTC)[reply]
- NicoV I just fix them. -- Magioladitis (talk) 06:18, 10 July 2014 (UTC)[reply]
- I am very happy. I have forgotten this was a mistake some people do. I just fixed 17 pages in the English Wikipedia. -- Magioladitis (talk) 06:40, 10 July 2014 (UTC)[reply]
- This is done by design. Yea, it is minor, but fixable. Besides it makes Magioladitis happy. Bgwhite (talk) 07:40, 10 July 2014 (UTC)[reply]
I did not remember that but AWB fixes the spacing inside close reg tag! -- Magioladitis (talk) 07:52, 10 July 2014 (UTC)[reply]
Would you please active fa translation? I want to start translating this tool in Farsi but it doesn't have any page for farsiYamaha5 (talk) 05:26, 11 July 2014 (UTC)[reply]
- Yamaha5, so you are the poor sucker that Ladsgroup rounded up. :)
- If you want to set up the Persian Checkwiki, you need to create a translation file. If you goto here and click on any language, there will be a translation file towards the top. Arabic, French, Germany, Swedish, Czech,
Slovenian Slovak, Greek and English translation files are the ones being actively updated. So, it is best to use one of those as a template. Place it somewhere on fawiki and tell me the location. This way, fawiki is in control of what errors should be checked. For example, some errors are only applicable to Latin script.
- There are sections in the translation file for a whitelist (what articles create a false-positve) and templates. Every wiki has their own name for templates.
- WPCleaner also uses the same file for its use. If you set up the translation file, WPCleaner can be used on fawiki. Towards the end of the file, errors #500 and above are WPCleaner only. Everything else is WPCleaner and CheckWiki. Bgwhite (talk) 05:49, 11 July 2014 (UTC)[reply]
- Thank you for your fast answer :)
- I made fa:ویکیپدیا:ویکیپروژه تصحیح ویکیپدیا/ترجمه and I will start translating. Yamaha5 (talk) 05:58, 11 July 2014 (UTC)[reply]
- Hi Yamaha5, I've added fawiki to WPCleaner if you're interested. WPCleaner configuration is available at fa:کاربر:NicoV/WikiCleanerConfiguration. --NicoV (Talk on frwiki) 21:15, 13 July 2014 (UTC)[reply]
- NicoV Thank you for your edit.Yamaha5 (talk) 22:31, 13 July 2014 (UTC)[reply]
Hi, with the latest full dump, there seems to be a lot of false positives for #87 (HTML entities without ;). Examples from the 25 first pages reported:
- Inside an URL:
fr:2011 au Mali (&intr ), fr:Association malienne des droits de l'homme (&intr ), fr:California Love (&interval ), fr:Chaunac (homonymie) (&geocode )
- Inside the attribute "name" of a
<ref>...</ref> tag: fr:Avahi (&geissmann2000 ), fr:Avahi du Sambirano (&geissmann2000 ), fr:Ayurveda (&Rhodes ), fr:Baryonyx (&Newsbury2004 ), fr:Biochar (&Lehmann2008 ), fr:Caraka Saṃhitā (&Rhodes ), fr:Carnotaurus (&chiarelli2009 )
- Inside a
<timeline>...</timeline> tag: fr:Canton de Steenvoorde (id:Blancs&Nuls )
- Inside an image name: fr:Cathédrale de Bangor (
&CentralTower )
- Text ending with a ; (but not an HTML entity): fr:48 Persei (
&phis; )
- Uppercase/lowercase problem (entity names are supposed to be case sensitive): fr:Andrea Cagnetti - Akelo (
&Gem , probably matching &ge ), fr:Aldo Cibic (&Partners , probably matching &part )
--NicoV (Talk on frwiki) 20:54, 21 July 2014 (UTC)[reply]
- NicoV We turned off #87 on enwiki because of the false positives. I'm not sure how to fix this. The hard part is there can be letters or numbers after an entity. Any ideas? Bgwhite (talk) 22:32, 21 July 2014 (UTC)[reply]
- Bgwhite Apart from the last 2, I think the only thing that could be done is filtering out the errors when they are found in special places (URL, attribute of a tag, timeline, image, ...). For the last one, I only see doing a case sensitive compariso. And for the &phis;, I don't know... Not very helpful, sorry. --NicoV (Talk on frwiki) 06:00, 22 July 2014 (UTC)[reply]
Hi, on frwiki, fr:Fièvre hémorragique Ebola is detected with the following notice </ref>. | width = 225 | icd1 . The notice is related to text in the infobox, but I don't see any problem there: there's a opening ref tag before. --NicoV (Talk on frwiki) 16:36, 22 July 2014 (UTC)[reply]
- NicoV check now. I fixed some spacing. -- Magioladitis (talk) 19:01, 22 July 2014 (UTC)[reply]
- Bgwhite, Magioladitis, you both modified the article to remove carriage return inside the refs text, but I don't think that should trigger #94. --NicoV (Talk on frwiki) 19:10, 22 July 2014 (UTC)[reply]
- Magioladitis, NicoV, it isn't fixed. I was thinking a hidden character might be the problem, so I re-typed out the ref. But, that wasn't the problem. Bgwhite (talk) 20:25, 22 July 2014 (UTC)[reply]
Hi Bgwhite, fr:Fièvre hémorragique Ebola is popping up almost daily, and there's also a false positive with fr:Multiplicateur de tension, with the following notice <ref name="yuan">{{Harvnb|Yuan|2010|pp=1 , where I don't see any problem. --NicoV (Talk on frwiki) 09:36, 8 August 2014 (UTC)[reply]
- NicoV. It isn't a false positive, but checkwiki is showing the wrong location. Ref names should not contain < or >.
- In Fièvre hémorragique Ebola, the error was at:
<ref name="10.1002/(SICI)1096-9071(199911)59:3<341::AID-JMV14"> . I removed the offending <. Now for the sad part. AWB did pick up the error and the correct spot. Crap.
- For Multiplicateur de tension, it is showing the correct spot, but it is the space before > that is issuing the error. </ref > should be </ref>. This was talked about a few months back. Bgwhite (talk) 06:14, 9 August 2014 (UTC)[reply]
- Ok, thanks, I will try to add this to WPCleaner. --NicoV (Talk on frwiki) 08:59, 9 August 2014 (UTC)[reply]
- Forgot to say that it's added in WPCleaner. --NicoV (Talk on frwiki) 08:01, 22 August 2014 (UTC)[reply]
Hi @Bgwhite:, I was wondering if we could enhance the integration between Check Wiki and tools like WPCleaner, by providing access to the direct analysis of an article in Check Wiki: I'd like to be able to send a request to Check Wiki script checkwiki_bots.cgi (with the following parameters: wiki, article title, article text) and receive an answer telling me which errors are still detected and where (character position ?). I don't know how much work that would be on your side, but that could be very helpful to users when WPCleaner doesn't detect the problem CW detected: we would know if CW thinks that the problem is still present and where, so I could tell the user where it is on their current version of the article. --NicoV (Talk on frwiki) 20:01, 10 August 2014 (UTC)[reply]
Hi, I was thinking about a new error for detecting empty titles, like the ones VE is creating on a regular basis (== <nowiki /> == ). --NicoV (Talk on frwiki) 18:10, 13 August 2014 (UTC)[reply]
Hi, I just found out that there were several Check Wiki main pages:
--NicoV (Talk on frwiki) 08:13, 14 August 2014 (UTC)[reply]
Hi, fr:Élément meta is reported by #92 with the notice "=== L'attribut ===". It seems that it's because there are several titles in the form L'attribut <code>something</code> . I think contents of <code>...</code> should be kept for analyzing #92. --NicoV (Talk on frwiki) 10:36, 14 August 2014 (UTC)[reply]
- NicoV, I'm not sure how to get around this. I've got headings inside code tags. Not sure how to remove one without the other. Bgwhite (talk) 21:34, 20 August 2014 (UTC)[reply]
- Bgwhite Ok, seems difficult. Throwing idea: keep the text inside the code tags, but somehow encode it internally so that it doesn't looks like other things (base 64, ...). Not sure. If it's too difficult, forget about it, we'll end up using the white list. --NicoV (Talk on frwiki) 21:57, 20 August 2014 (UTC)[reply]
Hi, when clicking on "Done", the list is displayed again and at the beginning of the page, there's the name of the article that has been marked as done. If this name contains accented characters, they are badly displayed. For example, in the list for #96, I clicked on Done for Liste des députés de la treizième législature par circonscription, the page is displayed with "Liste des députés de la treizième législature par circonscription" just after the Check Wikipedia title. --NicoV (Talk on frwiki) 12:09, 19 August 2014 (UTC)[reply]
Done
Hi, a suggestion for a prettier notice for #25 errors: instead of displaying a <br> between the two titles, maybe put a real line break so that the two titles are one above an other. Just a suggestion to have a better display. --NicoV (Talk on frwiki) 22:00, 20 August 2014 (UTC)[reply]
- NicoV Done Bgwhite (talk) 07:35, 22 August 2014 (UTC)[reply]
Done
Moin Moin Bgwhite, at this morning I would like to open the Check Wikipedia an got the following massage: Cloud not connect to database: Host '10.68.17.174' is blocked because of many connection errors; unblock with 'mysqladmin flush-hosts'. Could you have a look at? Thanks --Crazy1880 (talk) 04:58, 21 August 2014 (UTC)[reply]
- Crazy1880, WMFLabs database went down about 1/2 hour ago. Nothing I can do on my end. Also, the dump directory has been down for almost two months, which is the reason for no updates. Bgwhite (talk) 05:04, 21 August 2014 (UTC)[reply]
- Bgwhite, yes, i heard about this and i saw the bugzilla alert from user Merlissimo and this using for bot MerlBot. He has the same problems. Thanks and king regards. --Crazy1880 (talk) 06:45, 21 August 2014 (UTC)[reply]
This edit [8] breaks the formatting, because (contrary to popular belief) a blank line is not always equivalent to <p>. Please fix your tools to operate only where you understand the effects of what you're doing and, ideally, stop "fixing" things that aren't broken in pursuit of some perfectionist ideal of what markup should look like. Thanks. EEng (talk) 00:53, 8 August 2014 (UTC)[reply]
- I think it would be great if at least as much attention was given to not breaking things as is given to fixing not-broken things. Could I please have a response on this? EEng (talk) 13:08, 22 August 2014 (UTC)[reply]
- EEng, This edit has been made manually by Sfan00 IMG, not automatically by any tool. --NicoV (Talk on frwiki) 13:12, 22 August 2014 (UTC)[reply]
- Then why does the edit summary say WPCleaner v1.33 - Fixed using WP:WCW, with a link to this very page? EEng (talk) 13:14, 22 August 2014 (UTC)[reply]
- Hi EEng. Sfan00 IMG was using WPCleaner as the tool for editing. WPCleaner detects the same things that WP:WCW, and shows to the user what it has detected: in this case, as enwiki WP:WCW is configured to detect use of <p>, WPCleaner highlighted the <p> in the text. Then, the user decided to remove it. At the end, WPCleaner knew that there was a <p> in the original version, and that <p> has been removed, so it suggested an automatic comment. --NicoV (Talk on frwiki) 13:20, 22 August 2014 (UTC)[reply]
- OK, we're making some progress. So please tell me: why does WCW highlight < p>? EEng (talk) 13:29, 22 August 2014 (UTC)[reply]
- Technically, because error #39 (HTML text style element <p>) is activated in WCW configuration file. --NicoV (Talk on frwiki) 14:35, 22 August 2014 (UTC)[reply]
- What purpose is served by activating it? Please answer in terms of how articles are improved by highlighting < p>, not in terms of the mechanisms of operation of these tools. EEng (talk) 15:33, 22 August 2014 (UTC)[reply]
- We've been thru this before. You do not like anything about Checkwiki. You've told us to fuck off. You've called us MOS Nazis. We show where in MOS, but you've used MOS is just a guideline/policy and IAR. The funny thing is, one of the reasons Phineas Gage is not a GA is because of your idiosyncratic formatting. The very thing we've been preaching is one of things holding back your GA nomination. Eleanor Elkins Widener is already on the whitelist and won't be checked for <p> again. Bgwhite (talk) 17:35, 22 August 2014 (UTC)[reply]
|