Public wiki
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Ladsgroup | T218155 Create Wikisource Hindi | |||
Resolved | • Bstorm | T219374 Prepare and check storage layer for hi.wikisource |
Event Timeline
Removing myself as the wiki isn't created yet. Assign to me once you've created the tables, so I can sanitize it before sending it to the WMCS Team
Adding Analytics as they are interested in knowing when this wiki finally gets created so they can sqoop data from it T227030: hi.wikisource added to labs replicas?
Mentioned in SAL (#wikimedia-operations) [2019-09-19T06:18:44Z] <marostegui> Sanitize hiwikisource on db1124:3313 and db2094:3313 T219374
I have sanitized hiwikisource on sanitarium hosts (db1124:3313 and db2094:3313) and I have checked that the users were sanitized correctly, and my user, which I created right after placing the triggers was created with the redacted fields already.
Now I am running a check private data on the sanitarium hosts as well as on labsdb1009 and labsdb1012 to be double sure.
This is ready for cloud-services-team - the views can be created on labsdb1009-labsdb1012.
The check data scripts have finished and everything is fine.
I have also changed the grants for labsdbuser role.
root@cumin1001:/home/marostegui/git/tendril/bin# for i in `seq 1009 1012`; do echo labsdb$i; mysql.py -hlabsdb$i -e " show grants for labsdbuser" | grep hiwikisource ;done labsdb1009 GRANT SELECT, SHOW VIEW ON `hiwikisource\\_p`.* TO 'labsdbuser' labsdb1010 GRANT SELECT, SHOW VIEW ON `hiwikisource\\_p`.* TO 'labsdbuser' labsdb1011 GRANT SELECT, SHOW VIEW ON `hiwikisource\\_p`.* TO 'labsdbuser' labsdb1012 GRANT SELECT, SHOW VIEW ON `hiwikisource\\_p`.* TO 'labsdbuser'
Getting an error when trying to create the views. Maybe similar to what we saw in T193187?
labsdb1009:~$ sudo /usr/local/sbin/maintain-views --database hiwikisource --debug 2019-09-19 13:10:02,117 DEBUG Removing 0 dbs as sensitive 2019-09-19 13:10:02,139 DEBUG SQL: GRANT SELECT, SHOW VIEW ON `hiwikisource\_p`.* TO 'labsdbuser'; Traceback (most recent call last): File "/usr/local/sbin/maintain-views", line 637, in <module> sys.exit(main()) File "/usr/local/sbin/maintain-views", line 614, in main ops.execute(fullviews, customviews) File "/usr/local/sbin/maintain-views", line 397, in execute self.db_p.replace("_", "\\_") File "/usr/local/sbin/maintain-views", line 54, in write_execute self.cursor.execute(query) File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 166, in execute result = self._query(query) File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 322, in _query conn.query(q) File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 852, in query self._affected_rows = self._read_query_result(unbuffered=unbuffered) File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1053, in _read_query_result result.read() File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1336, in read first_packet = self.connection._read_packet() File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1010, in _read_packet packet.check_error() File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 393, in check_error err.raise_mysql_exception(self._data) File "/usr/lib/python3/dist-packages/pymysql/err.py", line 107, in raise_mysql_exception raise errorclass(errno, errval) pymysql.err.OperationalError: (1044, "Access denied for user 'maintainviews'@'localhost' to database 'hiwikisource\\_p'")
Excellent, you can proceed on labsdb1010 and labsdb1011 too. I have created the DB there too.
DNS and all the other operations done too?
I still cannot access the database from the tools bastion:
marostegui@tools-sgebastion-07:~$ sql hiwikisource_p Could not find requested database marostegui@tools-sgebastion-07:~$ sql enwiki_p Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 614321315 Server version: 10.1.39-MariaDB MariaDB Server Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [enwiki_p]>
OK, the replica DNS entries are setup for hiwikisource now.
I went through the full instructions for cloud services https://wikitech.wikimedia.org/wiki/Add_a_wiki#Cloud_Services
Creating the replica indexes failed with
/usr/local/sbin/maintain-replica-indexes --database hiwikisource --debug 2019-09-19 13:35:22,973 DEBUG SELECT table_schema FROM information_schema.tables WHERE table_name='archive' and table_schema like 'hiwikisource' and table_type='BASE TABLE' 2019-09-19 13:35:22,987 DEBUG SHOW INDEX from hiwikisource.archive 2019-09-19 13:35:22,989 INFO ALTER TABLE hiwikisource.archive ADD KEY user_timestamp (ar_user, ar_timestamp) Traceback (most recent call last): File "/usr/local/sbin/maintain-replica-indexes", line 159, in <module> main() File "/usr/local/sbin/maintain-replica-indexes", line 150, in main write_index(change_cursor, db_name, index, args.dryrun, args.debug) File "/usr/local/sbin/maintain-replica-indexes", line 93, in write_index cursor.execute(query) File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 166, in execute result = self._query(query) File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 322, in _query conn.query(q) File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 852, in query self._affected_rows = self._read_query_result(unbuffered=unbuffered) File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1053, in _read_query_result result.read() File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1336, in read first_packet = self.connection._read_packet() File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1010, in _read_packet packet.check_error() File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 393, in check_error err.raise_mysql_exception(self._data) File "/usr/lib/python3/dist-packages/pymysql/err.py", line 107, in raise_mysql_exception raise errorclass(errno, errval) pymysql.err.InternalError: (1072, "Key column 'ar_user' doesn't exist in table")
and inserting a new row in meta_p.wiki failed with
/usr/local/sbin/maintain-meta_p --databases hiwikisource /usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1007, u"Can't create database 'meta_p'; database exists") result = self._query(query) /usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1050, u"Table 'wiki' already exists") result = self._query(query) /usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1050, u"Table 'properties_anon_whitelist' already exists") result = self._query(query) Traceback (most recent call last): File "/usr/local/sbin/maintain-meta_p", line 369, in <module> main() File "/usr/local/sbin/maintain-meta_p", line 255, in main canonical = parse_php_canonical(initialise_settings) File "/usr/local/sbin/maintain-meta_p", line 68, in parse_php_canonical for line in rawCanonicalServer.split('\n'): UnboundLocalError: local variable 'rawCanonicalServer' referenced before assignment
Change 538030 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] index-conf.yaml: Remove unused index
This might be related to T233135: Schema change for refactored actor and comment storage cc @Anomie , please check this patch: https://gerrit.wikimedia.org/r/#/c/538030/
Regarding:
/usr/local/sbin/maintain-meta_p --databases hiwikisource /usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1007, u"Can't create database 'meta_p'; database exists") result = self._query(query) /usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1050, u"Table 'wiki' already exists") result = self._query(query) /usr/lib/python2.7/dist-packages/pymysql/cursors.py:166: Warning: (1050, u"Table 'properties_anon_whitelist' already exists") result = self._query(query) Traceback (most recent call last): File "/usr/local/sbin/maintain-meta_p", line 369, in <module> main() File "/usr/local/sbin/maintain-meta_p", line 255, in main canonical = parse_php_canonical(initialise_settings) File "/usr/local/sbin/maintain-meta_p", line 68, in parse_php_canonical for line in rawCanonicalServer.split('\n'): UnboundLocalError: local variable 'rawCanonicalServer' referenced before assignment
I guess it is because we created the _p database manually? I don't know how the insertion into meta_p works, so I am just guessing. Maybe @Bstorm knows?
Change 538030 merged by Marostegui:
[operations/puppet@production] index-conf.yaml: Remove unused index
@JHedden after your +1, I have merged the change. Can you run puppet and try again to see if that index error is no more?
For what is worth, I can access the hosts from bastion with -- cluster analytics and --cluster web and I can select fine from the tables, so most of the work is done successfully! Just that small error pending to fix :-)
This is working now, maintain-replica-indexes has been ran on all the replica servers now.
I've completed all the Cloud Services tasks defined at https://wikitech.wikimedia.org/wiki/Add_a_wiki#Cloud_Services
Other than the question around meta_p, I think everything is done.
Yes, it would be related. New wiki creations will use the new schema.
It looks like that user_timestamp index there was created to parallel the [ar_]usertext_timestamp (ar_user_text,ar_timestamp) index that had been defined in MediaWiki. It should be removed from all the replicas, matching the removal of [ar_]usertext_timestamp. MediaWiki provides ar_actor_timestamp (ar_actor,ar_timestamp) as the new equivalent.
I note the linked patch didn't just remove the obsolete index, it seems it also added one on (ar_timestamp). I don't know if that was intentional.
That meta_p and maintain-indexes thing will make the wiki non-functional for some purposes to users. The meta_p bit will link in tooling and the maintain-indexes breakage will make some queries horribly slow (it's for the joins).
I'll take a look into that today.
To provide context, we did a lot to prevent breakage of the tables during that refactor, but I didn't even think to check the indexes. The meta_p thing looks like an error that wasn't tested since it was merged? That'll take more digging.
Thanks for providing the context Brooke!
I will remove the temporary index tomorrow and do a --replace for that specific table in that specific new wiki, which shouldn't block you anyways
Looks like https://gerrit.wikimedia.org/r/c/operations/puppet/+/538030 fixes it, but I don't know if we need that timestamp index.
I'm not seeing anything changing in the meta_p script. Wonder if it talks to a server that's moved/down/etc.
The problem is apparently a lot of settings being moved from InitializeSettings.php to VariantSettings.php. I'm going to make sure the function will correctly parse the new file and, if so, document a command line that will use that instead.
I'm also making a subtask to use the sitematrix api in the future for this instead.
Change 538075 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wiki replicas: Switch to using VariantSettings.php for now
Change 538075 merged by Bstorm:
[operations/puppet@production] wiki replicas: Switch to using VariantSettings.php for now
Ok, I've run maintain-meta_p on all 4 replicas. Docs don't need an update because what I changed is hardcoded anyway.
This should be good to go.
Change 538140 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] index-conf.yaml: Remove temporary index timestamp on archive
Change 538140 merged by Marostegui:
[operations/puppet@production] index-conf.yaml: Remove temporary index timestamp on archive
Mentioned in SAL (#wikimedia-operations) [2019-09-20T05:07:29Z] <marostegui> Remove temporary index on hiwikisource views T219374
After merging puppet I ran the following on the 4 labsdb hosts to remove the temporary index on the hiwikisource view for archive table
/usr/local/sbin/maintain-views --databases hiwikisource --replace-all --clean