Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Page MenuHomePhabricator

Track share button usage
Closed, DeclinedPublic

Description

Let's instrument the share button (T181195) to verify users need this functionality.

We can share the following metrics:

  • clickthrough rate of the Share button
  • abandonment rate (defined as the percentage of share button clicks, i.e. openings of the share menu, that are not followed by the selection of a sharing app from that menu)
  • the number of pageviews from shared links (by adding a wprov URL parameter)

Schema: https://meta.wikimedia.org/wiki/Schema:MobileWebShareButton

QA Steps

Verify EventLogging events

  • Visit beta cluster on a phone which supports share (e.g. Android+Chrome)
  • Enable beta mode
  • Visit https://en.wikipedia.beta.wmflabs.org/wiki/Spain
    • Verify an event is sent on pageload with action="shownShareButton", pageTitle="Spain", namespaceId=0. Note the value of pageToken
  • Click share link
    • Verify an event is sent with action="clickShareButton" and the same values for pageTitle, namespaceId and pageToken as above.
  • Select an app from the sharing menu (e.g. the clipboard), to have the browser pass the link to that app
    • Verify an event is sent with action="SharedToApp" and the same values for pageTitle, namespaceId and pageToken as above.

Verify provenance parameter

Check that the link passed to the sharing app looks like this: https://en.wikipedia.beta.wmflabs.org/wiki/Spain?wprov=mfsw1

Event Timeline

@phuedx @Jdlrobson you might know the answer to the open question.

Tracking republication of Wikimedia content came up on analytics@ recently: https://lists.wikimedia.org/pipermail/analytics/2018-October/006460.html. Obviously, sharing content is different but the conversation seems relevant.

I think that a better way is to track it on the backend by using ?share_link, but I'm afraid there might be some performance impact (requests not hitting cache).
Are there any performance issues related to adding a parameter to the URL and tracking it on the backend?

Yes. As you've identified, the request would have to be served by the origin server, bypassing the edge cache(s).

This isn't a concern right now as the feature is being enabled for Mobile Beta™ users whose browsers support the Web Share API. By measuring both the number of shares and the number of clicks on those shares, you'll be able to estimate whether this will be a performance concern as your cohort of users grows (you promote the feature to 20% of all users, say).

Are we able to easily check the number of requests with an additional query parameter?

Yes. Since the share button code lives in Minerva, my recommendation would be to add a distinct BeforePageDisplay hook handler dedicated to checking for the existence of a parameter and then incrementing a counter:

public static function onBeforePageDisplay( OutputPage $out /* , ... */ ) {
  $request = $out->getRequest();

  if ( $request->getFuzzyBool( 'is_shared' ) ) {
    $statsd = MediaWikiServices::getInstance()->getStatsdDataFactory();

    // Note well that bucket names are prefixed by "MediaWiki." by default so this is
    // actually increment the "MediaWiki.minerva.shares.viewed" counter.
    $statsd->increment( 'minerva.shares.viewed' );
  }
}

Could the page view API be used for checking links back?

ovasileva triaged this task as Medium priority.Oct 19 2018, 6:26 PM

Could the page view API be used for checking links back?

Yes, if we knew which pages were being shared.

If we only count the number of pages being shared (which is the initial suggestion), then the most we can know from the Pageview API is that a page became a little more popular but not why.


Relatedly, we could query the wmf.webrequest table for requests with a specific query parameter rather than include the code that I wrote in T207280#4676470.

Pro: We wouldn't have to write any server-side code like in my comment above.
Con: We wouldn't be able to plot a graph in Grafana without any server-side code.

Both this solution and the solution in my comment above would both be impossible if we were to use hash fragments as they aren't sent to the server.

Assigning to Piotr to do the analysis and marking as low as this is currently his 10% project.

Jdlrobson lowered the priority of this task from Medium to Low.Oct 31 2018, 10:07 PM

@pmiazga Let's talk about this and avoid rolling our own here - e.g. there is already an existing mechanism that avoids cache fragmentation: https://wikitech.wikimedia.org/wiki/Provenance

(Update: we discussed this earlier this week and decided that a simple EventLogging schema would be useful. I'm going to create one, but there are still some relevant questions open in the other task, see T181195#4982077 ff.)

...

Relatedly, we could query the wmf.webrequest table for requests with a specific query parameter rather than include the code that I wrote in T207280#4676470.

Pro: We wouldn't have to write any server-side code like in my comment above.
Con: We wouldn't be able to plot a graph in Grafana without any server-side code.

Both this solution and the solution in my comment above would both be impossible if we were to use hash fragments as they aren't sent to the server.

BTW, one can actually get a graph in Turnilo when using a wprov query parameter, using the (albeit heavily sampled) webrequest dataset there. See e.g. this example for the existing "Share a link to an article (from lead image toolbar, or link preview)" Android app feature. With the usual benefits ;)

I started to draft a schema at https://meta.wikimedia.org/wiki/Schema:MobileWebShareButton , per the discussion at T181195#4982077 ff. It is partly modeled after the Print schema, e.g. includes an event for when the button is shown (triggered during a normal page load) so we can calculate the button's clickthrough rate directly, which is a more useful success metric than the absolute number of clicks.

I did not add a sampling method and ratio, as this is only meant to be deployed on Mobile beta for now, where instrumenting all (applicable) views seems appropriate. If and when when we want to get data on the use in production, we should limit the data by adding sampling.

Change 496995 had a related patch set uploaded (by Pmiazga; owner: Pmiazga):
[mediawiki/skins/MinervaNeue@master] Track links shared by Share feature

https://gerrit.wikimedia.org/r/496995

Change 496996 had a related patch set uploaded (by Pmiazga; owner: Pmiazga):
[mediawiki/skins/MinervaNeue@master] Track share button usage

https://gerrit.wikimedia.org/r/496996

Change 496995 merged by jenkins-bot:
[mediawiki/skins/MinervaNeue@master] Track links shared by Share feature

https://gerrit.wikimedia.org/r/496995

Change 496996 merged by jenkins-bot:
[mediawiki/skins/MinervaNeue@master] Track share button usage

https://gerrit.wikimedia.org/r/496996

How should this be QA'd? What are the steps that should be taken to prove that this works?

@pmiazga I have added the mandatory schema documentation to the talk page (feel rope in other maintainers, or fill out the project name): https://meta.wikimedia.org/wiki/Schema_talk:MobileWebShareButton
It also needs a whitelisting decision. I suggest to follow the example of Schema:Print here too (code).

I've QAed and verified events are shown on share clicks.
As for click through links, I can verify that the URL shared contains wprov - is that enough to call this done?

I've QAed and verified events are shown on share clicks.

The schema has three different events (actions), and several fields. For QA, all these need to be checked. I have updated the task description with steps that actually achieve this.

As for click through links, I can verify that the URL shared contains wprov - is that enough to call this done?

"contains wprov" is not sufficient, it needs to use the parameter value we picked for this, in the format required by Varnish. (On the other hand, I'm not quite sure what you meant by "check if it shows up in a page view table" - wprov parameters don't show up there, only in webrequest. In any case we don't need to debug the Varnish / refinery pipeline here, just make sure we format the URL as specified in its documentation.) I have updated the task description in that regard too.

The schema has three different events (actions), and several fields. For QA, all these need to be checked. I have updated the task description with steps that actually achieve this

I can confirm I have verified all of these.

contains wprov" is not sufficient, it needs to use the parameter value we picked for this, in the format required by Varnish

The value is also there. This I can confirm

I'm not quite sure what you meant by "check if it shows up in a page view table" - wprov parameters don't show up there, only in webrequest

Webrequest is what I'm referring to. My understanding is that there is no webrequest table for the beta cluster so we won't be able to test this unless the code is enabled in production.

The schema has three different events (actions), and several fields. For QA, all these need to be checked. I have updated the task description with steps that actually achieve this

I can confirm I have verified all of these.

OK thanks! Feel free to mark the new QA steps as passed if we have indeed verified the fields for all three actions.

contains wprov" is not sufficient, it needs to use the parameter value we picked for this, in the format required by Varnish

The value is also there. This I can confirm

I'm not quite sure what you meant by "check if it shows up in a page view table" - wprov parameters don't show up there, only in webrequest

Webrequest is what I'm referring to. My understanding is that there is no webrequest table for the beta cluster so we won't be able to test this unless the code is enabled in production.

Again though, there is no need to test this. The provenance mechanism was established four years ago already and has been relied upon by numerous projects since then (see the list of reserved values). I appreciate a healthy dose of paranoia about data quality, and some double-checking can't hurt, but we don't really need to spend time here to verify that Varnish / the webrequest refinery still processes the wprov parameter as documented.

Change 501715 had a related patch set uploaded (by HaeB; owner: HaeB):
[analytics/refinery@master] Add MobileWebShareButton schema to EventLogging whitelist

https://gerrit.wikimedia.org/r/501715

Filed a whitelist patch. Rather than the Print schema I mentioned above, the ReadingDepth schema turned out to be a better example to follow here (note that in this case we don't track session IDs to begin with).

Change 501715 merged by Milimetric:
[analytics/refinery@master] Add MobileWebShareButton schema to EventLogging whitelist

https://gerrit.wikimedia.org/r/501715

Share feature has been removed :(