Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Page MenuHomePhabricator

Add geolocation information to EditAttemptStep
Closed, ResolvedPublic

Description

The Editing team would like to add geolocation data to the EditAttemptStep schema (This was removed as part fo the migration process for legacy schemas in T262626 and never added to the Event Platform schemas).

This data is needed to complete several planned analyses for the upcoming fiscal year where we would like to know the geolocation information (country and region) of editors that attempt but never complete an edit.

Implementation details

Per details I found in a similar task (T287121#7229889), this will require adding the http.client_ip field to the schema. EventGate will then automatically populate it. See more details in https://wikitech.wikimedia.org/wiki/Event_Platform/Schemas/Guidelines#http_information

Done

  • Patch is created and merged to add http.client_ip field back to EditAttemptStep
  • @MNeisler reviews aggregate data in EditAttemptStep once deployed to confirm it is working

Event Timeline

I wound up doing this for a different schema in T310390, so assuming that data comes out correctly I do know what's needed now.

Change 842452 had a related patch set uploaded (by DLynch; author: DLynch):

[schemas/event/secondary@master] Include client_ip in EditAttemptStep schema

https://gerrit.wikimedia.org/r/842452

Change 842490 had a related patch set uploaded (by DLynch; author: DLynch):

[mediawiki/extensions/WikimediaEvents@master] Bump EditAttemptStep schema version

https://gerrit.wikimedia.org/r/842490

Change 842452 merged by jenkins-bot:

[schemas/event/secondary@master] Include client_ip in EditAttemptStep schema

https://gerrit.wikimedia.org/r/842452

Change 842490 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] Bump EditAttemptStep schema version

https://gerrit.wikimedia.org/r/842490

DLynch moved this task from Code Review to QA on the Editing-team (Kanban Board) board.

This needs to be checked up on by Megan, as it's 100% on "is the correct data being stored now?"

MNeisler triaged this task as Medium priority.Oct 21 2022, 2:24 PM
MNeisler edited projects, added Product-Analytics (Kanban); removed Product-Analytics.
MNeisler added a subscriber: ppelberg.

I reviewed the aggregate data and confirmed that the recently added geolocation data and are logged and stored as expected in EditAttemptStep. All edit attempts now have an associated county, continent, and country code that can be accessed using the geocoded_data field.

We start fully logging this data on 18 October 2022.

@ppelberg - Reassigning to you for sign-off