Learn from user corrections to avoid editing the same term again and again
Open, MediumPublic
Actions

Assigned To

None

Authored By

	Pginer-WMF
	Apr 15 2015, 5:18 PM

Description

When translating, users have to correct some terms that are not properly translated by the Machine Translation (MT) service. For example, when translating John Carpenter article, the director's surname can be translated into whichever term is used in the local language for the "carpenter" profession. Since an article is about a specific topic there are chances that those mistakes need to be fixed by our users again and again.

From our user testing sessions we have observed that while fixing it the first time is reasonable, users were negatively surprised that the system didn't learnt the lesson for the next time.

While improving MT services is probably out of the scope for the project, it may be worth it to think in ways CX can save the user time in that process of correction. Some of these mechanisms can be also useful when there is no MT at all acting like a very basic (maybe at word level) MT-like system based on what you have already translated.

Proposed solution

Keep track of user corrections on MT that happens repeatedly. We need to decide how many times, how many words and how long they should be to consider them a correction.
Replace previous corrections when a paragraph is added if the corrected word is found.
Provide a way for users to switch among the alternatives (which include the MT proposed term and the one used in previous corrections).
Learn from the use of the alternatives to decide whether to apply corrections automatically or just suggest them.

We'll illustrate the idea with the example of translating the Los Angeles article from Spanish to English. Since "angeles" means "angels" in Spanish, we'll assume that the MT service is going to translate the name of the city too literally, and the user corrects those in the first paragraph:

Initial translation with errors	User corrects the first paragraph

When adding the second paragraph (where the name of the city appears again), the system will replace it automatically from the proposed text by the MT based on the fact that the user has corrected it in previous paragraphs. In addition, the replaced text will be highlighted to communicate the user that a correction was applied automatically. This allows the user to undo the automatic change.

CX-corrections-applied.png (720×1 px, 264 KB)

CX-corrections-options.png (720×1 px, 271 KB)

Whether to apply the correction automatically or let the user do it will depend on previous decisions of the user for that word. If the user undoes an alternative that was applied on a paragraph, we can consider not using the replacement word in following paragraphs, and just highlight the MT version to let the user know they can pick an alternative manually. In the example below, the correction is not applied automatically, only suggested for the user to apply:

CX-corrections-not-applied.png (720×1 px, 264 KB)

CX-corrections-not-applied-options.png (720×1 px, 271 KB)

Additional considerations

Based on input from T339907, it may be relevant to consider:

User-defined corrections. Manually introduce new corrections or edit existing ones.
Support for apply corrections on a broader scope. Users may want to keep some corrections for the following translations, or propose them as general rules for the community.

A first step in this direction is providing alternative for link labels based on the target article title (T197662).
This is also related to the notion of Translation Memory. Based on experiences from using the tool in the class we got input that it may be useful for corrections to be shareable for articles in the same category/topic area, or across a group of users in the same program/campaign/event.

Related Objects
Search...

Status	Assigned	Task
Open	None	T76456 Language Engineering tracker of trackers (tracking)
Open	None	T127695 Design polishing for Content Translation
Resolved	dchen	T161188 Editing-Lang CX translation tool + templates hybrid testing sessions
Resolved	dchen	T161189 Fix issues found during Content Translation templates research
Open	None	T105923 Fix issues found during Wikimania 2015 Hackathon and Workshops
Open	None	T96165 Learn from user corrections to avoid editing the same term again and again

Event Timeline

Pginer-WMF created this task.Apr 15 2015, 5:18 PM

Pginer-WMF raised the priority of this task from to Needs Triage.

Pginer-WMF updated the task description. (Show Details)

Pginer-WMF added projects: ContentTranslation, Design.

Pginer-WMF subscribed.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 15 2015, 5:18 PM

This is something that I'd really love to have in some way, as a collaboration with dictionary and MT builders.

Somewhat related issues: T91748, T95886, T92243.

Amire80 triaged this task as Low priority.May 4 2015, 4:51 PM

Amire80 raised the priority of this task from Low to Medium.Jul 2 2015, 5:08 PM

Amire80 moved this task from Needs Triage to Bugs on the ContentTranslation board.

Amire80 set Security to None.

Amire80 lowered the priority of this task from Medium to Low.Oct 15 2015, 10:14 AM

Amire80 added a project: OKR-Work.

A relate case of quick correction for what MT proposes is link translation (T145009). When automatic translation fails for links, the linked article title may be the intended information (or a good-enough approximation). Suggesting that as a quick correction can be helpful.

Pginer-WMF mentioned this in T145009: Unnecessary piped link in en->fr translation.Oct 12 2016, 8:46 AM

Pginer-WMF added a parent task: T127695: Design polishing for Content Translation.Oct 12 2016, 8:50 AM

The issue of repeatedly fixing "errors, typos and mistakes" of MT was mentioned in this comment.

Framawiki awarded a token.Jun 20 2017, 4:12 PM

Framawiki subscribed.

Pginer-WMF claimed this task.Jun 27 2017, 8:15 AM

Pginer-WMF updated the task description. (Show Details)

Pginer-WMF added a project: User-Petar.petkovic.

I added mockups and illustrated an example to show how the feature could work.

Pginer-WMF removed Pginer-WMF as the assignee of this task.Jul 4 2017, 11:58 AM

Pginer-WMF raised the priority of this task from Low to Medium.

Pginer-WMF removed a project: Design.

Pginer-WMF unsubscribed.

Pginer-WMF subscribed.

Pginer-WMF mentioned this in T106602: Remember user corrections on MT and suggest them later in similar situations.Jan 11 2018, 1:15 PM

Pginer-WMF merged a task: T106602: Remember user corrections on MT and suggest them later in similar situations.

Pginer-WMF added a parent task: T161189: Fix issues found during Content Translation templates research.

Pginer-WMF added a parent task: T105923: Fix issues found during Wikimania 2015 Hackathon and Workshops.

Pginer-WMF added subscribers: He7d3r, • Nirzar, santhosh, Arrbee.

Would the proposed solution work with accented letters (e.g. the case listed at T152905, which is very common in math articles)?

In T96165#3894817, @He7d3r wrote:

Would the proposed solution work with accented letters (e.g. the case listed at T152905, which is very common in math articles)?

The initial idea is to consider as a correction any modification made to the text, which should work with accented letters and other symbols. However we may want to consider certain thresholds as we start working in this ticket. For example, if the users changes one character from lowercase to uppercase is this a change a correction we want to apply automatically the next time or is that correction likely to fail when applied in the next gramatical context?
In any case, since te approach allows to easily undo the changes, I think it should be ok to start with a basic approach and learn from the different situations we observe int he different languages.

• Petar.petkovic removed a project: User-Petar.petkovic.Apr 13 2018, 10:05 PM

Pginer-WMF mentioned this in T197662: CX2: Quickly switch between alternative link label translations.Jun 19 2018, 10:29 AM

Pginer-WMF updated the task description. (Show Details)Jun 19 2018, 10:36 AM

Arrbee moved this task from Bugs to Enhancements on the ContentTranslation board.Jun 22 2018, 1:41 PM

Arrbee moved this task from Bugs to Enhancements on the ContentTranslation board.

Pginer-WMF updated the task description. (Show Details)Jul 16 2018, 10:16 AM

Pginer-WMF mentioned this in T197688: CX2: Control whether references are translated or not.Jul 16 2018, 11:03 AM

Pginer-WMF mentioned this in T206512: CX2: Convert quotation marks automatically to the appropriate ones in the target language.Oct 9 2018, 9:58 AM

Pginer-WMF merged a task: T214259: Provide a Find and Replace button / functionality (to fix wrong machine translations in several places) .Jan 21 2019, 2:56 PM

Pginer-WMF mentioned this in T214259: Provide a Find and Replace button / functionality (to fix wrong machine translations in several places) .

Pginer-WMF added subscribers: Sushant_savla, KartikMistry.

Pginer-WMF mentioned this in T220440: GSoC 2019 Proposal: Integrate SVG Translate with Content Translation.Apr 29 2019, 10:59 AM

Pginer-WMF mentioned this in T225298: Language Annual Plan 2019-2020.Jun 11 2019, 10:12 AM

Pginer-WMF mentioned this in T137450: SI-units is translated with capital letters between Nynorsk and Bokmål.Jun 24 2019, 2:36 PM

Pginer-WMF edited projects, added CX-boost; removed ContentTranslation.Jul 26 2019, 10:29 AM

Pginer-WMF added a project: Language-Team (Language-2020-January-March).Dec 12 2019, 11:41 AM

Pginer-WMF mentioned this in T137401: Detection of entity names seems to fail quite often.Jan 20 2020, 11:52 AM

Pginer-WMF mentioned this in T215083: Change "." to "।" in Content Translation into Punjabi Wikipedia.Feb 17 2020, 12:46 PM

Pginer-WMF mentioned this in T279899: Learn from previous corrections on Section Translation editor.Apr 12 2021, 10:02 AM

A user recently reported this kind of issue when translating the article about the emperor Basil I, translated by MT as the herb with the same name in each instance in the article,

Pginer-WMF updated the task description. (Show Details)Sep 22 2021, 1:01 PM

Pginer-WMF mentioned this in T339907: Manual translation correction book (personal or communal).Jul 4 2023, 12:54 PM

Pginer-WMF updated the task description. (Show Details)Sep 4 2023, 10:37 AM

Pginer-WMF merged a task: T339907: Manual translation correction book (personal or communal).Sep 4 2023, 10:39 AM

Pginer-WMF added a subscriber: BartTerpstra.

Pginer-WMF mentioned this in T352785: Compile requirements for a general system to make the most of community-provided translations.Dec 5 2023, 4:16 PM

	F23795356: CX-corrections-not-applied.png
	Jul 16 2018, 10:16 AM

	F23794464: CX-corrections-initial-add.png
	Jul 16 2018, 10:16 AM

	F23795387: CX-corrections-not-applied-options.png
	Jul 16 2018, 10:16 AM

	F8615694: CX-corrections-initial-add.png
	Jul 4 2017, 11:57 AM

Learn from user corrections to avoid editing the same term again and againOpen, MediumPublicActions

Description

Proposed solution

Related ObjectsSearch...

Event Timeline

Learn from user corrections to avoid editing the same term again and again
Open, MediumPublic
Actions

Related Objects
Search...