Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

feat(validation): accept Spanish normative keywords in validator#806

Open
santif wants to merge 3 commits intoFission-AI:mainfrom
santif:feat/i18n-validator-keywords
Open

feat(validation): accept Spanish normative keywords in validator#806
santif wants to merge 3 commits intoFission-AI:mainfrom
santif:feat/i18n-validator-keywords

Conversation

@santif
Copy link

@santif santif commented Mar 6, 2026

Summary

  • Add centralized NORMATIVE_KEYWORDS registry supporting English (SHALL, MUST) and Spanish (DEBE, DEBERA/DEBERÁ) RFC 2119 keywords
  • Replace hardcoded keyword checks in base.schema.ts and validator.ts with shared containsNormativeKeyword() function
  • Update validation error messages to list all accepted keywords across supported languages

Changes

  • src/core/validation/constants.ts — New NORMATIVE_KEYWORDS constant and containsNormativeKeyword() function with word-boundary matching
  • src/core/schemas/base.schema.tsRequirementSchema refinement now uses containsNormativeKeyword()
  • src/core/validation/validator.tscontainsShallOrMust() delegates to containsNormativeKeyword(); delta validation errors reference the shared message
  • test/core/validation.test.ts — Unit tests for English, Spanish, accented variants, lowercase rejection, substring rejection, and delta spec validation with Spanish keywords
  • docs/multi-language.md — Document accepted normative keywords per language
  • openspec/specs/ — New i18n-validator-keywords spec and updated cli-validate spec with multi-language hint scenario
  • openspec/changes/archive/ — Archived completed change

Test plan

  • containsNormativeKeyword() accepts SHALL, MUST, DEBE, DEBERA, DEBERÁ
  • Lowercase variants are rejected
  • Substring matches (e.g., INDEBTED, MUSTERING) are rejected
  • RequirementSchema passes with Spanish keywords
  • Delta validation passes ADDED/MODIFIED requirements with Spanish keywords
  • Existing English-keyword tests still pass (no regressions)

Summary by CodeRabbit

  • New Features

    • Multi-language normative keyword validation: accepts Spanish keywords (DEBE, DEBERA) in addition to English (SHALL, MUST)
    • Validation now reports the full set of accepted uppercase keywords in error messages
  • Documentation

    • Added guidance and English–Spanish mappings for RFC 2119 keywords and uppercase usage
  • Tests

    • Expanded validation tests to cover English, Spanish, and accented variants, plus negative cases

… validator

Add multi-language support for RFC 2119 normative keywords. Introduce a
centralized NORMATIVE_KEYWORDS registry in constants.ts with English
(SHALL, MUST) and Spanish (DEBE, DEBERA/DEBERÁ) keywords, replacing
hardcoded checks in base.schema.ts and validator.ts. Update error
messages to list all accepted keywords, add tests for all keyword
variants including substring rejection, and document accepted keywords
in multi-language.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 6, 2026

📝 Walkthrough

Walkthrough

Adds language-aware RFC 2119 keyword validation: introduces a centralized NORMATIVE_KEYWORDS registry and containsNormativeKeyword() helper (English + Spanish), and updates schema and validator checks, tests, specs, and docs to use the centralized, accent-aware matcher.

Changes

Cohort / File(s) Summary
Core Validation Constants
src/core/validation/constants.ts
Added NORMATIVE_KEYWORDS (en/es), MATCH_KEYWORDS (includes accented forms), and exported containsNormativeKeyword(text: string): boolean; updated requirement-missing message to reference the combined keywords.
Schema & Validator Integration
src/core/schemas/base.schema.ts, src/core/validation/validator.ts
Replaced inline English-only checks with containsNormativeKeyword() in RequirementSchema refinement and containsShallOrMust(); updated validation messages to use centralized message constant.
Tests
test/core/validation.test.ts
Added unit and integration tests for English, Spanish, accented (DEBERÁ), negative cases (lowercase, substrings), and ADDED/MODIFIED delta scenarios verifying multi-language behavior.
Documentation
docs/multi-language.md
Appended "Normative Keywords" section documenting accepted English and Spanish RFC2119 keywords and uppercase usage guidance.
Specification & Design
openspec/specs/i18n-validator-keywords/spec.md, openspec/specs/cli-validate/spec.md, openspec/changes/archive/2026-03-06-i18n-validator-keywords/...
New/updated spec and design docs describing centralized keyword registry, matching rules (word-boundary, uppercase, accented forms), scenarios, error message requirements, and tasks.
Metadata
openspec/changes/archive/2026-03-06-i18n-validator-keywords/.openspec.yaml
Added spec metadata file with schema and creation date.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Suggested reviewers

  • TabishB

Poem

🐰 I hopped through code to find a clue,
SHALL and DEBE now share the view,
One little helper, tidy and bright,
Makes keywords sing in Spanish and light,
A carrot for tests — validation true! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(validation): accept Spanish normative keywords in validator' accurately describes the main change—adding support for Spanish RFC 2119 keywords in the validation system.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link

greptile-apps bot commented Mar 6, 2026

Greptile Summary

This PR adds multi-language normative keyword support to the validator, centralizing the English (SHALL, MUST) and Spanish (DEBE, DEBERA/DEBERÁ) RFC 2119 keyword checks into a shared containsNormativeKeyword() function and NORMATIVE_KEYWORDS registry in constants.ts, eliminating prior duplication between base.schema.ts and validator.ts.

Key changes:

  • src/core/validation/constants.ts — new NORMATIVE_KEYWORDS map and containsNormativeKeyword() helper; REQUIREMENT_NO_SHALL message now lists all accepted keywords
  • src/core/schemas/base.schema.tsRequirementSchema refinement delegates to containsNormativeKeyword() instead of hardcoded includes('SHALL') || includes('MUST')
  • src/core/validation/validator.tscontainsShallOrMust() delegates to containsNormativeKeyword(); delta error messages reference shared VALIDATION_MESSAGES constant
  • test/core/validation.test.ts — unit tests for English, Spanish, accented, lowercase rejection, and substring rejection cases; integration tests for Spanish ADDED/MODIFIED delta specs
  • docs/multi-language.md — documents accepted normative keywords per language

Issues to address before merge:

  1. The regex pattern (?<=\s|^)${kw}(?=\s|$) uses strict whitespace boundaries instead of \b word boundaries, causing false negatives when keywords are adjacent to punctuation (e.g., SHALL,, DEBE:, (SHALL) — a realistic pattern in natural-language requirement prose. No test covers this edge case.
  2. new RegExp(...) objects are created inside the some() callback on every invocation — patterns should be precompiled at module initialization as a best practice for hot-path validation.

Confidence Score: 2/5

  • The regex pattern regression can silently reject valid requirements with punctuation-adjacent normative keywords, creating false negatives in real-world use that would go undetected until end users encounter them.
  • The architecture (centralized registry, shared helper, delegation pattern) is sound and well-tested for the happy path. However, the regex pattern uses whitespace-only boundaries instead of \b, which is stricter than both the original validator.ts implementation and the design doc specification. This regression could cause valid requirement text like "The system SHALL, handle errors" or "System DEBE: authenticate" to fail validation silently. Without a test covering punctuation-adjacent keywords, this bug could go unnoticed in production until users file reports. The RegExp precompilation issue is a minor performance concern but less critical than the regex correctness issue.
  • src/core/validation/constants.ts — the regex pattern and RegExp allocation need fixes before merge; test/core/validation.test.ts — missing test coverage for keywords adjacent to punctuation.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Requirement text input] --> B{containsNormativeKeyword}
    B --> C[Iterate MATCH_KEYWORDS<br/>SHALL, MUST, DEBE, DEBERA, DEBERÁ]
    C --> D{"(?<=\\s|^)KW(?=\\s|$)<br/>matches?"}
    D -- yes --> E[return true ✓]
    D -- no, next keyword --> C
    C -- exhausted --> F[return false ✗]
    
    E --> G1[base.schema.ts<br/>RequirementSchema<br/>.refine passes]
    E --> G2[validator.ts<br/>containsShallOrMust<br/>returns true]
    F --> H1[Zod error:<br/>Requirement must contain<br/>a normative keyword<br/>SHALL, MUST, DEBE, DEBERA]
    F --> H2[ValidationIssue ERROR:<br/>ADDED/MODIFIED block<br/>must contain normative keyword]
Loading

Last reviewed commit: 812592d

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
openspec/specs/i18n-validator-keywords/spec.md (1)

34-36: Avoid baking a private method name into the spec.

containsShallOrMust is just an internal helper name, and it's already misleading now that Spanish keywords are supported. Keeping this requirement at the shared-matcher behavior level will let the implementation rename or inline that method without forcing a spec change.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openspec/specs/i18n-validator-keywords/spec.md` around lines 34 - 36, The
spec currently references a private helper name (containsShallOrMust); update
the requirement to avoid naming internal methods: reword the scenario so that
when a new language keyword is added to NORMATIVE_KEYWORDS both the
RequirementSchema refinement and the validator (or shared matcher) that checks
for "shall"/"must"-style keywords rely on that constant and therefore recognize
the new keyword without additional code changes—do not mention or mandate the
internal method name containsShallOrMust so implementations can rename/inline it
freely.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@openspec/changes/archive/2026-03-06-i18n-validator-keywords/design.md`:
- Around line 39-63: The design and code disagree: update the code to match the
design by centralizing keyword matching into a single helper and using
word-boundary regex with unaccented canonical Spanish forms. Add a
containsNormativeKeyword(text: string): boolean in constants.ts (or keywords.ts)
that builds allKeywords = Object.values(NORMATIVE_KEYWORDS).flat() and returns
allKeywords.some(kw => new RegExp(`\\b${kw}\\b`, 'i').test(text)); replace the
existing hard-coded Spanish "DEBERÁ" and the (?<=\s|^)…(?=\s|$) lookaround
usages with calls to containsNormativeKeyword from both base.schema.ts and
validator.ts, and ensure NORMATIVE_KEYWORDS lists unaccented Spanish variants
(e.g., DEBERA) so accented variants match via word boundaries.

In `@openspec/changes/archive/2026-03-06-i18n-validator-keywords/proposal.md`:
- Around line 7-10: The archive entry overstates functionality that wasn't
implemented: rather than claiming config-driven language selection and localized
error output, update the proposal text to accurately reflect the shipped
behavior — that a fixed English/Spanish RFC2119 keyword map was added and that
RequirementSchema (base.schema.ts) and containsShallOrMust (validator.ts) accept
those keywords, but validation messages remain English; remove or reword the
bullets about configurable language selection and bilingual error messages so
the entry only documents the actual changes implemented.

In `@src/core/validation/constants.ts`:
- Line 35: Update the shared error message constant REQUIREMENT_NO_SHALL to
append an uppercase remediation hint so callers (RequirementSchema and the delta
validation used by cli-validate) clearly state that lowercase normative keywords
are rejected; modify the message built from NORMATIVE_KEYWORDS
(Object.values(NORMATIVE_KEYWORDS).flat().join(', ')) to include a short clause
like "use UPPERCASE" or equivalent guidance so both RequirementSchema and delta
validation show the remediation to use uppercase keywords.
- Around line 26-27: The containsNormativeKeyword function currently uses
whitespace-only lookarounds; change its regex to use word boundaries (e.g., new
RegExp(`\\b${kw}\\b`)) so keywords like "MUST:" or "**DEBE**" match correctly
(reference MATCH_KEYWORDS and containsNormativeKeyword). Also update the
validation/remediation error message that lists the accepted keywords to
explicitly state that keywords must be in UPPERCASE (use the MATCH_KEYWORDS
array to build the list so the message stays in sync).

In `@src/core/validation/validator.ts`:
- Around line 434-435: containsShallOrMust calls containsNormativeKeyword which
currently only matches keywords with ASCII word boundaries, so tokens like
"SHALL." or "DEBE," are missed; update the regex in containsNormativeKeyword (in
constants.ts) to use Unicode-aware boundaries by using lookarounds (e.g.,
(?<!\p{L})KEYWORD(?!\p{L})) with the Unicode flag so keywords adjacent to
punctuation still match, and add unit tests covering punctuation-delimited cases
for containsShallOrMust/containsNormativeKeyword to validate ADDED/MODIFIED
delta detection.

---

Nitpick comments:
In `@openspec/specs/i18n-validator-keywords/spec.md`:
- Around line 34-36: The spec currently references a private helper name
(containsShallOrMust); update the requirement to avoid naming internal methods:
reword the scenario so that when a new language keyword is added to
NORMATIVE_KEYWORDS both the RequirementSchema refinement and the validator (or
shared matcher) that checks for "shall"/"must"-style keywords rely on that
constant and therefore recognize the new keyword without additional code
changes—do not mention or mandate the internal method name containsShallOrMust
so implementations can rename/inline it freely.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 006cd6b6-7ad1-49aa-b947-c8e15cadbb32

📥 Commits

Reviewing files that changed from the base of the PR and between afdca0d and 812592d.

📒 Files selected for processing (13)
  • docs/multi-language.md
  • openspec/changes/archive/2026-03-06-i18n-validator-keywords/.openspec.yaml
  • openspec/changes/archive/2026-03-06-i18n-validator-keywords/design.md
  • openspec/changes/archive/2026-03-06-i18n-validator-keywords/proposal.md
  • openspec/changes/archive/2026-03-06-i18n-validator-keywords/specs/cli-validate/spec.md
  • openspec/changes/archive/2026-03-06-i18n-validator-keywords/specs/i18n-validator-keywords/spec.md
  • openspec/changes/archive/2026-03-06-i18n-validator-keywords/tasks.md
  • openspec/specs/cli-validate/spec.md
  • openspec/specs/i18n-validator-keywords/spec.md
  • src/core/schemas/base.schema.ts
  • src/core/validation/constants.ts
  • src/core/validation/validator.ts
  • test/core/validation.test.ts

Switch containsNormativeKeyword regex from whitespace-only lookarounds
to \b…(?=\W|$) so keywords adjacent to punctuation (MUST:, DEBE,) and
markdown formatting (**SHALL**) match correctly while still supporting
accented DEBERÁ. Update REQUIREMENT_NO_SHALL to list all MATCH_KEYWORDS
including DEBERÁ and state that keywords must be UPPERCASE. Align
archived proposal and design with shipped behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@santif santif requested a review from TabishB as a code owner March 6, 2026 06:39
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
openspec/changes/archive/2026-03-06-i18n-validator-keywords/design.md (1)

39-41: ⚠️ Potential issue | 🟡 Minor

Align this rationale with the matcher that actually shipped.

Lines 39-41 still say the solution avoids regex and relies on Array.includes(), but the implementation in src/core/validation/constants.ts and the very next section both describe a regex-based helper. Keeping the archive internally consistent will make future changes much easier to reason about.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openspec/changes/archive/2026-03-06-i18n-validator-keywords/design.md` around
lines 39 - 41, Update the design note to match the shipped implementation:
replace the statement that validation uses Array.includes() with an explanation
of the regex-based matcher actually implemented (referencing the regex helper
described in src/core/validation/constants.ts and the "matcher" utility in the
next section), explain why the team chose a regex helper over Array.includes()
for this case, and remove or reframe the “no regex patterns” rationale so the
archive is internally consistent with the shipped regex-based helper.
openspec/changes/archive/2026-03-06-i18n-validator-keywords/proposal.md (1)

25-27: ⚠️ Potential issue | 🟡 Minor

Make the Impact section describe shipped behavior, not a future possibility.

Line 27 says constants.ts "may need" a keyword map, but this change already adds NORMATIVE_KEYWORDS and containsNormativeKeyword(). Rewording this in present tense will keep the archived proposal aligned with the implementation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openspec/changes/archive/2026-03-06-i18n-validator-keywords/proposal.md`
around lines 25 - 27, The Impact section is written as future-possibility but
the implementation already added the new constants and function; update the
wording to present tense and state the shipped behavior: replace "may need new
keyword constants or a keyword map" with a definitive statement that
src/core/validation/constants.ts now exposes NORMATIVE_KEYWORDS and
src/core/validation/validator.ts provides containsNormativeKeyword() (and that
RequirementSchema now uses multi-language keyword checks) so the archived
proposal matches the implemented changes.
🧹 Nitpick comments (1)
test/core/validation.test.ts (1)

84-114: Add regression cases for punctuation- and Markdown-adjacent keywords.

The matcher change was made to accept forms like MUST: and **DEBE**, but this suite still only proves whitespace-delimited matches. A couple of positive cases here would lock in the exact behavior that motivated the regex update.

Suggested test addition
   it('should accept accented Spanish keyword', () => {
     expect(containsNormativeKeyword('El sistema DEBERÁ hacer X')).toBe(true);
   });
+
+  it('should accept keywords adjacent to punctuation and Markdown formatting', () => {
+    expect(containsNormativeKeyword('The system MUST: do X')).toBe(true);
+    expect(containsNormativeKeyword('**DEBE** es obligatorio')).toBe(true);
+    expect(containsNormativeKeyword('El sistema DEBERÁ, si falla, reintenta')).toBe(true);
+  });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/core/validation.test.ts` around lines 84 - 114, Add regression tests to
cover punctuation- and Markdown-adjacent matches for containsNormativeKeyword:
update the containsNormativeKeyword test suite to include positive cases such as
a keyword followed by punctuation ("The system MUST: do X" or "MUST: do X") and
a Markdown-emphasized keyword ("**DEBE** hacer X" or "El sistema **DEBERÁ**
hacer X") to ensure the matcher accepts those forms; add these expectations
alongside the existing positive cases referencing the containsNormativeKeyword
function so the test suite locks in the regex behavior for punctuation- and
Markdown-adjacent keywords.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@openspec/changes/archive/2026-03-06-i18n-validator-keywords/design.md`:
- Around line 39-41: Update the design note to match the shipped implementation:
replace the statement that validation uses Array.includes() with an explanation
of the regex-based matcher actually implemented (referencing the regex helper
described in src/core/validation/constants.ts and the "matcher" utility in the
next section), explain why the team chose a regex helper over Array.includes()
for this case, and remove or reframe the “no regex patterns” rationale so the
archive is internally consistent with the shipped regex-based helper.

In `@openspec/changes/archive/2026-03-06-i18n-validator-keywords/proposal.md`:
- Around line 25-27: The Impact section is written as future-possibility but the
implementation already added the new constants and function; update the wording
to present tense and state the shipped behavior: replace "may need new keyword
constants or a keyword map" with a definitive statement that
src/core/validation/constants.ts now exposes NORMATIVE_KEYWORDS and
src/core/validation/validator.ts provides containsNormativeKeyword() (and that
RequirementSchema now uses multi-language keyword checks) so the archived
proposal matches the implemented changes.

---

Nitpick comments:
In `@test/core/validation.test.ts`:
- Around line 84-114: Add regression tests to cover punctuation- and
Markdown-adjacent matches for containsNormativeKeyword: update the
containsNormativeKeyword test suite to include positive cases such as a keyword
followed by punctuation ("The system MUST: do X" or "MUST: do X") and a
Markdown-emphasized keyword ("**DEBE** hacer X" or "El sistema **DEBERÁ** hacer
X") to ensure the matcher accepts those forms; add these expectations alongside
the existing positive cases referencing the containsNormativeKeyword function so
the test suite locks in the regex behavior for punctuation- and
Markdown-adjacent keywords.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c7471d63-6f86-4ed6-9602-8b127d6629cd

📥 Commits

Reviewing files that changed from the base of the PR and between 812592d and 2552e9b.

📒 Files selected for processing (4)
  • openspec/changes/archive/2026-03-06-i18n-validator-keywords/design.md
  • openspec/changes/archive/2026-03-06-i18n-validator-keywords/proposal.md
  • src/core/validation/constants.ts
  • test/core/validation.test.ts

…e load

Avoid creating new RegExp objects on every containsNormativeKeyword call
by compiling patterns once. Switch from (?=\W|$) to equivalent (?!\w)
lookahead. Add test coverage for keywords adjacent to punctuation and
update spec/archive docs accordingly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@santif
Copy link
Author

santif commented Mar 6, 2026

Summary of recent changes

1. Hybrid word-boundary regex (2552e9b)

  • Replaced \b…\b pattern with \b…(?=\W|$) to fix detection of accented keywords (DEBERÁ), since JavaScript doesn't treat Á as a word char and \b was silently failing.
  • Aligned archive docs (design.md, proposal.md, spec.md) with the actual regex implementation.
  • Added test for the accented keyword DEBERÁ at end of line (no trailing space).

2. Precompile regex patterns (c070f2c)

  • RegExp objects are now created once at module load (COMPILED_PATTERNS) instead of being instantiated on every containsNormativeKeyword call.
  • Simplified lookahead from (?=\W|$) to (?!\w) (equivalent, more concise).
  • Added test for keywords adjacent to punctuation (SHALL,, (SHALL, DEBE:).
  • Updated spec and archive docs to reflect precompilation and the new punctuation scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant