Silicon Team S3E07: Between Role Contributions and an Extension Ecosystem, There's a Missing Governance Layer

Silicon Team S3E07

EP02 said all five role PRs followed the four-section format. EP03 said three signals made the contribution threshold low. But one question went unasked: were all five roles actually good enough?

The Review Reality of Five Role PRs

Look back at the PR records for #24-28. Here’s how these five role PRs were reviewed:

No unified review checklist. No “role quality standard” document. No criteria for judging “whether this Expertise list is complete.”

As the sole maintainer, the review standard was in my head:

Four-section format correct? Mechanical check, all five passed
Identity description clear? Mostly, but granularity varied
Expertise coverage reasonable? Hard to judge — Performance’s “algorithmic complexity analysis” is precise, but SRE’s “availability, alerting, capacity planning, incident response” spans too widely. Can one role really cover all that?
When to Include specific enough? Some were specific (“code touching hot paths”), some vague (“any production-related changes”)
Anti-Patterns useful? Some were real anti-patterns (“premature optimization without profiling evidence”), some were generic (“ignoring best practices”)

The problem isn’t that the five roles are bad — it’s that there’s no standard defining “good.”

Five Symptoms of Missing Governance

Role PRs are just the entry point. The deeper question: when external contributions start arriving, what happens without a governance layer?

Symptom 1: Namespace Conflicts

The five roles happened not to share names. But if #29 also submitted a performance.md focusing on a different performance dimension — one on algorithmic complexity, another on frontend rendering performance — whose performance wins?

The role filename is the namespace. No naming registry means: first come first served, or maintainer decides. The former is unfair; the latter doesn’t scale. Maintainer-decides means every naming conflict becomes a manual arbitration task — understanding the differences between two roles, judging which deserves the name, then explaining the decision to the rejected contributor. A single maintainer can handle this once or twice, but as role count grows from single digits to double digits, case-by-case rulings become a bottleneck. And the rulings aren’t transparent — future contributors can’t predict whether their chosen name will collide.

Symptom 2: Quality Variance

EP03 said role files are “safe to break” — worst case is one more noisy perspective. True, but “noisy perspectives” accumulate.

If the roles/ directory has 50 role files, 30 written precisely and 20 written vaguely, what happens to the review pipeline? OPC’s role selection algorithm (tag matching + relevance judgment) pulls vague roles into reviews too. Vague roles produce vague suggestions. The signal-to-noise ratio drops. Maintainers and users both spend time distinguishing valuable suggestions from noise.

Worse still, this erodes trust in the entire system. After receiving generic advice like “follow best practices” or “consider code quality” a few times, users start dismissing role-based review suggestions altogether — including the precise, high-value ones from well-written roles. This is Gresham’s Law applied to code review: bad roles drive out good ones, not by replacing them, but by training users to ignore all of them. Once users develop the habit of skipping role suggestions, the system is effectively dead.

One role’s noise cost is near zero. N roles’ cumulative noise cost is O(N). But the real price isn’t N itself — it’s that beyond a certain threshold of N, user trust in the entire system drops to zero.

Symptom 3: No Retirement Process

Tech domains change fast. A “React Class Component Expert” role written in 2024 might be obsolete by 2026 — the React ecosystem has fully shifted to function components and hooks. But there’s no mechanism to mark a role as “outdated” or “deprecated.” It sits in the roles/ directory indefinitely, occasionally gets selected, and gives outdated advice.

The deeper problem is information asymmetry: users can’t distinguish a role that’s “validated against the current tech stack” from one that’s “written two years ago and never touched since.” All roles in the roles/ directory look identical — same markdown files, same four-section format, no visual signal distinguishing new from old. An outdated role doesn’t just undermine its own reliability — it makes users doubt the currency of every role in the directory.

Symptom 4: No Overlap Detection

#27 Data Engineer and #28 SRE have overlapping responsibilities — data pipeline reliability belongs to both data engineering and site reliability engineering. If both roles are selected to review the same code, they might give contradictory advice — Data Engineer says “add a data validation layer,” SRE says “reduce processing steps to lower latency.”

Currently no mechanism detects overlap and potential conflicts between roles.

Symptom 5: Contributor Expectation Management

After #24-28 were submitted, contributors’ reasonable expectation is: if the PR gets merged, their role officially becomes part of OPC. But “merged” and “maintained” are two different things. If two months later the SRE role’s Anti-Patterns need updating, who updates them? The original contributor is gone. The maintainer (me) understands SRE less well than the contributor did.

This is the classic open source dilemma: merging a PR is a one-time decision; maintaining a feature is an ongoing commitment. Contributors have finite attention windows — they may change jobs, switch tech stacks, or simply lose interest. Role files differ from code contributions in a crucial way: code has tests as a safety net — break something and CI tells you. But a role’s “correctness” depends on domain knowledge that lives in the contributor’s head. When the person leaves, the knowledge leaves with them.

From “Accepting” to “Governing”

These five symptoms point to one conclusion: accepting contributions and governing contributions are two entirely different capabilities.

Accepting contributions requires: a clear interface (four-section format) + an open attitude (CONTRIBUTING.md says “welcome”) + a merge button. Any GitHub repository has these three things — the ability to accept contributions exists from day one. That’s why “accepting contributions” looks so easy: its threshold is at the tooling level, not the capability level.

Governing contributions requires a complete set of mechanisms:

Mechanism 1: Quality Standards Document

Role files need a verifiable quality checklist, not subjective “looks good to me”:

## Role Quality Checklist

### Identity (required)
- [ ] Starts with "You are a..."
- [ ] Specifies the review perspective (not generic "code quality")
- [ ] Doesn't overlap >50% with existing role Identity

### Expertise (required)
- [ ] 3-7 specific domains (not catch-alls like "best practices")
- [ ] Each domain maps to a concrete code review action

### When to Include (required)
- [ ] Trigger conditions are specific ("code touching database queries")
  not vague ("any backend changes")

### Anti-Patterns (recommended)
- [ ] At least 2, each with a concrete identification signal

This checklist transforms review standards from “feeling in the maintainer’s head” to “mechanically checkable rules.” New contributors can self-check before submitting PRs. Reviewers can check against the list item by item.

Mechanism 2: Name Registry

A simple roles/REGISTRY.md file:

| Name | Maintainer | Status | Added |
|------|-----------|--------|-------|
| frontend | @maintainer | active | 2025-01-15 |
| performance | @contributor-24 | active | 2026-05-15 |
| ...

Check the registry before submitting a new role. Same name = either merge or rename. The registry also records maintainers — who’s responsible for ongoing updates.

Mechanism 3: Lifecycle Management

Each role has a status label:

active: Normal use
deprecated: Marked as outdated, replacement suggested, deleted after 6 months
experimental: Newly submitted, in trial period, excluded from official review flows

Status transitions have rules: active with no updates for 12+ months → automatically marked review-needed. Maintainer decides whether to update or deprecate.

Mechanism 4: Conflict Detection

An automated script scans all roles’ Expertise lists, using semantic similarity to detect overlaps. Overlap exceeding threshold → automatic warning flag during PR review. Doesn’t auto-reject — overlap isn’t necessarily bad (different angles on the same domain), but needs human confirmation.

The Cost of Governance

Let’s be direct: for a 163-star project, the above four mechanisms might be over-engineering.

Five role PRs. Not fifty, not five hundred. Building a full governance system to manage five markdown files has a low ROI.

But governance needs aren’t judged by current count — they’re judged by growth trends and the cost of losing control. If role PRs continue at a similar pace, that could mean 15-20 roles in six months. If governance starts then, existing roles need retroactive review, already-merged low-quality roles need deprecation, and contributors will feel the rules changed suddenly.

Retroactive governance is psychologically harder than upfront governance. Upfront governance hands new contributors a checklist before they’ve invested time — the psychological cost of acceptance is near zero. Retroactive governance tells existing contributors “your role doesn’t meet standards, please revise or we’ll deprecate it” — they’ve already invested time and identity, and it feels like a betrayal. This isn’t hypothetical — nearly every open source project that transitions from “submit whatever” to “we have standards” goes through this friction.

The best time for governance is before problems appear — but not before contributions appear. Building governance at zero contributions is waste. Not having governance at five contributions is exactly when to start.

Minimum Viable Governance for Open Source

Given the project’s scale (1 maintainer, 5 external role contributions), a pragmatic minimum viable governance:

Do now:

Add quality checklist to CONTRIBUTING.md (10 minutes of work, highest ROI)
Name registry roles/REGISTRY.md (30 minutes of work)

Do at 10 roles:

Lifecycle labels (requires changing harness role loading logic)
Automated conflict detection (requires semantic similarity tooling)

Do at 3+ external maintainers:

Role maintainer assignment mechanism
Role deprecation voting process

This is progressive governance — not designing a perfect system once, but adding rules incrementally as contribution volume grows. Just like S2’s core thesis: products expose the toolchain’s blind spots, blind spots drive the toolchain’s evolution. Similarly: contributions expose governance gaps, gaps drive governance construction.

Open Source Without Governance Isn’t an Ecosystem — It’s Accumulation

This episode’s core argument:

Accepting contributions ≠ the ability to govern contributions.

A GitHub repository with PR permissions enabled can accept contributions. But accepting isn’t digesting. Five role PRs were merged — but who’s responsible for their quality? Who’s responsible for keeping them updated? Who arbitrates when two roles conflict?

Without these answers, the roles/ directory isn’t an ecosystem — it’s a dumping ground. New roles come in, old roles never leave. Good roles and mediocre roles mix together. Nobody knows which roles are verified and which are just “someone submitted it so I merged it.”

From a trust perspective: between the contribution layer (layer 3) and the resilience layer (layer 5), what’s missing is precisely this governance layer. Without governance, more contributions make the system more fragile — because every unscrutinized contribution is a potential quality debt. And quality debt, like technical debt, compounds: the longer you wait to pay it down, the higher the cleanup cost. Five unreviewed roles can still be fixed one by one at manageable cost. But let it grow to fifty, and the cleanup effort becomes so daunting that nobody wants to touch the directory — it becomes “legacy” that everyone acknowledges is problematic but nobody is willing to address.

Next episode looks back across the full season: from infrastructure to core, the five-layer trust transfer ledger.

Silicon Team S3: From “I Can Use It” to “Others Can Use It” ← S3E06: Actually Verifying the FAIL Path for the First Time | S3E08: The Five-Layer Trust Transfer Ledger →