Curation Pipeline

Accepted

The Curation Pipeline is the continuous process that keeps the Component Library and Module Library current, accurate, and trustworthy. It is the mechanism by which Almathal improves over time: not by waiting for the next foundation model, but by curating the building blocks the system composes from.

Pipeline Stages

Discovery → Evaluation → Ratification → Release

Each stage has both agent automation and human oversight, in proportions that evolve with system maturity.

Stage 1: Discovery

Goal: Identify candidates for admission, update, or deprecation.

Discovery sources

Source	Watches for
Public repos & registries	New releases of curated libraries, new high-quality libraries in target categories
Security advisories	CVEs against curated libraries, vulnerable transitive dependencies
Generated app usage patterns	Adapters that appear together frequently (Module candidates), unmatched user requests (Archetype candidates)
Curated trend feeds	New language ecosystem developments, framework migrations
Manual proposals	Platform team or customer requests for specific additions

Discovery is primarily agent-driven. Agents continuously scan and queue candidates for evaluation.

Discovery cadence

Daily: CVE feeds, security advisories for curated libraries
Weekly: New releases of curated libraries
Weekly: GitHub trending and high-velocity repos in target categories
Monthly: Usage pattern analysis for Module and Archetype candidates
Ad hoc: Manual proposals

Stage 2: Evaluation

Goal: Determine whether a candidate is admission-worthy and what its Manifest should declare.

Automated evaluation

Check	Question
License scan	Is the license compatible with enterprise distribution?
CVE scan	Are there open vulnerabilities?
Maintenance signals	Active commits, release cadence, issue resolution, contributor count
API stability	Has the surface stayed stable across recent versions?
Adoption signals	GitHub stars, package download counts, presence in successful projects
Compatibility check	Does this candidate compose cleanly with existing library entries?
Regression test	If admitting an update, do existing Manifests/Archetypes still resolve?

Drafted Manifest

Based on the evaluation, an agent (or human-directed LLM in Stage 1 authoring) drafts the candidate’s Manifest. The draft is the input to Ratification.

Risk Classification

Candidates are classified for ratification routing:

Low risk: Minor version updates to active Adapters in mature categories with no Capability or Contract changes
Medium risk: New Adapter additions in mature categories, major version updates to active Adapters
High risk: New categories, security-critical Adapters (cryptography, auth, payments), Capability namespace changes, deprecations

Stage 3: Ratification

Goal: Decide whether to admit the candidate.

Routing by Risk

Risk	Ratifier
Low (post-MVP, with Stage 3 enabled)	Agent ratifies; human samples post-hoc
Medium	Human Reviewer
High	Human Reviewer + Security Reviewer for security-critical

Decision Record

Every ratification — admit, reject, defer — is logged with:

Candidate identity
Decision and reasoning
Reviewer identity (or “agent” with timestamp)
Manifest version admitted (or rejected)

The decision log is part of the platform’s audit trail.

What admission means

When a candidate is admitted:

A UUID is assigned (if new) or preserved (if updating)
The Manifest is committed to the central registry
Existing dependent Manifests are checked for compatibility
The change is queued for the next platform release

Stage 4: Release

Goal: Ship admitted changes to customers in versioned releases.

Release Cadence

Patch releases: Weekly, including security-critical Manifest updates
Minor releases: Monthly, including new Adapters and Modules
Major releases: As needed, including schema changes and breaking improvements

What a Release Contains

A release bundles:

New Manifests admitted since last release
Updated Manifests for existing Adapters and Modules
Deprecation notifications
New Capabilities and Contracts admitted to the namespaces
New Archetypes
Compatibility matrix updates
Migration tooling for any breaking changes

Customer Impact

Customers see versioned releases. They upgrade on their schedule. Generated apps reference the platform version at build time, so apps built against v0.5 keep referencing v0.5’s library contents indefinitely (until the customer regenerates against a newer version).

Trust Boundaries

The Curation Pipeline embeds trust decisions at every stage:

Discovery is broad and inclusive; no trust granted yet
Evaluation is mechanical; results are inputs to ratification, not trust assertions
Ratification is where trust is granted; human review applies wherever risk is non-trivial
Release is the contract: customers can rely on what was admitted in their installed version

The model is: Anthropic-style ML safety meets enterprise change management. Agents do volume; humans do judgment; mechanical checks gate both.

Customer Visibility

Customers have read access to:

The current Manifest registry (what they have access to in their installed version)
Audit trails for the components in their generated apps
Upcoming changes (release previews)
The decision log for changes affecting them (e.g., why an Adapter was deprecated)

Customers do not have write access to the registry. Customer-specific extensions are scoped to their own template library (ADR-0014).

When the Pipeline Fails

Failure modes and mitigations:

Failure	Mitigation
Bad Manifest admitted by mistake	Roll-back at next release; root-cause analysis; reviewer training
Library upstream takedown	Verified artifact mirror (ADR-0017) preserves access
License change in upstream library	Curation team alerted; replacement candidate evaluated; deprecation if no replacement
CVE discovered post-admission	Patch release with updated version or deprecation; customer notification
Agent ratification of low-risk update introduces regression	Stage 3 paused for that category; downgrade to Stage 2; root-cause analysis

Governance → Manifest Authoring
Governance → Capability Namespace
Architecture Overview
ADR-0001: Retrieval-First Architecture
ADR-0018: LLM-Drafted Human-Directed Manifest Authoring