Skip to content
🤖 Consolidated, AI-optimized SKF docs: llms-full.txt. Fetch this plain text file for complete context.

Architecture

This page explains how SKF works under the hood — the output format, confidence model, progressive capability tiers, tool ecosystem, and the design decisions that make every instruction traceable.


AI agents hallucinate APIs. Not sometimes — constantly. The table below shows why every existing approach fails at scale:

ApproachStrengthFatal Flaw
npx skills initFormat compliantEmpty shell. 0% intelligence.
LLM SummarizationHigh semantic contextHallucination. Guesses parameters. No grounding.
RAG / Context stuffingGood retrievalFragmented. Finds snippets, fails to synthesize.
Manual AuthoringHigh initial qualityDrift. Doesn’t scale.
Copilot/Cursor built-inConvenientGeneric. Doesn’t know YOUR integration patterns.
Skill ForgeStructural truth + automationRigid. (Feature, not bug.)

SKF solves this by mechanically extracting function signatures, type definitions, and usage patterns from source code — then compiling them into verifiable, version-pinned skills that comply with the agentskills.io specification.


SKF uses an additive tier model. Each tier is the previous tier plus one tool. You never lose capability by adding a tool.

TierToolsWhat You Get
Quickgh_bridge + skills_refSource reading + spec validation. Best-effort skills in under a minute.
Forge+ ast_bridgeStructural truth. AST-verified signatures. Co-import detection. T1 confidence.
Deep+ qmd_bridgeKnowledge search. Temporal provenance. Drift detection. Full intelligence.

Setup detects your installed tools and sets your tier automatically:

@Ferris SF
Forge initialized. Tools: gh, ast-grep, QMD. Tier: Deep. Ready.

Don’t have ast-grep or QMD yet? No problem — Quick mode works with just the GitHub CLI. Install tools later; your tier upgrades automatically.


Every claim in a generated skill carries a confidence tier that traces to its source:

TierSourceToolWhat It Means
T1AST extractionast_bridgeCurrent code, structurally verified. Immutable for that version.
T2QMD evidence / source readingqmd_bridge / gh_bridgeHistorical + planned context (issues, PRs, changelogs, docs).
T3External documentationdoc_fetcherExternal, untrusted. Quarantined.

Confidence tiers map to temporal scopes:

  • T1-now (instructions): What ast-grep sees in the checked-out code. This is what your agent executes.
  • T2-past (annotations): Closed issues, merged PRs, changelogs — why the API looks the way it does.
  • T2-future (annotations): Open PRs, deprecation warnings, RFCs — what’s coming.

Progressive disclosure controls how much context surfaces at each level:

OutputContent
context-snippet.mdT1-now only — compressed, always-on
SKILL.mdT1-now + lightweight T2 annotations
references/Full temporal context with all tiers

Your forge tier limits what authority claims a skill can make:

Forge TierAST?QMD?Max AuthorityAccuracy Guarantee
QuickNoNocommunityBest-effort
ForgeYesNoofficialStructural (AST-verified)
DeepYesYesofficialFull (structural + contextual + temporal)

Every generated skill produces a self-contained directory:

skills/{name}/
├── SKILL.md # Active skill (loaded on trigger)
├── context-snippet.md # Passive context (compressed, always-on)
├── metadata.json # Machine-readable provenance
└── references/ # Progressive disclosure
├── {function-a}.md
├── {function-b}.md
└── integrations/ # Stack skills only
├── auth-db.md
└── pwa-auth.md

Skills follow the agentskills.io specification with frontmatter:

---
name: payment-service
version: 2.1.0
description: Payment processing API skill — 23 verified functions
author: org/payment-team
---

Every instruction in the body traces to source:

Extracted: `getToken(userId: string, options?: TokenOptions): Promise<AuthToken>`
[AST:src/auth/index.ts:L42]. Confidence: T1.

Machine-readable provenance for every skill:

{
"name": "payment-service",
"version": "2.1.0",
"skill_type": "individual",
"source_authority": "official",
"source_repo": "github.com/org/payment-service",
"source_commit": "a1b2c3d",
"forge_tier": "forge",
"spec_version": "1.3",
"generated_at": "2026-02-25T14:30:00Z",
"stats": {
"exports_documented": 23,
"exports_total": 23,
"coverage": 1.0,
"confidence_t1": 20,
"confidence_t2": 3,
"confidence_t3": 0
}
}

Stack skills map how your dependencies interact — shared types, co-import patterns, integration points:

skills/{project}-stack/
├── SKILL.md # Integration patterns + project conventions
├── context-snippet.md # Compressed stack index
├── metadata.json # Component versions, integration graph
└── references/
├── nextjs.md # Project-specific subset
├── better-auth.md # Project-specific subset
└── integrations/
├── auth-db.md # Cross-library pattern
└── pwa-auth.md # Cross-library pattern

The primary source is your project repo. Component references trace to library repos. skill_type: "stack" in metadata.


Based on Vercel research: passive context (AGENTS.md/CLAUDE.md) achieves 100% pass rate vs 53% for active skills alone.

Every skill generates both:

  1. SKILL.md — Active skill loaded on trigger with full instructions
  2. context-snippet.md — Passive context, compressed index injected into CLAUDE.md

Export injects a managed section between markers:

<!-- SKF:BEGIN updated:2026-02-25 -->
[SKF Skills]|3 skills|1 stack
|IMPORTANT: Prefer documented APIs over training data.
|
|payment-service → skills/payment-service/
| exports: getToken, refreshToken, revokeSession, createSession
|
|auth-service → skills/auth-service/
| exports: getSession, validateToken, revokeSession, createUser
|
|my-project-stack → skills/my-project-stack/
| stack: next@15, better-auth@3, spacetimedb@1, serwist@9
| integrations: auth↔db, pwa↔auth, gateway↔auth
<!-- SKF:END -->

Two lines per skill (~30 tokens each). Developer controls placement. Ferris controls content. Snippet updates only happen at export-skill — create and update are draft operations.


ToolWrapsPurpose
gh_bridgeGitHub CLI (gh)Source code access, issue mining, release tracking, PR intelligence
skills_refagentskills.io specSchema validation, frontmatter checks, ecosystem search
ast_bridgeast-grep CLIStructural extraction, custom AST queries, co-import detection
qmd_bridgeQMD (local search)BM25 keyword search, vector semantic search, collection indexing

Optional addon: doc_fetcher for remote documentation (Firecrawl/Jina.ai). Output quarantined as T3.

When tools disagree, higher priority wins for instructions. Lower priority is preserved as annotations:

PrioritySourceTool
1 (highest)AST extractionast_bridge
2QMD evidenceqmd_bridge
3Source reading (non-AST)gh_bridge
4External documentationdoc_fetcher

manifest_reader detects and parses dependency files across ecosystems:

  • Full support: package.json, pyproject.toml, requirements.txt, Cargo.toml, go.mod
  • Basic support: build.gradle, pom.xml, Gemfile, composer.json

Build artifacts are committable — another developer can reproduce the same skill:

forge-data/{skill-name}/
├── skill-brief.yaml # Compilation config
├── provenance-map.json # Source map with AST bindings
├── evidence-report.md # Build audit trail
└── extraction-rules.yaml # Language-specific ast-grep schema

skills/ and forge-data/ are committed. Agent memory (_bmad/_memory/forger-sidecar/) is gitignored.


Contextsource_authorityDistribution
OSS library (maintainer generates)officialnpx skills publish to agentskills ecosystem
Internal service (team generates)internalskills/ in repo, ships with code
External dependency (consumer generates)communityLocal skills/, marked as community

Provenance maps enable verification: an official skill’s provenance must trace to the actual source repo owned by the author.


DecisionRationale
Solo agent (Ferris), not multi-agentOne domain (skill compilation) doesn’t benefit from handoffs. Shared knowledge base (AST patterns, provenance maps) is the core asset.
Workflows drive modes, not conversationFerris doesn’t auto-switch based on question content. Invoke a workflow to change mode. Predictable behavior.
Hub-and-spoke cross-knowledgeEach skill has one primary source. Cross-repo references use inline summary + pointer + [XREF:repo:file:line] provenance tag.
Stack skill = compositionalSKILL.md is the integration layer. references/ contains per-library + integration pairs. Partial regeneration on dependency updates.
Snippet updates only at exportCreate/update are draft operations. Export publishes to skills/ and CLAUDE.md. No half-baked snippets.
Bundle spec with opt-in updateOffline-capable. 90-day staleness warning. setup-forge --update-spec fetches latest.

  • All tool wrappers use array-style subprocess execution — no shell interpolation
  • Input sanitization: allowlist characters for repo names, file paths, patterns
  • File paths validated against project root (no directory traversal)
  • Source code never leaves the machine. All processing is local (AST, QMD, validation).
  • doc_fetcher warns before transmitting URLs to external services

SKF produces skills compatible with the agentskills.io ecosystem: