Skip to content
POST · 2026-03-06 · 9 min read · Cross-engine analysis

Where engines agree.
Where they don't.

Sixteen scanners, 769 repos, 890 scans. The security-relevant engines agreed with each other on 19–26% of flagged repositories. That is the case for never trusting a single tool's "all clear".

FIG. 03 · ENGINE OVERLAP — 16 SCANNERS, 890 SCANS
01 — Headline

We ran 16 different security scanners against 769 MCP server repositories. The engines agreed on almost nothing. Six of sixteen produced any findings at all. Among those six, the security-relevant pairs co-flagged the same repo only 19–26% of the time.

25.2%
Of scans scored a perfect 10/10 — passed all 16 engines
6 / 16
Engines that produced any findings across the full sample
19–26%
Co-flag rate between security-relevant engine pairs
02 — Method

How the comparison was set up.

For every submitted URL, the system clones the repo and dispatches it to engines running three at a time, in isolated Docker containers. No engine has access to another engine's output. Results are combined into a weighted trust score.

Engine weight buckets

  • 1.0 — High-signal security tools (Semgrep, OSV Scanner)
  • 0.7 — Solid general-purpose tools (Bandit, detect-secrets)
  • 0.5 — Moderate signal (custom YARA)
  • 0.3 — Noisy tools
  • 0.0 — Informational only (Syft, ScanCode) — produce findings, don't reduce score
03 — Engines

Six engines did all the flagging.

EngineScans flaggedWhat it detects
Syft671SBOM (informational, weight 0.0)
OSV Scanner485Known CVEs in dependencies
detect-secrets351Hardcoded secrets, API keys, tokens
Custom YARA329MCP-specific threat patterns
MCP Guardian296MCP protocol abuse patterns
Bandit118Python security anti-patterns

The remaining 10 engines found nothing. Zero. Not because the repos are clean — because most engines were built for general-purpose code, not for the specific threat model of AI tool-calling systems. The MCP-specific tooling layer is young and shallow.

04 — Overlap

Three categories, partial overlap.

[ FIG. 04 — co-flag overlap, security-relevant engines, n=890 ]
OSV
Scanner
detect
secrets
YARA +
Guardian
22%
shared

Engine co-occurrence

Engine pairCo-flagged scans% of 890
OSV Scanner + Syft44249.7%
detect-secrets + Syft30133.8%
Custom YARA + Syft26529.8%
MCP Guardian + Syft25428.5%
detect-secrets + OSV Scanner23025.8%
Custom YARA + MCP Guardian20222.7%
Custom YARA + detect-secrets18120.3%

Syft appears in the top four pairs because it flags nearly everything (SBOM generation runs on any repo with dependencies). Remove Syft and the real pattern emerges: the security-relevant engines (OSV, detect-secrets, YARA, MCP Guardian) co-flag repos about 170–230 times out of 890 scans — 19 to 26% agreement.

The Custom YARA + MCP Guardian pair is the most telling. These are the two engines specifically built for MCP threat patterns, and they agree on 22.7% of scans. For the rest, one catches what the other misses. That is exactly why multi-engine scanning matters.— from the analysis
05 — Scoreboard

Real scores for widely-deployed tools.

All data from static analysis of public GitHub repositories.

RepositoryScoreEngines flaggedNotes
Cline0.95VS Code AI coding assistant
Zed2.34Code editor with AI features
Letta AI2.54Long-term memory for agents
Block's Goose2.75Open-source agent framework
LiteLLM3.25LLM API proxy
TruffleHog3.35Secret-scanning tool
Continue3.54VS Code AI extension
Trivy4.54Container security scanner

Two caveats. Larger repos score worse — more code and more dependencies mean more findings. And these scores reflect supply-chain risk as much as code quality: a repo can score 2.0 because three transitive dependencies have unpatched CVEs the maintainers may not even know about.

06 — Limitations

Constraints we'd like to lift.

  1. Static analysis only. No runtime behavior, no protocol interaction, no prompt-injection testing.
  2. Point-in-time snapshots. Scores change as deps get patched and engines improve.
  3. GitHub repos only. A deployed MCP server may have different configurations or network policies.
  4. Engine coverage is uneven. 6 of 16 engines produced findings. As MCP-specific tooling matures, the picture shifts.

Scan your MCP server now

Sixteen engines, sixty seconds. Free, no account, no credit card.

AuthorNikita Frikh-Khar · Dresden
Last updated2026-03-12
Cite asFrikh-Khar, N. (2026). Cross-engine analysis. MCPAmpel.