Cross-engine analysis

01 — Headline

We ran 16 different security scanners against 769 MCP server repositories. The engines agreed on almost nothing. Six of sixteen produced any findings at all. Among those six, the security-relevant pairs co-flagged the same repo only 19–26% of the time.

25.2%

Of scans scored a perfect 10/10 — passed all 16 engines

6 / 16

Engines that produced any findings across the full sample

19–26%

Co-flag rate between security-relevant engine pairs

02 — Method

How the comparison was set up.

For every submitted URL, the system clones the repo and dispatches it to engines running three at a time, in isolated Docker containers. No engine has access to another engine's output. Results are combined into a weighted trust score.

Engine weight buckets

1.0 — High-signal security tools (Semgrep, OSV Scanner)
0.7 — Solid general-purpose tools (Bandit, detect-secrets)
0.5 — Moderate signal (custom YARA)
0.3 — Noisy tools
0.0 — Informational only (Syft, ScanCode) — produce findings, don't reduce score

03 — Engines

Six engines did all the flagging.

Engine	Scans flagged	What it detects
Syft	671	SBOM (informational, weight 0.0)
OSV Scanner	485	Known CVEs in dependencies
detect-secrets	351	Hardcoded secrets, API keys, tokens
Custom YARA	329	MCP-specific threat patterns
MCP Guardian	296	MCP protocol abuse patterns
Bandit	118	Python security anti-patterns

The remaining 10 engines found nothing. Zero. Not because the repos are clean — because most engines were built for general-purpose code, not for the specific threat model of AI tool-calling systems. The MCP-specific tooling layer is young and shallow.

04 — Overlap

Three categories, partial overlap.

[ FIG. 04 — co-flag overlap, security-relevant engines, n=890 ]

OSV
Scanner

detect
secrets

YARA +
Guardian

22%
shared

Engine co-occurrence

Engine pair	Co-flagged scans	% of 890
OSV Scanner + Syft	442	49.7%
detect-secrets + Syft	301	33.8%
Custom YARA + Syft	265	29.8%
MCP Guardian + Syft	254	28.5%
detect-secrets + OSV Scanner	230	25.8%
Custom YARA + MCP Guardian	202	22.7%
Custom YARA + detect-secrets	181	20.3%

Syft appears in the top four pairs because it flags nearly everything (SBOM generation runs on any repo with dependencies). Remove Syft and the real pattern emerges: the security-relevant engines (OSV, detect-secrets, YARA, MCP Guardian) co-flag repos about 170–230 times out of 890 scans — 19 to 26% agreement.

The Custom YARA + MCP Guardian pair is the most telling. These are the two engines specifically built for MCP threat patterns, and they agree on 22.7% of scans. For the rest, one catches what the other misses. That is exactly why multi-engine scanning matters.— from the analysis

05 — Scoreboard

Real scores for widely-deployed tools.

All data from static analysis of public GitHub repositories.

Repository	Score	Engines flagged	Notes
Cline	0.9	5	VS Code AI coding assistant
Zed	2.3	4	Code editor with AI features
Letta AI	2.5	4	Long-term memory for agents
Block's Goose	2.7	5	Open-source agent framework
LiteLLM	3.2	5	LLM API proxy
TruffleHog	3.3	5	Secret-scanning tool
Continue	3.5	4	VS Code AI extension
Trivy	4.5	4	Container security scanner

Two caveats. Larger repos score worse — more code and more dependencies mean more findings. And these scores reflect supply-chain risk as much as code quality: a repo can score 2.0 because three transitive dependencies have unpatched CVEs the maintainers may not even know about.

06 — Limitations

Constraints we'd like to lift.

Static analysis only. No runtime behavior, no protocol interaction, no prompt-injection testing.
Point-in-time snapshots. Scores change as deps get patched and engines improve.
GitHub repos only. A deployed MCP server may have different configurations or network policies.
Engine coverage is uneven. 6 of 16 engines produced findings. As MCP-specific tooling matures, the picture shifts.

Scan your MCP server now →

Sixteen engines, sixty seconds. Free, no account, no credit card.

Open the scanner → See all 16 engines

AuthorNikita Frikh-Khar · Dresden

Last updated2026-03-12

Reproducegithub.com/diemoeve/mcpampel

Cite asFrikh-Khar, N. (2026). Cross-engine analysis. MCPAmpel.