What 16 engines found in 2,900 MCP servers

01 — Headline

91.4% of MCP server repositories have at least one security finding. We ran 6,494 scans across 2,896 unique repos with 16 independent engines over the past month. The average trust score was 7.54 / 10. Thirteen repos scored below 5.0. Half landed between 5 and 7.

This is not a malware problem. It is a hygiene problem. Most repos in the yellow zone are maintained, functional, and used in production. They carry known CVEs in their dependency chains, secrets in commit history, or MCP tool permissions broader than what the tool actually needs. One scanner would call most of them clean.

91.4%

Of repos flagged by at least 1 of 16 engines (5,937 of 6,494 scans)

85%

Of flagged repos were flagged by 2 or more engines independently

7.54

Average trust score · median was 7.6 · 13 repos below 5.0

02 — Method

How the numbers were generated.

Sample. 2,896 unique repositories from public MCP registries, official MCP directories on GitHub, and community submissions. Scanned between March 1 and April 2, 2026. Roughly 13× larger than our first report.

Engines. 16 independent security engines, each in its own Docker container. No engine sees another engine's output. The full engine list covers five categories:

Vulnerability scanning — Trivy, Grype, OSV Scanner, npm audit, pip-audit
Secret detection — detect-secrets, Gitleaks
Static analysis — Bandit, Semgrep, Checkov
MCP-specific — Custom YARA rules, MCP Guardian
Supply chain (informational) — Syft, ScanCode, Cisco AIBOM

Scoring formula

# For each finding f produced by engine e:
score -= severity_weight[f] * engine_weight[e]
# Capped per-engine to prevent any single tool dominating:
deduction[e] = min(deduction[e], 3.0)
# Final score clamped to [0.0, 10.0]:
final = max(0.0, min(10.0, 10.0 - sum(deduction)))

Three informational engines (Syft, ScanCode, Cisco AIBOM) produce findings but do not reduce the trust score. Full methodology lives at /docs/scoring.

03 — Distribution

Where the 2,896 repos landed.

Score 10.0792

Score 9419

Score 8678

Score 7756

Score 6536

Score 5147

Score 45

Score < 48

Zone	Score range	Repos	Percentage
Red	0.0 — 4.0	13	0.2%
Yellow	5.0 — 7.0	3,231	49.7%
Green	8.0 — 10.0	3,250	50.1%

The yellow zone carries the main finding. These 3,231 repos are not abandoned or broken. Most are actively maintained. They carry CVEs with published advisories, secrets committed to source, or MCP-specific configuration issues.— from the dataset notes

04 — Engines

Detection rates, per engine.

The three vulnerability scanners have the highest detection rates. Dependency vulnerabilities are common, CVE databases are extensive, and most repos pull in dozens of transitive dependencies.

Engine	Detection rate	Category
Trivy	76.5%	Vulnerability scanning
OSV Scanner	54.9%	Vulnerability scanning
Grype	49.8%	Vulnerability scanning
detect-secrets	43.6%	Secret detection
Custom YARA	31.8%	MCP-specific threats
MCP Guardian	22.6%	MCP-specific checks
Gitleaks	22.3%	Secret detection
Bandit	12.9%	Static analysis (Python)

detect-secrets flagged 43.6% of repos with an average of 99.9 findings per flagged repo. That number is high because it uses entropy-based pattern matching that catches non-standard secret formats. Many of those are false positives. Gitleaks (22.3%, pattern-based rules) is more precise but misses atypical formats. Running both tells you more than running either alone.

Six engines returned zero findings on this sample. This is expected. npm audit only runs on Node.js projects with a package-lock.json. pip-audit requires Python dependency manifests. Semgrep needs matching code patterns. A zero detection rate means the engine's rules did not match this sample, not that the engine failed.

05 — Agreement

The strongest signal: tools agreeing.

Engines flagging	Scans	% of flagged
1 engine only	891	15.0%
2 engines	791	13.3%
3 engines	1,333	22.5%
4 engines	1,272	21.4%
5 engines	752	12.7%
6 engines	593	10.0%
7 engines	268	4.5%
8–9 engines	40	0.7%

85% of flagged repos are flagged by two or more engines. That is the argument for multi-engine scanning. A single tool catches what it was built to catch. Run sixteen and the blind spots become visible.

Top co-occurring engine pairs: OSV + Trivy (3,167 scans), Grype + Trivy (3,154), Grype + OSV (2,877). These three form the backbone of high-confidence vulnerability detection across the dataset.

06 — Notable

Two repos worth flagging.

Cline · score 3.5

Eight engines flagging independently — the highest engine count in the dataset. Findings span dependency vulnerabilities across multiple package ecosystems, detected secrets in source, and MCP-specific configuration issues. Cline is one of the most popular AI coding assistants. Developers using it daily should be aware.

Apache APISIX MCP · score 3.5

Seven engines flagging. Known CVEs in the dependency chain, configuration patterns that expose internal endpoints, broad permission scopes in MCP tool definitions. APISIX is a widely deployed API gateway; its MCP integration inherits the parent project's large attack surface.

07 — Limitations

What this report does not say.

Static analysis only. We scan source, dependencies, configuration. We do not execute tools or observe runtime behavior.
Sample bias. Repos from public registries tend to be better maintained than internal or ad-hoc MCP servers. The broader ecosystem likely scores lower.
Three LLM-powered engines (requiring external API keys) were disabled during this scan period. Adding them would likely shift some scores down.
Engine weights are our judgment call. A different weighting produces different scores. The full weights are at /docs/scoring.

Scan your MCP server now →

Sixteen engines, sixty seconds. Free, no account, no credit card.

Open the scanner → See all 16 engines

AuthorNikita Frikh-Khar · Dresden

Last updated2026-04-08 · methodology v1.4

Reproducegithub.com/diemoeve/mcpampel

Cite asFrikh-Khar, N. (2026). What 16 engines found in 2,900 MCP servers. MCPAmpel.