Source quality

Source quality benchmark.

A public view of source intelligence depth. This is not a safety guarantee; it shows what Nipmod can inspect, where the limits are and how search quality is checked.

Benchmark
28/28 pass
MRR
1
Recall at 3
1
Blocked recommended
0

Depth

Source intelligence depth

npm

latest manifest, tarball integrity, registry signatures, lifecycle scripts, packument version intelligence, OSV advisory context, dependency count and download signal

Current
98
Target
98
Coverage
strong

PyPI

project JSON, latest release files, file hashes, yanked flags, OSV advisory context, release velocity, Simple API metadata and provenance links

Current
96
Target
96
Coverage
strong

GitHub

repository metadata plus selected manifest, security, workflow, Dockerfile, release asset, commit freshness and lockfile probes on the default branch

Current
95
Target
95
Coverage
strong

Hugging Face models

model API metadata, cardData, tags, siblings, downloads, likes, gated/private flags, commit SHA, file-shape counts, eval labels and remote-code indicators

Current
95
Target
95
Coverage
strong

Hugging Face datasets

dataset API metadata, dataset_info, features, splits, tags, siblings, data file shape, compressed archive/script warnings, downloads, likes, gated/private flags and commit SHA when returned

Current
93
Target
93
Coverage
strong

MCP

MCP registry server metadata, schema URL, remote endpoint security, repository link, status, package references and credential-scope summary when returned

Current
90
Target
90
Coverage
moderate

Benchmark

Search quality gates

Command

Run locally

pnpm search:benchmark
Snapshot

28/28 passing

Mean reciprocal rank 1, recall at 1 1, recall at 3 1.

Safety

No blocked recommendation

Benchmark cases include unsafe decoys and partial source outage behavior.

Scope

What the benchmark covers

Question
Can Nipmod choose a useful package, model, repo, dataset or MCP server before an agent moves toward external code execution?
Unit
search result and pre-install source selection
Counting
Source coverage counts benchmark cases where the source was requested; multi-source cases count toward each requested source.
Scenarios
Scenario groups are overlapping by design; one benchmark case can exercise more than one risk class.
npm
16/16 pass
PyPI
12/12 pass
GitHub
1/1 pass
Hugging Face models
2/2 pass
Hugging Face datasets
1/1 pass
MCP
2/2 pass
baseline package, model, repo or MCP selection
8 cases
partial or multi-source outage behavior
2 cases
typo, namespace, dependency confusion or source impersonation
6 cases
install, lifecycle, wallet, dataset script or credential-scope risk
5 cases
package metadata, README, long-description or model-card instruction risk
5 cases
deprecation, publisher continuity or takeover timeline risk
4 cases

Profiles

What each source is best for

npm
JavaScript and TypeScript package selection; install-plan review; lifecycle script risk checks
registry-ranked search with validated task hints for common agent requests
PyPI
Python package exact inspect; wheel/source release risk review; known PyPI vulnerability context
normalized name candidates, validated task hints, exact-name fallback and source-specific ranking
GitHub
source repository discovery; repo activity context; agent review before cloning code
GitHub repository search sorted by stars with archived repositories filtered out and top-result manifest/README enrichment
Hugging Face models
model discovery; model card and file-shape context; remote-code and weight-format warning
Hugging Face hub search sorted by downloads
Hugging Face datasets
dataset discovery; dataset card metadata; license and hub usage context
Hugging Face dataset search sorted by downloads
MCP
MCP server discovery; remote endpoint context; credential-scope review before enabling tools
MCP registry server search with pinned fallback for known public records

Limits

Limits we do not hide

npm
npm search ranking is upstream-provided and can still surface weak packages
package authorship; malware-free guarantee; workspace execution approval
PyPI
PyPI has no official JSON search API, so broad natural-language discovery uses normalized candidates and curated task hints
full index crawl; malware-free guarantee; private package visibility
GitHub
GitHub repository search is not package-registry resolution
verified release provenance; full code scan; dependency vulnerability audit
Hugging Face models
model files are not downloaded or executed by the hosted API
model behavior evaluation; weight integrity beyond returned metadata; license legal advice
Hugging Face datasets
dataset contents are not downloaded, sampled or scanned by the hosted API
dataset content audit; training suitability approval; license legal advice
MCP
MCP registry availability and schema stability are still early
tool execution safety; server operator verification; credential policy approval

Machine

Agent-readable report