Is the base image in the FROM line of your Dockerfile safe to use? Pulling nginx, python, grafana, or any other image from Docker Hub with the latest tag doesn't mean it's patched, current, or secure. A standard nginx image returns 300+ CVEs. Most are vulnerabilities in OS utilities unrelated to serving HTTP traffic, but not all of them can be ignored.
Some vulnerabilities are enough for an attacker to compromise your application even if your code and infrastructure are flawless. In 2024, python:3.11-slim was found to contain CVE-2024-6345, a critical setuptools vulnerability (CVSS 9.8) that allows arbitrary code execution during package installation. The image stayed vulnerable for several weeks after the official patch shipped.
Even well-maintained images lag. nginx:latest has carried OpenSSL vulnerabilities for months at a time. The upstream fix exists, but the Docker Hub layer takes time to catch up. Teams pulling latest and following best practices were still deploying the vulnerability throughout.
The interpretation problem
A scanner gives you a long list of vulnerabilities that is hard to interpret and act on:
- Which image is genuinely dangerous, and which is safe to use?
- Which vulnerabilities deserve attention first?
- Of hundreds of CVEs, which actually matter for your deployment?
- How do you justify to an auditor or CTO why you're ignoring 298 out of 300 CVEs?
Solution: a systematic approach
This article covers the first step: reducing scanner noise through automated prioritization based on severity, exploitation probability, and image context, turning a raw CVE list into a manageable set that actually requires engineering attention.
The approach prioritizes vulnerabilities based on:
- Severity, CVSS score, exploitation probability (EPSS), and confirmed active exploitation (CISA KEV)
- Relevance to the specific image's function
In follow-up articles, we'll enrich the results with runtime context (which libraries are loaded, which functions are called) and show whether vulnerabilities can be triggered given your specific configuration.
After completing the full workflow, you'll be able to say with confidence:
- ✓Which CVEs are actually dangerous for your configuration
- ✓Whether the image needs an immediate update or can wait
- ✓Whether to switch to an alternative base image
- ✓Which risks are acceptable and which require immediate action
- ✓How to justify those decisions to a security auditor or management
The full workflow can be automated and integrated into a CI/CD pipeline. This series covers how to do that.
The Algorithm
Run a scanner against the image to get a complete list of CVEs, including transitive dependencies you might not know are in the image.
For each CVE, add two signals beyond CVSS:
- EPSS: probability of exploitation in the wild within the next 30 days, updated daily by FIRST.org
- CISA KEV: catalog of CVEs with confirmed active exploitation maintained by the US government
CVSS measures theoretical severity. EPSS and KEV tell you whether exploitation is actually happening.
Assign each CVE to a bucket based on how the vulnerable package relates to what the container does:
Categorization is done by an LLM using the CVE description, package type, and image purpose. Runtime tracing in later steps validates these decisions against the actual container.
Setup
We use Trivy as the scanner, the most widely adopted open-source container scanner, with over 34.2k stars on GitHub. For the experiment we picked three images with different base distributions and workload types:
| Image | Base | Type |
|---|---|---|
node:20-alpine | Alpine | Runtime |
nginx:1.25 | Debian | Web server |
grafana/grafana:10.0.0 | Ubuntu | Application |
These are pinned versions used during testing. The algorithm applies to any image — pinning versions ensures the results in this article are reproducible.
Categorization in Practice
Each CVE gets one of two outcomes: PROCESS or IGNORE. The decision tree:
KEV listed→PROCESSSeverity: LOW→IGNOREMEDIUM / HIGH / CRITICAL→CLASSIFYHow prompt formulation affects results
We ran the same dataset through three prompt versions. The difference between 28% and 45% noise reduction comes entirely from how the rules are formulated:
| Prompt version | node (15) | nginx (363) | grafana (293) | total (671) |
|---|---|---|---|---|
| v1no image context | 33% (5) | 56% (205) | 23% (68) | 41% (278) |
| v2image context only | 20% (3) | 46% (167) | 6% (18) | 28% (188) |
| v3context + heuristics | 27% (4) | 70% (253) | 15% (44) | 45% (301) |
Applying the Algorithm
01Scan
| Image | Critical | High | Medium | Low | Unknown | Total |
|---|---|---|---|---|---|---|
node:20-alpine | 0 | 12 | 1 | 2 | 0 | 15 |
nginx:1.25 | 16 | 61 | 125 | 156 | 5 | 363 |
grafana/grafana:10.0.0 | 15 | 70 | 190 | 18 | 0 | 293 |
02Enrich
Across the three images, 3 entries match KEV (2 unique CVEs):
| CVE | Severity | Package | Image |
|---|---|---|---|
| CVE-2025-27363 | HIGH | libfreetype6 | nginx |
| CVE-2023-44487 | HIGH | nghttp2-libs | grafana |
| CVE-2023-44487 | MEDIUM | golang.org/x/net | grafana |
CVE-2023-44487 is the HTTP/2 Rapid Reset Attack — one of the most widely exploited vulnerabilities of 2023. It appears twice in grafana because two separate packages are affected.
03Categorize
LOW severity CVEs without a KEV flag are ignored by default (176 CVEs). The remaining 319 CVEs went through AI classification — 125 assigned Not Applicable and ignored. Three examples:
| Image | CVE | Sev | EPSS | KEV | Category | Decision |
|---|---|---|---|---|---|---|
| nginx | CVE-2025-27363 | HIGH | 0.65 | ✓ | KEV | PROCESS |
| node | CVE-2024-21538 | HIGH | 0.00 | — | Directly Exposed | PROCESS |
| nginx | CVE-2024-2398 | HIGH | 0.02 | — | Not Applicable | IGNORE |
Results
| Image | Before | Process | Ignore | Noise reduction |
|---|---|---|---|---|
node:20-alpine | 15 | 11 | 4 | 27% |
nginx:1.25 | 363 | 110 | 253 | 70% |
grafana/grafana:10.0.0 | 293 | 249 | 44 | 15% |
Conclusion
Across three images, 671 CVEs were processed: 301 filtered without manual review, 370 requiring attention. Noise reduction ranges from 15% to 70% depending on the image type and how conservatively the prompt is tuned.
The approach works best on images with a known, fixed purpose. For generic base images without application context, Not Applicable is harder to assign and more CVEs default to PROCESS.
This is the first step: filtering by metadata alone, before any runtime data is involved. The next articles narrow the list further using runtime libraries, call graphs, and syscall-level verification.