Blog Post

Vulnerability Scanner – SpringAI

March 15, 2026 ai by ravidorairaj

I built a self-hosted AI pipeline that clones a Java or Kotlin repository, resolves the full transitive dependency tree, checks every dependency against the OSV vulnerability database, and generates an actionable security report — complete with BOM-aware upgrade recommendations and the exact build file lines to change. It runs as a Docker container with a built-in web UI, scan history, MCP server for IDE integration, and PDF export. The code is open source. This is the Java / Spring AI edition — ported from the Python version for type safety, multi-vendor LLM support, and enterprise-grade tooling. GitHub: github.com/crewwithravi/vul-scanner-spring-ai

What is VulnHawk (Spring AI Edition)?

VulnHawk is a personal research project I built to explore multi-agent AI architectures applied to software supply chain security — rebuilt in Java using Spring AI and Spring Boot. The idea is straightforward:

You submit — a GitHub repository URL or a list of Maven/Gradle dependency coordinates (via the REST API, web UI, or MCP from Claude Code / Cursor)
VulnHawk resolves — the full transitive dependency tree using mvn dependency:tree or gradle dependencies, then batch-queries the OSV vulnerability database
AI interprets — a 4-agent Spring AI pipeline reads the vulnerability data, checks whether each dep is BOM-managed, reviews changelogs, and searches the source code for affected APIs
You receive — a structured 10-section Markdown report with CVE IDs, severity ratings, and the exact version bump to make in your pom.xml or build.gradle

The key design principle: the vulnerability data is always real. The AI interprets and recommends — it never fabricates CVE IDs or version numbers.

Why I Built the Java Version

The Python version (CrewAI + FastAPI) works well. But I needed capabilities that the Python ecosystem couldn’t deliver cleanly:

Multi-vendor LLM with zero code changes — Spring AI’s ChatModel interface lets me switch between Gemini, Claude, OpenAI, and Ollama with one env var. No SDK swapping, no import changes.
Compile-time type safety — @Tool-annotated Java methods catch schema errors at build time, not in production.
True concurrency — JVM thread pool handles parallel scans without Python’s GIL limitation.
MCP protocol built-in — one Spring AI starter turns the app into an MCP server that Claude Code and Cursor can call directly.
Enterprise ecosystem — Spring Security, Actuator, JPA, and 22 years of battle-tested infrastructure when the roadmap demands it.

The Python version is ~800 lines. The Java version is ~2,500 lines — more explicit, but with stronger guarantees.

Key Features

Feature	Description
Full transitive resolution	Runs `mvn dependency:tree` or `gradle dependencies` to capture every transitive dependency — not just the ones declared in the build file. Falls back to pom.xml/build.gradle parsing if no build tool is installed.
OSV vulnerability database	Batch-queries the OSV API for all dependencies in a single call (up to 1000 deps per batch).
BOM-aware upgrades	Detects Spring Boot BOM-managed libraries and recommends bumping the parent version instead of the dependency directly. Covers 8 Spring Boot releases from 2.6.x through 3.5.x.
Changelog + code review	For each upgrade, fetches release notes from GitHub, searches the project source for affected API usage, and produces a confidence score on whether the upgrade is safe.
Deterministic fallback	If any LLM call fails, the pipeline runs tools directly and builds the report programmatically from raw tool output — no data is lost.
Multi-LLM support	Switch between Google Gemini (default), Anthropic Claude, OpenAI, and Ollama (self-hosted GPU) with one environment variable. Zero code changes.
MCP server	Built-in Model Context Protocol server at `/mcp/sse` — scan repos directly from Claude Code, Cursor, or VS Code without opening a browser.
Web UI + REST API	Dark-themed SPA with live progress, severity badges, scan history panel, and health indicator. Full REST API with async scan support and polling.
PDF export	Generate print-ready PDF reports via FlexMark + Flying Saucer with colour-coded severity sections and professional typography.
Scan history	Every scan is saved to disk (JSON). Browse, reload, or delete past scans from the UI or API. Persisted across container restarts via Docker volumes.

Architecture: 4-Agent Pipeline

This is the part I’m most intentional about. VulnHawk follows a data-first architecture — all vulnerability data is fetched from real databases and computed locally. The AI layer receives verified data and is responsible only for interpretation and recommendations.

  Agent 1  Repo Scanner          → detect Maven/Gradle, run dependency:tree
                                    output: [{group_id, artifact_id, version, scope, depth}]

  Agent 2  Vulnerability Analyst → batch OSV query per dep
                                    output: {vuln_count, vulnerabilities: [{id, severity, fix_version}]}

  Agent 3  Upgrade Strategist    → BOM check → version lookup → changelog → code search
                                    output: upgrade plan with confidence score per dep

  Agent 4  Report Generator      → 10-section Markdown report with exact build file changes

Each agent is a single ChatClient.prompt() call with a tailored system prompt and @Tool-annotated Java methods. Spring AI handles the entire tool-calling loop — schema generation, argument deserialization, method invocation, and result forwarding — automatically. Every API response separates verified data (cve_id, cvss_score, affected_versions) from AI output (upgrade_summary, confidence_score). Users can verify every CVE independently of whatever the model says.

How Spring AI Powers the Pipeline

  ChatClient.prompt()
      .system("You are the Vulnerability Analyst agent...")
      .user(dependencyList)
      .tools(vulnHawkTools)       ← Spring AI binds all @Tool methods
      .call()
          |
          +→ LLM receives function schemas (auto-generated from Java methods)
          +→ LLM decides to call checkOsvVulnerabilities(json)
          +→ Spring AI deserializes args, invokes Java method
          +→ Method calls OSV API, returns result
          +→ Spring AI sends result back to LLM
          +→ LLM may call another tool or return final answer
          |
      .content()  → final Markdown string

This replaces CrewAI’s Agent() → Task() → Crew().kickoff() chain with a single fluent API call. No agent class hierarchy, no task objects, no crew orchestrator.

Tech Stack

Component	Technology	Purpose
Framework	Spring Boot 3.5	REST endpoints, auto-configuration, static file serving
AI Abstraction	Spring AI 1.1.2	Vendor-neutral LLM interface, `@Tool` binding, MCP server
LLM (default)	Google Gemini	Default backend — fast, free tier available
LLM (alt)	Claude / OpenAI / Ollama	Switch with one env var — zero code changes
Vulnerability Data	OSV API	Batch CVE lookup for all Maven/Gradle dependencies
Version Lookup	Maven Central Search API	Find the smallest safe upgrade version on Maven Central
PDF Export	FlexMark + Flying Saucer	Markdown → HTML → PDF with styled severity sections
Build	Gradle 9.3	Dependency management, Spring Boot plugin
Container	Docker + Docker Compose	Multi-stage build, non-root image with Git + Maven
Web UI	HTML + Tailwind CSS + JS	Built-in dark-themed SPA served by embedded Tomcat — no build step

8 Built-in Tools (Spring AI @Tool Methods)

Each tool is a plain Java method annotated with @Tool. Spring AI automatically generates a JSON function schema from the method signature, sends it to the LLM, and handles the invoke/response loop. No manual schema definition needed.

Tool	What it does	External API
`detectBuildSystem`	Identifies Maven or Gradle from project files	Local filesystem
`extractDependencies`	Runs `mvn dependency:tree` or `gradle dependencies`, falls back to build file parsing	Local CLI
`checkOsvVulnerabilities`	Batch CVE lookup (up to 1000 deps per call)	OSV API (osv.dev)
`lookupLatestSafeVersion`	Finds the latest non-vulnerable version	Maven Central (Solr)
`resolveBomParent`	Checks if a dep is managed by Spring Boot BOM and resolves the correct parent version	Static BOM data
`searchCodeUsage`	Greps Java/Kotlin source for vulnerable package usage	Local filesystem
`fetchChangelog`	Pulls release notes for the upgrade target version	GitHub API / Apache
`readProjectDocs`	Reads README, SECURITY.md, CHANGELOG for breaking change context	Local filesystem

The Web UI

VulnHawk ships with a built-in web interface — no separate frontend build required. It features:

Dark theme with glass-morphism design
GitHub URL tab and dependency list tab — paste coordinates directly to scan without cloning
Live progress bar showing which agent is currently running (5 steps)
Elapsed time counter during scan
Rendered report with colour-coded severity badges (CRITICAL / HIGH / MEDIUM / LOW), sortable dependency table, and upgrade plan
Section navigation auto-generated from report headings
History panel — clock icon top-right — browse, reload, or delete any past scan
Health badge showing active LLM vendor and model
Download as PDF or Markdown, copy to clipboard

The frontend is pure HTML, JavaScript, and CSS served directly by Spring Boot’s embedded Tomcat. Zero build step, zero npm.

MCP Server — Scan from Claude Code / Cursor

VulnHawk includes a built-in MCP (Model Context Protocol) server powered by spring-ai-starter-mcp-server-webmvc. Any MCP-compatible client can discover and call VulnHawk’s tools without a browser.

  Developer in Claude Code:
    > "Scan spring-petclinic for vulnerabilities"
    > "Is log4j 2.14.1 vulnerable?"
    > "What's the safe upgrade for spring-core 5.3.0?"

  Claude Code calls VulnHawk MCP tools automatically.
  No browser. No copy-paste. Integrated into the dev workflow.

6 MCP tools exposed:

MCP Tool	Description
`scan_repository`	Full 4-agent pipeline scan of a GitHub repo URL
`scan_dependencies`	Scan a raw `group:artifact:version` list
`get_vulnerability_report`	Retrieve a past report by ID
`list_scan_history`	List all past scans with summary
`check_single_dependency`	Quick OSV check for one dependency
`get_safe_upgrade`	Find safe upgrade version for one dependency

API Endpoints

Method	Endpoint	Description
`GET`	`/health`	Health check — LLM backend connectivity, active model, and provider info
`POST`	`/scan`	Start an async vulnerability scan — accepts `github_url` or raw `input` dependency coordinates
`GET`	`/scan/{id}/status`	Poll scan progress — status, current step (0-4), result when complete
`DELETE`	`/scan/{id}`	Cancel an in-progress scan
`POST`	`/report/pdf`	Generate a PDF from a Markdown report
`GET`	`/history`	List past scan summaries ordered most-recent first
`GET`	`/history/{id}`	Retrieve the full Markdown report for a specific past scan
`DELETE`	`/history/{id}`	Remove a scan record from history

Running It Yourself

The quickest way to try VulnHawk locally:

  git clone https://github.com/crewwithravi/vul-scanner-spring-ai.git
  cd vul-scanner-spring-ai

  export SPRING_AI_GOOGLE_GENAI_API_KEY=AIza...
  ./gradlew bootRun

Then open http://localhost:9090 in your browser. To run with a different LLM vendor:

  # Anthropic Claude
  export LLM_VENDOR=anthropic
  export SPRING_AI_ANTHROPIC_API_KEY=sk-ant-...
  ./gradlew bootRun

  # OpenAI
  export LLM_VENDOR=openai
  export SPRING_AI_OPENAI_API_KEY=sk-...
  ./gradlew bootRun

  # Ollama (local or remote GPU)
  export LLM_VENDOR=ollama
  export OLLAMA_BASE_URL=http://<host>:11434
  ./gradlew bootRun

To run with Docker:

  docker compose up --build

The container includes Git and Maven pre-installed so it can clone and analyse repositories. Scan history is persisted via Docker volumes.

BOM-Aware Upgrades: How It Works

This is the part of VulnHawk I’m most proud of. For every vulnerable dependency, Agent 3 first checks whether it is managed by the Spring Boot BOM before recommending any version change. Here is what that looks like in the report:

  ⚠  DO NOT bump tomcat-embed-core directly — it is BOM-managed.

  FIX: Upgrade spring-boot 3.2.5 → 3.3.10
       This automatically brings tomcat-embed-core 10.1.30 (≥ safe version).

  pom.xml:       <spring-boot.version>3.3.10</spring-boot.version>
  build.gradle:  id 'org.springframework.boot' version '3.3.10'

Libraries covered by the BOM resolver:

Library	Maven Coordinates	Common Vulnerability
Embedded Tomcat	`org.apache.tomcat.embed:tomcat-embed-core`	Session fixation, request smuggling CVEs
Jackson Databind	`com.fasterxml.jackson.core:jackson-databind`	Deserialization gadget chain CVEs
Netty	`io.netty:netty-all`	HTTP request smuggling CVEs
Log4j	`org.apache.logging.log4j:log4j-core`	Log4Shell (CVE-2021-44228) and follow-on CVEs
Spring Framework	`org.springframework:spring-core`	Spring4Shell and expression injection CVEs
Spring Security	`org.springframework.security:spring-security-core`	Authentication bypass CVEs
Logback	`ch.qos.logback:logback-classic`	JNDI injection CVEs
SnakeYAML	`org.yaml:snakeyaml`	Deserialization CVEs

What’s Next

PostgreSQL persistence — replace in-memory history with JPA + unlimited scan storage
GitHub Actions integration — post scan results as PR comments
Private repository support via GitHub token
Scheduled background re-scans with email / Slack alerts when new CVEs are published for tracked repos
Jira integration — auto-create tickets for CRITICAL findings
SBOM export in CycloneDX format

Disclaimer

VulnHawk is an educational and experimental project built for learning and demonstration purposes. It is not production security software. Scan results are a useful starting point but may be incomplete — the OSV database has known gaps, and AI-generated upgrade assessments should always be verified independently before being applied to production systems. Always consult a qualified security professional for critical decisions. Use at your own risk.

Full source code, deployment guide, and API docs: github.com/crewwithravi/vul-scanner-spring-ai

Licensed under MIT · Built with Spring AI, Spring Boot, and OSV

Write a comment