Built for the team that ships, not just the one that's on-call
NessForge models your deployment environment as it actually is — multiple services, multiple teams, shared config, inconsistent naming conventions and all. The analysis works against your real stack, not an idealized one.
Who uses it and how
Visibility across all service deploys, without building internal tooling
When you're running 20+ microservices and multiple product teams shipping to the same production environment, the question "did this deploy succeed?" is never as simple as a green check. A passing CI run doesn't mean nothing broke downstream. NessForge gives platform teams a unified timeline of every active deploy: what changed, which services it touched, and whether any of those services started behaving differently afterward. When something goes wrong, you get a dependency-aware failure map instead of a blank screen and a timer.
Know whether your change broke something else, specifically
You merged a clean PR. CI passed. The deploy ran. And now the auth service is returning 503s, and nobody's sure if it's related to your change or something that was already broken. NessForge tells you, specifically: whether the failure pattern started before or after your deploy, whether your change altered any shared config or contract that other services depend on, and whether this failure signature has appeared before. It doesn't prove innocence — but it gives you the facts to stop guessing and start fixing.
Reduce MTTR without adding another dashboard to watch
Your mean time to root cause isn't limited by your alerting. It's limited by the time it takes to correlate evidence across five systems during an incident — and nobody fires up a structured investigation workflow at 2am. NessForge reduces that correlation step from a 20-minute manual process to a 30-second lookup. It also tracks pipeline health metrics over time, so you can catch a degraded CI pipeline — flaky tests accumulating, build duration drifting — before it becomes the next incident.
What's under the hood
NessForge is built on a causal event graph, not a time-series database. The distinction matters: it can answer "why did this fail" instead of just "when did this spike."
Data model
Causal event graph linking commits, CI runs, config changes, and service failures
CI integrations
GitHub Actions, GitLab CI, CircleCI, Buildkite
Source control
GitHub, GitLab, Bitbucket
Orchestration
Kubernetes (in-cluster or remote), Amazon ECS
Observability (planned)
Datadog, Prometheus / Grafana — import existing metrics to enrich failure context
Alerting
Slack webhooks, PagerDuty incident creation with pre-populated root cause context
Auth
SSO / SAML, GitHub OAuth, GitLab OAuth
Code storage
NessForge never stores source code. It reads commit metadata and CI log structure — not content.
Deployment model
SaaS (hosted). Self-hosted option planned for Q3 2026.
Retention
90-day rolling pipeline history by default; configurable up to 365 days
Availability
Target: 99.9% uptime SLA at GA. Early access runs on best-effort.
Compliance
SOC 2 Type II audit in progress. Expected Q4 2026.
Start with your existing stack
No agent to install. No pipeline YAML to change. Connect in 5 minutes and start seeing your deploy history in context.