Diagnostic

Diagnose your OKR in 60 seconds.

Paste your OKR. Get a structured critique in seconds: what's broken, why, and three rewrites you can use immediately. Your API key never leaves your browser.

Runs locally before the LLM call. Output-verb detection, timebox regex, placeholder check. Instant pre-score, no key required.

Try an example

Paste OKR

/100

Scanning

Key stored in browser only. One OKR set at a time.

~$0.002 per analysis with GPT-4o-mini. ~$0.01 with Claude Sonnet. Cmd+Enter to submit.

Where are you?
Starting from scratch, have a goal but no KRs, or have a draft you want to pressure-test?

~$0.02-0.04 per session. Conversation stays in your browser only.

/100

Analysing

Rule engine flags

Findings

Getting LLM analysis

The rubric explained.

OKR Orca scores against a seven-criterion rubric. Three criteria apply to the Objective. Two criteria apply to each Key Result. Two apply to the set as a whole.

The core principle: "Who does what by how much." An outcome is a measurable change in behaviour. Outputs are not outcomes. Impact (revenue, profit) is too far removed. Key Results live at the outcome layer.

Objective criteria

O1 Clarity

Is the customer benefit explicit? 0 = No customer benefit stated. 1 = Vague, unclear who benefits. 2 = Clear customer and specific scope.

O2 Timebox

Is there a deadline? 0 = No deadline. 1 = Implicit ("this quarter"). 2 = Explicit date or quarter.

O3 Strategy

Is a solution prescribed? 0 = A feature or solution is named. 1 = Generic but solution-free. 2 = Problem-framed, no solution prescribed.

Key Result criteria (per KR)

Outcome form

Output or outcome? 0 = Output keyword detected (launch, migrate, deliver, build, implement). 1 = Has a metric but vague actor or direction. 2 = Full "who does what from X to Y".

Measurability

Baseline and target present? 0 = Neither. 1 = One present, one missing. 2 = Both present with implied data source.

Set-level criteria

A1 Alignment

Is there a parent objective or strategy reference? 0 = No reference. 1 = Implied. 2 = Explicit parent link.

C1 Completeness

Placeholders present? 0 = Placeholders detected (X%, TBD, (owner), (tbc)). 1 = Minor gaps. 2 = Fully specified.

The "So What?" test

For every KR, ask: (1) If all KRs are green, is the Objective obviously achieved? (2) If this KR turns red, does it signal a real problem? (3) Does the team actually control this metric? Any "no" means the KR needs rewriting.

Common anti-patterns: Output-as-KR. Impact-as-KR (revenue targets). Vanity metrics (engagement without "who + by how much"). Placeholders. Binary milestones (100% migrated). Task lists disguised as KRs.

Frequently asked.

Is my OKR text sent to a server?

No. Your OKR text is sent directly from your browser to OpenAI or Anthropic. OKR Orca has no backend. The source is public and you can verify this yourself.

Where is my API key stored?

In your browser's localStorage only. It is never transmitted to okrorca.com or any third party. You can clear it at any time via the "change" link in the nav.

Which models are supported?

OpenAI: GPT-4o-mini (default, cheapest), GPT-4o. Anthropic: Claude Sonnet 4.6 (default), Claude Haiku 3.5. The rule-engine pre-score runs locally and is free regardless of key.

How much does an analysis cost?

With GPT-4o-mini roughly $0.001 to $0.003 per analysis. With Claude Sonnet 4.6 roughly $0.008 to $0.015. The estimate shown on the input panel is indicative only.

Can I use it without an API key?

Yes. The rule-engine pre-score runs instantly in the browser with no key required. You get output-verb flags, timebox checks, placeholder detection, and a pre-score. LLM rewrite suggestions require a key.

What is the rule-engine pre-score?

A heuristic score (0-100) computed locally in under 100ms. It checks for output verbs in KRs, missing timeboxes in the Objective, placeholder strings, and the presence of baseline and target numbers. It is a fast signal, not a replacement for the full LLM analysis.

What is Coach mode?

Coach mode is the second tab in the input panel. Instead of pasting an existing OKR, you start a conversation. The tool asks one focused question at a time, surfaces the gap between what you want to achieve and what you have written, and helps you arrive at an OKR in your own words. When the draft is ready, you can send it straight to Diagnose for a rubric score.

Who built this.

I'm Frederik Metz, an agile coach in Munich. I've reviewed several hundred OKRs over the last few years, mostly from product teams trying to get the rubric right and getting tangled by output-disguised-as-outcome KRs.

OKR Orca is the diagnostic I wish my teams had earlier. The 7-criterion rubric is the one I use in coaching conversations. It's opinionated, but it's the opinion that consistently produces OKRs that move outcomes, not OKRs that produce planning theatre.

The tool is free. Your key stays in your browser. No signup, no tracking. If you find a bug or want a feature, the source is on GitHub.

View source on GitHub · frederikmetz@gmail.com