Key Takeaways
Support agent performance is measured across four categories: resolution speed, quality, customer sentiment, and efficiency.
Strong teams track a small, balanced set rather than any single metric, and pair every number with a benchmark so a score has meaning. The four jobs your metrics need to cover are:
Here are the dozen metrics worth tracking, grouped by the job they do. Each one gets a plain definition, the formula, and a 2026 benchmark.
The long tail of niche call-center metrics is left out on purpose, because a shorter list that everyone understands beats a wall of numbers nobody acts on.
First Contact Resolution (FCR). The percentage of issues solved in the first interaction, with no follow-up. It is the single strongest driver of loyalty and cost savings, and the one to protect above all others. Formula: one-touch tickets divided by total tickets, times 100.
First Reply Time (FRT). How long a customer waits for the first human (or AI) response. Fast first replies lower anxiety, especially on urgent issues. Formula: total first-reply time divided by number of tickets.
Average Resolution Time. The total time from ticket open to close. Useful for spotting workflow bottlenecks, as long as you segment by issue type. Formula: time closed minus time opened, averaged.
Average Handle Time (AHT). The time an agent spends on one interaction, including talk, hold, and after-contact work. Track it for capacity planning, but read the traps section before you put it on anyone's scorecard.
Internal Quality Score (QA score). A graded review of a sampled set of tickets against a scorecard covering tone, empathy, process, and whether the issue was truly resolved. It is the counterweight to every speed metric.
Escalation rate. How often tickets get bumped to a higher tier. A climbing rate points to training gaps or unclear ownership, not always to a weak agent. Formula: escalated tickets divided by total tickets, times 100.
Replies per conversation and repeat-contact rate. Two cheap signals of resolution quality. High replies per ticket or repeat contacts about the same issue usually mean the first answer did not stick.
CSAT, CES, NPS, and DSAT. CSAT captures happiness with a specific interaction, CES measures how hard the customer had to work, NPS tracks long-term loyalty, and DSAT (the inverse of CSAT) pinpoints where things break. CSAT is the day-to-day workhorse.
Why FCR and CSAT move together: research from SQM Group finds that roughly every one percent gain in First Contact Resolution lifts customer satisfaction by about one percent, while making a customer follow up for the same issue can drop their satisfaction by around 15 percent. Speed for its own sake does not buy this. Resolution does.
Utilization and occupancy. Utilization is the share of scheduled time spent available to help; occupancy is the share of logged-in time spent actively handling work. Treat these as team and staffing metrics, not individual report cards.
Tickets solved per hour and cost per conversation. Throughput and unit economics. They answer capacity and budget questions, and they tie support directly to a number the finance team recognizes.
Numbers only mean something next to a target. Here are current ranges to aim for. Treat them as starting points, then adjust for your product complexity, because technical B2B tools resolve fewer issues on first contact than simple consumer apps.
| Metric | Good | Elite (top performers) |
|---|---|---|
| First Contact Resolution | 70 to 79% | 80%+ (about the top 5%) |
| CSAT | 75 to 85% | 85%+ |
| Average Handle Time | 4 to 6 min (segment by type) | lower only if quality holds |
| Occupancy | 70 to 85% | below 90% (above predicts burnout) |
| First Reply Time | minutes on chat, hours on email | near-instant with AI assist |
Some metrics do more harm than good when you use them the wrong way. These three traps catch well-run teams all the time.
It is useful for capacity planning and harmful as an individual KPI. Pressure agents to lower handle time and they rush, skip discovery, and close tickets that bounce right back. Re-contacts rise, First Contact Resolution falls, and CSAT follows it down. The metric improved while the service got worse.
Quality teams have documented this for years. As MaestroQA notes, optimizing one number in isolation invites agents to game it at the expense of the customer. Read handle time next to FCR and quality, never alone.
Any one metric, on its own, gets gamed. CSAT surveys get cherry-picked, ticket counts get padded with easy closes, AHT gets gamed by transfers. A balanced set protects against all of this, because no single behavior can move every number in the right direction at once.
A spike in escalations after a buggy release is a product signal, not an agent failure. So is a slow reply time during a staffing gap.
When you score agents on outcomes they do not control, you teach your best people to leave. Separate what the agent owns from what the system owns before anyone gets a number.
Most metrics advice is written for high-volume call centers, where the job is to clear thousands of interchangeable contacts as fast as possible. B2B software support is the opposite problem. Lower volume, higher stakes, and known accounts.
Support built for B2B treats every ticket as a window into the health of an account, not a call to be cleared in four minutes.
That changes which metrics you trust. A 12-minute conversation that uncovers a renewal risk is worth more than three two-minute closes.
Average Handle Time, the metric most call-center guides obsess over, is close to meaningless when one ticket can signal six figures of churn.
Lead with First Contact Resolution, CSAT, and a quality score. Then add the metric the other guides miss: account-health contribution.
Did the agent flag the churn risk, surface the upsell, or catch the competitor mention buried in a ticket? In B2B, those signals are the point.
Helply scans every ticket for them and routes churn risk to the CSM and buying signals to the AE automatically, so support produces revenue instead of just deflecting work.
Measuring is the easy part. Moving the numbers takes four habits.
That last habit is where modern tooling earns its keep. Helply's AI assistant drafts every reply with full account context pulled from your CRM, Stripe, and product data, so agents answer faster without cutting corners, at $0.25 per draft. High-confidence tickets can be resolved autonomously at $0.50 per resolution, and you can ask your support data anything in plain language instead of building a report. The helpdesk itself is free, with unlimited seats; you pay only when the AI delivers an outcome.
The math: a 12-seat team on Zendesk Suite Pro with Copilot runs about $3,884 a month. The same team on Helply pays $0 for the helpdesk and only for the AI outcomes it uses, which is why the headline comparison is $3,884 versus $0. Request access to see your own numbers.
Good support measurement is not about counting everything. It is about tracking a small, balanced set, FCR, CSAT, quality, and a capacity metric, reading them against real benchmarks, and refusing to let one number define an agent.
For B2B teams, add the account-health signals that turn support into revenue.
The teams that win in 2026 treat their metrics as a revenue lens, not a stopwatch.
If you want a helpdesk that surfaces those signals automatically and only charges when the AI delivers, request access to Helply.
70 to 79 percent is good and 80 percent or higher is elite, though technical B2B products often run lower.
Aim for 75 to 85 percent, with 85 percent or higher considered top-tier performance.
It is useful for capacity planning but harmful as an individual KPI, because optimizing it in isolation drags down FCR and CSAT.
Combine resolution (FCR, reply time), quality (a QA score), sentiment (CSAT or CES), and efficiency (occupancy) rather than relying on any single number.
FCR, CSAT, a quality score, and account-health contribution such as churn or upsell signals surfaced, not raw ticket volume or AHT.
They turn support into a coachable system and, for B2B teams using Helply, into a revenue signal rather than a cost center.