Why backlog grooming feels like pulling teeth
Your backlog has 347 items. Half of them were added by people who’ve left the company. A third haven’t been touched in six months. And every grooming session turns into a two-hour debate about whether the CEO’s pet feature should jump the queue. Sound familiar?
Backlog grooming—sometimes called backlog refinement—has earned a bad reputation because most teams do it wrong. They treat it as a bureaucratic checkbox, a recurring calendar invite where engineers reluctantly assign story points while checking Slack. But done well, backlog grooming is where product strategy meets execution. It’s how you ensure your team works on what actually matters.
This guide covers how to run grooming sessions that people don’t dread, how to evaluate items without endless debates, and how to keep your backlog lean enough to be useful.
What backlog grooming actually is (and isn’t)
Backlog grooming is the ongoing process of reviewing, refining, and prioritizing the items in your product backlog. It includes:
- Adding detail to upcoming items so they’re ready for development
- Breaking large items into smaller, shippable pieces
- Re-prioritizing based on new information
- Removing items that no longer make sense
- Estimating effort (if your team does that)
What it’s not: a planning session, a sprint commitment meeting, or a place to debate product strategy from first principles. If you’re regularly relitigating your roadmap in grooming, something upstream is broken.
The Scrum Guide renamed this “refinement” years ago, but “grooming” stuck in common usage. Use whichever term your team prefers—the practice matters more than the label.
How often to groom (and for how long)
The Scrum Guide suggests spending no more than 10% of sprint capacity on refinement. For a two-week sprint, that’s roughly 4 hours total—but that doesn’t mean one four-hour meeting.
Here’s what works for most teams:
Weekly grooming (recommended for most teams)
A 45-60 minute session once per week keeps items flowing without marathon meetings. Schedule it mid-sprint so you’re not competing with planning or retros. Basecamp, before moving away from sprints entirely, found that shorter, more frequent refinement sessions reduced “meeting dread” and kept discussions focused.
Twice-weekly for fast-moving teams
If you’re shipping multiple times per week or dealing with high uncertainty, two 30-minute sessions can work better. This is common at companies like Linear, where the backlog changes frequently based on customer feedback.
Continuous refinement for mature teams
Some teams skip formal grooming sessions entirely. Engineers and PMs refine items asynchronously in Jira, Notion, or Linear, then use sync time only for items that need discussion. This requires high trust and good documentation habits—don’t attempt it until your team has nailed the basics.
Who should be in the room
The worst grooming sessions have too many people. The second-worst have too few.
Always include:
- Product manager (you’re driving)
- Tech lead or senior engineer (technical feasibility and effort)
- Designer (if the items involve UX decisions)
Include when relevant:
- QA lead (for complex items with testing implications)
- The full engineering team (only for estimation or when everyone’s input matters)
Don’t include:
- Stakeholders who’ll derail prioritization with “just one quick question”
- Leadership who should be influencing roadmap, not backlog ordering
- Anyone who hasn’t read the items beforehand
Marty Cagan advocates for a small, empowered product team that can make decisions quickly. Grooming is where that empowerment gets tested. If you need approval from three levels of management to reorder your backlog, you have a bigger problem than this article can solve [INTERNAL_LINK: product team empowerment].
How to evaluate and rank items without drama
Most backlog debates happen because there’s no shared framework for prioritization. People argue based on gut feeling, recency bias, or whoever speaks loudest. Fix this by agreeing on criteria before you start ranking.
Use a lightweight scoring framework
You don’t need anything complex. RICE (Reach, Impact, Confidence, Effort) works well for most teams [INTERNAL_LINK: RICE prioritization]:
- Reach: How many users/customers will this affect in a given time period?
- Impact: How much will it move the needle for those users? (Use a scale: 3 = massive, 2 = high, 1 = medium, 0.5 = low, 0.25 = minimal)
- Confidence: How sure are you about the above estimates? (100%, 80%, 50%)
- Effort: How many person-weeks will this take?
Score = (Reach × Impact × Confidence) / Effort
Intercom popularized RICE and found it reduced prioritization arguments by giving everyone a shared language. You’re not debating “is this important”—you’re calibrating specific dimensions.
Separate prioritization from estimation
A common mistake: letting effort estimates drive priority. Yes, a quick win might be worth doing, but don’t let “it’s only two points” become the reason you never tackle important complex work.
Rank items by value first, then factor in effort. A high-value, high-effort item might still outrank a low-value, low-effort one—it just means you need to break it down or staff it appropriately.
Use the “if not now, when?” test
For items that keep hovering in the middle of the backlog, ask: “If we don’t do this in the next 6 weeks, when would we do it?” If the honest answer is “probably never,” either commit to scheduling it or delete it. Purgatory isn’t a prioritization strategy.
When to delete vs. defer items
This is where most PMs fail. Deleting feels wasteful—what if we need that idea later? But a bloated backlog is worse than no backlog. You can’t see what matters when it’s buried under 200 items no one remembers adding.
Delete when:
- The problem no longer exists (market changed, feature shipped differently, workaround emerged)
- The item has been in the backlog for 6+ months untouched
- No one can articulate why this matters to current goals
- The requester has left and no one else cares
- You’ve learned something that invalidates the hypothesis
Defer (move to an “icebox” or separate list) when:
- It’s a good idea but clearly not for this quarter/half
- It depends on something else shipping first
- It’s exploratory and needs more discovery before it’s actionable
At Airbnb, product teams do quarterly “backlog bankruptcy”—a dedicated session to archive anything that hasn’t moved in 90 days. If it’s truly important, it’ll resurface. Most things don’t.
A practical rule: your active backlog should have 2-3 sprints worth of refined items, plus a rough queue for the rest of the quarter. Everything else goes in an icebox or gets deleted.
How to keep the backlog from becoming a graveyard
Healthy backlogs require ongoing maintenance, not just occasional grooming. Here’s how to prevent decay:
Set a size limit
Teresa Torres recommends keeping your working backlog under 50 items. When you hit the limit, you must delete something before adding anything new. This forces continuous triage instead of endless accumulation [INTERNAL_LINK: continuous discovery habits].
Add expiration dates
When you create an item, add a “review by” date 60-90 days out. If it hasn’t been prioritized by then, it triggers an automatic review: still relevant? Delete or reschedule.
Track where items come from
Tag items by source (customer feedback, internal request, analytics insight, stakeholder ask). During backlog grooming, you can quickly filter and audit: “We have 40 items from stakeholder requests but only 5 from customer research. That’s a smell.”
Require a problem statement
No item enters the backlog without a clear problem statement: Who has this problem? How do we know? What happens if we don’t solve it? This alone eliminates half of low-quality additions. Lenny Rachitsky’s research found that teams who enforce problem statements spend 30% less time in grooming debates.
Review aging items monthly
Spend 15 minutes once a month looking at items older than 60 days. Either promote them (they’re important, get them into the next few sprints), demote them (move to icebox), or delete them. No item should age indefinitely.
Making grooming work in practice
The best teams treat backlog grooming as a forcing function for clarity. It’s where vague ideas get sharpened, bad ideas get killed, and good ideas get the detail they need to become shippable work.
Start with these changes:
- Cut your grooming invite list to the minimum viable attendees
- Adopt a simple scoring framework and use it consistently
- Schedule 30 minutes this week to delete anything untouched for 6+ months
- Set a backlog size limit and enforce it
Your backlog should feel like a curated queue of your best ideas, not a graveyard of forgotten requests. When someone new joins your team, they should be able to scan the backlog and understand what matters and why. That’s the goal—and consistent, thoughtful grooming is how you get there.
Frequently asked questions
What is backlog grooming (or backlog refinement)?
Backlog grooming (officially called backlog refinement in Scrum) is the ongoing process of reviewing, updating, and prioritizing items in the product backlog to ensure the team always has a ready queue of well-defined work.
How often should you groom the backlog?
Most teams groom the backlog once per sprint (every 1-2 weeks). Sessions typically last 1-2 hours. The goal is that the top of the backlog is always ready to pull into the next sprint.
Who should attend backlog grooming?
At minimum: the product manager and lead engineer. Designers and other stakeholders join when the items require design or technical context. The whole dev team isn’t always necessary — keep the session lean.
What should you do with old backlog items?
If an item has been in the backlog for 6+ months without being prioritized, it should be deleted or archived. A bloated backlog is a distraction. If it was important, it will come back up.
