Stock Screens That Filter Quality vs. Ones That Curve-Fit

Most stock screens fail not because the metrics are wrong, but because the operator confuses correlation with past returns for evidence of business quality. A screen that backtests beautifully on 2010-2021 data is often a fossil of one regime — cheap money, expanding multiples, and tech-led concentration. The screens that hold up across regimes ask a different question: does this business throw off real cash, defend its margins, and reinvest at a decent rate? Everything else is noise or curve-fit.

Here's how to separate the two.

Quality Screening Metrics That Actually Filter Durable Businesses

These are the inputs that map to something a business actually does, not just how its stock has behaved.

Return on invested capital (ROIC), 5-year average. A single year of high ROIC is often a cycle. Five years averages out the working-capital noise and tells you whether the business earns more than its cost of capital across conditions. A useful rough cut: ROIC > 15% over five years, and ROIC > WACC (weighted average cost of capital) every year. Costco, Visa, and Moody's clear this bar for boring reasons — they have structural moats, not lucky quarters.

Free cash flow conversion. Net income is an opinion; free cash flow is closer to a fact. Screen for free cash flow / net income > 80% over a trailing five-year window. Businesses that consistently fail this test are usually either capex-heavy (which isn't necessarily bad, but you need to know) or running aggressive revenue recognition. Stripping out stock-based compensation from FCF — which most screeners don't do by default — is the single biggest upgrade you can make.

Gross margin stability, not level. A 70% gross margin SaaS company is no better than a 35% gross margin industrial if the industrial's margin doesn't move ±300bps across a cycle. Stability signals pricing power. Screen for gross margin standard deviation < 200bps over five years before screening for absolute level.

Net debt / EBITDA with a sector overlay. Below 2x is conservative for most industries, but the right threshold depends on cash flow stability. A regulated utility at 4x is fine; a cyclical at 4x is a future restructuring.

Curve-Fit Metrics That Look Smart But Filter Nothing

These show up in a lot of "quality" screens and contribute almost nothing once you control for the metrics above.

P/E below the sector median. This is the canonical value trap generator. Sector medians are dragged around by a handful of names, and "cheap relative to sector" usually means "sector is mispricing something you also missed." If you want to use a multiple, use EV/EBIT and pair it with ROIC — cheap and high-returning is a real signal; cheap alone isn't.

Dividend yield > X%. A high yield is a market verdict, not a quality signal. The yield is high because the price fell, and the price fell for a reason. Screen on dividend growth with a payout ratio sanity check (< 60% for most non-utilities) if you care about income durability.

Revenue growth > 20%. Top-line growth is regime-dependent and easy to buy with discounts, capex, or stock-funded M&A. A growth screen without a unit-economics filter (gross margin trend, operating leverage, customer acquisition cost trends where disclosed) will load you up on the next WeWork.

RSI, moving averages, and any momentum overlay. These aren't bad — they just aren't quality screens. Don't pretend a 200-day moving average crossover is telling you something about the business. Use technicals as a timing layer if you want, but keep them out of your quality filter.

How To Build A Screen That Survives Out-Of-Sample

Three practical rules.

First, screen on five-year averages, not trailing twelve months. TTM screens pick up cycle peaks. If you'd screened US homebuilders on TTM ROE in 2005, you would have loved them. A five-year window doesn't fix this completely, but it raises the bar.

Second, stack filters in order of business quality first, valuation last. Start with ROIC, FCF conversion, and margin stability. Apply leverage and accounting-quality screens next. Then layer valuation. Reversing the order — cheap first, quality second — is how value investors ended up owning newspapers in 2008.

Third, always check your output list against a sniff test. If your screen returns 80 names and 30 of them are Chinese ADRs you've never heard of, the screen is probably picking up reporting differences, not quality. A good screen returns names you mostly recognize, with a few interesting outliers worth researching.

The Backtesting Trap Most Prosumers Walk Into

A screen that backtests to 18% annualized over a decade has almost certainly been over-fit, either by you or by whoever published it. Three tells:

It uses more than four or five filters. Each additional filter is a degree of freedom that can be tuned to the past.
The thresholds are oddly specific (ROIC > 17.3%, debt/equity < 0.62). Real signal works at round numbers.
Performance is concentrated in one regime — usually 2013-2020 — and falls apart in 2022.

A screen that produces 12% with three filters and works across multiple decades is far more useful than one that backtests to 20% with seven filters.

What To Watch Next

Audit your current screen. Count the filters. If it's more than five, you're likely curve-fitting.
Replace any TTM metric with a 5-year average where the data is available. This alone removes a lot of cycle noise.
Add a stock-based-comp adjustment to your FCF filter if you screen tech. Most stock screeners don't do this by default, and it changes the output materially.
Re-run your screen across two distinct market regimes (e.g., 2015-2019 and 2021-2024). If the top 20 names overlap by less than half, your screen is regime-dependent, not quality-dependent.