May 12, 2026

investing
fundamentals
quantitative

Which share buybacks work best: a step-by-step backtest

An experiment with historical data to distinguish buybacks that truly reduce the share count from those that merely improve EPS and decorate the narrative.

In the previous post, we talked about the theory of share buybacks: what they are, when they create value, when they destroy value, and why looking only at EPS can be a fairly sophisticated way of fooling yourself.

If that post was the recipe, this is the practical cooking class.

One thing is saying "making a Spanish omelette is easy". Another very different thing is standing in front of the pan, with the oil hot, at that critical moment when you have to flip it. With buybacks, something similar happens.

In theory, a good buyback should do three things:

consider the opportunity cost
be done at a reasonable price
be financed responsibly

Then we run into real life.

In a real buyback, there are share issuances, stock-based compensation, debt, acquisitions paid for with shares, buybacks that reduce nothing, managers who repurchase because it looks nice in the quarterly presentation, and companies that buy expensively with a conviction that can only be explained by bonuses, ego, or both.

So our mission is to answer this question using historical data:

what kind of buyback contains useful information, and what kind of buyback is just financial decoration?

And, so nobody gets lost, we are going to go slowly and neatly.

How did we set up the experiment?

We are going to use Portfolio123 and its FactSet data.

Since we are quite limited by the tools, the goal is not to prove causality. We are not going to prove that one specific buyback causes one specific return, because that requires quite a bit more academic rigor.

The goal is more modest: to look for signals.

In other words, to see whether the theory from the previous post points in the right direction or whether, instead, we have built a magnificent sandcastle.

The base universe is Easy To Trade USA.
When we talk about the benchmark, we mean SPY.

We are going to evaluate the universe every four weeks between 2006 and 2026. The goal is not to prove causality, but to check whether variables associated with buybacks help rank the universe by future returns.

To do this, we use decile rankings. On each rebalance date, companies are ranked according to the relevant factor and grouped into ten equal-weight portfolios. The idea is simple: if the factor contains useful information, the higher deciles should do better than the lower ones in a reasonably consistent way.

Returns include dividends, use point-in-time data, include delisted stocks, and are calculated before taxes, transaction costs, and slippage.

To avoid turning this section into a technical manual, I leave the exact formulas and the composition of each ranking in the methodological appendix. Here we will stick with the important idea: we are going to check whether buybacks rank the universe better when we add real share count reduction, FCF yield, and debt control.

And with that, we have all the ingredients. Now it is time to start cooking. Can we find signal in share buybacks?

Experiment

Do companies that buy back more stock do better?

We start with the most obvious thing: net buyback yield. This metric helps us understand where the company's cash flow has gone, whether toward buying back stock or issuing shares, compared with its market cap.

Net buyback yield = (amount spent on buybacks - amount raised through share issuance) / market capitalization

It is a good starting point, but we need to keep in mind that it does not tell us whether the number of diluted shares actually goes down, because dilution can happen in other ways. It only tells us that, net of reported issuances, there is buyback flow.

What do we expect to see?

If buybacks contain signal, as we expect, the high deciles of BBY_Net should do better than the low deciles. In other words, net buyback yield should be able to rank the deciles somewhat, showing an upward trend across them and a meaningful return difference between the high and low deciles.

Result:

Factor	Top 20% CAGR	Bottom 20% CAGR	Spread	Best decile
`Net BBY`	10.35%	1.51%	+8.84 pp	Decile 9

There is a big difference between the companies that score best on "net buyback yield" and those that score worst. So it does look like there is some signal.

But careful. Not everything is as pretty as it seems.

Decile ranking

CAGR by net buyback yield

A metric capable of ranking the universe should show a monotonic trend.

SPY10,64%Universe6,81%

Here we can see that it is not really clear what is wheat and what is chaff. It more or less separates good stocks from bad stocks, but not with much precision in the middle deciles.

This is to be expected. As we already explained, buying and reducing shares is not enough for a buyback to be good. Sometimes companies repurchase with poor price discipline, sometimes they do it simply to offset SBC dilution, sometimes they weaken the balance sheet, etc. We are going to correct these problems step by step.

The first thing we are going to improve: net buyback yield measures recent net buyback flow, but it does not guarantee that the shareholder owns a larger percentage of the business.

We need to look at whether the share count actually goes down.

What if there are fewer diluted shares after the buyback?

If a company repurchases shares and there are NOT fewer shares outstanding, the buyback is more narrative than economics. To study this effect, we are going to add:

ShrRedFD_3Y: reduction in fully diluted shares over three years.

With the formula:

3-year diluted share reduction = 100 x (1 - current diluted shares / diluted shares 3 years ago)

We use diluted shares because they better capture options, RSUs, convertibles, and other instruments that dilute the shareholder. I use three years because a single year can be contaminated by timing, SBC, one-off issuances, or acquisitions.

It is not perfect, because a three-year window may react late to recent opportunistic buybacks. But in exchange, we get a more consistent signal.

Result:

Ranking	Top 20% CAGR	Bottom 20% CAGR	Spread	Best decile
`Net BBY`	10.35%	1.51%	+8.84 pp	9
`BBY + ShrRedFD_3Y`	10.93%	0.54%	+10.40 pp	10

Decile ranking

CAGR by BBY + real share reduction

The message improves: buybacks matter more when diluted shares truly go down.

SPY10,64%Universe6,81%

The signal improves a lot. Now we are doing a decent job of ranking the stocks that historically had better or worse future returns after each rebalance.

This is the important part: when we require a real reduction in diluted shares, we filter out cosmetic buybacks and move closer to buybacks that increase the economic ownership of the shareholder who stays. In other words, if we remove the "fake buybacks" that do not reduce the share count, the signal improves.

But it is still not enough. We can do better by applying the theory we already explained. Reducing shares is not enough if:

the company does it with poor financing
the company does it at bad prices

Which takes us to the next step of our investigation.

Free Cash Flow Yield

To try to improve the signal even further, and to capture the performance of "good buybacks", we are going to add Free Cash Flow Yield (FCFY) as a parameter.

This metric serves two purposes:

It introduces a measure of cash generation, meaning credible financing.
It starts connecting the buyback with valuation.

Buying back stock with abundant FCF is not the same as buying back stock with scarce FCF, just as buying back at high prices is not the same as buying back at low prices. Financing and valuation.

Result:

Ranking	Top 20% CAGR	Bottom 20% CAGR	Spread	Best decile
`Net BBY`	10.35%	1.51%	+8.84 pp	9
`BBY + ShrRedFD_3Y`	10.93%	0.54%	+10.40 pp	10
`BBY + ShrRedFD_3Y + FCFY`	11.66%	-1.47%	+13.13 pp	10

Decile ranking

CAGR by BBY + real reduction + FCF yield

The separation improves: the bottom 20% falls into negative territory and decile 10 takes the lead.

SPY10,64%Universe6,81%

The curve improves again. A real buyback becomes more interesting when it appears alongside FCF yield. This suggests that the market distinguishes, at least partially, between companies that buy back stock with enough cash and at reasonable prices, and companies that buy back stock because it sounds good to say so.

But we are not going to stop here. We have already said that FCF yield helps with the valuation leg and also with the financing leg, but let's see whether we can improve the signal by adding information about debt.

Net debt to FCF

If we are levered up to the eyeballs, is it really a good idea to spend that money on buybacks?

That is what we are going to answer. The new question is:

does the buyback look reasonable relative to the balance sheet?

We add a new metric:

Net debt to FCF = net debt / free cash flow

This metric is more of a guardrail. We want to avoid companies that buy back stock while the balance sheet is stretched, which is a bad idea and a very stupid way to destroy the company if things go wrong.

Result:

Ranking	Top 20% CAGR	Bottom 20% CAGR	Spread	Best decile
`Net BBY`	10.35%	1.51%	+8.84 pp	9
`BBY + ShrRedFD_3Y`	10.93%	0.54%	+10.40 pp	10
`BBY + ShrRedFD_3Y + FCFY`	11.66%	-1.47%	+13.13 pp	10
`BBY + ShrRedFD_3Y + FCFY + debt`	11.59%	-1.80%	+13.39 pp	10

Decile ranking

CAGR by BBY + real reduction + FCF yield + debt

When we add debt, the top 20% does not improve much, but the bottom 20% gets worse.

SPY10,64%Universe6,81%

The improvement does not come so much from pushing up the top 20%, but from separating the bottom 20% better. Put differently: controlling for debt helps avoid mistakes.

Next steps

There is still a lot to do:

If we look at the comparison against the S&P 500, we realize how hard it is to beat the index.

The best deciles do manage to beat SPY, but the edge is not large enough to declare victory. We still need to look at volatility, drawdowns, transaction costs, temporal robustness, and exposure to other factors. In other words: there is signal, but there is still no strategy.
We need to verify that the results are more robust.

Is it just a matter of the selected time period? Is the "profitability" factor generating this alpha? Maybe if we tweak the numbers a little, the story changes? These are things we need to test to make sure it was not just luck.
We can refine the signal much further.

What if we subtract SBC from FCF, as many investors do? What if we combine it with insider buying signals? What if we improve how we measure whether the shares are undervalued?

In another post, we will keep refining the signal, run robustness tests, and understand how we can apply this information in our investment strategies.

Stay tuned so you do not miss anything.

Conclusion

Buybacks are not magic.

Net buyback yield contains signal, but also a lot of noise. The separation between deciles increases when the buyback truly reduces the number of diluted shares, when the company generates enough FCF, and when the balance sheet is not strained.

Added signal	What it corrects	What happens
Net buyback yield	measures net buyback flow	there is signal, but it is noisy
Diluted share reduction	filters out cosmetic buybacks	the ranking orders the universe better
FCF yield	adds available cash and valuation	improves the separation between good and bad deciles
Net debt / FCF	avoids stretched balance sheets	mainly improves the identification of the bottom

Put differently: do not look for companies that buy back a lot, look for companies that buy back well.

Methodological appendix

The main variables in the experiment are these:

Variable	What it measures	Formula
`BBY_Net`	recent net buyback	buybacks net of share issuance, divided by market capitalization
`ShrRedFD_3Y`	real reduction in diluted shares	percentage decline in fully diluted shares over three years
`FCFY`	cash generation capacity and valuation	free cash flow divided by market capitalization
`NetDebtToFCF`	balance sheet risk	net debt divided by free cash flow

None of these variables is a strategy on its own, but we are going to study their impact on future returns. To do that, we use decile rankings to see whether they can rank the universe well. We want to check whether these variables can separate the wheat from the chaff; whether they can separate the best companies from the worst ones, and everything in between.

The rankings used in the experiment are built like this:

Ranking	Composition
`BBY`	100% net buyback yield
`BBY + ShrRedFD_3Y`	50% net buyback yield + 50% three-year diluted share reduction
`BBY + ShrRedFD_3Y + FCFY`	34% net buyback yield + 33% three-year diluted share reduction + 33% free cash flow yield
`BBY + ShrRedFD_3Y + FCFY + Debt`	30% net buyback yield + 30% three-year diluted share reduction + 25% free cash flow yield + 15% lower net debt to FCF

The final ranking is no longer equal-weighted because debt plays the role of filtering out the worst companies. BBY and ShrRedFD_3Y are the core of the thesis: net buyback plus real share reduction. That is why they have higher weights. FCFY is the first important economic control: that there is cash flow and reasonable valuation. It has a high weight, but slightly lower. Debt, by contrast, works more as a risk guardrail than as the main alpha engine. That is why it stays at 15%.

Enjoyed this post?

Leave me your email and I'll let you know when I publish something new.

Rate this post

If you liked it, give it a rating. It helps me improve.

1 = weak, 5 = excellent

Loading ratings...