A long form note on systematic short volatility strategies. Where they fail, where they succeed, and what I would actually deploy. The premise was simple: I wanted to know whether the wheel strategy was real alpha or marketing. What followed was a year of backtesting, two layers of Monte Carlo validation, an extended history test through the dot com crash and the GFC, a live paper trading harness running against real option chains, and a long list of approaches that did not work. What I found surprised me.
The wheel strategy keeps showing up in retail finance discussions. Sell cash secured puts on stocks you would not mind owning, take assignment if the stock falls below your strike, sell covered calls against those shares until they get called away, repeat. The pitch is that you get paid premium for being patient. I wanted to know if this was real alpha or marketing.
My personal benchmark was always QQQ buy and hold. From 2010 through 2024, QQQ buy and hold returned roughly 17 percent CAGR with a worst drawdown of minus 35 percent during 2022 and a recovery time of roughly six months. That is the bar. Anything I built had to do better than that on a risk adjusted basis, or there was no point.
My constraint was specific. I told myself that a drawdown of 30 or even 50 percent was acceptable as long as the strategy could characterize it as normal behavior and the math supported recovery. What I would not tolerate was a strategy that could permanently impair my capital. So the early question I was asking was not "what is the highest CAGR I can achieve" but rather "what is the highest CAGR I can achieve without an account blow up risk."
I wanted a system that I could leave running and trust. A system that had a known failure mode I could prepare for, not a hidden one that would surprise me.
An option is a contract. The buyer pays a premium today for the right, but not the obligation, to do something at a future date. The seller collects the premium today in exchange for taking on the obligation. There are two types of basic options.
The seller of either option collects the premium up front but takes on the corresponding obligation. The buyer is essentially purchasing insurance or leverage. The seller is essentially providing it.
On average, across long historical samples, the premium that option buyers pay is more than the eventual cost of the obligation. This gap is called the volatility risk premium. Two reasons it exists.
First, hedging demand. Pension funds, insurance companies, and mutual funds need to hedge their long equity exposure. They buy index puts as portfolio insurance. There is structurally more demand for these puts than there are sellers, so the price gets bid up. The expected payout on those puts is lower than what buyers pay.
Second, lottery preference. Retail traders and speculators buy calls hoping for outsized returns. They are systematically willing to overpay for the lottery ticket profile of out of the money calls. Same effect: the price exceeds the rational expected value.
The empirical size of this gap, on the S and P 500 index, is around 2 to 4 percent per year of realized vol overpayment. That is the structural edge that any short volatility strategy is trying to harvest. Everything else is execution detail.
Before I built anything, I wanted to know how big the premium gap actually is on the tickers I would be trading. So I pulled live option chains and computed the ratio of market implied vol to the underlying's trailing realized vol. This is the practical version of the academic VRP measurement.
Three things are worth noting here.
Index ETFs (SPY, QQQ, IWM) command the largest premium because every pension fund and mutual fund hedges with index puts. The 1.30x multiplier is roughly textbook for the SPX put premium.
Single name stocks command much less premium, around 1.10x, because there is less hedging demand for them. Pension funds do not buy AAPL puts, they buy SPX puts.
NVDA is anomalous. Its implied vol trades at almost exactly the same level as realized. The reason is that retail call buying has been so intense that market makers can hedge calls cheaply against existing demand, suppressing the put side premium that normally lifts the average IV.
I used these ticker specific multipliers throughout the backtest, calibrated against current chains. They are not arbitrary numbers.
The cheapest historical options data available costs roughly 30 dollars per month from Polygon. I wanted to start without that, so I built a synthetic engine. The logic is straightforward.
On every trading day in the simulation, I take the underlying's price from yfinance, compute its trailing 30 day realized volatility, multiply by the ticker specific uplift from the previous section to get a synthetic implied vol, and use that to price options via Black Scholes. The strike I select is whichever strike has the target delta. For example, a 35 delta put on QQQ is the strike that has roughly a 35 percent probability of finishing in the money, by Black Scholes assumptions.
The simulation runs once per day. Each iteration handles three things in order:
I want to be explicit about what this engine cannot model. It does not capture bid ask widening during crises. It does not model volatility skew (the fact that puts trade at higher IV than calls in real markets). It does not handle pin risk at expiration. It assumes I can always exit at intrinsic value. These are real limitations and I quantify their impact later.
Imagine you would not mind owning 100 shares of QQQ at a 5 percent discount to today's price. The wheel lets you get paid to be patient about that purchase.
You sell a cash secured put with a strike about 5 percent below the current price. You collect a premium. The buyer of that put has the right to sell you those shares at the strike on the expiration date. Three things can happen.
Once you own shares, you sell a covered call with a strike about 5 percent above your cost basis. You collect another premium. Three things can happen.
I ran the classic wheel on QQQ from 2010 to 2024, selling 35 delta puts at 14 days to expiration. The premium uplift was 1.30x realized vol from the calibration above. Here is what came out.
| Metric | QQQ Buy and Hold | Classic Wheel 35d / 14DTE |
|---|---|---|
| CAGR | 18.51% | 18.40% |
| Sharpe ratio | 0.74 | 1.10 |
| Maximum drawdown | -35.1% | -22.3% |
The wheel underperformed on CAGR by a couple of percentage points, but it had a meaningfully better Sharpe ratio and a smaller worst drawdown. So it was not worthless. But it was not what the marketing suggested either.
The reason is structural. The 14 year window from 2010 to 2024 was an unusually strong bull market for tech. QQQ went up roughly 10x in that period. Every time the wheel had me assigned and selling covered calls, the calls would get exercised and I would sell the shares at the call strike, capturing a small profit but missing the next 20 percent of the rally. The covered call leg is a structural tax on upside in a trending market.
If the covered call is the problem, what happens if I never take assignment? I built a variant called CSP only. When the put expires in the money, I just cash settle the loss and immediately open a new put at a fresh delta. I never hold the underlying.
The result was worse on CAGR. The reason is that when QQQ takes a drawdown and comes back, the wheel captures the share recovery as gains on the underlying position. CSP only does not own anything. It just keeps selling premium against a falling and recovering price, which is roughly a coin flip in net effect.
| Metric | Classic Wheel | CSP only |
|---|---|---|
| CAGR | 18.40% | 12.30% |
| Max drawdown | -22.3% | -19.9% |
What CSP only does give you is a tighter, more predictable drawdown profile because you never carry an underwater stock position. The recovery time is also shorter. So there is a use case, but it is not the CAGR maximizing answer.
One obvious refinement to the wheel is to avoid selling puts during obvious downtrends. The classic problem with cash secured puts is that they make you a catcher of falling knives. If QQQ is in a clear bear market, selling puts means agreeing to buy at a strike that is likely to be far above future spot. Painful.
So I added a simple momentum filter. Only open new puts when the underlying is above its 50 day simple moving average. If it falls below, I hold existing positions to expiration but do not open new ones. This is the "wheel plus momentum" variant.
| Metric | Classic Wheel | Wheel + Momentum |
|---|---|---|
| CAGR | 18.40% | 16.67% |
| Max drawdown | -22.3% | -22.8% |
On QQQ, the trend filter slightly reduces CAGR (because you skip some of the recovery rallies that begin while the SMA is still below), but it tightens drawdowns. The bigger payoff for momentum filtering, as I later found, is on the leveraged ETFs where bear regimes are existential.
At this point in the project I was disappointed. The basic wheel strategies I had been testing were not really beating buy and hold on a clear risk adjusted basis. I needed to think differently.
The short strangle is structurally different from the wheel. Instead of selling a put and waiting to potentially own shares, you simultaneously sell an out of the money put AND an out of the money call. You never own the underlying. Both legs are cash settled at expiration.
What makes the strangle interesting is that it has no upside cap from a covered call leg. Once shares get assigned in the wheel, you cap your upside. In a strangle, you never own shares. The trade off is that you have unbounded loss potential on the call side (technically the stock can rocket to infinity) and bounded loss on the put side (a stock cannot fall below zero).
I ran a 30 delta strangle at 14 days to expiration on QQQ.
| Metric | QQQ Buy and Hold | Strangle 30d / 14DTE |
|---|---|---|
| CAGR | 18.51% | 20.70% |
| Sharpe ratio | 0.74 | 1.55 |
| Maximum drawdown | -35.1% | -19.3% |
This was the first strategy that beat buy and hold meaningfully on both CAGR and drawdown. The Sharpe ratio jumped substantially because the strangle's equity curve is much smoother than QQQ's. Each cycle collects two premiums (put plus call) and as long as QQQ stays between the strikes (which it does most weeks), the strategy makes money.
The strangle has been a known retail strategy for decades but is less commonly written about than the wheel. I think this is partly because the wheel sounds friendlier (you "own the stocks you like"), and partly because strangles have a scarier risk profile on paper (unbounded call side loss). In practice, the unbounded loss is a theoretical concern that almost never matters at reasonable position sizes, and the CAGR advantage is real.
Once strangles were the focus, the next question was parameter tuning. Two main dials: delta and days to expiration.
Delta controls how close to the current price your strikes are. A 30 delta strike is about 30 percent likely to finish in the money, so it is moderately out of the money. A 45 delta strike is much closer to the current price and collects much more premium but also has higher probability of being tested.
Days to expiration controls how often the strategy cycles. A 14 day to expiration strategy cycles every two weeks. A 7 day strategy cycles weekly. A 3 day strategy cycles essentially three times a week.
The theoretical tradeoff is that shorter DTE captures faster theta decay (the premium decays nonlinearly toward expiration) but adds gamma risk (the option's sensitivity to underlying moves accelerates near expiration). Higher delta captures more premium per trade but has higher probability of needing to be paid out.
I ran a walk forward sweep across delta in {0.15, 0.20, 0.25, 0.30, 0.35, 0.40, 0.45} and DTE in {3, 5, 7, 10, 14, 21} on QQQ. The clear winner was high delta combined with short DTE.
| Strangle config | CAGR | Sharpe | Max DD |
|---|---|---|---|
| 30 delta / 14 DTE | 20.70% | 1.55 | -19.3% |
| 35 delta / 7 DTE | 31.83% | 2.25 | -14.0% |
| 45 delta / 3 DTE | 48.88% | 2.92 | -20.7% |
The 35 delta 7 DTE variant became my Tier 1 candidate. It captures most of the alpha of the more aggressive 45/3 variant but with materially less gamma risk and fewer trades. I considered this the strongest single sleeve candidate I could build on QQQ alone.
The next experiment was extending the same logic to leveraged ETFs. TQQQ is a 3x leveraged version of QQQ. SOXL is 3x leveraged semis. UPRO is 3x leveraged SPX. These instruments have famously high implied vol because their underlying daily moves are roughly 3x the index, which makes the premium yield enormous.
The catch is that they have volatility decay. When the underlying chops sideways, the 3x ETF loses money even though the underlying is flat. This is just compounding math. A 5 percent up day followed by a 5 percent down day leaves the underlying at 100 minus 0.25 = 99.75. The 3x leveraged version goes up 15 percent then down 15 percent, leaving it at 100 times 1.15 times 0.85 = 97.75. The decay is roughly the square of the daily move.
I ran the wheel and strangle on the three leveraged ETFs to see what would happen.
| Solo leveraged sleeve | CAGR | Sharpe | Max DD |
|---|---|---|---|
| TQQQ wheel 35/14 | 38.25% | 0.78 | -81.6% |
| SOXL wheel + momentum | 37.49% | 0.86 | -60.5% |
| UPRO strangle 35/14 | 32.78% | 1.00 | -57.1% |
The headline numbers were tempting. 41 percent CAGR on SOXL is the kind of return that gets you on the cover of magazines, if it survives validation. But the drawdowns were brutal. SOXL had a worst peak to trough drawdown of nearly 65 percent. When I later ran these through Monte Carlo, the worst random 2 year window on UPRO strangle hit minus 68.7 percent. That is one bad week from breaching my "permanent impairment" threshold.
The next idea was diversification. If selling premium on QQQ produces 22 percent CAGR with a 21 percent drawdown, what if I spread the capital across different asset classes that have low correlation? Bonds, gold, healthcare, all of which sell premium with their own independent paths.
I built a portfolio I called the cross asset core. 45 percent QQQ strangle, 20 percent GLD strangle, 20 percent TLT strangle, 15 percent XLV wheel. Four sleeves on four different asset classes.
| Metric | QQQ strangle 30/14 solo | Cross asset core |
|---|---|---|
| CAGR | 20.70% | 15.82% |
| Sharpe ratio | 1.55 | 1.56 |
| Max drawdown | -19.3% | -14.3% |
The Sharpe ratio held nearly identical while the max drawdown collapsed. This looked like a free lunch from diversification. But the historical CAGR was suspicious.
When I later ran the cross asset core through independent block bootstrap Monte Carlo, the CAGR dropped from the historical 16 percent down to about 3 percent. The reason became obvious. From 2010 to 2024, gold had a strong overall uptrend (post 2018 rally) and long bonds had a multi year bull market until 2022. Those drifts contributed most of the cross asset core's CAGR. The strangle premium alone, stripped of the underlying drift, is not enough to compound meaningfully on bonds or gold.
So the cross asset core is an interesting smooth equity curve but the realistic forward return is modest, in the 3 to 5 percent range. It became a "defensive bench" option in the final tier structure but not a primary deployment candidate.
Diversification that did not depend on a one time historical drift was the next idea. Instead of mixing asset classes, I mixed risk profiles within the same general equity exposure. The structure is 60 percent in the QQQ strangle from Tier 1, 20 percent in a SOXL wheel with momentum filter, and 20 percent in a UPRO strangle.
The math works because the three sleeves have correlated but not identical drawdown timing. When semis are dragging through a cyclical down phase, the broad SPX or QQQ might still be flat. So the portfolio max drawdown is much smaller than any individual sleeve.
| Sleeve (solo) | Max drawdown solo |
|---|---|
| QQQ strangle 35/7 | -14.0% |
| SOXL wheel + momentum | -60.5% |
| UPRO strangle 35/14 | -57.1% |
| 60/20/20 blend | -34.3% |
The portfolio drawdown is roughly minus 34 percent, compared to minus 65 percent on SOXL alone or minus 68 percent on UPRO alone. The diversification math is doing the heavy lifting. The CAGR of the blend, however, only modestly exceeds the QQQ strangle alone, because the SOXL and UPRO sleeves only contribute their share of the return.
This was the moment I realized leveraged sleeves are not bad strategies. They are bad standalone deployments. Inside a properly weighted blend, the leveraged sleeves' high CAGR contribution is muted by the QQQ core, but their tail risk is also muted. The blend approximately preserves the Sharpe ratio of the components while reducing the worst drawdowns dramatically.
The cleanest insight in the entire project came when I realized that leveraged ETFs have systematically different return profiles in different market regimes. When the underlying is in a clean uptrend, the leveraged ETF compounds beautifully. When the market is chopping or in a bear phase, the leveraged ETF loses to volatility decay.
So I built a regime allocator. It runs two sub portfolios in parallel:
The signal is brutally simple. Take the closing price of QQQ and its 20 day simple moving average. If price is above SMA, run the aggressive sleeve. If price is below SMA, fall back to the safe sleeve. Reassess daily.
The intuition is that the 20 day SMA catches major regime transitions within a few days. When QQQ tips into a clear downtrend (typically the precursor to a leveraged ETF drawdown), the allocator switches to the safer Tier 1 strategy and waits. When QQQ recovers above its SMA, the allocator re engages the leveraged sleeves to capture the recovery rally.
| Metric | Tier 1 (QQQ 35/7) | Tier 2 (60/20/20) | Tier 3 (regime) |
|---|---|---|---|
| CAGR (full backtest) | 31.83% | 33.24% | 64.85% |
| Sharpe ratio | 2.25 | 1.54 | 3.76 |
| Max drawdown | -14.0% | -34.3% | -14.7% |
The regime allocator is the moment in this research where the numbers started to look almost too good to be true. I will be transparent about that in the validation sections below, and the live calibration section will further temper the headline estimate.
The numbers above are all in sample. Every backtest claim has to survive proper out of sample validation. I used three different methodologies.
Walk forward validation works like this. I split the historical data into multiple non overlapping test windows. For each window, I treat it as the "future" and check whether the strategy delivers positive returns. If a strategy only works in one specific window, it is overfit. If it works in many independent windows, it has structural alpha.
I ran 11 rolling 4 year test windows starting in 2011 through 2024, sliding by one year. Every portfolio design produced positive CAGR in every window.
The next test was random start dates. Instead of January aligned windows, I picked 300 random business days as start dates across 2012 to 2022, ran each strategy for a 2 year window from that start, and tracked the outcome. This catches the "what if I deployed at a random Tuesday" scenario rather than the cleaner walk forward setup.
Across 300 random starts on the Tier 1 strangle, the worst 5 percent of starts still produced 14.7 percent CAGR. Zero starts produced a drawdown worse than minus 21 percent. Zero blowups across the entire sample.
The historical Monte Carlo overlaps windows, which means it is not statistically independent. The bull market of 2012 to 2024 is sampled many times. To get a genuinely independent assessment, I used a stationary block bootstrap. This resamples blocks of consecutive daily returns from history with random block lengths (averaging 20 days), producing fully synthetic price paths that preserve volatility clustering and cross asset correlation but completely randomize the temporal sequence.
The independent Monte Carlo dragged the median CAGRs down meaningfully versus the historical Monte Carlo. The Tier 1 strangle went from 29 percent historical median to 19 percent independent median. The reason is that the historical bull market sequencing was lucky for short volatility strategies. Once that sequencing is scrambled, the realistic forward return is lower.
The drawdown character was unchanged. The worst paths in the independent Monte Carlo were not deeper than in the historical Monte Carlo. So the safety profile was preserved while the CAGR estimates were corrected downward.
None of the above tests included the major crashes. QQQ went down 83 percent from 2000 to 2002. SPX went down 55 percent in 2007 to 2009. Those are the regimes that destroy retail strategies. I needed to know if my strategies could survive them.
So I extended the yfinance dataset back to 1999 (QQQ's inception) and re ran the Monte Carlo with starts during the dot com era and again during the GFC.
The dot com result was the most striking finding in the entire project. While QQQ buy and hold lost 22 percent annualized over the dot com era (with a 57 percent blowup rate across random starts), the QQQ 35 delta 7 day strangle produced 88 percent median CAGR. Yes, 88 percent.
I want to be honest about one caveat. The 88 percent CAGR is almost certainly overstated by real world friction. During the 2000 to 2002 period, bid ask spreads were dramatically wider than today, implied vol did not perfectly track realized vol, and the option markets were less liquid. My synthetic engine assumes perfect execution. A realistic live estimate is probably 30 to 50 percent CAGR rather than 88 percent. But the direction is unambiguous. Short volatility strategies are structurally robust to crashes in ways that buy and hold is not.
Most of the project was failed experiments. Worth documenting because the negative results are educational.
The obvious refinement is to add a stop loss to limit drawdowns. I tested closing any short put position when its mark to market loss reached 2x or 3x the original premium received, on the TQQQ wheel strategy. The result was counterintuitive. Stop losses bumped CAGR up from 31 percent to 38 percent but made the max drawdown worse, from minus 57 percent to minus 82 percent, and tripled the worst recovery time.
The mechanism is subtle. Stops crystallize the loss at the trough, then the strategy redeploys at lower prices, capturing bigger percentage bounces on the recovery. CAGR goes up because you cycle through more capital faster. Drawdown gets worse because you actually realized the losses instead of riding through them. Bad under any reasonable risk model.
I tried gating entry on the current implied vol being above some percentile of its trailing year. The intuition is to skip the cheap premium regimes. In the synthetic model this had zero effect because my IV proxy is a constant multiple of realized vol, so it has no independent signal versus its history. The test was inconclusive because the engine cannot model the real signal we would need.
An iron condor is a short strangle plus protective long wings further out of the money. It bounds the max loss but reduces the net premium. When I sized iron condors by the same cash collateral as a strangle (rather than by the bounded max loss, which would let you over leverage), the CAGR came out around 7 to 8 percent. Too low to be worth the complexity. Iron condors really only outperform strangles if you can over leverage them on margin, which is a fundamentally different risk profile.
When I tuned the regime allocator's SMA window, I tested 20 through 250 days. The CAGR was monotonically better at shorter windows. SMA 5 produced 85 percent walk forward CAGR. That is a red flag, not real alpha.
The reason is that my engine does not model the transaction cost of switching between sleeve sets when the regime flips. In reality, every regime flip requires liquidating positions in one sleeve and opening positions in another, paying bid ask spread on every contract. At SMA 5, the strategy would flip dozens of times per year, eating all the alpha in transaction costs. The model rewards faster signals only because it does not see the cost of using them.
I tested spending 0.5 percent of equity per month buying 5 delta puts at 60 days to expiration on QQQ. The idea is to insure against fast crashes. The result was instructive.
During the GFC era, the tail hedge added 9 to 13 percentage points of CAGR to every strategy because the sharp 2008 gap downs let the puts pay off massively. During the dot com era, the tail hedge cost 5 to 7 percentage points because the slow grinding bear meant the 60 day puts kept expiring worthless. So the tail hedge insures against fast crashes, not slow bears. And my option selling strategies do not need the hedge during slow bears because they collect rich premium throughout. The tail hedge is appropriate insurance for buy and hold portfolios, not for premium sellers.
Vol targeting scales positions inversely to current implied vol so that the portfolio variance stays approximately constant. When IV is high, shrink positions. When IV is low, grow them. I tested this across all surviving strategies at multiple target levels.
The result is that vol targeting trades CAGR for drawdown roughly proportionally. The Sharpe ratio does not improve. It is a slider on the risk return curve, not a source of free alpha. Useful if you have a specific personal risk tolerance, not as a research finding. The one interesting exception is the 60/20/20 blend where vol targeting at 20 percent target pushed Sharpe from 1.97 to 2.51 (cutting max drawdown nearly in half), at the cost of 12 percentage points of CAGR. That is a real Pareto move, not a slider, and could be a more conservative variant of Tier 2.
Synthetic backtests can only get you so far. The final and most important validation is live data. I built a paper trading harness that runs the strategies against actual yfinance option chains, logs both the actual market bid and my model's predicted price for every trade decision, and continues collecting data continuously.
The harness samples every 30 minutes during market hours, logging mark to market data on all open positions. At the close of each market day, it performs full processing: settles any expired positions and opens fresh trades for the next cycle. The infrastructure runs as a continuous daemon and can be left on a dedicated machine for weeks or months without supervision.
The very first paper trading session produced an important calibration finding. The QQQ 35 delta 7 day strangle that day involved selling a put at strike 723 (delta minus 0.35) and a call at strike 738 (delta plus 0.35) for an expiration 7 days out.
| Leg | Market bid | My model's prediction | Gap |
|---|---|---|---|
| Put at 723 (delta -0.35) | $4.90 | $5.13 | +4.6% |
| Call at 738 (delta +0.35) | $4.07 | $5.46 | +34.0% |
The put leg was within 5 percent of my model's prediction. Good. The call leg was overpriced by 34 percent. That is volatility skew showing up in real life. Markets price puts at higher implied vol than calls because hedging demand is concentrated on the downside (people pay extra for downside insurance, not upside leverage). My engine uses a single sigma for both legs, so it systematically overestimates call premium.
The paper trading harness is still running and continues to collect data. I will revisit these estimates after 4 to 8 weeks of live observations.
Three operating points, ranked by my conviction level.
QQQ 35 delta 7 day strangle, single sleeve. Sell a 35 delta put and a 35 delta call expiring next Friday, every Friday. Cash settle at expiration. Size by the put strike cash collateral requirement.
Walk forward median CAGR: 28.5 percent (synthetic) which adjusts to roughly 16 to 19 percent after the call skew correction. Max drawdown across 11 walk forward windows: minus 20.8 percent. Worst recovery time: 144 days. Zero blowups across 2,500 Monte Carlo paths including the dot com and GFC eras.
60 percent Tier 1, 20 percent SOXL wheel with momentum filter, 20 percent UPRO strangle. Three sleeves running in parallel, each with their own allocated capital.
Walk forward median CAGR: 32 percent (synthetic), which adjusts to roughly 22 to 28 percent after live frictions. Max drawdown: minus 35 percent. Worst recovery: 12 months. Still zero blowups in Monte Carlo because the diversification between sleeves dampens any single sleeve's tail.
Regime allocator. Tier 1 when QQQ is below its 20 day SMA, more aggressive 60/20/20 when QQQ is above.
Walk forward median CAGR: 44 percent (synthetic), which adjusts to roughly 25 to 40 percent live with a wide uncertainty band. The 3 day to expiration leveraged ETF options in the aggressive sleeve have severe model versus reality gaps that paper trading is currently quantifying. Need 4 to 8 weeks of live data before risking real capital.
I want to be explicit about everything the synthetic engine does not model, in case anyone tries to use these results directly without the live calibration step.
The friction sensitivity analysis tried to bound some of this.
The honest reading is this. My backtested CAGRs are likely overstated by 5 to 15 percentage points. The drawdown character is probably accurate to within a few percentage points. The ranking of strategies is robust. The strangle beats the wheel. Diversified blends beat solo leveraged sleeves. Regime allocation beats static allocation. These conclusions are stable across every test I have run. The specific absolute numbers are conservatively estimated to be roughly 70 to 85 percent of the synthetic backtest in live markets.
Full backtest summary for every strategy discussed in this note. All values 2010 to 2024.
| Strategy | CAGR | Sharpe | Max DD | Median recovery | Worst recovery |
|---|---|---|---|---|---|
| QQQ Buy and Hold (baseline) | 18.51% | 0.74 | -35.1% | 83d | 716d |
| Wheel 35d / 14DTE | 18.40% | 1.10 | -22.3% | 75d | 221d |
| CSP only 35d / 14DTE | 12.30% | 0.82 | -19.9% | 104d | 400d |
| Wheel + momentum 35d / 14DTE | 16.67% | 1.08 | -22.8% | 76d | 465d |
| Strangle 30d / 14DTE | 20.70% | 1.55 | -19.3% | 98d | 202d |
| Strangle 35d / 7DTE (Tier 1) | 31.83% | 2.25 | -14.0% | 56d | 141d |
| Strangle 45d / 3DTE | 48.88% | 2.92 | -20.7% | 49d | 124d |
| TQQQ wheel 35/14 | 38.25% | 0.78 | -81.6% | 28d | 1111d |
| SOXL wheel + momentum | 37.49% | 0.86 | -60.5% | 24d | 791d |
| UPRO strangle 35/14 | 32.78% | 1.00 | -57.1% | 83d | 315d |
| Cross asset core | 15.82% | 1.56 | -14.3% | 133d | 155d |
| 60/20/20 leveraged (Tier 2) | 33.24% | 1.54 | -34.3% | 72d | 202d |
| Regime allocator (Tier 3) | 64.85% | 3.76 | -14.7% | 44d | 111d |
All code, raw data, walk forward windows, Monte Carlo trials, and the live paper trading harness exist as a runnable repository. The paper trader is collecting intraday data continuously across all six portfolio variants. I will revisit these estimates with real fills as ground truth once I have 4 to 8 weeks of live data.