Tension Index

The Tension Index (TX) measures whether research activity or capital deployment is accelerating unusually fast relative to the other over a trailing 12-month window. It is a standardized, bounded signal in the range -100 to +100, derived from ordinary least-squares momentum slopes, z-score normalization, and a hyperbolic-tangent compression.

What It Measures

TX quantifies the divergence between two momentum slopes: research output (arXiv publications) and capital deployed (a composite of five funding sources). Each slope is estimated via OLS on a 12-month rolling window, z-normalized against historical baselines, and the difference is compressed through tanh(raw / 4) into the -100 to +100 range. The result is a unitless score indicating which axis is accelerating faster than its own historical norm.

Three Axes of Observation

The dashboard tracks three independent axes: Research Momentum (monthly arXiv submissions per sector), Capital Momentum (the dollar-weighted composite described below), and Public Interest (Google Trends index averaged across five sector keywords). The Tension Index is computed from the first two; the third provides contextual signal only.

The Capital Composite

Capital deployed is aggregated monthly from five institutional sources: NSF Awards, NIH RePORTER, USASpending grants, SEC EDGAR Form D filings, and CORDIS Horizon Europe grants. Each source is deflated to constant 2025 USD using CPI-U, then summed dollar-for-dollar into the composite monthly value. Z-normalization is applied afterwards, to the composite slope itself — not to individual sources. Note: the EDGAR pipeline is temporarily returning zero due to a change in the SEC document-URL format; NSF, NIH, USASpending, and CORDIS carry the signal today.

Momentum Calculation

For each axis (research and capital), the raw monthly series is first smoothed with a trailing 3-month sum to reduce reporting noise, then passed through a log(1+x) transform so that proportional — not absolute — changes drive the signal. An ordinary least-squares (OLS) regression is fitted to the most recent 12 smoothed log values, treating time as the independent variable. The resulting slope coefficient represents the momentum of that axis. Each slope is then z-normalized against baseline statistics computed from the full 2015-2024 history: z = (slope - baseline_mean) / baseline_std. This normalization makes slopes comparable across sectors with different absolute scales. Because the current calendar month is still being reported, the series is always truncated to the last completed month before computation.

Tension Formula

The raw tension value is the difference between the two z-scores: raw = z_research − z_capital. This difference is compressed through a hyperbolic tangent: TX = 100 × tanh(raw / 4). The tanh function maps any real-valued divergence smoothly into the bounded range -100 to +100, passes through zero with unit slope, and flattens near the rails so that extreme readings don't dominate the visual. The divisor of 4 keeps typical signals in the expressive part of the curve — values up to roughly ±2σ map to display values in [-46, +46], while larger divergences are compressed gracefully toward the limits. The output is rounded to the nearest integer.

Confidence and Significance

Each TX value carries a 95% confidence interval derived from the standard errors of the two OLS slope estimates. The standard error of the raw tension is SE = sqrt((SE_research / σ_research)² + (SE_capital / σ_capital)²), and the CI endpoints are transformed through tanh exactly, so the displayed band is asymmetric around the point estimate near the rails. A point is marked statistically significant when its raw CI does not cross zero. Non-significant points are drawn in the chart as a dashed line segment, while significant points are drawn solid; the gray band behind the line is the CI itself. So: if the line is solid, the direction of the divergence is reliable; if it is dashed, the divergence cannot be distinguished from noise at 95% and the numeric TX value should be read as "near zero regardless of what the rounded integer says".

How to Read It

  • TX > 0 (significant): Research is accelerating faster than its historical norm relative to capital. This may indicate an emerging scientific front where funding has not yet caught up.
  • TX < 0 (significant): Capital is accelerating faster than its historical norm relative to research. This may indicate crowding or speculative allocation ahead of scientific output.
  • TX near zero or not significant: Either both axes are moving in tandem, or the confidence interval spans zero, meaning the observed divergence cannot be distinguished from noise at the 95% level.

Statistical Properties

The index is bounded by construction (-100 to +100), symmetric around zero, and approximately standard-normal in the pre-tanh space. The 12-month OLS window provides robustness against single-month outliers, and the 3-month trailing-sum smoothing absorbs reporting jitter. Z-normalization against 2015-2024 baselines ensures cross-sector comparability. The tanh compression preserves rank order while keeping extreme readings legible.

Scientific Limitations

OLS slopes assume locally linear trends within the 12-month window; structural breaks or regime changes may produce transient artifacts. The capital composite sums dollars across sources, which means a large-dollar source (USASpending) weighs more than a small-dollar one (CORDIS) in absolute terms — we consider this desirable for a flow-of-funds signal but it is a modelling choice. CPI-U deflation is a broad measure and may not capture sector-specific cost dynamics. Google Trends data (public interest axis) is not used in the TX formula but is displayed alongside it, and its sampling methodology is opaque. Each sector is treated independently; cross-sector contagion effects are not modelled. Baseline statistics are updated quarterly and may lag structural shifts in funding landscapes. The EDGAR source is currently offline due to a SEC URL-format change, reducing venture-capital coverage until fixed.

Back to Forms