Transparency page

Methodology & Sources

This page documents exactly what HRSA's Health Workforce Simulation Model (HWSM) does, where it falls short, and how our three-layer extension addresses each limitation. Every claim on the rest of the site can be traced back here.

1. What HRSA publishes

HRSA's National Center for Health Workforce Analysis (NCHWA) publishes the US health-workforce supply and demand projection via HWSM. The FY2025 data file uses 2023 as the baseline year and projects through 2038. It covers ~104 professions across 7 groups, all 50 states + DC + 4 census regions, and 3 rurality slices (Metro / NonMetro / Total).

HWSM is an integrated microsimulation model: supply is modeled at the individual provider level, demand at the individual patient level, then aggregated up to county → state → national totals. Each year the model ages the supply population forward (applying retirement, hours, new-graduate equations), ages the demand population forward (applying per-person utilization equations from MEPS, NHAMCS, NIS), and converts service demand into provider demand via static staffing ratios.

Supply data sources

AMA Physician Professional Data (PPD) — physicians
ADA Masterfile — dentists
AAPA Masterfile — physician associates
HRSA NSSRN — registered nurses
ACS, BLS OEWS, IPEDS (new graduates), state licensure files
Association surveys (APA, APTA, NAADAC, etc.)

Demand data sources

ACS 2023 — base population
BRFSS — health risks
MCBS — Medicare beneficiaries
CMS MDS — nursing homes
MEPS — ambulatory utilization
NHAMCS — hospital ambulatory; NIS — inpatient
Census / S&P Global / state population projections

2. Why HRSA's headline understates the gap

Problem 1 — Demand is defined as utilization, not need

From HWSM Technical Documentation, Chapter I (verbatim):

“Workforce demand is defined as the number of health care workers required to provide a level of services that will be utilized given patient health-seeking behavior, and ability and willingness to pay for health care services. As discussed later, demand is different from need. Demand reflects the level of care that people are likely to use, while need is usually a clinical definition.”

And from Chapter XIII (Validation, Strengths, Limitations) — HRSA's own admitted limitation:

“Historically, HWSM operated under the assumption that national demand equals national supply in the starting year for most health professions. The exceptions were primary care physicians and psychiatrists... Growing evidence of national shortfalls across many health professions, partly attributable to the COVID-19 pandemic, has prompted significant updates to this approach starting in 2025.”

The 2025 update partially fixes this — physician specialties, psychiatrists, nursing, and physical therapy now start with vacancy-informed baselines — but the demand trajectory still extrapolates observed utilization, not clinical need. Every population group that has historically been under-served (rural, uninsured, low-income, Medicaid, non-white, non-English-speaking) is systematically counted as lower-demand in the base case because their observed utilization is lower.

Problem 2 — Scenario space is too narrow

HRSA ships 13 scenarios: 5 supply (Status Quo, ±10% graduates, ±2y retirement) and 8 demand (Status Quo, Urban/Insurance/Race parity, Combination, Income Effect for oral health, Unmet Need 1/2 and Elevated Need for behavioral health). None model AI productivity, scope-of-practice shifts, staffing ratio laws, team-based care, telehealth capacity, or induced demand from new access modalities.

Problem 3 — Financial impact is not modeled

HRSA publishes FTE counts. No dollars, no preventable hospitalizations, no life-years-lost, no GDP drag. Translating FTE gaps into the things a policy audience cares about is the missing piece.

3. Our three-layer extension

Our platform layers over HRSA's published output. We do not rewrite HWSM — we don't have the restricted micro-datasets. Instead we apply defensible parametric adjustments with literature-sourced defaults.

Layer 1 — Faithful Replication

The HRSA XLSX is ingested into Supabase Postgres exactly as published. No recomputation, no interpolation. A dedicated Compare to HRSA page proves that every cell in our database matches the HRSA spreadsheet for any (profession, state, year, scenario) combination.

Layer 2 — 14 Extension Knobs

Demand side (E1–E6): Need-based baseline, disease trajectory, induced demand from access, AI demand multiplier, aging acceleration, insurance coverage shifts.

Supply side (S1–S8): AI productivity, scope of practice, staffing ratios, team-based care, immigration/IMG, training pipeline, burnout shock, telehealth.

Each knob has a literature-backed default and a sensitivity range. With all knobs at default, Layer 2 output equals Layer 1 exactly — enforced by a unit test.

Layer 3 — Financial Impact

Converts FTE gaps into dollars and outcomes: health-system P&L impact (revenue loss, premium labor, turnover cost), preventable hospitalizations from AHRQ PQI rates, life-years lost, and macro-GDP drag benchmarked against Milken / Deloitte / McKinsey Global Institute.

4. Captured HRSA technical documentation

HRSA's Bureau of Health Workforce publishes the HWSM technical documentation as HTML chapters on bhw.hrsa.gov. We captured the relevant chapters verbatim via web.archive.org and stored them in data/docs/hwsm_chapters/. Key chapters include:

I. Introduction
II. Supply Modeling Overview
III. Demand Modeling Overview
IV. Nursing Model Components
V. Physician Model Components
VIII. Behavioral Health Care Provider Model Components
X. Oral Health Care Provider Model Components
XI. Allied Health & Select Other Occupations
XII. Long-Term Services and Support Model Components
XIII. HWSM Validation, Strengths and Limitations, and Improvement

5. Attribution

All HRSA data is in the public domain per HRSA's data-use notice. This platform cites HRSA on every view that uses Layer 1 data. Data currency date: 2025-12-18. Questions about the HRSA source: NCHWAinquiries@hrsa.gov.