Transparency page
Methodology & Sources
This page documents exactly what HRSA's Health Workforce Simulation Model (HWSM) does, where it falls short, and how our three-layer extension addresses each limitation. Every claim on the rest of the site can be traced back here.
1. What HRSA publishes
HRSA's National Center for Health Workforce Analysis (NCHWA) publishes the US health-workforce supply and demand projection via HWSM. The FY2025 data file uses 2023 as the baseline year and projects through 2038. It covers ~104 professions across 7 groups, all 50 states + DC + 4 census regions, and 3 rurality slices (Metro / NonMetro / Total).
HWSM is an integrated microsimulation model: supply is modeled at the individual provider level, demand at the individual patient level, then aggregated up to county → state → national totals. Each year the model ages the supply population forward (applying retirement, hours, new-graduate equations), ages the demand population forward (applying per-person utilization equations from MEPS, NHAMCS, NIS), and converts service demand into provider demand via static staffing ratios.
Supply data sources
- AMA Physician Professional Data (PPD) — physicians
- ADA Masterfile — dentists
- AAPA Masterfile — physician associates
- HRSA NSSRN — registered nurses
- ACS, BLS OEWS, IPEDS (new graduates), state licensure files
- Association surveys (APA, APTA, NAADAC, etc.)
Demand data sources
- ACS 2023 — base population
- BRFSS — health risks
- MCBS — Medicare beneficiaries
- CMS MDS — nursing homes
- MEPS — ambulatory utilization
- NHAMCS — hospital ambulatory; NIS — inpatient
- Census / S&P Global / state population projections
2. Why HRSA's headline understates the gap
Problem 1 — Demand is defined as utilization, not need
From HWSM Technical Documentation, Chapter I (verbatim):
“Workforce demand is defined as the number of health care workers required to provide a level of services that will be utilized given patient health-seeking behavior, and ability and willingness to pay for health care services. As discussed later, demand is different from need. Demand reflects the level of care that people are likely to use, while need is usually a clinical definition.”
And from Chapter XIII (Validation, Strengths, Limitations) — HRSA's own admitted limitation:
“Historically, HWSM operated under the assumption that national demand equals national supply in the starting year for most health professions. The exceptions were primary care physicians and psychiatrists... Growing evidence of national shortfalls across many health professions, partly attributable to the COVID-19 pandemic, has prompted significant updates to this approach starting in 2025.”
The 2025 update partially fixes this — physician specialties, psychiatrists, nursing, and physical therapy now start with vacancy-informed baselines — but the demand trajectory still extrapolates observed utilization, not clinical need. Every population group that has historically been under-served (rural, uninsured, low-income, Medicaid, non-white, non-English-speaking) is systematically counted as lower-demand in the base case because their observed utilization is lower.
Problem 2 — Scenario space is too narrow
HRSA ships 13 scenarios: 5 supply (Status Quo, ±10% graduates, ±2y retirement) and 8 demand (Status Quo, Urban/Insurance/Race parity, Combination, Income Effect for oral health, Unmet Need 1/2 and Elevated Need for behavioral health). None model AI productivity, scope-of-practice shifts, staffing ratio laws, team-based care, telehealth capacity, or induced demand from new access modalities.
Problem 3 — Financial impact is not modeled
HRSA publishes FTE counts. No dollars, no preventable hospitalizations, no life-years-lost, no GDP drag. Translating FTE gaps into the things a policy audience cares about is the missing piece.
3. Our three-layer extension
Our platform layers over HRSA's published output. We do not rewrite HWSM — we don't have the restricted micro-datasets. Instead we apply defensible parametric adjustments with literature-sourced defaults.
Layer 1 — Faithful Replication
The HRSA XLSX is ingested into Supabase Postgres exactly as published. No recomputation, no interpolation. A dedicated Compare to HRSA page proves that every cell in our database matches the HRSA spreadsheet for any (profession, state, year, scenario) combination.
Layer 2 — 14 Extension Knobs
Demand side (E1–E6): Need-based baseline, disease trajectory, induced demand from access, AI demand multiplier, aging acceleration, insurance coverage shifts.
Supply side (S1–S8): AI productivity, scope of practice, staffing ratios, team-based care, immigration/IMG, training pipeline, burnout shock, telehealth.
Each knob has a literature-backed default and a sensitivity range. With all knobs at default, Layer 2 output equals Layer 1 exactly — enforced by a unit test.
Layer 3 — Financial Impact
Converts FTE gaps into dollars and outcomes: health-system P&L impact (revenue loss, premium labor, turnover cost), preventable hospitalizations from AHRQ PQI rates, life-years lost, and macro-GDP drag benchmarked against Milken / Deloitte / McKinsey Global Institute.
4. Captured HRSA technical documentation
HRSA's Bureau of Health Workforce publishes the HWSM technical documentation as HTML chapters on bhw.hrsa.gov. We captured the relevant chapters verbatim via web.archive.org and stored them in data/docs/hwsm_chapters/. Key chapters include:
- I. Introduction
- II. Supply Modeling Overview
- III. Demand Modeling Overview
- IV. Nursing Model Components
- V. Physician Model Components
- VIII. Behavioral Health Care Provider Model Components
- X. Oral Health Care Provider Model Components
- XI. Allied Health & Select Other Occupations
- XII. Long-Term Services and Support Model Components
- XIII. HWSM Validation, Strengths and Limitations, and Improvement
5. Attribution
All HRSA data is in the public domain per HRSA's data-use notice. This platform cites HRSA on every view that uses Layer 1 data. Data currency date: 2025-12-18. Questions about the HRSA source: NCHWAinquiries@hrsa.gov.