Skip to content

healthcare

3 posts with the tag “healthcare”

A Technical Deep-Dive into Benchmarking Hospital Cost Per Discharge

Introduction: The Efficiency Question

In hospital operations, a key indicator of efficiency is the “cost per inpatient discharge.” It’s a simple concept: on average, how much does it cost to treat one patient from admission to discharge? Knowing this number is useful, but knowing how it stacks up against similar hospitals is what provides true strategic insight.

A hospital executive needs to know: “For fiscal year 2022, where does our hospital rank in cost per inpatient discharge compared to other hospitals in our state with a similar number of beds (±20%)?”

This seems like a straightforward question, but like many things in healthcare finance, the data is buried in the HCRIS database. Answering it requires navigating the complexities of the HCRIS data structure to extract the right data points and perform the correct calculations.

At Spectral Health, we’ve automated this process. Here’s a look at the exact BigQuery SQL we use to go from this plain-English question to a direct, actionable answer in seconds.

The SQL Deep-Dive: From Raw Data to Ranked Peers

Here is the complete query. We’ll break it down step-by-step to show how it works.

/* FY-2022 · Cost per Inpatient Discharge
Target: CCN 372018 · Peers: same state, beds ±20 % */
-- latest report per CCN
WITH rpt AS (
SELECT PRVDR_NUM AS ccn, RPT_REC_NUM
FROM `hcris-demo.data.HOSP10_ALL_RPT`
WHERE EXTRACT(YEAR FROM PARSE_DATE('%m/%d/%Y', FY_END_DT)) = 2022
QUALIFY ROW_NUMBER() OVER (PARTITION BY PRVDR_NUM ORDER BY RPT_REC_NUM DESC)=1
),
-- name + state from S-2
alpha AS (
SELECT
r.ccn,
MAX(IF(a.WKSHT_CD='S200001' AND a.LINE_NUM=300 AND a.CLMN_NUM='00100',
a.ALPHNMRC_ITM_TXT,NULL)) AS name,
MAX(IF(a.WKSHT_CD='S200001' AND a.LINE_NUM=200 AND a.CLMN_NUM='00200',
a.ALPHNMRC_ITM_TXT,NULL)) AS st
FROM rpt r
JOIN `hcris-demo.data.HOSP10_ALL_ALPHA` a
ON a.RPT_REC_NUM=r.RPT_REC_NUM
GROUP BY ccn
),
-- beds, discharges, cost from NMRC
nmrc AS (
SELECT
r.ccn,
MAX(IF(n.WKSHT_CD='S300001' AND n.LINE_NUM=1400 AND n.CLMN_NUM='00200',
CAST(n.ITM_VAL_NUM AS INT64),NULL)) AS beds,
MAX(IF(n.WKSHT_CD='S300001' AND n.LINE_NUM=1400 AND n.CLMN_NUM='01500',
CAST(n.ITM_VAL_NUM AS FLOAT64),NULL)) AS disch,
SUM(IF(n.WKSHT_CD='C000001' AND n.CLMN_NUM='00100'
AND ((n.LINE_NUM BETWEEN 3000 AND 4699) OR (n.LINE_NUM BETWEEN 5000 AND 7799)),
CAST(n.ITM_VAL_NUM AS FLOAT64),0)) AS cost
FROM rpt r
JOIN `hcris-demo.data.HOSP10_ALL_NMRC` n
ON n.RPT_REC_NUM=r.RPT_REC_NUM
GROUP BY ccn
),
-- combine metrics
m AS (
SELECT
n.ccn, a.name, a.st, n.beds, n.disch,
n.cost, n.cost/NULLIF(n.disch,0) AS cpd
FROM nmrc n JOIN alpha a USING(ccn)
),
-- target values
t AS (SELECT st, beds FROM m WHERE ccn=372018),
-- peers + rank
peer AS (
SELECT m.*,
RANK() OVER(ORDER BY cpd) AS peer_rank
FROM m, t
WHERE m.st=t.st
AND m.beds BETWEEN t.beds*0.8 AND t.beds*1.2
AND m.beds IS NOT NULL AND m.disch IS NOT NULL AND m.cost IS NOT NULL
)
SELECT
ccn,
name AS provider_name,
beds,
ROUND(cpd,2) AS cost_per_discharge,
peer_rank
FROM peer
ORDER BY peer_rank;

Step 1: Find the Latest Report (rpt)

Hospitals can refile their cost reports, creating multiple records for the same fiscal year. We use the QUALIFY ROW_NUMBER() OVER(...) clause to ensure we only analyze the most recent report for each hospital in fiscal year 2022.

Step 2: Get Hospital Name and State (alpha)

We pull the provider’s name and state from Worksheet S2. This worksheet contains general information about the hospital.

Step 3: Extract Beds, Discharges, and Costs (nmrc)

This is where we dig into the numeric data. We pull three key metrics:

  • Beds: The number of available beds, from Worksheet S3.
  • Discharges: The total number of inpatient discharges, also from Worksheet S3.
  • Cost: The total inpatient cost, which we calculate by summing the relevant lines from Worksheet C, Part I.

Step 4: Combine Metrics and Calculate Cost Per Discharge (m)

With the raw data extracted, we join our temporary tables and calculate the core metric: cost / discharges. We use NULLIF(n.disch, 0) to avoid any divide-by-zero errors.

Step 5: Define the Peer Group and Rank (t, peer)

This is the crucial step for benchmarking.

  1. t (target): First, we create a small temporary table t containing just the state and bed count for our specific hospital (CCN 372018).
  2. peer: Then, we filter our main metrics table m based on the values in t. The WHERE m.st = t.st clause ensures we only look at hospitals in the same state. The WHERE m.beds BETWEEN t.beds * 0.8 AND t.beds * 1.2 clause narrows the list down to only facilities with a similar bed size (±20%). Finally, we use the RANK() window function to rank these peers by their calculated cost per discharge (cpd).

The Result: A Clear Answer

The query returns a clean, ranked list, which can be visualized to give an immediate sense of where the hospital stands among its peers. (We’re continuously refining our chart designs, but even this simple view makes the result clear):

Cost Per Discharge Comparison Chart

The final output is a simple table, ready for analysis or export. It shows that for FY2022, Post Acute Medical Tulsa (CCN 372018) ranks 12th among its peers in Oklahoma with a cost per discharge of $32,249.14.

ccnprovider_namebedscost_per_dischargepeer_rank
374025OAKWOOD SPRINGS726337.751
374026TULSA CENTER FOR BEHAVIORAL HEALTH567781.02
373034ST. JOHN REHABILITATION HOSPITAL AN6013706.233
370211INTEGRIS CANADIAN VALLEY HOSPITAL6615376.814
373033MERCY REHAB HOSPITAL OKLAHOMA CITY6616108.05
373035PAM REHAB HOSPITAL OF TULSA5316143.66
373025VALIR REHAB HOSPITAL5017223.457
374016ROLLING HILLS HOSPITAL6018050.038
370022JACKSON COUNTY MEMORIAL HOSPITAL4923085.799
370149ST ANTHONY SHAWNEE HOSPITAL5724846.7710
370054GRADY MEMORIAL HOSPITAL4828849.9411
372018POST ACUTE MEDICAL TULSA6032249.1412
370049STILLWATER MEDICAL CENTER5234807.0213
370222MCBRIDE CLINIC ORTHOPEDIC HOSPITAL6835876.6614
372004CURAHEALTH OKLAHOMA CITY5936537.0715
372022SOLARA HOSPITAL MUSKOGEE6439835.8216
370171CHEROKEE NATION WW HASTINGS HOSPITAL5241427.2617
370057MUSCOGEE CREEK NATION MED CENTER5046736.3818
372009SSH - OKLAHOMA CITY INC.7249023.0619
370180CHICKASAW NATION MEDICAL CENTER7250711.9820

Conclusion: From Complexity to Clarity

This example shows how Spectral Health transforms a complex analytical task into a simple, repeatable process. By handling the intricacies of HCRIS data, we provide clear, accurate, and timely insights that empower healthcare leaders to make data-driven decisions.

A Technical Deep-Dive into Benchmarking Hospital Operating Margins

Introduction: The Deceptively Simple Question

In the world of hospital finance, one of the most fundamental questions an executive can ask is: “How is our operating margin performing compared to our peers?” It sounds simple. In reality, answering it has traditionally been a nightmare of data wrangling, domain expertise, and days—if not weeks—of an analyst’s time.

This isn’t a quick Google search. The data required lives in the Healthcare Cost Report Information System (HCRIS), a public dataset published by the Centers for Medicare & Medicaid Services (CMS). It’s a goldmine, but it’s also a beast.

The Manual Nightmare: Why HCRIS Is So Tough

If you wanted to answer that simple question manually, here’s what you’d be up against:

  1. Sourcing the Data: First, you have to get the data from the official CMS website. You’d need to download the files for every fiscal year you want to analyze.

  2. The “Excel Breaking” Problem: You can immediately forget about using spreadsheet software. Taking the HOSPITAL-2010 dataset for a single year like 2020 as an example, the numeric data file is around 700 MB and the alphabetic data file is another 150 MB. Multiply that by seven years of data (2015-2021), and you’re dealing with many gigabytes of raw data. You need a proper database, and the skills to manage it, just to get started.

  3. Deep Domain Knowledge: The data is organized in a way that mirrors the paper forms hospitals submit. To find “Operating Income” or “Net Patient Revenue,” you can’t just look for a column with that name. You need to know that you’re looking for Worksheet G-3, Line 500, Column 1 for operating income, and Line 300 for revenue. This requires constant cross-referencing of arcane documentation.

At Spectral Health, we’ve automated this entire process. We’ll show you exactly how a plain-English request triggers a powerful BigQuery query that returns the answer in under 30 seconds.

The Anatomy of the Request

A user can simply ask:

“What was the operating margin for provider number 222002 from 2015 to 2021, compared to the state average and national median?”

This single sentence is the key that unlocks a multi-step SQL query. A “provider number” (also called a CCN) is just the unique ID CMS assigns to each hospital. Our system parses this request and constructs the following BigQuery query to run against our HCRIS data warehouse.

The SQL Deep-Dive: Breaking Down the 5 Steps to Insight

Here is the exact query, which we’ll break down step-by-step.

/*---
Operating margin for CCN 222002, FY 2015-2021, compared with
state average (with standard deviation) and national median
(with interquartile range).
-----*/
-- 1. Pick the latest cost-report for each {provider, year} --
WITH latest_reports AS (
SELECT
PRVDR_NUM AS ccn, -- INT64
RPT_REC_NUM,
EXTRACT(YEAR FROM SAFE.PARSE_DATE('%m/%d/%Y', FY_END_DT))
AS fiscal_year,
ROW_NUMBER() OVER (PARTITION BY PRVDR_NUM,
EXTRACT(YEAR FROM SAFE.PARSE_DATE('%m/%d/%Y', FY_END_DT))
ORDER BY RPT_REC_NUM DESC) AS rn
FROM `hcris-demo.data.HOSP10_ALL_RPT`
WHERE SAFE.PARSE_DATE('%m/%d/%Y', FY_END_DT)
BETWEEN DATE '2015-01-01' AND DATE '2021-12-31'
),
chosen_reports AS (
SELECT ccn, RPT_REC_NUM, fiscal_year
FROM latest_reports
WHERE rn = 1
),
-- - 2. Pull G-3 line 300 & 500 + provider state ---
g3_lines AS (
SELECT
cr.ccn,
cr.fiscal_year,
COALESCE(CAST(alpha_state.ALPHNMRC_ITM_TXT AS STRING), 'UNK') AS state,
MAX(CASE WHEN nmrc.LINE_NUM = 300 THEN CAST(nmrc.ITM_VAL_NUM AS FLOAT64) END)
AS net_patient_revenue,
MAX(CASE WHEN nmrc.LINE_NUM = 500 THEN CAST(nmrc.ITM_VAL_NUM AS FLOAT64) END)
AS operating_income
FROM chosen_reports cr
JOIN `hcris-demo.data.HOSP10_ALL_NMRC` nmrc
ON nmrc.RPT_REC_NUM = cr.RPT_REC_NUM
AND nmrc.WKSHT_CD = 'G300000'
AND nmrc.CLMN_NUM = '00100'
AND nmrc.LINE_NUM IN (300, 500)
LEFT JOIN `hcris-demo.data.HOSP10_ALL_ALPHA` alpha_state
ON alpha_state.RPT_REC_NUM = cr.RPT_REC_NUM
AND alpha_state.WKSHT_CD = 'S200001'
AND alpha_state.LINE_NUM = 200
AND alpha_state.CLMN_NUM = '00200' -- provider state
GROUP BY cr.ccn, cr.fiscal_year, state
),
-- 3. Calculate operating-margin pct --
margins AS (
SELECT
ccn,
fiscal_year,
state,
ROUND(100 * SAFE_DIVIDE(operating_income, net_patient_revenue), 2)
AS operating_margin_pct
FROM g3_lines
WHERE net_patient_revenue IS NOT NULL
AND operating_income IS NOT NULL
),
-- 4. Derive comparison sets --
hospital AS (
SELECT fiscal_year,
ccn,
operating_margin_pct AS hospital_margin_pct,
state
FROM margins
WHERE ccn = 222002
),
-- state_avg now with standard deviation and N count
state_avg AS (
SELECT fiscal_year,
state,
ROUND(AVG(operating_margin_pct), 2) AS state_avg_margin_pct,
ROUND(STDDEV(operating_margin_pct), 2) AS state_stddev_margin_pct,
COUNT(ccn) AS state_n
FROM margins
GROUP BY fiscal_year, state
),
-- national_stats calculates quartiles for IQR and N count
national_stats AS (
SELECT fiscal_year,
COUNT(ccn) AS national_n,
APPROX_QUANTILES(operating_margin_pct, 4) AS national_quartiles
FROM margins
GROUP BY fiscal_year
)
-- 5. Final answer ---
SELECT
h.fiscal_year,
h.ccn,
h.hospital_margin_pct,
s.state_avg_margin_pct,
s.state_stddev_margin_pct,
s.state_n,
n.national_quartiles[OFFSET(2)] AS national_median_margin_pct,
-- Calculate Interquartile Range (IQR) as Q3 - Q1
ROUND(n.national_quartiles[OFFSET(3)] - n.national_quartiles[OFFSET(1)], 2)
AS national_iqr_margin_pct,
n.national_n
FROM hospital h
LEFT JOIN state_avg s ON h.fiscal_year = s.fiscal_year
AND h.state = s.state
LEFT JOIN national_stats n ON h.fiscal_year = n.fiscal_year
ORDER BY h.fiscal_year;

Step 1: Finding the Needle in the Haystack (latest_reports)

A crucial complication is that hospitals often refile their cost reports. A simple query for “2021” might pull multiple versions of the report for the same hospital. We only want the most recent one. We solve this with a window function: ROW_NUMBER() OVER (PARTITION BY ... ORDER BY RPT_REC_NUM DESC). This assigns a rank to each report for a given hospital and year, with the highest RPT_REC_NUM (report record number) getting rank #1. We then simply select where rn = 1.

Step 2: Extracting the Key Financials (g3_lines)

This is where the domain knowledge comes in. The raw numeric data is in a long format (one row per line item). We need to pivot this into columns. We do this by joining to our chosen_reports and filtering for the exact worksheet (G300000), column (00100), and lines (300 and 500) we need. The MAX(CASE WHEN ...) pattern is a standard SQL trick to perform this pivot, creating columns for net_patient_revenue and operating_income. We also LEFT JOIN to another table to grab the hospital’s state for our later comparison.

Step 3: The Core Calculation (margins)

With our data properly structured, the core calculation is simple: operating_income / net_patient_revenue. We use SAFE_DIVIDE as a best practice to gracefully handle any cases where revenue might be zero, preventing the query from crashing.

Step 4: Building the Benchmarks (hospital, state_avg, national_median)

Now we create three separate, focused datasets for our comparison:

  1. hospital: We filter our margins table for only the specific CCN we care about (222002).
  2. state_avg: We group all hospitals by fiscal_year and state and calculate the AVG(operating_margin_pct).
  3. national_median: We group all hospitals by fiscal_year and use BigQuery’s powerful APPROX_QUANTILES() function to efficiently calculate the median (the 50th percentile). This is far more performant than other methods for calculating medians on large datasets.

Step 5: Assembling the Final Answer

The final step is to bring it all together. We start with our single hospital’s data and LEFT JOIN the state and national benchmarks for each corresponding year. The result is a clean, year-by-year table that directly answers the user’s question.

The final output is this simple, powerful chart:

Operating Margin Comparison

The query also returns data in CSV format. The table below shows the complete dataset that generated the chart, with the standard deviation for the state average, the interquartile range (IQR) for the national median, and the number of hospitals (N) in each comparison group.

Fiscal YearYour MarginState Avg. (Std Dev)State NNational Median (IQR)National N
2015-4.80%-9.34% (34.63)92-0.32% (17.07)5,863
20164.21%-6.64% (22.00)90-0.78% (17.71)5,866
20170.49%-10.25% (21.99)89-2.00% (19.21)5,844
2018-3.86%-8.94% (24.65)88-1.74% (18.94)5,808
201911.33%-8.96% (24.56)89-1.41% (18.99)5,788
202015.78%-16.12% (29.30)88-5.59% (22.45)5,758
202122.36%-15.32% (40.09)89-1.70% (22.06)5,772

Why This is a Game-Changer

Let’s contrast the two approaches:

AttributeManual ProcessSpectral Health Process
TimeDays or weeks of an analyst’s time.Under 30 seconds.
InfrastructureUser must set up and maintain their own database.Handled entirely by our managed BigQuery instance.
ExpertiseRequires deep SQL and HCRIS schema knowledge.Requires asking a question in plain English.

This query is just one example. By embedding this logic into our platform, we abstract away the immense technical complexity and empower users to focus on the strategic insights, not the data janitorial work.

Conclusion: Empowering Decisions with Accessible Data

Our mission is to take this incredibly valuable but notoriously difficult public dataset and make it accessible to everyone. The goal isn’t just about running queries faster; it’s about enabling healthcare finance professionals, consultants, and researchers to make better, more informed decisions without needing a data engineering team on retainer.

What question would you ask the entire HCRIS database if you could? Come give it a try on our free tier and see what you can uncover.

Unlocking Healthcare's Best-Kept Secret: The HCRIS Goldmine

Most people don’t realize the federal government quietly publishes the country’s most valuable hospital data. This is the Healthcare Cost Report Information System (HCRIS). Every year, hospitals submit Medicare cost reports to CMS, packed with details on revenue, costs, staffing, patient volumes, service lines, and more. CMS aggregates all this into HCRIS, covering every hospital nationwide and reaching back decades. On paper, it’s a goldmine for benchmarking, analysis, and strategic decisions.

In reality, HCRIS is massive, messy, and completely unusable in Excel. Even experts rarely use it because it demands SQL skills, deep Medicare knowledge, and hours of manual work. Most decision-makers end up relying on consultants, reimbursement specialists, or outdated surveys and are left guessing about the metrics that actually shape performance and strategy.

Spectral Health was built to fix this gap. After 25 years in hospital finance and as a former Recovery Audit Contractor (RAC), I saw firsthand how limited access to HCRIS holds everyone back. Spectral Health unlocks this data for everyone. It allows anyone to ask plain-English questions and get instant, accurate answers, complete with charts and downloadable files for further analysis.

Wondering how your net revenue per patient day compares to other acute care hospitals in Texas? Just ask. The answer is ready in seconds.

Spectral Health is built on this proven foundation and is now open to everyone. We believe so strongly in the power of this data that we offer a robust free tier, giving you the ability to get instant answers to your most pressing questions.

What You Can Do with Spectral Health

  • Answer Critical Business Questions: Instantly query decades of hospital data to inform your strategy.
  • Benchmark Performance: Compare your facility’s financial metrics against any hospital or group in the country.
  • No Cost, No Commitment: Access the core features of Spectral Health and run a generous number of free queries every month.

Get started for free at https://spectralhealth.ai and unlock the HCRIS goldmine today.

About me:

I’m Eric Hendrickson, co-founder of Spectral Health. For 25 years, I have worked in hospital finance and Medicare reimbursement. I have filed hundreds of cost reports, advised hospital systems, and worked as a Recovery Audit Contractor for insurance payers. Today, I direct reimbursement and revenue cycle for a rehab hospital system, so I understand how critical fast, accurate data is for making decisions. Spectral Health is the tool I spent my career searching for, and I know I am not alone.

If you work in administration, reimbursement, finance, M&A, consulting, investment, journalism, or if you’re simply curious how hospitals really operate, let’s connect. I frequently post insights and use cases powered by Spectral Health.