Analyzing Hospital Patient Outcomes using R
Data Analysis is increasingly becoming more and more important in healthcare. Hospitals are using Big Data to make decisions and improve patient care. At the same time, patients are using Big Data, or more likely the published results, to make decisions about where to receive healthcare. There is at least a third element as well, and that is that governments and agencies may use this data to actually impose fines against institutions not meeting certain minimums of care.
One site which you can use to get large amounts of data on healthcare institutions is the Hospital Compare website http://hospitalcompare.hhs.gov. The website contains information about the quality and care at over 4,000 Medicare-certified hospitals in the U.S.
In a recent school project I analyzed some of the Hospital Compare data. Specifically I used the following files, which represent only a small portion of the data available:
• outcome-of-care-measures.csv: Contains information about 30-day mortality and readmission rates for heart attacks, heart failure, and pneumonia for over 4,000 hospitals.
• hospital-data.csv: Contains information about each hospital.
• Hospital_Revised_Flatfiles.pdf: Descriptions of the variables in each file (i.e the code book).
The data itself is in a raw format, giving all of the information necessary to mine all kinds of good information. I used this data, the R programming language, and its Lattice and Base graphics packages to make several analysis. I am presenting a few here along with the code that I used.
Heart Attack 30-day Death Rate by State
In this analysis, I composed a box-whiskers plot of the 30-day Death Rate by State for Heart Attack. I ranked the entire graphic by the median death rate for the state to give an appealing visualization. I removed states that contained less than 20 hospitals.
hospital state
1 CRESTWOOD MEDICAL CENTER AL
2 PROVIDENCE ALASKA MEDICAL CENTER AK
3 MAYO CLINIC HOSPITAL AZ
4 ARKANSAS HEART HOSPITAL AR
5 GLENDALE ADVENTIST MEDICAL CENTER CA
6 ST MARYS HOSPITAL AND MEDICAL CENTER CO
7 WATERBURY HOSPITAL CT
8 BAYHEALTH - KENT GENERAL HOSPITAL DE
9 PROVIDENCE HOSPITAL DC
10 MOUNT SINAI MEDICAL CENTER FL
11 STEPHENS COUNTY HOSPITAL GA
12 HILO MEDICAL CENTER HI
13 PORTNEUF MEDICAL CENTER ID
14 SAINT JOSEPH HOSPITAL IL
15 ST VINCENT HEART CENTER OF INDIANA LLC IN
16 MARY GREELEY MEDICAL CENTER IA
17 KANSAS HEART HOSPITAL KS
18 ST ELIZABETH MEDICAL CENTER NORTH KY
19 ST FRANCIS MEDICAL CENTER LA
20 YORK HOSPITAL ME
21 JOHNS HOPKINS HOSPITAL, THE MD
22 BETH ISRAEL DEACONESS MEDICAL CENTER MA
23 MUNSON MEDICAL CENTER MI
24 ST MARYS HOSPITAL MN
25 WESLEY MEDICAL CENTER MS
26 BOONE HOSPITAL CENTER MO
27 BENEFIS HOSPITALS INC MT
28 FAITH REGIONAL HEALTH SERVICES NE
29 SUNRISE HOSPITAL AND MEDICAL CENTER NV
30 CATHOLIC MEDICAL CENTER NH
31 EAST ORANGE GENERAL HOSPITAL NJ
32 ST VINCENT HOSPITAL NM
33 NYU HOSPITALS CENTER NY
34 CAROLINAS MEDICAL CENTER-NORTHEAST NC
35 SANFORD MEDICAL CENTER FARGO ND
36 JEWISH HOSPITAL, LLC OH
37 OKLAHOMA HEART HOSPITAL SOUTH OK
38 PORTLAND VA MEDICAL CENTER OR
39 DOYLESTOWN HOSPITAL PA
40 HOSPITAL DR CAYETANO COLL Y TOSTE PR
41 MIRIAM HOSPITAL RI
42 MUSC MEDICAL CENTER SC
43 AVERA HEART HOSPITAL OF SOUTH DAKOTA LLC SD
44 METHODIST MEDICAL CENTER OF OAK RIDGE TN
45 CYPRESS FAIRBANKS MEDICAL CENTER TX
46 DIXIE REGIONAL MEDICAL CENTER UT
47 FLETCHER ALLEN HOSPITAL OF VERMONT VT
48 ROY LESTER SCHNEIDER HOSPITAL,THE VI
49 CHESAPEAKE REGIONAL MEDICAL CENTER VA
50 PROVIDENCE SACRED HEART MEDICAL CENTER WA
51 MONONGALIA COUNTY GENERAL HOSPITAL WV
52 BELLIN MEMORIAL HSPTL WI
53 WYOMING MEDICAL CENTER WY
54 GUAM MEMORIAL HOSPITAL AUTHORITY GU
The Worst
1 HELEN KELLER MEMORIAL HOSPITAL AL
2 MAT-SU REGIONAL MEDICAL CENTER AK
3 VERDE VALLEY MEDICAL CENTER AZ
4 MEDICAL CENTER SOUTH ARKANSAS AR
5 METHODIST HOSPITAL OF SACRAMENTO CA
6 NORTH SUBURBAN MEDICAL CENTER CO
7 JOHNSON MEMORIAL HOSPITAL CT
8 ST FRANCIS HEALTHCARE DE
9 HOWARD UNIVERSITY HOSPITAL DC
10 PALMETTO GENERAL HOSPITAL FL
11 WEST GEORGIA MEDICAL CENTER GA
12 PALI MOMI MEDICAL CENTER HI
13 EASTERN IDAHO REGIONAL MEDICAL CENTER ID
14 SAINT ANTHONY MEDICAL CENTER IL
15 MARION GENERAL HOSPITAL IN
16 BOONE COUNTY HOSPITAL IA
17 OLATHE MEDICAL CENTER KS
18 MURRAY-CALLOWAY COUNTY HOSPITAL KY
19 RIVER PARISHES HOSPITAL LA
20 PENOBSCOT VALLEY HOSPITAL ME
21 HARFORD MEMORIAL HOSPITAL MD
22 NOBLE HOSPITAL MA
23 HURLEY MEDICAL CENTER MI
24 HEALTHEAST ST JOHN'S HOSPITAL MN
25 SOUTHWEST MS REGIONAL MEDICAL CENTER MS
26 POPLAR BLUFF REGIONAL MEDICAL CENTER MO
27 BOZEMAN DEACONESS HOSPITAL MT
28 OMAHA VA MEDICAL CENTER (VA NEBRASKA WESTERN IOWA HEALTHCARE SYSTEM) NE
29 DESERT SPRINGS HOSPITAL NV
30 FRANKLIN REGIONAL HOSPITAL NH
31 ROBERT WOOD JOHNSON UNIVERSITY HOSPITAL AT RAHWAY NJ
32 MOUNTAIN VIEW REGIONAL MEDICAL CENTER NM
33 F F THOMPSON HOSPITAL NY
34 WAYNE MEMORIAL HOSPITAL NC
35 ALTRU HOSPITAL ND
36 MERCY FRANCISCAN HOSPITAL WESTERN HILLS OH
37 MERCY MEMORIAL HEALTH CENTER OK
38 THREE RIVERS COMMUNITY HOSPITAL OR
39 EPHRATA COMMUNITY HOSPITAL PA
40 DOCTORS' CENTER HOSPITAL, INC PR
41 WESTERLY HOSPITAL RI
42 WACCAMAW COMMUNITY HOSPITAL SC
43 PRAIRIE LAKES HOSPITAL SD
44 DYERSBURG REGIONAL MEDICAL CENTER TN
45 LAREDO MEDICAL CENTER TX
46 ST MARKS HOSPITAL UT
47 NORTHEASTERN VERMONT REGIONAL HOSPITAL VT
48 GOV JUAN F LUIS HOSPITAL & MEDICAL CTR VI
49 RIVERSIDE TAPPAHANNOCK HOSPITAL VA
50 KADLEC REGIONAL MEDICAL CENTER WA
51 THOMAS MEMORIAL HOSPITAL WV
52 HOLY FAMILY MEMORIAL INC WI
53 SHERIDAN MEMORIAL HOSPITAL WY
54 GUAM MEMORIAL HOSPITAL AUTHORITY GU
The code:
rankall <- function(outcome, num = "best") { ## Read outcome data outcomeData <- read.csv("outcome-of-care-measures.csv", colClasses = "character") if (outcome != "heart attack" && outcome != "heart failure" && outcome != "pneumonia") { stop("Invalid Outcome") } hospital <- character() statevector <- character() ## Return hospital name in that state with lowest 30-day death rate for(state in unique(outcomeData$State)) { stateIndex = which(outcomeData$State == state) outcomeDataState <- outcomeData if(outcome == "heart attack") { outcomeData.sorted <- outcomeDataState[order(as.numeric(outcomeDataState<,11>), outcomeDataState<,2>, na.last = NA),] } else if(outcome == "heart failure") { outcomeData.sorted <- outcomeDataState[order(as.numeric(outcomeDataState<,17>), outcomeDataState<,2>, na.last = NA),] } else if(outcome == "pneumonia") { outcomeData.sorted <- outcomeDataState[order(as.numeric(outcomeDataState<,23>), outcomeDataState<,2>, na.last = NA),] } if(num == "best") { rownum = 1 } else if(num == "worst") { rownum = nrow(outcomeData.sorted) } else { rownum = num } hospital <- c(hospital,outcomeData.sorted) statevector <- c(statevector,state) } df <- data.frame(hospital = hospital,state = statevector) return(df) }
Recent Posts
See AllRecently I was working on a problem with Time Series. Â Time Series can quickly add up to a lot of data, as you are using previous...
One of the biggest bottlenecks in Deep Learning is loading data. Â having fast drives and access to the data is important, especially if...
import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.TableName;...
Comments