At a glance

Short week

I am heading out of town this weekend, so my time is somewhat constrained this week. I am also waiting on the second dataset, the National Readmission Database (NRD) which is needed to do proceed on this project.

In the mean time, I thought I’d do some quick exploratory data analysis on the NIS dataset just to get a quick visualization of what’s going on.

The query

We’ll start by getting any observation in which C. diff (ICD-9-CM code 00845) is listed as a diagnosis. Diagnoses are stored in the DXx columns (DX1, DX2, …, DX30). The following query will bring back anything with a C. diff diagnosis.

SELECT *
  FROM nis
 WHERE nis.dx1  = '00845' 
    OR nis.dx2  = '00845' 
    OR nis.dx3  = '00845' 
    OR nis.dx4  = '00845' 
    OR nis.dx5  = '00845' 
    OR nis.dx6  = '00845' 
    OR nis.dx7  = '00845' 
    OR nis.dx8  = '00845' 
    OR nis.dx9  = '00845' 
    OR nis.dx10 = '00845' 
    OR nis.dx11 = '00845' 
    OR nis.dx12 = '00845' 
    OR nis.dx13 = '00845' 
    OR nis.dx14 = '00845' 
    OR nis.dx15 = '00845'
    OR nis.dx16 = '00845' 
    OR nis.dx17 = '00845' 
    OR nis.dx18 = '00845' 
    OR nis.dx19 = '00845' 
    OR nis.dx20 = '00845' 
    OR nis.dx21 = '00845' 
    OR nis.dx22 = '00845' 
    OR nis.dx23 = '00845' 
    OR nis.dx23 = '00845' 
    OR nis.dx25 = '00845' 
    OR nis.dx26 = '00845' 
    OR nis.dx27 = '00845' 
    OR nis.dx28 = '00845' 
    OR nis.dx29 = '00845'
    OR nis.dx30 = '00845'

According to the H-CUP NIS documentation,

In the HCUP inpatient databases, the first listed diagnosis is the principal diagnosis defined as the condition established after study to be chiefly responsible for occasioning the admission of the patient to the hospital for care.

That is, DX1 is the reason the patient arrived at the hospital. For a DX1 of 00845, the patient contracted C. diff somewhere else and was brought to the hospital.

For DX2+, it is possible the patient was brought to the hospital for C. diff and another reason, but more likely, the patient contracted C. diff in the hospital while they were there for the DX1 reason.

This assumption will need a little more expertise to confirm, but we’re just spit balling right now.

Here, we can see that the initial diagnosis, DX1 has the highest occurrence of C. diff. However, that is to be expected. The distribution of secondary hospital-acquired C. diff cases are spread over 29 other positions.

If we sum up the secondary diagnoses, we clearly see they outnumber the inpatient diagnoses by more than double.

Furthermore, the number of C. diff observations show an increasing trend. There was a spike in 2011, and a reduction in 2012, but the overall trend is increasing.

Patients brought in with C. diff paid an average total charge of $31826, whereas patients who acquired C. diff in the hospital paid an average total charge of $93107.

I would suspect this is because in the latter cases, the patient is already in the hospital for an existing illness or procedure, and the addition of C. diff causes an increased length of stay. This is only speculation though, and should be taken as such.

Conclusion

There is some good data in here, and I am excited to start digging in and finding something useful. The addition of the NRD dataset will add a crucial dimension as I begin to look for what works and what doesn’t when treating C. diff.