Monday, July 8, 2019

Second BigData program - look for patients with “bone diseases and arthropathies without major complication

In the previous post we have established connection with a big-data type of database. Now let’s set up another query. We would like to look for patients with “bone diseases and arthropathies without major complication or comorbidity.” This is MS_DRG code 554. This is done through one of the most arcane and complicated coding systems in the world, called ICD-10, which maps virtually any diagnostic condition to a single code.

In this post we are going to search the entire inpatient_charges_2015 dataset looking for the MS_DRG code 554, which is “Bone Diseases And Arthropathies Without Major Complication Or Comorbidity,” or, in other words, people who have issues with their bones, but with no serious issues currently manifesting externally. The following program does this:

import pandas as pd
from google.cloud import bigquery


# set up the query
QUERY = """
SELECT provider_city, provider_state, drg_definition,
average_total_payments, average_medicare_payments
FROM `bigquery-public-data.cms_medicare.inpatient_charges_2015`
WHERE drg_definition LIKE '554 %'
ORDER BY provider_city ASC
LIMIT 1000
"""
client = bigquery.Client.from_service_account_json(
'MedicareProject2-1223283ef413.json')
query_job = client.query(QUERY)
df = query_job.to_dataframe() 


print ("Records Returned: ", df.shape )
print ()
print ("First 3 Records")
print (df.head(3))


When we  run the above program we get these results:

Records Returned: (286, 5)
First 3 Records
provider_city provider_state drg_definition
average_total_payments average_medicare_payments
0 ABINGTON PA 554 - BONE DISEASES & ARTHROPATHIES W/O MCC
5443.67 3992.93
1 AKRON OH 554 - BONE DISEASES & ARTHROPATHIES W/O MCC
5581.00 4292.47
2 ALBANY NY 554 - BONE DISEASES & ARTHROPATHIES W/O MCC
7628.94 5137.31

If we compare to this program with the program from previous post , the only thing different in this program from our previous one is that we added LIKE '554 %', which will match on any DRG that starts with “554.”

Let’s do a little analysis with the obtained data and find out what percent of the total payments for this condition is paid for by Medicare. See the following program:

import pandas as pd
from google.cloud import bigquery


# set up the query
QUERY = """
SELECT provider_city, provider_state, drg_definition,
average_total_payments, average_medicare_payments
FROM `bigquery-public-data.cms_medicare.inpatient_charges_2015`
WHERE drg_definition LIKE '554 %'
ORDER BY provider_city ASC
LIMIT 1000
"""
client = bigquery.Client.from_service_account_json(
'MedicareProject2-1223283ef413.json')
query_job = client.query(QUERY)
df = query_job.to_dataframe() 


print ("Records Returned: ", df.shape )
print ()
total_payment = df.average_total_payments.sum()
medicare_payment = df.average_medicare_payments.sum()
percent_paid = ((medicare_payment/total_payment))*100
print ("Medicare pays {:4.2f}% of Total for 554 DRG".format(percent_paid))
print ("Patient pays {:4.2f}% of Total for 554 DRG".format(100-percent_paid))



The output shows the following results:

Records Returned: (286, 5)
Medicare pays 77.06% of Total for 554 DRG
Patient pays 22.94% of Total for 554 DRG


Our next analysis will be to find payment percent by state as shown in the following program in which we select the unique states in our database and iterate over the states to calculate the percent paid by Medicare by state for 554. The code is shown below:

import pandas as pd
from google.cloud import bigquery
# set up the query
QUERY = """
SELECT provider_city, provider_state, drg_definition,
average_total_payments, average_medicare_payments
FROM `bigquery-public-data.cms_medicare.inpatient_charges_2015`
WHERE drg_definition LIKE '554 %'
ORDER BY provider_city ASC
LIMIT 1000
"""
client = bigquery.Client.from_service_account_json(
'MedicareProject2-1223283ef413.json')
query_job = client.query(QUERY)
df = query_job.to_dataframe()

print ("Records Returned: ", df.shape )
print ()
# find the unique values of State
states = df.provider_state.unique()
states.sort()
total_payment = df.average_total_payments.sum()
medicare_payment = df.average_medicare_payments.sum()
percent_paid = ((medicare_payment/total_payment))*100
print("Overall:")
print ("Medicare pays {:4.2f}% of Total for 554 DRG".format(percent_paid))
print ("Patient pays {:4.2f}% of Total for 554 DRG".format(100-percent_paid))
print ("Per State:")
# now iterate over states
print(df.head(5))
state_percent = []
for current_state in states:
state_df = df[df.provider_state == current_state]
state_total_payment = state_df.average_total_payments.sum()
state_medicare_payment = state_df.average_medicare_payments.sum()
state_percent_paid = ((state_medicare_payment/state_total_payment))*100
state_percent.append(state_percent_paid)
print ("{:s} Medicare pays {:4.2f}% of Total for 554 DRG".format
(current_state,state_percent_paid))


# plot graph

data_array = {'State': states, 'Percent': state_percent}
df_states = pd.DataFrame.from_dict(data_array)

# Now back in dataframe land


import matplotlib.pyplot as plt
import seaborn as sb


print (df_states)
df_states.plot(kind='bar', x='State', y= 'Percent')
plt.show()


The program results in the following graph:





Here I am ending today's post. In the next post we'll take up one more example using another public database on BigQuery. Till we meet again keep studying and learning Python as Python is easy to learn!


Share:

0 comments:

Post a Comment