So my attempt at using a blog to categorize my thoughts is off to a rough start, since I have completely neglected this for so long, but lets try it again by getting a new post!
My current project that is first in line is to get my MS project finished! I introduced it briefly in the first post, but I am going to take this post to introduce a bit more information. My buddy at grad school (mark) set me up with two doctors in the UK and his adviser at the University of Minnesota, so I could have a nice project to graduate with. I am of course in the midst of pulling out all of my hair (from a sunburned scalp even!) trying to figure out the tiny details that always fall apart.
In any case, the paper looks at two kinds of hip replacement (full and half). I honestly don’t know a whole ton about the hip replacement side of things, so I would suggest http://en.wikipedia.org/wiki/Hip_replacement for more information on that. Suffice to say, they are different and we are wondering in what way they are different. The way we plan on figuring out their difference is based on the absurdly large and excellent data base that is maintained in the UK about all patients and surgeries that happen. Apparently in order to get paid as a doctor in the UK you have to submit a good chunk of information to the central agency that does the paying. If you were looking for a reliable way to get good information this is it, give me what I want and then I will pay you! So the British doctors have access to this wonderful resource.
The database has the type of surgery, age, sex, 30 day dislocation rate, 6 month dislocation rate, the 4 year dislocation rate, the 6 month revision rate, 4 year revision rate and the charlson score. The dislocation rate data is a binary variable that is a 1 if you have had a dislocation in a given period of time (this is a complication that is associated with hip replacements), the revision rate is whether or not they had to go back and do another surgery on your hip in a give period of time, and the charlson score is a measure of overall health (for instance, if you had 4 heart attacks, you would have a large charlson score and it would suggest that maybe your problems are a result of external forces and not just the hip replacement).
Naturally we needed to find a way to compare this data in a reasonable way. In a previous paper they had done some propensity score matching, so I looked into that and we decided that it was a reasonable approach. Propensity score matching is a way of matching based on different variables. Recall that our goal is the compare people that had half hip replacements with similar people that have had full hip replacements, so we need to find a way of reasonably matching them.
The propensity score matching approach starts with a propensity model, which in this case is a simple logistic regression. I found that a propensity score based on charlson age and sex seemed to have the best AIC value, though the differences between a few different models was minor, so it is reasonable to pick a different model, but in the end the results are the same as long as you include age and charlson. Next I used the package in R called Matching to take these propensity scores and then construct two data sets that are well matched.
The initial results from this suggested that half hip replacement have between a .5 and 1.5 % decrease in dislocations for all time periods. This is about where I am currently.
Next up I am trying to fill out some more detailed tables that subset the data and compare, say only the really sick people or only older people, but the Matching package is doing a damn fine job of hiding the method they use. I have tried a few things, such as a simple t-test and http://en.wikipedia.org/wiki/McNemar’s_test, but they all give incredibly significant p-values (roughly 0), where as the output from the Matching package gives p-values that are more around the .05ish range and they are not always significant.
I have 3 articles I am reading on the method they use, but given the relative difficulty I am having in getting it sorted out, I think I might just go with McNemar’s test, since a bio-statistician my adviser talked to suggested it.