Posts tagged “gbd”
RAW is a really impressive and easy-to-use data visualization tool created by Density Design. I created the following plot in about five minutes from existing GBD data (of DALYs in India for women of all ages).
The first column contains risk categories as defined by the comparative risk assessment of the 2010 Global Burden of Disease. The second column contains individual risk factors (each of which fits into an aforementioned risk category). The final column shows attributable DALYs by cause. Some color would help differentiate the different risks and causes, but the basic picture is clear if you spend a few minutes with the graph. Women in India, according to the 2010 GBD, predominantly lose healthy life years from CVD, chronic respiratory diseases, nutritional deficiencies, and infectious disease. A fair amount of this is attributable to air pollution.
To make this plot, I opened a CSV, copied its contents, pasted into a text field at RAW, and then used its simple, elegant GUI to generate the code for the plot. The options are a little limited now (would like to add some color, shift label positions around, etc). If I really wanted to make those changes, I could edit the code and do it manually. A really impressive showcase of what can be done in the browser and definitely worth checking out and keeping an eye on.
A few requests had come in to download around 12 countries worth of the recently released Global Burden of Disease from the IHME website. There’s no way to quickly download multiple files; by my count, it requires you to type the country name, click a link, click a tab, and then option-click a CSV file.
The URLs had relatively similar construction, so I wrote a quick R script to download all of the data and save each one as a separate compressed RDS file. I also dropped a couple of redundant columns to try to save some space. The compression is pretty efficient; 25-27 MB files were reduced to between 6.6 - 7.4 MB. Check it out here or below.
Update (April 2015): Updated to allow users to specify download location, making it work better ‘out of the box’; users can specify whether to download as CSV or RDS (or both); fixed some other minor bugs; fixed a major change in the URL structure.
R can be scary for those new to it, but it is exceptionally useful for a number of things, including managing, importing, and merging text files; resaving them; and performing statistical analyses to your heart’s content. It is your friend, albeit one that you must learn to love slowly and painfully.
This brief tutorial does not serve as an introduction to R. Instead, it focuses on reading in a large, complex data set with ~1 million rows and 50+ columns. It was created to help facilitate some analysis in a GBD course at Berkeley. It will help you figure out how to do some basic manipulation and subsetting and export these subsetted data into a comma-separated text file (“csv”) for analysis in your favorite spreadsheet program. It is a work in progress and will be updated over time.