Posts tagged “data visualization”
A bunch of folks across the internet have been doing some great stuff with the air quality data coming out of China via official channels and the US Embassy twitter feeds. My advisor asked for some graphs of available data. They are posted below (all were created in R using ggplot2). If time ever permits, I’ll post some interactive visualizations.
RAW is a really impressive and easy-to-use data visualization tool created by Density Design. I created the following plot in about five minutes from existing GBD data (of DALYs in India for women of all ages).
The first column contains risk categories as defined by the comparative risk assessment of the 2010 Global Burden of Disease. The second column contains individual risk factors (each of which fits into an aforementioned risk category). The final column shows attributable DALYs by cause. Some color would help differentiate the different risks and causes, but the basic picture is clear if you spend a few minutes with the graph. Women in India, according to the 2010 GBD, predominantly lose healthy life years from CVD, chronic respiratory diseases, nutritional deficiencies, and infectious disease. A fair amount of this is attributable to air pollution.
To make this plot, I opened a CSV, copied its contents, pasted into a text field at RAW, and then used its simple, elegant GUI to generate the code for the plot. The options are a little limited now (would like to add some color, shift label positions around, etc). If I really wanted to make those changes, I could edit the code and do it manually. A really impressive showcase of what can be done in the browser and definitely worth checking out and keeping an eye on.
The Best American Infographics 2013 came in yesterday. It’s chock-full of goodness and inspiring visual displays of data. Some are nonsensical, some are dense and shocking. They’re all pretty engaging and the collection appears well-curated. Wired has a number of the selected graphics online.
The book’s introduction was written by David Byrne. I’ll add a link to the essay if it appears online. In the meantime, my favorite bit follows.
The very best of these, in my opinion, engender and facilitate an insight by visual means - allow us to grasp some relationship quickly and easily that otherwise would take many pages and illustrations and tables to convey. Insight seems to happen most often when data sets are crossed in the design of the piece - when we can quickly see the effects on something over time, for example, or view how factors like income, race, geography, or diet might affect other data. When that happens, there’s an instant “Aha!” - we can see how income affects or at least correlates with, for example, folks’ levels of education. Or, less expectedly, we might, for example, see how rainfall seems to have a profound effect on consumption of hard liquor (I made that part up). What we can get in this medium is the instant revelation of a pattern that wasn’t noticeable before.
One would hope that we could educate ourselves to be able to spot the evil infographics that are being used to manipulate us, or that are being used to hide important patterns and information. Ideally, an educated consumer of infographics might develop some sort of infographic bullshit detector that would beep when told how the trickle-down economic effect justifies fracking, for example. It’s not easy, as one can be seduced relatively easily by colors, diagrams and funny writing.
Johnson M, Pillarisetti A, Allen T, Charron D, Pennise D, Smith KR. A robust, low-cost particle monitor and data platform for evaluation of cookstove performance. EPA Air Sensors 2013: Data Quality & Applications. Research Triange Park, NC: March 18-19, 2013.
We’ve gathered hour-by-hour observations from tens of thousands of ground stations world-wide, in some places going back a hundred years. We expose it as a sort of “time machine” that lets you explore the past weather at any given location. We’ve also used the data to develop statistical forecasts for any day in the future. For example, say you have an outdoor family reunion in 6 months: with the time machine, you can see what the likely temperature and precipitation will be at the exact day and hour.
Their API sounds good, too, though I haven’t taken the plunge on that yet.
Now that we’ve developed a general-purpose weather API, we’re trying to compete with the other weather APIs available around the Internet. We’ve found those APIs to be difficult and clunky to use, so we’ve tried to make our API as streamlined as possible: you can sign up for a developer account without needing a credit card, and start making requests right away—you can worry about payment information when your app is ready. Additionally, we’ve lowered our prices so that we’re competitive with the other data providers out there.
It's getting hot in here: Shifting Distribution of Northern Hemisphere Summer Temperature Anomalies, 1951-2011 →
This bell curve graph shows how the distribution of Northern Hemisphere summer temperature anomalies has shifted toward an increase in hot summers. The seasonal mean temperature for the entire base period of 1951-1980 is plotted at the top of the bell curve. Decreasing in frequency to the right are what are defined as “hot” anomalies (between 1 and 2 standard deviations from the norm), “very hot” anomalies (between 2 and 3 standard deviations) and “extremely hot” anomalies (greater than 3 standard deviations). The anomalies fall off to the left in mirror-image categories of “cold, “very cold” and “extremely cold.” The range between the .43 and -.43 standard deviation marks represent “normal” temperatures.
As the graph moves forward in time, the bell curve shifts to the right, representing an increase in the frequency of the various hot anomalies. It also gets wider and shorter, representing a wider range of temperature extremes. As the graph moves beyond 1980, the temperatures are still compared to the seasonal mean of the 1951-1980 base period, so that as it reaches the 21st century, there is a far greater frequency of temperatures that once fell 3 standard deviations beyond the mean.
A pretty stunning visualization of wind direction and speed over the continental US. Data is pulled from the National Digital Forecast Database every hour, so the visualization is almost in real-time. And, impressively, they're using HTML5 to draw the map and wind animation.
Beautiful + impressive.