Posts tagged “visualization”
This is relatively old news in the world of the internet... but it's still a pretty awesome visualization. The story's full of interesting facts. For instance:
The distillery in Lawrenceburg, Indiana is known colloquially as LDI, but is now part of MGP, a food conglomerate that specializes in bioplastics, industrial proteins, and starches for use in salad dressings,energy bars, imitation cheese, and fruit fillings. One of the products made in the Indiana facility is a rye whiskey with a mash bill of 95 percent rye, 5 percent malt barley. Most rye whiskeys are no more than 70 percent rye. According to author Chuck Cowdery, this particular whiskey was developed by Seagram's as a flavoring agent for blended whiskeys like Seagram's 7. When Seagram's disintegrated due to mismanagement in the 1990s, the whiskey, then in the process of aging, was sold to other distilleries in the fire sale of assets, as one salvage company after the next tried to determine what to do with the distillery and its excess inventory. This is how one generic whiskey became known by more than a dozen names, including Templeton Rye, Redemption Rye, Bulleit Rye, Willet, Smooth Ambler, and George Dickel Rye, among others. The companies that own each of these brands have purchased LDI rye whiskey and now bottle it under their own labels, adjusting the proof and length of aging in order to create their own differentiations.
What the what.
A bunch of folks across the internet have been doing some great stuff with the air quality data coming out of China via official channels and the US Embassy twitter feeds. My advisor asked for some graphs of available data. They are posted below (all were created in R using ggplot2). If time ever permits, I’ll post some interactive visualizations.
We’ve gathered hour-by-hour observations from tens of thousands of ground stations world-wide, in some places going back a hundred years. We expose it as a sort of “time machine” that lets you explore the past weather at any given location. We’ve also used the data to develop statistical forecasts for any day in the future. For example, say you have an outdoor family reunion in 6 months: with the time machine, you can see what the likely temperature and precipitation will be at the exact day and hour.
Their API sounds good, too, though I haven’t taken the plunge on that yet.
Now that we’ve developed a general-purpose weather API, we’re trying to compete with the other weather APIs available around the Internet. We’ve found those APIs to be difficult and clunky to use, so we’ve tried to make our API as streamlined as possible: you can sign up for a developer account without needing a credit card, and start making requests right away—you can worry about payment information when your app is ready. Additionally, we’ve lowered our prices so that we’re competitive with the other data providers out there.
It's getting hot in here: Shifting Distribution of Northern Hemisphere Summer Temperature Anomalies, 1951-2011 →
This bell curve graph shows how the distribution of Northern Hemisphere summer temperature anomalies has shifted toward an increase in hot summers. The seasonal mean temperature for the entire base period of 1951-1980 is plotted at the top of the bell curve. Decreasing in frequency to the right are what are defined as “hot” anomalies (between 1 and 2 standard deviations from the norm), “very hot” anomalies (between 2 and 3 standard deviations) and “extremely hot” anomalies (greater than 3 standard deviations). The anomalies fall off to the left in mirror-image categories of “cold, “very cold” and “extremely cold.” The range between the .43 and -.43 standard deviation marks represent “normal” temperatures.
As the graph moves forward in time, the bell curve shifts to the right, representing an increase in the frequency of the various hot anomalies. It also gets wider and shorter, representing a wider range of temperature extremes. As the graph moves beyond 1980, the temperatures are still compared to the seasonal mean of the 1951-1980 base period, so that as it reaches the 21st century, there is a far greater frequency of temperatures that once fell 3 standard deviations beyond the mean.
There’s been a lot of dingus kerfuffle around the US Embassy monitoring air quality in Beijing and posting the results to Twitter at @BeijingAir. I personally like this kind of thing — its almost as though the government is acting as an environmental activist with infinite clout, stirring up problems by bringing known issues to light.
I thought, in passing, that it would be fun to pull the data stream from Twitter, parse it, and graph it. The embassy updates the data hourly; I figured I could make a call to Twitter’s API, without the need for any hacky AJAX refreshing. When people view the post, it’ll show the most recent two hundred tweets, representing 200 hours of data. Perhaps there’d be a need/interest to backup more to a database, but I was running out of steam - turns out that this undertaking wasn’t as easy as one would have hoped.
So, without further ado, here’s approximately the latest week of PM2.5 data from Beijing. The lower line — in red — is the PM2.5 concentration; the upper line — in green — is the air quality index (AQI). The dotted, light-grey line is the US EPA 24h PM2.5 standard. Note that Beijing is rarely, if ever, below that designation. I’ll do my best to explain what each of those lines represents below. But now, the graph:
PM2.5 is defined by the US EPA as follows:
Particles less than 2.5 micrometers in diameter are called “fine” particles. These particles are so small they can be detected only with an electron microscope. Sources of fine particles include all types of combustion, including motor vehicles, power plants, residential wood burning, forest fires, agricultural burning, and some industrial processes.
Exposure to particles of this size has been implicated in a wide range of health effects. Like other chemical exposures, at a first approximation the intensity of the health effect depends on the duration of exposure, the concentration of particles in the environment, and an individual’s proximity to the source. There’s increasing evidence that any exposure above very low levels — the types we rarely see anywhere on Earth these days — are bad for health and can exacerbate heart and lung disease, asthma, bronchitis, and the like.
The Air Quality Index (or AQI) is a summary measure that
tells you how clean or polluted your air is, and what associated health effects might be a concern for you. The AQI focuses on health effects you may experience within a few hours or days after breathing polluted air. EPA calculates the AQI for five major air pollutants regulated by the Clean Air Act: ground-level ozone, particle pollution (also known as particulate matter), carbon monoxide, sulfur dioxide, and nitrogen dioxide. For each of these pollutants, EPA has established national air quality standards to protect public health. Ground-level ozone and airborne particles are the two pollutants that pose the greatest threat to human health in this country.
Finally, the US EPA standard is pretty straightforward. For the US, there are not supposed to be 24-hour average PM levels above the 35µg/m3. Of course, as we can expect, not every locale in the country can meet this standard.
Back to China.
It’d be interesting to add some summary statistics and look at variation between weekdays and weekends — I’m working on that now. I’m also trying to find an accessible data source from China to plot along with the US data. Some comparison would be good, especially after China began posting its own data not too long ago.
The previous (and awesome) work that inspired this undertaking was done by China Air Daily. They’ve got some amazing visuals of the air pollution. One is attached below; I recommend checking out their site for more great stuff.