home about publications talks teaching photos tools archives
about writing tools archives
about writing

Posts tagged “R”

Four Gallup polls related to energy and climate

Gallup conducted its annual Environment poll between March 1-10. They have released four snippets of fascinating data related to global warming, reliance on fossil fuels, and nuclear energy.

A/ More people are attributing warmer and colder weather to climate change than previously

Seventy percent of the subset of U.S. adults experiencing warmer temperatures this winter, and 44% of those experiencing colder than normal temperatures, attribute their atypical weather to human activity.

In 2012, when temperatures nationwide were 3.69 degrees higher on average than normal, 38% of those experiencing warmer than usual weather blamed it on human activity. The percentage blaming warmer weather on human activity rose into the 50s from 2013 to 2017 before rising to 70% this year.

The sample sizes of those experiencing colder than normal weather in 2012 and 2013 were too small to allow reporting of these respondents’ views on the cause. However, in 2014, when the sample size was sufficient, just 29% thought colder-than-normal weather was due to human activity. That rose to 37% in 2015, to 40% in 2016, and has since been above 40%.

B/ The data on climate change indicate greater concern overall among the populace, but dramatic differences across party and ideological lines.

… the public’s concern about global warming and belief that humans are responsible are holding steady at or near the trend high points. While Americans as a whole are concerned about global warming, the partisan differences between Democrats and Republicans are stark. Most Democrats take the issue seriously and are troubled by it. Republicans remain skeptical and largely unconcerned.

First, some graphs on the overall concern.

The numbers from 2019 that I find most striking revolve around a question asking whether or not “global warming will pose a serious threat to your or your way of life in your lifetime.” In 2019, 45% said yes and 55% said no. If you split that up by ideology, the difference is stark: 67% of liberals, 47% of moderates, and 27% of conservatives said yes.

C/ The next two items released by Gallup deal with future energy options. The first set of questions focused on decreasing use of fossil fuels in the next 10 or 20 years. 60% of respondents thought it was likely or very likely that the US could dramatically decrease dependance on fossil fuels, with the vast majority wanting to see increased emphasis on wind and solar power.

D/ Gallup also released asked questions on the use of nuclear power. The data show an almost even split:

Americans are evenly split on the use of nuclear power as a U.S. energy source. Forty-nine percent of U.S. adults either strongly favor (17%) or somewhat favor (32%) the use of nuclear energy to generate electricity, while 49% either strongly oppose (21%) or somewhat oppose (28%) its use.

Roughly equal percentages of Americans say nuclear power plants are safe (47%) as say they are not safe (49%). This is the first time in Gallup’s 10-year trend on this question that a plurality of Americans have considered nuclear power unsafe. Even in the 2011 poll, conducted two weeks after the high-profile Fukushima Daiichi nuclear accident in Japan, a majority said they viewed nuclear power plants as safe.

More information on the sampling methodology and survey methods can be found at the bottom of each page linked above. Visualizations on this page made by downloading PDFs of data from Gallup, extracting tables, and generating plots using some R packages: readxl, ggplot2, and data.table.

WHO Homes Model

who_homes.png Access WHO HOMES Model. The WHO HOMES model is an online implementation of a single compartment boxmodel appropriate for estimating PM or CO concentrations resulting from the combustion of solid fuels in homes. It contains a number of easy to manipulate parameters, like air changes per hour, cooking time, etc, that are used to recreate distributions from which Monte Carlo analyses can be performed. It can estimate exposures using a number of methods.

A concise and useful guide on placing multiple ggplots in R

A great tutorial and guide by Baptiste AuguiƩ. Helps explain how to layout multiple plots in a single window, and provides useful tutorial and examples.

Some more graphs of Beijing's Air Pollution

A bunch of folks across the internet have been doing some great stuff with the air quality data coming out of China via official channels and the US Embassy twitter feeds. My advisor asked for some graphs of available data. They are posted below (all were created in R using ggplot2). If time ever permits, I’ll post some interactive visualizations.

Shiny Server on WebFaction

Update: WebFaction released today a one-click installed for node.js, obviating Step 2 below. Leaving it in here for posterity.

Shiny “makes it super simple for R users like you to turn analyses into interactive web applications that anyone can use.” It’s a powerful tool with a relatively simple syntax. It’s great for local apps — but I wanted to set up a web-based app that others could access and that wasn’t beholden to Shiny and RStudio’s excellent beta server platform.

I host this site and a few others at WebFaction — an awesome service with little to no downtime, fast servers, and relatively flexible restrictions. Getting Shiny up and running on WebFaction required a little work.

Step 1: SSH into WebFaction. Follow the instructions on their website for your specific server(s).

Step 2: Make a source directory. Download and install node.js.

mkdir src
cd src
wget 'http://nodejs.org/dist/v0.10.20/node-v0.10.20.tar.gz'
tar -xzf node-v0.10.20.tar.gz
cd node-v0.10.20
python2.7 configure --prefix=$HOME
make PYTHON=python2.7
make PYTHON=python2.7 install

export NODE_PATH="$HOME/lib/node_modules:$NODE_PATH"
echo 'export NODE_PATH="$HOME/lib/node_modules:$NODE_PATH"' >> $HOME/.bashrc 

Step 3: Download and install R.

#install R
wget 'http://cran.us.r-project.org/src/base/R-3/R-3.0.2.tar.gz'
tar -xzf R-3.0.2.tar.gz
cd R-3.0.2
./configure --prefix $HOME
make
make install

Step 4: Make a temp/tmp/temporary director.

cd $HOME
mkdir tmp
chmod 777 tmp
TMPDIR=$HOME/tmp
export TMPDIR

Step 5: Download Shiny from source and install using NPM.

git clone https://github.com/rstudio/shiny-server.git
npm install -g shiny-server/

installing from NPM directly did not work — Shiny would not launch. I believe this is because you’re not allowed root access on WebFaction shared accounts.

Step 6: Launch R and install whatever packages you need.

install.packages('ggplot2')
install.packages('data.table')
devtools::install_github("ShinyDash", "trestletech")
devtools::install_github("shiny-incubator", "rstudio")

Step 7: Want plots to work? In your Shiny app’s global.R file, set

options(bitmapType = 'cairo')

Next up: a cron job to keep a Shiny instance running or to restart it if it goes down… and putting Shiny behind some light authentication to prevent pre-release apps from general consumption.

Batch Download IHME's Global Burden of Disease Data

A few requests had come in to download around 12 countries worth of the recently released Global Burden of Disease from the IHME website. There’s no way to quickly download multiple files; by my count, it requires you to type the country name, click a link, click a tab, and then option-click a CSV file.

The URLs had relatively similar construction, so I wrote a quick R script to download all of the data and save each one as a separate compressed RDS file. I also dropped a couple of redundant columns to try to save some space. The compression is pretty efficient; 25-27 MB files were reduced to between 6.6 - 7.4 MB. Check it out here or below.

Update (April 2015): Updated to allow users to specify download location, making it work better ‘out of the box’; users can specify whether to download as CSV or RDS (or both); fixed some other minor bugs; fixed a major change in the URL structure.

R + Global Burden of Disease / Comparative Risk Assessment Data: A tutorial (version 0.1)

R can be scary for those new to it, but it is exceptionally useful for a number of things, including managing, importing, and merging text files; resaving them; and performing statistical analyses to your heart’s content. It is your friend, albeit one that you must learn to love slowly and painfully.

This brief tutorial does not serve as an introduction to R. Instead, it focuses on reading in a large, complex data set with ~1 million rows and 50+ columns. It was created to help facilitate some analysis in a GBD course at Berkeley. It will help you figure out how to do some basic manipulation and subsetting and export these subsetted data into a comma-separated text file (“csv”) for analysis in your favorite spreadsheet program. It is a work in progress and will be updated over time.

Guess which state has the most medal winners from the Great American Beer Festival?

Subtitle: Mapping the 2012 GABF Winners Using R

The Great American Beer Festival (GABF) announced its winners on October 13. Lots of amazing beers from all over the US. They have a nifty search feature which lets you (1) find beers from specific states, (2) search by year of competition, (3) search by award - gold, silver, or bronze, and (4) search by keyword.

Like a true beer-loving nerd, I was curious to see which state won the most awards and to look at the geographic distribution of winners. I also needed to learn how to make simple maps using R for some work related stuff. The confluence of curiosity and need got me giddy… and set me to work. Turns out that making simple maps in R is… simple.

More on the details of the process in a few days (along with a table outlining the above data). In the meantime, revel in the beer mecca that is California.