Posts tagged “statistics”

Non-elite marathon performance

Some fun stuff from Runrepeat.com:

The purpose of this study was to compare recreational runners. Therefore elite runners were omitted hence including them would bias the results dramatically.

In this analysis, we looked at the following marathons from 2009 to 2014: Chicago, Marine, Boston, London, Paris, Berlin, Frankfurt, Athens, Amsterdam, Budapest, Warszawa and Madrid. This gave us a total of 72 marathon races.

Iceland’s got the world’s fastest non-elite marathon runners. Men clock in at ~3:52, Women at 4:18. Pretty impressive. In comparison, men in the US finish at around 4:19, while women finish at ~4:42. Developed and rapidly developing countries seem to see the largest increase in marathon runnings between 2009-2014.

BBC's Your life on earth

Put in a few facts about yourself — birthdate, gender and heights — and get an assortment of facts about how the world has changed since your arrival.

Some of mine:

  • Population has increased by ~2.8 billion; life expectancy is 8 years longer than when I was born
  • BBC projects Oil and Coal will run out by the time I’m 80. They estimate gas supplies will continue beyond my life, but not my children’s.

If you were born 4 years ago:

  • Population has increased ~327 million — 10 million more than the US!
  • While you’re on average (in the US) 3.3 ft tall, a coastal redwood would have grown ~5ft.

Kind of fun. I’d be interested to know a bit more about their data projections. They do offer a little bit of information, at least, about where the data came from.

via kottke

The New Yorker: Mapping the Rise of Craft Beer

As part of their Ideas of the Week series, the New Yorker mapped craft beer breweries and other beer related statistics. The interactive maps are particularly fun to play with.

As of March, the United States was home to nearly two thousand four hundred craft breweries, the small producers best known for India pale ales and other decidedly non-Budweiser-esque beers. What’s more, they are rapidly colonizing what one might call the craft-beer frontier: the South, the Southwest, and, really, almost any part of the country that isn’t the West or the Northeast. The interactive map below, based on newly released 2012 data gathered by the Brewers Association, illustrates this phenomenon and offers a detailed overview of the American craft-beer industry.

Visualizing US Gun Killings in 2010

An impressive visualization created by Periscopic using public data. They calculated counterfactual stories for each of the individuals killed by gun violence, offering an alternate likely cause of death had they not been killed. Their description of their methods:

Our data comes from the FBI’s Uniform Crime Reports, which include voluntarily-reported data from police precincts across the country. In 2007, according to the FBI, law enforcement agencies active in the UCR Program represented more than 285 million US inhabitants—94.6% of the total population. This special dataset is at the raw, or incident, level—containing details of each person who was killed, including their age, gender, race, relationship to killer, and more.

For the gray lines, we calculated alternate stories for the people killed with guns using data from the World Health Organization. To calculate an alternate story, we first performed an age prediction weighted according to the age distribution of US deaths. Using this age, we then predicted a likely cause of death at that age. We do not adjust for life-expectancy differences between demographic groups, as we have not yet found data to that extent. We used data from 2005, the most recent year available.

Arrested for breaking the law of large numbers

James B. Stewart wrote a piece about Apple and the Law of Large Numbers. Maybe the article was okay as a whole, but he entirely misrepresented the Law of Large Numbers, which has ab-sa-toot-ly nothing to do with individual corporations and their future growth projections.

How does this slip by the NYT? How does it get by their legions of fact-checkers, statisticians, and Nate Silver? Shouldn't any mention of statistical theory or mathematical theorems be properly represented by a paper that continually pushes the need for better science and math education?

I digress. A fun piece from Dr. Drang has popped up on the blogosphere about this blunder.

Let's start with what the Law of Large Numbers really states. Put simply, it says that the sample mean of a random variable will tend toward the underlying population mean as the number of samples grows larger. For example, Wolfram Alpha says the average height of an adult male is 5′ 9″. If you measured the height of a few randomly selected men, you might get an average for your sample that's quite far from 5′ 9″. But if you increased the size of the sample, the tendency would be for your sample average to move closer to 5′ 9″.

The law does not state that "a variable will revert to a mean over a large sample of results." The Law of Large Numbers says nothing about individual measurements; it's all about averages. And it certainly doesn't "suggest" anything about the future growth of large companies.

If the Law of Large Numbers worked the way Stewart says, you could repeatedly measure the height of Dirk Nowitzki and he'd eventually shrink down to 5′ 9″. I'm surprised the Mavericks' opponents haven't thought of this.

Read the full article here.

Stata Bundle for TextMate

This link is as much for me as for anyone reading this thing. I'm beginning to learn Stata, a statistics software package, as it is broadly used at the School of Public Health at Berkeley [along with a fair amount of SAS and R, just to keep things interesting].

I was looking for and found an updated Stata bundle for TextMate, which works nicely and can be quickly updated and customized.

Thanks to Dan Bylr.

all rights reserved
snarglr is written & maintained by ajay pillarisetti



click here to turn on all posts