Check out this JustSayNow article, seriously in need of a graph. As the author states,"While correlation is not causation, I think it is interesting that the top ranked cities for creativity all have extremely tolerant attitudes toward marijuana use." While not actually ripe for statistical analysis as described, how might these variables be operationalized?
Welcome, Again: This site has gone through several iterations, first as an idea proposed in an Ask Reddit thread, then the creation of the first few posts you see here. The site was built in 2010, although work ended quickly with a post offering to give the site away to a good home.
Several years later, I had a similar idea for a site to be used in courses related to statistics or research methods. I teach in the Sociology and Criminology Department at the University of North Carolina Wilmington, and am also a web hobbyist. After researching domain names, I bought statfail.com and soon discovered this domain had been previously developed. Archive.org presented a site very similar to what you see here. I liked the design and the content offered a good start in my web-based examination of the perils and prospects of statistical analysis. The site also had a decent Google pagerank, which feed into my curious hobby. Rather than start at zero, I contacted the original developer for permission to use this content. Thank you Decabear for agreeing to allow this content to be recycled.
Additional content will be added in time. While often comical, the correlations presented here may be closer to the reality of today's media that we care to admit. The goal is to help my students, and various site visitors, become more critical of statistical analysis. In particular, we examine the confusion of correlation and causation.
Time order may come into play as well. For now, I am going to step outside, open my umbrella, and cause it to rain.
Here are the facts: Album sales in the music industry have been in decline. This decline started in 2000. Nickelback scored their first big hits in 2000. How can the logic and results of this convincing statistical analysis be denied? Interestingly, charting Pontiac Aztec by total Pontiac sales will demonstrate the same precipitous decline.
Here are the facts: The more Google results something has, the more likely it is popular. This is a roughly linear relationship. Given this, “sex with a picnic table” garnered about 60% as many results as “sex with a human being.” (Statfail's snarky 2012 comment: Nice work Decabear! Shady statistical analysis coupled with an imprecise Google search. Well done!)
Here are the facts: Texas has the most Miss USA pageant winners of any state by far, and has one of the highest rates of obesity. Colorado, on the other hand has the lowest rate of obesity, and a dearth of hot chicks, apparently. This statistical analysis left out a few variables, or perhaps not, but Statfail's extensive ski slope data collection leads one to question this result.
Here are the facts: There have been four generations of Pokémon games. The release of each has been followed by an economic upturn. Coincidence? Statistical analysis at its best?
What do you think? Coincidence or not? How does this graph help us define statistical analysis?
Welcome to Statfail
This site has gone through several iterations, first as an idea proposed in an Ask Reddit thread.
Original content, limited to the first six posts, remains. These posts, and others to be added, are intended to help visitors (including my students) learn statistical analysis. I will use this site in my online statistics courses, where students develop an understanding of tools for analyzing data, correlation, causation, and application of statistics in data analytics. This site is intended to define statistical analysis by encouraging visitors to ask "what is data analysis," "Why is it important to understand statistical analysis," and "how often is data mis-analyzed?"