Calculate Wages and Benefits in R with blscrapeR

The most difficult thing about working with BLS data is gaining a clear understanding on what data are available and what they represent. Some of the more popular data sets can be found on the BLS Databases, Tables & Calculations website. The selected examples below do not include all series or databases. Install blscrapeR The Read More


Mapping US Counties in R with FIPS

Anyone who’s spent any time around data knows primary keys are your friend. Enter the FIPS code. FIPS is the Federal Information Processing Standard and appears in most data sets published by the US government. Name Matching The map below is an example as the “wrong way” to do something like this. This map uses Read More


Adding Pitching Stats to the Lahman R Package

Lately I’ve rediscovered the Lahman package for R. Since I’ve got a Lahman database on my localhost, I normally use a db connection in R to grab the data I need. In the process it’s easy to forget how quick and easy the Lahman package makes it to gather baseball data. No Love for the Read More


Cluster Analysis of a Umpire’s Strikezone in R

Ever wonder how “high” or “low” an umpire’s strikezone is compared to the rest of the leauge? Thanks to some public data and the PitchRx package, it’s easy to use a cluster analysis to figure it out! Tim Timmons vs. the League For this analysis I’m going to pick on umpire Tim Timmons, for no Read More


Top 5 Replacement Players That Will Surprise You: Baseball’s WAR

Wins above replacement (WAR) is the wild wild west of unstandardized baseball stats. With the 2014 MLB playoffs about a month away, I’ve started looking at some surprises in the season’s WAR rankings. I ran some standard calculations in Ben Baumer’s openWAR package, and among the surprises were the names on the list of “replacement Read More


Top 5 Metrics That Make Batting Average Obsolete

Every time I watch a ball game and hear the announcers talk about a player’s batting average my skin crawls. Why? Because batting average is one of the worst stats you can base a player’s performance on. By the time you’re done reading this, you’ll convince all your friends never to speak of BA again. Read More


Cool things to do with pitchRx in R Studio (Part 1)

I recently wrote a post on how to collect MLB game day data using the pitchRx package in R and collect that data into a Sqlite database. Data scraping is fun but all the data in the world is useless unless you can do something with it. The real power of pitchRx, in my opinion, Read More


How to Use R-Studio with Tableau

A little while back we posted an article on the pros and cons of connecting Tableau with R. All the while, we failed to post a tutorial on how to do so. It’s pretty simple really. 1. Open R-Sudio and install the Rserve package . If you need help installing R-Studio on Linux, see our Read More


How to Install R in Linux Ubuntu 14.04

The R programming language is one of the most powerful tools for data analysis on any operating system. There are two major parts of each install, the R-base that installs the programming language and its dependencies and R-Studio, which is an open-source IDE for R. If you need help on installing R on Ubuntu 16.04 Read More