We empirical economists get very excited about finding or generating new data sets. There are big returns to splicing together different data sources to answer new and interesting questions. This is hard work and not everyone is good at it. Many datasets are hidden deep inside government vaults and some are under lock and key. You need to become a sworn Census Officer to access many of them. I have a lot of friends who took that oath. There is a reason to take privacy concerns seriously and we need to protect the identity of individuals in these microlevel datasets.
What I find frustrating though, is how difficult it is to get access to publicly available data on the energy economy. It should not be hard to figure out where we are drilling for natural gas. It should not be difficult to figure out where pipelines run. Wouldn’t it be nice to have the majority of the public datasets the federal government collects on energy online in an easy and accessible format? Make a map of gas wells. Make a map of oil wells. Download these data in GIS format onto your computer and splice them in with your favorite dataset? For those of you who have tried to do this in the past, this was an exercise in banging your head against your 27” flatscreen. Well, a new era has broken. I am not sure which EIA administrator (Richard Newell, Howard Gruenspecht or Adam Sieminski) deserves the credit, but the EIA a little while back started rolling out its U.S. Energy Data Mapping System.
What they have done here is started putting their spatial datasets online into one easily accessible system. So how about that map of natural gas and oil wells? 30 seconds of clicking produces this!
A quick click on notes and sources gets you access to where the data come from. If the shapefiles are public, there is a link to download the original data. Clicking on individual points on the map gets you information about that datapoint.
This effort of EIA to make public data accessible will not only generate new papers, but also make it easy for anyone interested in the local and national energy economy to visualize aspects relevant to their inquiry.
For the nerdier folks, who know what an API is, the EIA has that too now. If you program in R, Matlab or Stata (or use Excel), you can get updated versions of data series automatically via this interface. Currently this interface contains:
408,000 electricity series organized into 29,000 categories
30,000 State Energy Data System series organized into 600 categories
115,052 petroleum series and associated categories
11,989 natural gas series and associated categories
132,331 coal series and associated categories (released Feb 25, 2014)
With so much data readily accessible to anyone with a computer, I anticipate that we will see more papers on, better analysis of, and certainly better maps depicting the national energy economy. For those of you who get your horoscope by email in the morning, I would also highly suggest subscribing to the daily “today in energy” mailing. These (unlike your daily horoscope) are well researched and present short pieces discussing something important and/or timely every morning. I could not imagine having my coffee without it.