A couple of real-life events prompted this post:

  1. I took an online course on Algorithmic Trading for Python; and
  2. My partner and I are looking at moving and evaluating the housing market

The best way to combine the two events is obvious: use my new coding skills to visualize the public real estate data. Yes,I know I'm a little bit late to the data analytics game. That being said, I'm sure there are folks out there who could benefit from this data, so here we are.

In Part 1 of a series of posts, I use Zillow data via Quandl to plot some indicators and maps of real estate pricing in the Bay Area. The data is easy to manipulate, and you can change the variables and run it for your own location, or modify the code however you see fit. I used Python in the Jupyter Notebook environment to manipulate the data, and the open source Plotly library to visualize the data. The notebook files are linked at the bottom of the post.

Things to look forward to in this series:

  1. Further data analysis
  2. Analysis of data from other sources (Redfin, US Government, etc.)

First, we are going to look at some geographic plots to see what the market looks based on zip code. Note: Data for all zip codes is not available for every specific date, so we will work with what we have.

Caveat - the Plotly graphs display best on a desktop or larger screen device but still work on mobile.


In this plot, we are looking the Median Rental Price Per Square Foot in each zip code. Darker red is more expensive, yellow is less expensive. Prices in Alameda County, CA range from $3.5 - $2 / sq, ft. The numbers are for June 2018.

We can basically see that Oakland appears to be the most expensive, followed by Berkeley and Alameda. Dublin, Pleasanton, and Fremont are about mid range. Hayward and Union City offer the lowest rental rates.

Let's compare this with the Zillow Rental Index Per Square Foot, which purportedly accounts for variance in month to month data and outliers.

Based on Zillow, the overall market actually ranges a bit less - from ~$3 - $1.6 / sq. ft. This still confirms our estimate that Downtown Oakland and Berkeley are still the most expensive to live in coming in at around $3 per sq. ft. vs Hayward at nearly half $1.6.

What this data translates to is that as of June 2018 for a 1000 sq. ft. place you are looking at paying between $1600 - $3000 based on location. It seems that June may be an expensive month for rentals, but we won't be able to confirm that without accounting for trends over time.


Let's move on to some time series analysis. Here we compare these Zillow Indicators:

  1. Zillow Rental Index
  2. Price to Rent Ratio
  3. Median Rental Price for All Homes
  4. Median Rental Price Per SqFt for All Homes

Across these cities in East Bay:

  1. Oakland
  2. Berkeley
  3. Dublin
  4. San Ramon
  5. Fremont

The rental index shows that it's most expensive to rent in Berkeley and San Ramon currently. Note: This plot does not take into account the square footage. Dublin and Fremont are around the same and follow a similar trend. We can also start to see a trend where the rentals go up between May and October and then fall in the latter months. It looks like the optimal time to rent may be between November and March. Let's see if any of the other plots can help confirm this theory.

What this tells us is that the optimal time to buy a house in the East Bay was late 2014 / early 2015. That being said, for most areas except for Oakland and Emeryville, it's more economical to rent rather than buy a home.

I've read some charts from other sources that do not agree with this data and show the price to rent ratio being much higher for these areas. As with everything on the internet, take this chart with a grain of salt. I will do some more research to confirm or deny these findings in the next post.

Here, we can see there is a seasonality to the rental prices. Looking at the data from 2015, 2016, and 2017 we can confirm that the rental prices do drop between November and March with some variance depending upon the year. It also confirms the Zillow Rental Index finding that Berkeley and San Ramon are the most expensive areas.

Take this data with a grain of salt as well since the Berkeley and Emeryville numbers only date back to January 2017 and appear to fluctuate differently than the other cities. That being said, it does confirm that Berkeley, Oakland, and Emeryville are the most expensive per square foot. Compared to the previous chart showing San Ramon, Dublin, and Fremont with a higher rental index than Oakland and Emeryville, this chart we are seeing a much lower price per square foot. This means that the houses in Berkeley, Oakland, and Emeryville on the whole have much less square footage than their counterparts. This is important to consider if space is an issue, or you are trying to start a family.


TL;DR:

The Bay Area is expensive. It's often more economical to rent rather than buy. If you are able to plan for it, you are more likely to get a better rental price between November and March. And, you will get much more square footage for your money in Dublin or Fremont than in Berkeley or Oakland.

This was a fun little exercise. My Python Notebook files are listed below for you to use and edit if you'd like. You can run the numbers on different indicators or locations. Let me know if you find something useful! I would love more insight.

Geo Plot Notebook
Time Series Notebook

Questions, errors, comments? Post below.