I wanted to write about Neural Networks since a while, mainly because I saw this as an opportunity to summarize some of the things I learned working with Keras, a high level Neural Networks API building on top of Tensorflow (among others). I have used and came to like the latter, but always thought it lacks a bit on the usability side. This will be the first of a series of posts on Keras and Neural Networks. We won’t do anything fancy, just introduce a dataset we’ll be working on and showcase the basic usage of Keras.

Oslo Bikes

You probably know I like to work with bike sharing trip data, but this time I wanted to look at the open data provided by the Oslo City Bike program. I downloaded some of their datasets containing 2.5 million trips taken in Oslo between April 1st, 2017 and September 20th, 2017 (Norwegians love to cycle). The number of daily trips varies quite a bit, as the plot below shows (on May 17th, e.g. the system was shut down due to it being a national holiday).

I made a dataset containing the trips taken with a resolution of one hour, looking like this:

date trips
2017-05-01 10:00:00 353.0
2017-05-01 11:00:00 617.0
2017-05-01 12:00:00 817.0
2017-05-01 13:00:00 1023.0
2017-05-01 14:00:00 1145.0

We will use autoregressive models to predict the number of trips being taken at a given hour, using the number of trips taken in the past as inputs. Some pandas magic gives us the following dataset.

date 1d 1h 2d 2h 3d 3h 4d 4h 7d target
2017-05-01 10:00:00 324.0 244.0 461.0 162.0 250.0 94.0 311.0 50.0 88.0 353.0
2017-05-01 11:00:00 589.0 353.0 664.0 244.0 394.0 162.0 394.0 94.0 86.0 617.0
2017-05-01 12:00:00 801.0 617.0 845.0 353.0 472.0 244.0 412.0 162.0 119.0 817.0
2017-05-01 13:00:00 890.0 817.0 966.0 617.0 468.0 353.0 457.0 244.0 182.0 1023.0
2017-05-01 14:00:00 1025.0 1023.0 1100.0 817.0 593.0 617.0 609.0 353.0 288.0 1145.0

The target is the number of trips taken, the lagged data are our covariates. We’ll use the last week of the data as a test set and train our models on the rest.


To get started with using Keras, all you need to do is install it using

pip install keras
pip install tensorflow

Building our first model is pretty straightforward, after reading the excellent user guide and documentation, you’ll soon enough write your own. No comparison to Tensorflow!

from keras.models import Sequential
from keras.layers import Dense, Dropout

dense_model = Sequential()
dense_model.add(Dense(32, input_shape=(1, len(features)), activation='relu'))
dense_model.add(Dense(16, activation='relu'))
dense_model.add(Dense(1, activation='linear'))
dense_model.compile(loss='mean_squared_error', optimizer='adam')
res = dense_model.fit(X_train,

So how well does our basic model do? As a comparison, I’ve also fitted a gradient boosted tree and a more fancy neural network employing long short-term memory (LSTM) units (building one is super easy in Keras). The good news is that the problem is a quite harmless one, and predictions are close to the actual values.

The errors are decently distributed, and plotting the predictions against the actuals doesn’t show any significant non-linearities.

Staring at the above plot for a little bit, one could argue that the neural networks show are a little bit less variance than the tree in this example. The residual errors on the test set are virtually identical from all these three methods though. This is not surprising, of course a neural network does as well as any other method on a harmless dataset. The thing is, using Keras, it is also as easy to train as any other model, e.g. from scikit-learn.

I hope you’ve enjoyed this post and feel inspired to give Keras a try. As always, stay tuned for the next data adventure, where we’ll have even more fun with neural networks!