• Fun with Neural Networks Part 2: Autoencoders

    After we familiarized ourselves with Keras in the last post, now is the time to get more serious. Much has been said and written about neural networks, and nobody working in analytics nowadays can really escape the hype. Most of the time you’ll however only read about neural networks for classification or regression, that is to say in a supervised learning setting. That is quite interesting and all, but there are exciting things that you can do with unsupervised problems as well.

  • Fun with Neural Networks Part 1: First Steps with Keras

    I wanted to write about Neural Networks since a while, mainly because I saw this as an opportunity to summarize some of the things I learned working with Keras, a high level Neural Networks API building on top of Tensorflow (among others). I have used and came to like the latter, but always thought it lacks a bit on the usability side. This will be the first of a series of posts on Keras and Neural Networks. We won’t do anything fancy, just introduce a dataset we’ll be working on and showcase the basic usage of Keras.

  • Clustering 101, or: On Fridays, People Bike Differently!

    We have talked about the BABS open data data set many times before. It lists bike trips in the San Francisco Bay area, with start and end point, date, time, and some extra information about the rider. What we want to look at in this episode is some basic clustering, and some surprising results from this well-know data set. The plan is to find classes of typical days in terms of bike usage. One would e.g. expect different usage patterns between weekdays and weekends, and we will actually discover some fun things beyond these basics as we go along. Let’s dive right in.

  • Simulating Traffic Jams

    The Monte Carlo method is one heck of a Swiss Army Knife. It’s used in nearly any field where quantitative predictions are needed, from engineering and finance to statistics and physics. As a matter of fact, I was even making a living with large-scale Markov Chain Monte Carlo simulations for a while. But that’s not what I want to talk about today. Instead, we’ll be talking about stop and go traffic. Let me explain.

  • Don't despair!

    Hi internet! I know it’s been a while since our last data adventure. I am currently quite busy having a full-time job and teaching a data science course at the university of Oslo with the catchy name STK-INF4000. This means three things.

  • Can't Buy Me Love: Hacking Dating Site Profiles


  • Anomaly Detection in R: Euro 2016 Edition.

    I’ve been working a fair bit with anomaly detection in the last months, and browsing through Andrew Ng’s excellent machine learning course, I was curious to try out his anomaly detection algorithm. It reads a bit like this:

subscribe via RSS