• So I Donated a Dataset

    You know we go the extra mile for an interesting data set. I’ve recently been doing so quite literally and logged around 5 running sessions per week with help of a GPS running watch. Thanks to some Selenium magic, I’ve been able to easily download the raw CSV files and am now able to donate them for your analysis pleasure. You can find them on github.

  • Fun with Neural Networks Part 2: Autoencoders

    After we familiarized ourselves with Keras in the last post, now is the time to get more serious. Much has been said and written about neural networks, and nobody working in analytics nowadays can really escape the hype. Most of the time you’ll however only read about neural networks for classification or regression, that is to say in a supervised learning setting. That is quite interesting and all, but there are exciting things that you can do with unsupervised problems as well.

  • Fun with Neural Networks Part 1: First Steps with Keras

    I wanted to write about Neural Networks since a while, mainly because I saw this as an opportunity to summarize some of the things I learned working with Keras, a high level Neural Networks API building on top of Tensorflow (among others). I have used and came to like the latter, but always thought it lacks a bit on the usability side. This will be the first of a series of posts on Keras and Neural Networks. We won’t do anything fancy, just introduce a dataset we’ll be working on and showcase the basic usage of Keras.

  • Clustering 101, or: On Fridays, People Bike Differently!

    We have talked about the BABS open data data set many times before. It lists bike trips in the San Francisco Bay area, with start and end point, date, time, and some extra information about the rider. What we want to look at in this episode is some basic clustering, and some surprising results from this well-know data set. The plan is to find classes of typical days in terms of bike usage. One would e.g. expect different usage patterns between weekdays and weekends, and we will actually discover some fun things beyond these basics as we go along. Let’s dive right in.

  • Simulating Traffic Jams

    The Monte Carlo method is one heck of a Swiss Army Knife. It’s used in nearly any field where quantitative predictions are needed, from engineering and finance to statistics and physics. As a matter of fact, I was even making a living with large-scale Markov Chain Monte Carlo simulations for a while. But that’s not what I want to talk about today. Instead, we’ll be talking about stop and go traffic. Let me explain.

  • Don't despair!

    Hi internet! I know it’s been a while since our last data adventure. I am currently quite busy having a full-time job and teaching a data science course at the university of Oslo with the catchy name STK-INF4000. This means three things.

  • Can't Buy Me Love: Hacking Dating Site Profiles


subscribe via RSS