Posts

  • Exploring the Strava Api

    If you, like me, are interested in running and data science, you might be interested in analyzing running performance. You could do this on someone else’s data, e.g. on a donated a data set, or you could use your own. You would e.g. be interested how your average pace/speed and average heart rate are distributed.

  • Model Parameters from Markov Chain Monte Carlo

    One thing I often miss in discussions and courses of data science and machine learning is the interpretation of models. A lot of emphasis is on predictive power and which model class is the better one, without stressing what a model can tell you about the data.

  • So I Donated a Dataset

    You know we go the extra mile for an interesting data set. I’ve recently been doing so quite literally and logged around 5 running sessions per week with help of a GPS running watch. Thanks to some Selenium magic, I’ve been able to easily download the raw CSV files and am now able to donate them for your analysis pleasure. You can find them on github.

  • Fun with Neural Networks Part 2: Autoencoders

    After we familiarized ourselves with Keras in the last post, now is the time to get more serious. Much has been said and written about neural networks, and nobody working in analytics nowadays can really escape the hype. Most of the time you’ll however only read about neural networks for classification or regression, that is to say in a supervised learning setting. That is quite interesting and all, but there are exciting things that you can do with unsupervised problems as well.

  • Fun with Neural Networks Part 1: First Steps with Keras

    I wanted to write about Neural Networks since a while, mainly because I saw this as an opportunity to summarize some of the things I learned working with Keras, a high level Neural Networks API building on top of Tensorflow (among others). I have used and came to like the latter, but always thought it lacks a bit on the usability side. This will be the first of a series of posts on Keras and Neural Networks. We won’t do anything fancy, just introduce a dataset we’ll be working on and showcase the basic usage of Keras.

  • Clustering 101, or: On Fridays, People Bike Differently!

    We have talked about the BABS open data data set many times before. It lists bike trips in the San Francisco Bay area, with start and end point, date, time, and some extra information about the rider. What we want to look at in this episode is some basic clustering, and some surprising results from this well-know data set. The plan is to find classes of typical days in terms of bike usage. One would e.g. expect different usage patterns between weekdays and weekends, and we will actually discover some fun things beyond these basics as we go along. Let’s dive right in.

  • Clustering 101, or: On Fridays, People Bike Differently!

    We have talked about the BABS open data data set many times before. It lists bike trips in the San Francisco Bay area, with start and end point, date, time, and some extra information about the rider. What we want to look at in this episode is some basic clustering, and some surprising results from this well-know data set. The plan is to find classes of typical days in terms of bike usage. One would e.g. expect different usage patterns between weekdays and weekends, and we will actually discover some fun things beyond these basics as we go along. Let’s dive right in.

subscribe via RSS