By now you maybe know that finding out something about people by knowing their first name is one of my hobbies. A while back we investigated if people whose age suggest that their parents were following a trend rather than setting one while choosing their name use language differently.

The hypothesis was that trend-setters have on average a higher level of education and because of that chances are that their kids will receive a better than average education as well. That means that if you’re a bit too old for your name you’re likely to have a higher socioeconomic status, and vice versa. A bit like this.

MaritByYear

All of this is an old story that’s kind of well researched. When we investigated the language use of people on either side of the peak we did see some differences which hint at a more correct/formal use of Norwegian on the left side, but it did not fully convince me. Here’s the graph again for reference.

SocioeconomicWordUse

So what else can we find out? Twitter gives us some details about what client was used to send the Tweet, like Twitter for iPhone, Android, etc.. One would assume that this could give some insight on socioeconomic status given the different price tags on different devices.

First though, you should know something about Norway. Everybody has an iPhone. Everybody. It’s almost like there are no other smart phones, to a point were you notice people (like me) not using one. So I would not expect big differences in number of iPhone users, but maybe something else will come out of it. Let’s investigate.

I’ve collected Norwegian Twitter data over a few months last year and tagged people with probable high/low economic status using the method I described here. I’ve filtered out multiple tweets from the same user on the same platform, but double-counted if someone uses multiple clients. This means that we don’t take into account if someone uses twitter a lot or is an infrequent user. Let’s look at what I found.

source_by_status_unique_user

There seems to be no sizable effect, besides that I might be biased concerning iPhone usage in Norway. So being on the left or right side of your first name’s age peak does not really influence which twitter client you use. This could mean several things.

  1. There is no connection between socioeconomic status and preferred Twitter client.

  2. There is no connection between socioeconomic status and our name-and-number method.

  3. Both of the above.

Now there is one thing I mentioned earlier. I didn’t double-count if I saw multiple tweets from one user. Maybe more frequent users have stronger preferences for a certain client. There is probably something to be said for using the web client from a computer and having a proper keyboard if you tweet a lot. The plot including double-counting looks like this.

source_by_status

Now that’s a huge difference! We just shouldn’t draw any conclusions without checking if we now have just a users with a lot of tweets that mess with our distributions. Telling a story just from the above plot would be so easy though. People with a likely low socioeconomic status use android twice as often as the ones who are better off  and both the web client and Tweet Deck are preferred by smart power users. But let’s look at the histogram of the per-user tweet counts.tweets_per_userThe distribution has a very long tail, and some users in this tail make up a significant fraction of the total Tweets in the analysis. This skews our per-client counts so much that we’d draw wrong conclusions by just looking at it in isolation. Just to rub in: Always reason responsibly!

Talking of clients. What do you think are the most popular Twitter clients in Norway? That’s in today’s bonus plot.

total_usage_by_client