April 21, 2011

Apple's location tracking

This also fits in the “topics I know almost nothing about” category.

There are a few confusing bits in the story about Apple’s iPhone collecting location data.

I downloaded the app and checked it out – there were a few out-of-place spots, but all in all it was relatively accurate. But, the points were all laid out on a grid, which I thought was odd (one, supposedly the locations were the phone’s position triangulated from cell towers, and two, even if the locations were the cell towers themselves, I’m pretty sure they’re not on an exactly laid out grid across metro and rural areas over three states). Additionally, the time progression of the app was broken down by week – I’d hoped it would be more granular. And finally, the points weren’t clickable or anything – you couldn’t tell anything about what time any of them was generated.

Luckily that page also gives instructions for digging through the raw data without the program, which I did.

Here I’ll repeat the preface that I don’t really know what I’m doing. But based on my amateur, inexperienced, unknowledgeable analysis of what I found, it appears the iPhone isn’t tracking so much as collecting.

Dataset

In my dataset of nearly 10,000 individual pairs of coordinates over 300 days, there were – as far as I could tell – no duplicate latitude-longitude pairs. None.

Additionally, during times when I traveled, many, many more data points were collected than usual. For example, almost a third of my 10,000 pairs were collected during a two-day trip to Chicago from St. Louis. Another 1,000 or so were collected during other out-of-town trips totaling another week or two.

So, nearly 4,000 of 10,000 data points, two-fifths, were collected in less than 30 days, one-tenth of the total time.

Based on the data, there are about 33 pairs per day, on average. Except that, as I said, about 4,000 points were collected over 23 days out of town, or about 175 points per day. That leaves 6,000 points collected over 277 days, or about 22 per day.

Analysis

Ok, so Chicago to St. Louis is about 300 miles. Double it for the round trip, 600. I won’t even tack on any for wandering around while I was there. Nearly 3,000 points for 600 miles. That’s about 5 points collected per mile traveled.

That seems ok. But the rest of the time? The 6,000 points over 277 days I spent in and around St. Louis? I put 5,000 miles on my car during that time. That’s just 1.2 points collected per mile.

Conclusion

So what’s the difference? The difference is that Chicago (and the ground covered between here and there) for two days is entirely new ground. The 5,000 miles in St. Louis is largely already-covered territory.

So, given the lack of duplicate points combined with the huge uptick in recorded points whenever the phone is in new territory, I think “tracking” is not the right word for the behavior. Tracking would imply recording a new point every time the phone moved, or a new point at a given time interval.

This seems a lot more like “collecting”. The phone seems to be collecting locations and associating them with cell towers. It doesn’t need to collect a place more than once (unless, probably, if something changes – the best tower servicing a given latitude/longitude).

This is probably an outdated analogy, but think of business cards: You collect business cards from contacts. Each time you meet a new contact, you’ll ask for their card. If you go to meetings and professional gatherings in your town, you might collect a new one now and then (both from the occasional new person in town as well as from someone who’s changed phone numbers or companies), but go to an out of town convention and you’ll come home with a thick stack, because you met a bunch of people whose cards you didn’t have.

The phone is collecting data about where it was at what time, and there’s no opt-out. That much seems clear. I’m not quite sure that automatically implies that it’s tracking its whereabouts all the time. I can’t wait to hear from people who really know what they’re talking about.