These days it feels as if every tech behemoth is competing with every other tech behemoth but one of the more interesting battles has been between Google and Apple. Google is the accepted leader in ML and AI and they leverage it to offer a better and constantly improving user experience. Apple, on the other hand, has been stressing that unlike Google it doesn’t make money off of data mining your data and is instead focused on privacy.
More data is an incredible advantage when it comes to training AI models and Google is king. We all want privacy but the question is whether the benefit of privacy outweighs the convenience of opening up your data. Every person has a different opinion but as these AI models improve the improved experience will take precedence over the benefits of privacy for more and more people until only a minority grasps to their data.
That’s why I was glad to see this mention of Apple leveraging iPhone sensors to collect anonymous data snippets that preserve privacy while feeding into the models. The resulting models may not be as great as those built on identifiable information but it’s much better than the alternative.
The secret sauce here is what Apple calls probe data. Essentially little slices of vector data that represent direction and speed transmitted back to Apple completely anonymized with no way to tie it to a specific user or even any given trip. It’s reaching in and sipping a tiny amount of data from millions of users instead, giving it a holistic, real-time picture without compromising user privacy.
If you’re driving, walking or cycling, your iPhone can already tell this. Now if it knows you’re driving, it also can send relevant traffic and routing data in these anonymous slivers to improve the entire service. This only happens if your Maps app has been active, say you check the map, look for directions, etc. If you’re actively using your GPS for walking or driving, then the updates are more precise and can help with walking improvements like charting new pedestrian paths through parks — building out the map’s overall quality.
I’m hopeful that this model succeeds and proves that great predictions are possible while relying on anonymized data. More importantly, the fact that the data itself is anonymous may even lead to it being open sourced and help the industry compete against those leveraging proprietary data.