Problem Statement

In general, it is well-known that smartphone usage influences user privacy. As phones tend to remain on, their usage patterns provide deep insights into their owners’ lives. With this project, we aim to raise awareness of the fact that this is particularly true for messenger apps.

While the upload of users' address books has, so far, been the main concern with messenger apps, we will shed light on the more subtle privacy implications of unintentional system leaks. These are inherent side effects of using a specific messenger service.

One of the most popular — and, at the same time, probably the most underestimated privacy-related system leak — is the user's presence status. Popular messenger apps like WhatsApp or Telegram display the current "online" status of a contact whenever a contact opens the messenger app while connected to the Internet. Moreover, a "last seen” timestamp is displayed for each contact. This refers to the time the user was last connected.

Certainly, both the “last seen” functionality and the presence status are widely known, as are the privacy implications of users tracking friends’ or acquaintances’ online or “last seen” status manually from within the app. When these notifications are monitored on a large scale and aggregated over a long period of time, however, a completely new dimension of privacy implications opens up. The collected data can be retrospectively analyzed to provide insights into users’ daily routines.

To demonstrate the practicability of this approach — and, particularly, to highlight the related privacy implications — we monitored 1,000 randomly chosen users of the popular WhatsApp messenger app from July 2013 to April 2014. Within that 9-month observation period, we successfully received more than 4.5 million presence notifications. These show exactly when a user connected to the WhatsApp network by opening the WhatsApp app; the time they closed the app again; and their online time, correct to the precise second. This app usage data allows inferences about users' daily routines, such as their online time during working hours and their average sleeping time (including deviations during weekend activities), to be drawn. It also allows the evaluation of their overall reachability, which, in turn, could result in a considerable loss of plausible deniability. Obviously, the more frequently the app is used, the more contextual information is available, and the more revealing these analyses would be.

A recently published study by the University of Ulm also investigated these privacy implications. After collecting and analyzing presence information about 19 participants over 1 month and comparing them to their real living habits, they also reach the conclusion that “presence information alone is sufficient to accurately identify, for example, daily routines, deviations, times of inappropriate mobile messaging, or conversation partners“. Following this up, within our project, we demonstrate the reality of this threat and demonstrate that large-scale and long-term monitoring of random users is practically feasible.

For more details on our specific monitoring approach, please refer to the Technical Background section.