Open access to the Proceedings of
the 24th USENIX Security Symposium
is sponsored by USENIX
You Shouldn’t Collect My Secrets:
Thwarting Sensitive Keystroke Leakage
in Mobile IME Apps
Jin Chen and Haibo Chen, Shanghai Jiao Tong University; Erick Bauman
and Zhiqiang Lin, The University of Texas at Dallas; Binyu Zang and Haibing Guan,
Shanghai Jiao Tong University
https://www.usenix.org/conference/usenixsecurity15/technical-s essions/presentation/chen-jin
This paper is included in the Proceedings of the
24th USENIX Security Symposium
August 12–14, 2015 • Washington, D.C.
ISBN 978-1-939133-11-3
USENIX Association 24th USENIX Security Symposium 675
You Shouldn’t Collec t M y Sec rets:
Thwarting Sensitive Keystroke Leakage in Mobile IME Apps
Jin Chen
, Haibo Chen
, Erick Bauman
, Zhiqiang Lin
, Binyu Zang
, Haibing Guan
Shanghai Key Kaboratory of Scalable Computing and Systems, Shanghai Jiao Tong University
Department of Computer Science, The University of Texas at Dallas
ABSTRACT
IME (input method editor) apps are the primary means
of interaction on mobile touch sc reen devices and thus
are usually granted with access to a wealth of private
user input. In order to understand the (in)security of
mobile IME apps, this paper first performs a systematic
study and uncovers that many IME apps may (intention-
ally or unintentionally) leak users’ sensitive data to the
outside world (mainly due to the incentives of improv-
ing the user’s experience). To thwart the threat of sen-
sitive information leakage while retaining the benefits of
an improved user experience, this paper then proposes
I-BOX, an app-transparent oblivious sandbox that mini-
mizes sensitive input leakage by confining untrusted IME
apps to predefined security policies. Several key chal-
lenges have to be addressed due to the proprietary and
closed-source nature of most IME apps and the fact that
an IME app can arbitrarily store and transform user input
before sending it out. By designing system-level transac-
tional execution,
I-BOX works seamlessly and transpar-
ently with IME apps. Specifically,
I-BOX first check-
points an IME app’s state before the first keystroke of an
input, monitors and analyzes the user’s input, and rolls
back the state to the checkpoint if it detects the poten-
tial danger that sensitive input may be leaked. A proof
of concept
I-BOX prototype has been built for Android
and teste d with a set of popular IME a pps. Experimental
results show that
I-BOX is able to thwart the leakage of
sensitive input for untrusted IME apps, while incurring
very small runtime overhead and little impact on user ex-
perience.
1INTRODUCTION
The Problem. With large touch screens, modern mo-
bile devices typically feature software keyboards to al-
low users to enter text input. This is different compared
to traditional desktops where we use the hardware key-
boards. These soft keyboards are known as Input Method
Editor (IME) apps, and they convert users’ touch events
to text. Since IME apps proces s almost all of a user’s in-
put in mobile devices, it is critical to ensure that they are
not keyloggers and they do not leak any sensitive input
to the outside world.
0
20,000
40,000
60,000
80,000
100,000
120,000
140,000
160,000
Sougou
iFlytek
Google Pinyin
QQ
TouchPal
Baidu
Jinshou
Guobi
Octopus
Slideit
Vee
The number of download (units 10,000)
The IME Apps
Figure 1: Download statistics of IME apps in our study.
While all mobile devices have a default IME app in-
stalled, users often demand third-party IME apps with
expanded feature sets in order to gain a better user ex-
perience. This is especially common for no n -Latin lan-
guages. In order to accommodate this need, mobile oper-
ating systems such as Android a nd iOS provide an exten-
sible framework allowing alternate input methods. Due
to the ease of making third-party IME apps and high de-
mand for customization, there are currently thousands of
IME apps in major App market like Google Play and Ap-
ple’s App Store. Many of whic h have gained hundreds of
millions downloads, as shown in Fig. 1. For instance, the
Sogou IME apps has in total 1.6 billion downloads in
Google Play and several third party app vendors such as
360, and Baidu. Meanwhile, a recent survey [13] found
that 68.3% of smartphones in China are using third-party
IME apps. This survey did not include statistics from
Japan or Korea, where such apps are also very popular.
Unfortunately, despite these advantages, using a third-
party IME app also brings security and privacy concerns
(assume the default IME app does not have these prob-
lems). First, IME app developers have incentives to log
and collect user input in order to improve the user’s ex-
perience with their products, and user input is as valuable
as email content, from which they can learn user’s needs
and push customized advertising or other business activ-
ities. Although an IME app may state a policy of not
collecting certain input from a user, the policies imple-
676 24th USENIX Security Symposium USENIX Association
mented in the app may unintentionally send sensitive in-
put outside the phone. In §2.3 we show that such a threat
is real by observing the output of a popular IME app that
periodically sends out user input to a remote server. In
addition, we colle cted the network activities of a set of
IME apps during a user input study and showed that they
also likely send out private data. In light of this informa-
tion leakage threat, the Japanese government’s National
Information Security Center has warned its central gov-
ernment ministries, age n cies, research institutions and
public universities to stop using IME apps offered by the
search engine provider Baidu [1].
Even if a user trusts benign IME apps to properly se -
cure private data, there is still a risk from repackaging
attacks targeting benign apps. In fact, prior study has
shown that around 86% of Android malware samples are
repackaged from legitimate apps [49]. It is also surpris-
ingly simple to repackage an IME app with a malicio u s
payload, as we demonstrate in §2. Essentially, a repack-
aged malicious IME app is essentially a keylogger, which
has been one of the most dangerous security threats for
years [39]. Also, evidence has shown that IME apps are
popular for attackers to inject malicious code [29].
Challenges. While it may se em trivial to detect thes e
repackaged malicious IME apps by comparing a hash of
the code with the corresponding vendor in the official
market, the wide spread existence of third-party markets
makes such checks more difficult. It is also easy for at-
tackers to plant repackaged malware into these markets,
as is shown by the fact that a considerable a mount of
repackaged malware has been found in them [48].
Of further concern is the fact that it is very challenging
to analyze whether even “benign” IME apps will leak any
sensitive data or not. There are several reasons why de-
tecting privacy leaks in IME apps is challengin g . First,
many commercial IME apps use excessive amounts of
native code, which makes it very difficult to understand
how they log and process user input. Second, many of
the IME apps use unknown, proprietary protocols, which
makes it especially hard to analyze how they collect and
transform user input. Third, many of them utilize encryp-
tion, and their algorithms are also unknown. Therefore,
we eventually must treat the IME apps as black boxes
for current privacy-preserving techniques on mobile de-
vices, and users must either trust them completely (and
risk leaking their private data) or switch to the default
IME a pp (and lose the improved user experience).
At a high level, it would seem that existing techniques
such as taint tracking would be viable approaches to pre-
cisely tracking and containing sensitive input. For ex-
ample, TaintDroid [16, 17] and its follow-up work have
been shown to very effective to track sensitive input and
detect when it is leaked. There will still be the follow-
ing additional challenges to be overcome. First, current
IME apps tend to use excessive native code in their core
logic, and TaintDroid currently does not track tainted
data in native code. Second, it is a well-known problem
that data-flow based tracking for taint-tracking systems
to capture control-based propagation. In fact, many of
the keystrokes a re generated through lookup tables, as
reported in Panorama [46]. Third, sensitive information
is often compos ed of a sequence of keystrokes, making it
challenging to have a well-defined policy to differentiate
between se nsitive and non-sensitive keystrokes in Ta int-
Droid. Therefore, we must look for new techniques.
Our approach. In this pape r, we pres ent
I-BOX, an
app-oblivious IME sandbox that prevents IME apps from
leaking sensitive user input. In light of the opaque na-
ture of third-party IME apps, the key idea of
I-BOX is
to make an IME app oblivious to sensitive input by run-
ning IME apps transactionally;
I-BOX eliminates sensi-
tive data from untrusted IME apps when there is sensi-
tive input during this process. Specifically,
I-BOX check-
points the states of an IME app before an input transa c-
tion. It then analyze s the user’s input data using a pol-
icy engine to detect whether sensitive input is flowing
into an IME app. If so,
I-BOX rolls back the IME app’s
states to the saved checkpoint, which essen tially makes
an IME app oblivious to what a us er has entered. Other-
wise,
I-BOX commits the input transaction by disca rdin g
the checkpoint, which enables the IME app to leverage
users’ input to improve the user experience.
One key challenge faced when building
I-BOX is
how to make the checkpointing process efficient and
consistent, which is unfortunately complicated by An-
droid’s design, especially its hybrid execution (of Java
and C), multi-threading, and complex IPC mechanism
(e.g., Binder). Fortunately,
I-BOX addresses this chal-
lenge by leveraging the event-driven nature of an IME
app. More specifically, we present a novel approach by
creating the checkpoint at a quiescent point, in which its
execution states are inactive. Such a design significantly
simplifies many issues su ch as handling residual states in
the local stack of native code, the Dalvik VM and IPCs.
We have implemented
I-BOX based on Android 4.2.2
running on a Samsung Galaxy Nexus smartphone. Per-
formance evaluations show that
I-BOX can checkpoint
and restore a set of third-party popular IME apps within a
very tiny amount of time, and thus cause little impact on
user experience. A sec urity evaluation using a set of pop-
ular IME apps shows that
I-BOX mitigates the leakage of
sensitive input. Case studies using a popular “benign”
IME app and a repackaged IME app confirm that
I-BOX
accurately conforms to the predefined security policies to
prevent sending of sensitive input data.
USENIX Association 24th USENIX Security Symposium 677
IME App
Client Apps
InputMethod
ManagerService
InputConnection
Touch Event
Plain Text
Invoke IME
Awaken
Start
Input
Show
Text
InputMethod
Service
EditText
User
Figure 2: The workflow when using an IM E app.
Contributions. In short, we make the following contri-
butions:
New Problem. This is the first a ttempt to systemat-
ically understand the threat caused by the leakage of
private sensitive keystrokes in third-party IME apps.
Our discovery shows the pervasive presence of such
attacks, and the seriousness of the problem.
New Technique. We introduce oblivious sa nd-
boxing for IME apps tha t embraces both security
and usability and quiescent points based check-
point/restore that significantly simplifies the d esign
and imp lementation of
I-BOX.
New System. We demonstrate a working prototype
of the techniques and a set of evaluations confirming
the security threat of commercial IME apps and the
effectiveness of
I-BOX.
2BACKGROUND AND MOTIVATION
In this section, we first describe the necessary back-
ground on IME architecture in Android, and then discuss
why commercial IME apps have the incentive to collect
a user’s data, followed by the case studies s howing how
IME apps can leak users’ sensitive data to remote parties.
2.1 Input Method Editor
Though Android provides a default IME app for each
language, many end users prefer using third-party IME
apps for better user experiences, such as changing the
screen layout for faster input, generating personalized
phrases to provide intelligently assoc iational input, and
providing more accurate translation from keystrokes to
the target languages. As a result, mobile operating sys-
tems such as Android provide an extensible IME infras-
tructure to allow third-party vendors to develop their own
IME a pps.
Figure 2 gives an overview of the involved IME com-
ponents when entering text in a client app. Specifically,
third-party IME apps must conform to the IME frame-
work so that the Android Input Method Management
Service (IMMS) can recognize and manage them. For
example, every IME app contains a class that extends
from InputMethodService, which helps Android
recognize it as an input service and add it into the sys-
tem as an IME app. When an end user clicks a textbox
to invoke an IME app, Android IMMS will start the de-
fault IME activity and build an InputConnection
between the IME app and the client app that helps the
IME app to commit the us e r input to the client app. In
particular, the IME app first gets the touch event con-
taining the pos itio n data and translates it to mean in g ful
characters or words based on its keyboard layout and in-
ternal logic. Then it sends the keystrokes to the c lient
app through InputConnection.
The IME architecture is clean with well-defined
classes. This not only significantly saves pro-
grammer’s effort in developing a new IME app,
but also makes it easy for attackers to locate
key points of a victim IME app. For instance,
our study found that simply hooking the function
BaseInputConnection.commitText can inter-
cept all the user’s input in many IME apps. This
can be done by simply searching for the keyword
BaseInputConnection.commitText in the de-
compiled code to locate all of its occurrences.
2.2 Why IME Apps Collect Users’ Input
Third-party IME apps usually extend the standard IME
apps with lots of rich features to provide a better user ex-
perience. Such features usually require collecting users’
input data to learn users’ habits to allow personalizing
IME apps. Further, such data may also collectively be
used to improve experiences of other users, i.e., push-
ing phrases learned from a set of users to others. In fact,
there are many features that require collecting user input
data. The following lists a few of them:
Personal dictionary. Commercial IME apps usu-
ally remember the words and phrases from user
input to speed up follow-up input (especially for
non-Latin languages) by prompting potential results
when input is not finished. To achieve this, they
need to maintain a personal dictionary for each user
to save frequently typed or self-made words.
Cloud input. As users usually have multiple de-
vices and need to synchronize personal dic tionary
among them, IME apps utilize cloud-based services
to store the dictionary and to synchronize the dic-
tionary and personal settings between different de-
vices.
Meanwhile, some non-Latin languages such as
those eastern languages differ from English in that
IMEs need to translate users’ keystrokes to words
in those languages. To accelerate input speed, IMEs
678 24th USENIX Security Symposium USENIX Association
may usually need to leverage cloud services to ana-
lyze and predict users’ intended words based on the
current input.
In a ddition, for some latin-based languages, some
IME apps provide a feature that leverages the cur-
rent input to predict the intended phrases and adjust
the layout of the soft keyboard to make the soft key
of the next character close to use rs ’ current figure.
To better predict user intent, some IME apps usually
leverage the abundant resources in cloud to analyze
and predict user input. Meanwhile, they also collect
users’ habits to improve the accuracy of prediction.
Search media ti on. Some IME apps have a new
feature named “search mediation”, which intercepts
user input and returns some search result back to the
user. However, this means that user inputs will be
unrestrictedly sent to the search engine.
Note that due to the unstable network connectivity of
mobile devices, almost all IME apps can work properly
with and without network connections. When network is
disconnected, an IME app may store current input (like
frequently used phrases) for later use when the network
connection is on. Besides, Android’s configurable per-
mission model indicates that an IME app usually works
normally even with o u t grants of certain permissions.
2.3 Possible Threats Posed by IME Apps
While third-party IME apps do offer useful features and
better user experiences, they may unduly collect user
data or be repac kaged to be malicious. Next, we study
the poss ible threats an IME app could impose.
Privacy leakage in “benign” IME apps. Conventional
wisdom is to trust a respected service provider, in the
hope that the provider will enforce policies in the cloud
to faithfully provide user se c recy [30]. Unfortunately,
this exposes users’ sensitive keystrokes from two threats.
First, a curious or malicious operator may stealthily steal
such data [47, 41], which has been evidenced by numer-
ous insider data theft inc ide nts even from reputed compa-
nies [40]. Second, even reputed cloud providers provide
no guarantee on the security of user data, which is evi-
denced by their user agreements. Hence, it is reasonable
to not trust an IME app to securely protect users’ data.
More specifically, a severe threat from “benign” IME
apps is that they may have unduly collected user data
without users’ awareness. Given that we do not have
their source code and they often use proprietary proto-
cols with encryption, it thus remains opaque to end users
how the IME apps really handle the sensitive input data.
At a high level, since they have been collecting user data
for better experiences (especially the personal dictionary
and cloud input), it is highly likely that much of a user’s
sensitive input has been leaked to these IME providers.
To confirm our hypothesis, we conducted an experi-
mental s tu d y by performing a man-in-the-middle attack
on a popular IME app, namely TouchPal Keyboard (in
version chubao 5.5.5.67049, cootek). This IM E app
provides multiple rich functionalities su ch as cloud in-
put and a pers ona l dictionary and has been installed
more than 7.09 million times from a third-party market.
By intercepting its network packages us ing Wiresha rk
1
,
we found that its cloud input is implemented using an
HTTP POST command which carries several parameters
in plain text. Therefore, we are able to s ee how it works
without any protocol reverse engineering a nd packet de-
cryption. A deep investigation revealed that these param-
eters include a userid, the keycode that a user just
entered, and the existing words of the target input con-
trol that user is focusing on. This contradicts its privacy
statement of “No collection of personal information that
you type” in a prior statement
2
, and thus poses a serious
threat to user privacy.
We suspect there may be many other commercial IME
apps that also leak users’ se nsitive input. Currently,
we only used side-channel analysis [11] to analyze the
packet size between the IME apps and their servers. We
did notice there are notable differences in the number of
packets (as reported in §5.2).
Privacy leakage in malicious IME apps. Even if a ll
third-party IME apps did not leak any user’s private data,
there are still other attack vectors such as repackaging
attacks. In fact, a prior study uncovers that repackaged
malware samples account for 86% of all malware [49].
Moreover, there are also trojans that serve as key loggers
but masquerade as IME apps [29]. Finally, IME apps
may also be vulnerable to component-hijacking attacks.
It has bee n shown that input methods have been a popu-
lar means to inject malicious cod e [29]. While currently
we are not aware of any repackaged malicious IME apps
in Android, we envision that there will be such malware
given the large popularity of the official apps and the eas-
iness of repackaging them as shown below.
To unde rs tand the repackaging threat of IME apps,
we conducted an attack study by repackaging a popu-
lar commercial IME app called Baidu IME, which has
been downloaded more than 100 million times in a third-
party market. In this study, we repackage the IME app by
inserting a malicious payload into the original program.
The payload records a ll user input and sends them to a
specific server.
While the core logic of the Baidu IME app is written
1
http://www.wireshark.org/
2
We noted that the newer versions of TouchPal changed their pri-
vacy statement indicating that they will collect user privacy data.
USENIX Association 24th USENIX Security Symposium 679
using C, the other c ompone nts are written in Java
which enables an easy reverse engineering of the
bytecode especia lly with existing tools. Specifically,
we used baksmali [2], a popular Da lvik disassem-
bler to reverse classes.dex into an intermediate
representation in the form of smali files. Then we
directly modified smali code to insert our payload,
which captures the text committed by the function
BaseInputConnection.commitText and then
sends the data out. A caveat in this s tudy is that we found
it would not work if we simply repackaged the app be-
cause the IME app has a checksum protection. However,
the protection mechanism is rather simple, as it just calls
a self-crash function when detecting repackaging. How-
ever, the self-crash function is not self-protected and thus
we rewrote it to return directly to disable the prote ction.
We conducted our experiment in a contained environ-
ment and did not upload this repackaged IME app to any
third-party Android market, but attackers can easily do
this, as reported before [49, 48]. We installed this repack-
aged IME app on our test smartphone and all data w e
input through it was divulged. Our attack study shows
all critical data that a use r inputs will be compromised if
the IME app is malicious . The popularity of third-party
markets aggravates this problem, especially considering
that 5% to 13% of apps are repackaged in a number of
third-party markets [48].
3OVERVIEW
The goal of I-BOX is to protect users’ sensitive inp u t,
while still preserving the usability of (c u riou s or mali-
cious) IME apps such that users can still benefit from
the rich features. One possible approach might be let-
ting users switch to a trusted IME app when they want
to type some sensitive information. While this may work
for simple sensitive data like passwords, some us e rs ’ sen-
sitive input (like addresses and diseases) is scattered in a
long conversation. It is cumbersome for users to con-
stantly keep this in mind and do the switch. Another
intuitive approach would be to block all network con-
nections during user input, but doing so will negatively
affect the user experience. Besides, there are also other
channels like third-party content providers and external
storages that an IME app may temporally store input data
to be leaked later. Therefore, we have to look for new ap-
proaches.
Approach overview. As discussed, the key challenges
of securely using third-party IME apps are that such apps
are usua lly closed-source and they may do arbitrary pro-
cessing and transformation of users’ input data before
sending it out. It is thus hard to model or predict their
behavior. Hence,
I-BOX instead treats an IME app as a
black box and makes it oblivious to users’ sensitive in-
put data. To achieve this,
I-BOX borrows the idea from
execution transactions by running an IME app transac-
tionally. Consequently, if an IME app touches users’
sensitive input data,
I-BOX will roll back the IME app’s
states to make it oblivious to what it has observed so as to
address the problem where an IME app stores and trans-
forms users’ input data.
I-BOX regards the user input process as a transaction,
which begins when a user starts to enter the input and
ends w he n the input session ends. A clean s napshot of an
IME a pp will be saved before an input transaction starts.
For normal input transactions without touching sensitive
input data,
I-BOX will commit the IME app’s state such
that the IME app can use these data to improve the user
experience. To prevent malicious IME apps from send-
ing private data out during the input transaction, the net-
work connection of the IME app will be restricted when
the current transaction is marked as sensitive. When an
input session ends and thus the client app has received
all user input,
I-BOX will abort the input transaction from
the view of the IME app, by restoring the IME app’s state
to a most-recent checkpoint. This makes the IME app
oblivious to the sensitive data it observed. Henc e, even if
the IME app locally saves a user’s input to be sent later,
the input data will be swiped during restoring.
As input data is provided in a streaming fashion by a
user, there is no general way to know the input stream
in advance. Because the IME app gets the input data
prior to
I-BOX, it would be too late to s to p an IME app’s
leaking channels like network connection after it gets the
whole input since it may have sent it out or store it lo-
cally. Hence, it is generally impossible for an approach
not leaking any user input before
I-BOX can determine if
the current input stream is sensitive or not.
As a result,
I-BOX chooses to use a combination of
context-based and policy-driven approaches based on the
state of the IME app, with the goal of striking a balance
between user experience and privacy. For specific input
such as passwords, which
I-BOX can determine through
input context,
I-BOX can immediately know they are
sensitive and thus constrains IME app’s behavior (like
blocking networking for the app). For general input,
I-BOX uses a state-machine based policy engine to
predict whether the current input transac tion is sensitive.
This is done continuously during the input process,
where
I-BOX uses the current partial input stream to
determine if the next string is sensitive or not.
An architectural overview of
I-BOX is presented in
Figure 3.
I-BOX consists of an isolated user-level pol-
icy engine that decides whether
I-BOX shall commit or
roll back the execution of an IME app’s state. The sand-
box module is implemented as a kernel module, which
saves and restores the states of an IME app as needed.
680 24th USENIX Security Symposium USENIX Association
IME App Client App
Kernel
Module
kevin1987@hotmail.com
ssn:642-38-1689
...
I-Box
Daemon
Network
Control
Checkpoint
/ Rollback
My mail is
alice@bob.com
USER Level
KERNEL Level
Figure 3: An Architectural Overview of I-BOX
Challenges. To realize I-BOX, we are facing several
challenges. In particular:
How to express and enforce security policies? As
users’ privacy policies are usually vague, it is criti-
cal to efficiently represent users’ policies such that
there won’t be a state explosion problem. This is
especially challenging to handle for non-Latin lan-
guages as they usually require an additional layer
of translation to represent them. Further, once the
policies are represented, it should also be relatively
easy to check the current input against the policies,
which is c ritical to the latency of the checking.
How to efficiently perform the checkpoint and
rollback? As checkpoint and rollback are triggered
during input, lengthy checkpoint and rollback may
extend the latency of users’ input. However, tradi-
tional checkpoint and rollback usually require e ither
expensive copying of applications’ states, or heavy-
weight recording of applications’ execution. For
example, prior checkpointing on server platforms
takes around 600ms without copying files [28].
How to ensure consistency upon rollback? By
considering the user’s input process as a transac-
tion,
I-BOX can ignore the implementation details
of different IME apps and take them as normal pro-
cesses from the kernel’s viewpoint. However, there
also intensive cross-layer and cross-component in-
teractions between an IME app and the rest of the
environment, like the Dalvik V M, the application
framework and the client app. Further, the IME app
is essentially multi-threaded. Hence, consistently
checkpointing and rolling back an IME app’s states
while preserving the sta te s of other components is
another key technical challenge for
I-BOX.
Threat model and assumptions. As third-party IME
apps have the incentive to collect and send out users’ da ta
Sandbox
IME
App
I-BOX
Deamon
I-Box
K-Module
IMMS
Client
App
Checkpoint IME
App state
Invoke IME App
Start
Input
Data
Analyze
Input Data
Close
Network
Notify
rollback
Rollback IME
App state
Step
1
Step
2
Step
3
Figure 4: I-BOX work flow.
and some IME apps are even malicious (repackaged or
even faked),
I-BOX considers all third-party IME apps as
untrusted. However,
I-BOX trusts the underlying smart-
phone OS, including the OS kernel, system services and
any process with root or system privileges. Also, we as-
sume the user’s smartphone has not been rooted such that
the untrusted IME app cannot break the default security
isolation between different apps, especially for system
and use r-level apps.
I-BOX relies on input c ontexts and a user’s policy to
distinguish private data from normal input data. It is pos-
sible that
I-BOX may lea k sensitive user input if the pol-
icy is incomplete or inaccurate, or the user’s intent has
changed after specifying the policy. Further, depending
on the state machin e,
I-BOX may leak a prefix of some
sensitive input.
I-BOX also trusts the end user and rely on her as a
witness to prevent a malicious IME app from tampering
with the user’s input during typing. This should be easy
as she can tell the difference between what she typed and
what she observed from the input screen.
We consider the client app that uses the services from
an IME app as trusted. While a rogue or malicious client
app may also steal users’ sensitive input, a malicious
IME app causes more security impact than a malicious
client app as it leaks all user input to all client apps (in-
cluding system apps) in contrast to only input to a spe-
cific (third-party) client app. How to protect third-party
client apps is out of the scope of this work and many
prior efforts have intensively studied solutions to prevent
information leakage from apps [24, 51, 34].
USENIX Association 24th USENIX Security Symposium 681
4DESIGN AND IMPLEMENTATION
The work flow of how I-BOX works is illustrated in
Figure 4. Specifically,
I-BOX intercepts a user’s input
data by placing hooks into Android Input Method Ser-
vice (IMS) and detects the sensitive data from the in-
put stream based on the policy engine.
I-BOX uses
both context-based and prefix-matching policies (§4.1)
and enforces them using transactional execution (§4.2)
to protect sensitive data such as passwords. Before div-
ing into the details how we design and implement
I-BOX,
we first use a running example to illustrate h ow it really
works.
A running example. Assuming a sensitive string “Is-
UsenixSec2015” is being typed by a use r through an IME
app,
I-BOX first makes a checkpoint of the IME app as a
clean snapshot before input. If this string is being typed
to a password textbox (context-based policy),
I-BOX im-
mediately knows that the string to type is sensitive and
will restrict the IME app’s behavior (such as stopping
network connections). Otherwise,
I-BOX intercepts the
characters and runs the analysis through the policy en-
gine. After getting the characters ‘I‘, ‘s‘, ‘U‘, and ‘s‘,
I-BOX predicts tha t the user may be typing the sensitive
string “IsUsenixSec2015” and
I-BOX restricts the IME
app’s behavior immediately to prevent it from sending
further keystrokes out (prefix-matching policy). After-
wards, the IME a pp continues to accept input from users’
typing and
I-BOX monitors the file operations of the IME
app to record the files that may log the input data. After
the us er finishes typing,
I-BOX confirms that a sensitive
string was typed into the IME app and restores the states
of the IME app with the checkpoint to cle a n the sensitive
string out.
4.1 Policy Engine
The policy engine of I-BOX separates sensitive input
from normal input such that different policies can be ap-
plied to different types of da ta.
I-BOX uses both context-
based and prefix matching strategies to derive policies,
with the first strategy having higher priority.
Context-based policy. We first provide an automated
approach to deriving which input would be s e nsitive
based on the type of the input and execution context of
an app. Specifically, Android uses text fields to help the
user type text into client apps. Text fields can have dif-
ferent input types, such as numbers, dates, passwords,
or email addresses. In fact, the type information of text
fields in the client app has been used to help an IME app
to optimize its layout for frequen tly used characters.
I-
B
OX also leverages the type information of the text fields
to decide whether the input is sensitive or not, a nd pass-
words and email addresses are by default sensitive. In
addition, based on the user defined per-client app pol-
icy (e.g., an IME app is providing services to a banking
application),
I-BOX will automatically tre at all the input
consumed by a sensitive app according to context [44] as
sensitive.
Prefix-matching policy. For general input streams,
I-
B
OX leverages prefix matching to distinguish which in-
put stream is sensitive or not. One challenge for defining
policies for
I-BOX is that IME a pps may need to handle
multiple languages, including both Latin languages and
non-Latin languages. For non-Latin languages,
I-BOX
can only get the text in the target languages after an IME
app has translated the keystrokes for the corresponding
text. Hence, it is not viable to simply use keystrokes
to represent the current input. To address this problem,
I-BOX instead uses the UTF-8 (8-bit Unicode Transfor-
mation Format) of th e translated keystrokes to represent
current keystrokes as well as those in the policy engine.
As there may eventually be multiple data instances
that should be considered as sensitive,
I-BOX uses a trie -
like structure to maintain which data should be consid-
ered as sensitive. A trie-like structure is very space-
efficient for data with a common prefix and is very ef-
ficient for look-up.
I-BOX maintains a globa l trie struc-
ture to represent the global policy.
I-BOX may also
provide an application-specific trie structure if an end
user demands more strict policy. During a query,
I-BOX
queries the global and application-specific trie structures
in parallel but prefers application-specific policies over
the global one.
While much of the sensitive data like contacts and
cookies can be automatically translated to the trie struc-
ture as the default policies,
I-BOX also allows end users
to use regular-expressions when they manually spe cify
the policy. For example, user ma y define “abc*” to in-
dicate any word starting with “ abc” as sens itive input.
Associated with the regular expression , there is als o an
acceptable disclosure rate (ADR), which defines how
many characters can be exposed in an input stream. The
larger the ADR, the more information may be leaked but
the more chances are allowed for cloud assistance. Using
regular expression is easy for experienced users to spec-
ify sensitive input, as it does not require them to fully
remember all such sensitive data and thus matches users’
ambiguous and incomplete memory. This also avoids
asking users to input full secrets to
I-BOX. Alternatively,
average users may also specify full secret names (i.e., a
special case of regular expression) to
I-BOX.
I-BOX provides a simple script to add such regular-
expressions to the trie-like structure and report any con-
flicts if they occur. For example, for a sensitive string of
15 characters (such as ‘IsUsenixSec2015’) and an ADR
of 0.2,
I-BOX will restrict an IME app’s behavior when
682 24th USENIX Security Symposium USENIX Association
the first three characters (‘IsU’) a re typed. I-BOX runs
the trie-structure as a state machine to predict the in-
put stream by matching the typed characters with the trie
structures. Since any substring in the input data may be
sensitive,
I-BOX needs to check all of them. To speed
up this process,
I-BOX searches all poss ible substrings
when a new character is typed. Intermediate states are
maintained so that only new characters need to be han-
dled instead of new substrings constructed by the char-
acter.
Note that currently
I-BOX directly searches over the
plain text of the policy file and relies on the Android per-
mission system to protect it for simplicity. This can be
further enhanced by encrypting the policy file and us-
ing regular expressions to search over the encrypted file,
which was shown to have small runtime and space over-
head [36].
Prefix-substitution attacks. At first glance, the prefix-
matching policy used by
I-BOX would appear to be vul-
nerable to a prefix-substitu tion attack by a malicious IME
App. Specifically, a malicious IME app might first re-
place the prefix of a typed string with a non-sensitive one
so that
I-BOX wouldn’t recognize this prefix a nd thus no
oblivious sandbox would be applied for this input ses-
sion. Fortunately, we note that users, the ultimate wit-
ness, would immediately notice this by observing the dif-
ference between what they typed and what was displayed
on the screen.
Note that as
I-BOX monitors all keystrokes sent from
IME apps to us e r apps,
I-BOX will adjust the state ma-
chine accordingly for any cursor movement and special
characters like deletion. This can detect the case where a
malicious IME app stealthily moves the cursor to deceive
I-BOX on the input sensitivity.
Overall,
I-BOX requires users’ awareness of what she
types from what she observes to detect malicious behav-
ior from an IME app. If a user does not pay enough at-
tention to the input process, a malicious IME app may
still have the chance to fool
I-BOX about the sensitivity
of the input streams.
4.2 Enabling Tran s actional Exe cution
To enable transactional execution of an IME app, I-
B
OX nee ds to provide a checkpoint and rollback mech-
anism. The key challenges here lie in how to provide
low-latency and ens ure consistency, which are made es-
pecially difficult by A ndroid’s unique design. For exam-
ple, Android uses a Dalvik virtual machine (VM) to run
the Java code of the IME a pp, whic h interacts intensively
with the a pplication framework. Further, the native code
of an IME a p p also interacts with the Dalvik VM through
the Java Native Interface (JNI). Finally, Android inten-
sively uses Binder, a complex IPC mechanism for c om-
munication among isolated apps. Such hybrid execution
and complex communication make it hard to efficiently
and cons istently checkpoint the states of an IME app.
I-BOX addresses the above challenges by leveraging
a set of quiescent points. A quiescent point is a point
such that all threads of an application have stopped ex-
ecution and there are no pending states and requests to
be processed. Doing checkpointing at quiescent points
frees
I-BOX from handling a number of subtle states like
residual states in stack or other communication peers.
Further, it also requires less states to be checkpointed.
Finally, when
I-BOX rolls back the states of an IME app,
the states can be restored consistently without having to
deal with s ome subtle residual states in o th er apps.
In the following, we describe in greater detail how
we choose the quiescent points (§4.2.1), how
I-BOX per-
forms the checkpoint and restore of the local states of an
IME app (§4.2.2), and how
I-BOX handles interactions
of an IME app with others through IPCs (§4.2.3).
4.2.1 Quiescent Poi nts
Our key observation is that an IME app is essentially
an event-driven app that provides services to the client
app. Consequently, it shall be usually in a quiescent point
when a user is not typing, as no event will be delivered
to the IME app at that time. At this state, the IME app’s
states are stable and consistent. Thus,
I-BOX can be re-
laxed from handling a lot of complex and subtle local
states. To achieve this ,
I-BOX first checks if an IME app
is in a quiesc ent point by checking the process and thread
states (sleeping or not) and the IPC states. The checking
result is very likely to be true for most cases. Even if
the IME app refuses to cooperate with
I-BOX and keeps
itself busy,
I-BOX can first wait a short time and the n
enforce a quiescent point by blocking new requests and
then forcing the IME app to sleep to do the checkpoint.
Here, a non-cooperative IME app could also be a sign
of being malicious. However, we never encountered this
case as the IME apps we tested always conform to An-
droid IME architecture. Even if so,
I-BOX may always
roll back the IME app to a clean state checkpointed early.
4.2.2 Checkpointing and Restori ng Local States
Since data typed by a user can be stored into any place of
the IME app in any form, it requires that all process states
restore in order to wipe out any sensitive data. The tra-
ditional way of doing checkpoints is copying all rela ted
process states into storages, which is very heavyweight
and would incur long latency. As the main purpose of
I-BOXs checkpoint is to either rollback or discard later,
I-BOX chooses a lightweight approach to checkpointing,
which creates a shadow process and then tracks all later
changes by using copy-on-write (COW) features pro-
vided by Linux.
USENIX Association 24th USENIX Security Symposium 683
Saving and restoring file states. As typical IME apps
usually only modify a small amount of files during one
input transaction,
I-BOX currently records and copies
such files during checkpointing and restores them during
rollback for simplicity. Another option is us in g a COW
file system like Btrfs or ext3cow to avoid copying. This
requires replacing Android’s file system with one with a
COW feature, which will be our future work.
Android provides several options to s ave persistent a p-
plication data. Based on the position where the data is
stored, we can divide these options into two categories:
internal and external storage. Every Android app will be
assigned with a private directory in the internal storage
to store files and data. By default, da ta saved to the in-
ternal storage are private to an app and other apps cannot
access them (nor can the user).
I-BOX just copies all files
in the IME app’s private directory and then restores the
files modified during the input transaction upon rollback.
Since there are usually only a small number of files in the
internal storage for an IME app and the modified ones are
even less, the time cost is negligible.
For external storage, any Android apps with
proper permissions (e.g., android.permission.
WRITE
EXTERNAL STORAGE) can access the whole
external storage. It would be very lengthy if
I-BOX
scanned the whole external storage to find the modified
files. Hence,
I-BOX records all the files modified by the
IME app during the input transaction and then restores
them as needed. Specifically, once
I-BOX detects the
IME app tries to write some data into a file, it duplicates
the file for subsequent restoring.
Note that as the checkpointed files are created by
I-BOX, which runs as a system process, the files are with
system privilege and thus cannot be read/written by the
IME app itself. This ensures that an IME app cannot first
save sensitive key logs into such files and later read them
out. Actually,
I-BOX also removes the checkpointed
files after rolling back an IME app.
Saving and restori ng memory states. Memory states
include the IM E app process ’s data in memory and
process-related metadata maintained by the OS ker-
nel (i.e., Linux). Linux uses a lot of data structures
to manage a process and maintain its state, such as
task
struct, thread info and others. I-BOX
relies on a kernel module to save and restore such data
structures. Specifically, this module maintains a shadow
process in the kernel to store the data of each running
IME app. The s hadow process duplicates the process
states of the original IME app by copying the metadata
of the IME app into its own task
struct but with
some modifications for consistency. For example, it
has its own kernel stack and redirects the stack pointer
in the task
struct to its own one, although the
content on the stack is the same as in the original IME
app. For independent states like process ID or kernel
stack,
I-BOX just copies the data into a buffer and writes
them back later. As for other states connected with
other processes or other events like a pipe or waitqueue,
I-BOX needs to record the s ta te s and the relationships
so that it can recover it correctly later. Besides this,
I-BOX als o needs to save the process memory. Instead of
really copying the memory pages,
I-BOX simply creates
a shadow page table that shares the memory with the
target IME app process and marks the page table of the
target process as COW. This omits lots of unnecessary
page copying sinc e most pa ges will not be modified
during the input transaction and it just needs to switch
the page table root to restore the memory, which is very
fast. This helps reduce the stop-time of each IME app
process when
I-BOX tries to do checkpoint and restore.
Multi-thread rollback. Most Android applications run
in the Dalvik virtual machine and have multiple threads
for different purposes . Besides the main thread for UI
and the core logic of the IME app, there are about another
10 threads for garbage collection, event handling, Binder
IPC, and so on. To roll back the process states of an IME
app correctly,
I-BOX needs to deal with such threads
properly. Linux assigns task
struct to a thread just
like a proce ss to maintain its state and groups all threads
belonging to one process together through a list. So
I-
B
OX saves each thread with a separated shadow pro-
cess and groups these processes together through a list
to maintain their parent-child relationships jus t like the
original one. The sharing resources between threads will
be duplicated too. For example,
I-BOX will save the pipe
states between two threads and restore it later.
4.2.3 Handling IPCs
One major challenge I-BOX faces in checkpoint and roll-
back is how to deal with the IPC states of an IME app
process. An IPC involves multiple processes or even
multiple machines, but
I-BOX can only control one end
in the communication. One potential problem is that the
other side of an IPC may wait for a reply that will never
be sent, s inc e the IME app process has forgotten this re-
quest after rollback. Another serious problem is that the
client app may c o mmu n icate with an inactive IPC that
has been erased from the IME app process due to roll-
back.
As a result,
I-BOX needs to find proper timing to do
checkpoint and rollback such that the consistency of a n
IPC is not violated. Proper timing requires several condi-
tions. First, there should not be any data in transmission
between two processes; otherwise it will lead to a cor-
rupted request with incorrect semantics. Second, there
should be no pending IPC requests. This means an IME
684 24th USENIX Security Symposium USENIX Association
app shall wait for all replies before doing checkpoint and
ensure that no request is pending to the process before
rollback. Fortunately, it is not hard for
I-BOX to find
suitable timing because
I-BOX only does checkpoint and
rollback when a user does not input. In most cases, the
IME app processes should be sleeping at that point. If
not, we can safely enforce it without dis turbing other
client apps since the us er is not typing.
Inter-threads IPC. Linux provides a set of IPC mech-
anisms such as pipe, socket, and shared memory. An-
droid inherits such mechanisms but only uses them as a
method for communica tio n between threads within a sin-
gle process. Hence,
I-BOX can control both ends of these
inter-threads IPC, which avoids the inconsistency issues
due to unilateral actions. For example, the two commu-
nicating parties of a pipe in a single process have a pair
of pipe fd; the OS kernel allocates a buffer for them to
pass the message. To restore the pipe correctly,
I-BOX
just keeps a record of current pipe status and its buffered
data, then restores it as needed. There is no restriction
on the timing for checkpoint and rollback. Other IPCs
within the sa me process are done similarly to this.
Android Binder. Android heavily uses its own IPC
mechanism: Binder, which helps the Android permission
system to provide access control to Android services and
resources. By mapping kernel memory into user space,
Binder IPC only requires one data copy for one transmis-
sion, i.e. , from the sender’s user space to the kernel buffer
of the Binder driver. Then the receiver can directly read
the data from its read-only user space mapping, which is
more perform ance-friendly. Th ere are two issues
I-BOX
needs to take care for a consistent restore of the Binder.
More specifically:
Reference counting for Binder proxies. An An-
droid app uses Binder proxies (e.g., BBinder, Bp-
Binder) as the reference to remote processes instead
of simple file descriptors. The Binder driver in the
kernel needs to manage the reference counter for
such proxies so that it ca n know whether a binder
instance is useless or not.
I-BOX needs to track and
record modifications to references to Binder proxies
so that it can keep the consistency of the reference
counters.
Conversation between the Bi nder request and re-
sponse.
I-BOX also needs to keep the conversa-
tion between the Binder transaction request and re-
sponse. As an Android service provider, an IME
app process will accept a Binder transaction request
from the client app and it will send back the transac-
tion response after disposing the request. To achieve
this,
I-BOX tracks the transaction request and re-
sponse to find a right timing when all requests have
been handled. It is not hard to find such a point be-
cause usually
I-BOX tries to do checkpoint or roll-
back whe n IME is idle without new requests.
Content Provider. An IME app may also interact
with both third-party and system content providers.
For example, our analysis with TouchPal IME
app reveals that this app accesses third-party con-
tent providers like content://com.tencent.
mm.sdk.plugin.provider/sharedpref
and content://com.facebook.katana.
provider.AttributionIdProvider; our anal-
ysis with Guobi IME app shows that this app accesses
content://com.iflytek.speechcloud.
providers.LocalResourceProvider and
content://com.tencent.mm.sdk.plugin.
provider/sharedpref. TouchPal accesses the sys-
tem content provider like content://sms/inbox
and both TouchPal and Guobi access content://
telephony/carriers/preferapn. In Android,
all requests to content providers are issued through the
Binder mechanism, we rely on the Binder mechanism
to detect a quiescent point. Fortunately, we note that
accesses to content providers are request-oriented and
thus connection-less. Thus, there is no request on-the-fly
and thus
I-BOX can checkpoint such states accordingly.
Network. Different from Binder, the network driver
does not expose any semantic information to an upper
layer’s connections. Hence, it seems hard to maintain
the consistency of request and response between an IME
app process and its cloud-based server. Fortunately, there
are two observations that help rela x the strict consis-
tency requirement. First, network connections between
an IME app and the cloud-based server, like fetching
the words by sending the keystrokes, synchronizing the
user’s library, and downloading news or advertisements,
are usually stateless and non-transactional; a redo opera-
tion does not cause any consistency issues. Second, net-
work connections during input transactions are mostly
short-time synchronized requests that are finished when
input is done; hence they will not be affected by rollback.
Lessons Learned. While it is generally hard to check-
point a complex app like an IME app, the event-driven
nature of
I-BOX greatly helps simplify the design and
implementation of
I-BOX. By leveraging a quiescence-
point based approach and conduct checkpointing at the
time at which an IME app likely to be quiescent (e.g.,
before an input session sta rt),
I-BOX enjoys both less im-
plementation complexity and runtime overhead.
4.3 Restrictin g IME Apps’ Behavior
When I-BOX detects a sensitive input session, it needs to
restrict an IME app’s behavior such that no se nsitive data
USENIX Association 24th USENIX Security Symposium 685
should be leaked during this process. A malicious IME
app may leverage various means to store and transform
the data during this process. For example, it may directly
send input data to the network, or store the input data to
a content provider to be restored and sent out later. To
this end,
I-BOX needs to restrict an IME app’s behavior
to stop such channels for a sensitive input stream.
I-BOX constrains an IME app from using network and
accesses to content provider and services during a sen-
sitive input session. Specifically, during a sensitive in-
put s ession,
I-BOX only grants an IME app with read
accesses (like query) to such content providers and ser-
vices. This is done by interposing the binder
transaction
and acts according to the access types from the transac-
tion c ode (i.e., query, insert, update or delete).
One potential issue would be that the IME app may not
function correctly without such accesses. Fortunately,
most Android apps (including IME apps) are designed
to work gracefully w ith different permissions, due to the
fact that the user may grant different permissions and an
IME app may work without network accesses. As a re-
sult, it is non-intrusive to dynamically deprive the IME
app from certain accesses as evidenced by prior research
on dynamic permissions on Android [32]. After a roll-
back, as all residual states inside an IME app have been
cleaned, any pe nding actions like insertion or deletion
will not cle ared as if they never happen. Thus, there
won’t be any confusions to the content provider and ser-
vices.
5EVALUATION
We have implemented I-BOX bas ed on Android 4.2.2
and Linux kernel OMAP 3.0.72. It consists of two main
parts: i) a user-level modification of the Android appli-
cation framework to insert the
I-BOX policy engine and
network control module; ii) a kernel module to handle
checkpoints and rollback of IME apps.
Experimental Setup. All of our experiments were per-
formed on a Samsung Gala xy Nexus smartphone with a
1.2 GHz TI OMAP4460 CPU, a 1GB memory and 16GB
internal storage. We evaluate
I-BOX using 11 popular
IME apps to measure the performance overhead of
I-
B
OX. The 11 IME apps (as shown in the first column of
Table 1) are ranked among the highest in popularity in a
large third-party market
3
. Many of these IME apps have
been installed more than million s of times (Figure 1).
In our testing, we set the security policies to include all
contacts in the phone and all c ommonly used accounts
and passwords. This forms a trie containing around 400
words.
3
http://www.wandoujia.com/
5.1 Performance Evaluation
The time overhead of I-BOX comes from three parts: (1)
time to find the quiescent points; (2) time to perform
memory checkpoint and rollback, and (3) time to per-
form file s ave and restore. To measure the performance
overhead, we asked a volunteer with an average typing
speed of about 100 characters per minute to e nte r a 10
word paragraph in an SMS app using the tested IME
apps. We did not use an automation tool like an Android
Monkey as it cannot handle the complex UI interface of
these IM E apps.
Latency. As shown in Ta b le 1, the time to find a quies-
cent point is very small (less than 14ms). This confirms
our observation that it is very easy and fast to find or
force a quiescent point to do checkpoint and rollback on
an IME app. The time of saving and restoring an IME
app’s memory state is also very small (less than 29ms)
since
I-BOX does not really copy the whole memory but
just mark them as COW. Based on the files touched by
the IME app proce ss during the typing,
I-BOX needs to
restore a few files to prevent the IME app from conceal-
ing the secret inside files. Hence, the time for file save
and restore is a little bit lengthy (60 ms), whic h can fur-
ther be improved by using a copy-on-write file system. In
total, the maximum total time to do a checkpoint (includ-
ing finding a quiescent point) is less than 103ms (14ms
+ 29ms + 60ms). In contras t, the world rec ord of texting
is typing a complicated 25 word message (159 charac -
ters) in 25.94 sec onds [5], which corresponding to 163
ms/character and 1.0376 second/word. Hence, the time
to do checkpointing is very s ma ll compared to user typ-
ing. As the time to search the trie is negligible, we didn’t
report it here.
Power. To measure the power overhead incurred by
I-
B
OX, we used the TouchPal IME to input an article and
its non-Latin translation to a text-note app calle d Catch
and count its power s tatus. The total input process spans
30 minutes for both unmodified Android and
I-BOX-
capable Android. We found that in both case s the power
dropped from 100% to 99%, whose differences were in-
distinguishable. This is probably because the IME app
is not power-hungry and the additional power consumed
by
I-BOX was evened by the reduced network transmis-
sions, which is thus hard to be distinguished without a
highly accurate power meter. In our future, we plan to
further characterize the power consumption using a n a c -
curate power meter.
5.2 Security Evaluation
Here, we evaluate whether I-BOX inde ed has mitigated
the leakage of a user’s sensitive keystrokes. We still use
the IME apps in our performance testing, along with a
686 24th USENIX Security Symposium USENIX Association
Quiescent Memory File
IME app C (ms) R (ms) C (ms) R (µs) C (ms) R (ms)
Sogou 13.3 14.1 22.8 91 30 30
Baidu 8.2 11.1 22.6 275 40 40
QQ 12 11.8 24.3 31 60 30
Pinyin 11.8 12 20.8 122 10 10
Vee 5.9 10.3 0.022 61 20 20
Guobi 7.4 9.5 25.5 61 10 10
Octopus 11.4 11 28.9 245 30 20
iFlytek 4.6 9.7 13.9 92 10 10
Slideit 13.2 15.2 13.5 152 20 30
Jinshou 3.1 6.5 28 91 60 50
TouchPal 7.8 13.3 22.1 183 30 30
Baidu
3.3 10.9 9 61 30 40
Average 8.4 11.3 22.5 140 28.3 26.7
Table 1: Time overhead for finding a quiescent point,
doing checkpoint (C) and rollback (R).
repackaged malicious IME app (described in §2.3), to
evaluate its effectiveness. According to the accessibil-
ity of these IME apps, we conducted three sets of ex-
periments to dete rmine effectiveness: black-box testing,
gray-box testing, and white-box testing.
5.2.1 Black-box Tes ting
Since most of the IME apps use proprietary unknown
protocols with unknown encryptions, we c annot directly
trace the network packets to confirm our effectiveness.
Therefore, we take a black-box approach to approximat-
ing our result. That is, instead of inspecting the packet
contents, we inspect the packet differences sent by the
IME-apps w i t h
I-BOX and without I-B OX, within an
identical experiment setup and time window.
In particular, we ran all these apps using a two-minute
time window, and we typed around 30 non-Latin words
with [email protected] as the sensitive word and then ob-
served the packet differences using the Wireshark tool.
Usually, these IME apps will send some packages out
when a user types something that triggers the cloud input
function. Interestingly, we found 6 out of the 11 tested
apps have a different number of packages, as shown in
Table 2. With
I-BOX being enabled, there are less pack-
ages to be sent out compared to normal one s. This is
because
I-BOX controls the network of the target IME
app when it detects sensitive input data and prevents the
target IME app from leaking the data out.
While such side-channel based black-box testing can-
not fully confirm that we have prevented a ll lea ks, we
believe it is highly likely that
I-BOX has stopped them,
even for the other 5 apps that w e did not observe pack-
age differences for. (It is highly likely that these IME
apps have buffered the input with the intent to s end the
data out later. However, our oblivious sandboxing mech-
anism will clear the buffered sensitive data).
IME app w/o I-BOX w/ I-BOX
Baidu 17 6
Sogou 44 30
QQ 37 20
Octopus 32 16
TouchPal 70 28
Baidu
30 18
Table 2: #packages observed for the testing apps.
Figure 5: Hexdump of the traced Touchpal package. The
leaked SSN is highlighted.
5.2.2 Gray-box Testing
Among these 11 IME apps, we are able to observe the
packet payload of TouchPal (as in discussed in §2.3) be-
cause it uses a plain-text protocol. Therefore, we con-
ducted gray-box testing to confirm
I-BOX indeed miti-
gated the privacy leakage. In this experiment, we open a
client “SMS” app to send a short message to one friend
with a socia l security number (SSN), which is private and
sensitive by default. The text to send is a mixture of both
Latin and non-Latin languages, as well as the number.
Cloud input functio nality w ill be triggered in this case.
Interestingly, without
I-BOXs protection, we found
that Touchpal uploaded not only the keycodes the user
typed as arguments of cloud input, but also the text mes-
sage before the current input cursor that includes the
sensitive social security number to the cloud through an
HTTP POST method. We intercepted this packet using a
man-in-the-middle attack. Part of the packet is disp layed
in Figure 5. However, with
I-BOXs protection, we found
that
I-BOX successfully detected the critical number and
shutdown its network to stop the leakage of data, and w e
did not observe any network trace.
We also studied the privacy warnings generated by An-
droid on which data an IME may collect. Figure 6 shows
that Android generates privacy warnings for two popular
IME a pps, Sogou and TouchPal, indicating that they may
collect users’ passwords, credit card number, etc. This
further confirms our conclusion that they collect users’
privacy data.
USENIX Association 24th USENIX Security Symposium 687
Apps Without I-BOX With I-BOX
SMS (phone number) 6204562244 62045
SMS (message) Let’s meet tomorrow noon at room 302 Let’s meet tomorrow noon at room 302
Instagram (account) [email protected] thisisf
Instagram (password) fakepassword
Facebook (ac count) [email protected] thisisf
Facebook (password) dontbelieveit
Alipay [email protected] nomo
Gmail [email protected] tosom
Google Play Ingress Ingress
browser How much is this PS3? How much is this PS3?
Table 3: Evaluation result w/ repackaged Baidu IME using different client apps.
(a) Sogou I ME App (i n Chinese) (b) TouchPal IME App (in Engl ish)
Figure 6: Privacy Warning by Android for two popular
IME apps. The left is shown in Chinese and the right is
shown in English; the essential meanings are the same.
5.2.3 White-box Testing
As discussed in § 2.3, we repackaged a very popular
Baidu IME app to log all of the user input data and se nd
them out to a malicious server we controlled. Hence,
this repackaged IME app is essentially a keylogger. We
were able to perform white-box testing by inspecting the
packet payloads and confirming them with the source
code of our malicious payloa d. We installed this IME
app on our test phone and then used this phone to en-
ter some user-defined private sensitive data with differ-
ent client apps ranging from SMS, Facebook, and Gmail,
etc. Table 3 shows the data we collected at the server side
with and without
I-BOXs protection.
From this table we can clearly observe that without
I-
B
OX, the malicious IME app will steal all the data that
a user enters. Consequently, all sensitive data has been
leaked out; with
I-BOX, it automatically blocks the net-
work connection so that the server cannot receive any
complete sensitive information. For instance, for pass-
words, the malicious server cannot receive anything as
shown in the Instagram and Facebook case. As
I-BOX
shuts down the malicious IME app’s network when it
finds character sequences that have matched part of the
sensitive phrase in our security policy, the server side can
only receive the parts of the typed c haracters. For exam-
ple, when a user tries to type her Facebook account thi-
sisfortest@gmail.com, the server side can only receive a
part of it, i.e. th isisf
4
. While partial sensitive input is
still being leaked, we believe it is still hard for attackers
to guess the original message.
5.3 Users Experience
One principal goal of I-BOX is to limit the negative influ-
ence on an end user’s experience as little as possible. To
evaluate this, we tested latency by determining how an
end user would feel when typing characters on devices
protected by
I-BOX. For this , w e invited a dozen stu-
dents (6 undergraduate and 6 master students) in our Lab
to install
I-BOX on their phones, and asked them to use
our system and provide us with feedbac k. By default,
I-BOX uses the context-based policy and derives all sen-
sitive data from the contacts and cookies. Two of them
also tried to inp u t their girl-friend’s names and birth dates
into
I-BOX.
To our plea sure, none of the use rs complained of any
latency imposed by our system. As shown in Table 4,
there is only 0.4 milliseconds (ms) overhead per charac-
ter imposed by our policy checking. While network shut-
down takes about 180 ms, it is not executed per word and
is instead triggered only when certain sensitive words are
going to be formed. Therefore, the additional overhead
added by
I-BOX cannot be detected by end users. This
is because the typing speed for a normal user is 625ms
per character, and the world fast record is 160 ms per
character, as shown in Table 4.
One complaint we received so far is that the users now
need to manually type their account instead of using the
automation features provided by the IME apps. We be-
lieve this is worthwhile for bette r privacy protection. An-
other complaint is that they need to specify their addi-
tional secrets manually; this will motivate us to d esign
better UI interface in our future work.
4
Note that we regard the sequence after @ as one character because
an attacker can guess the rest by the first character most of the time.
688 24th USENIX Security Symposium USENIX Association
Policy Checking 0.4ms/char
Network Shutdown 180ms
Checkpoint/Re store 103ms
Guinness World R ecords of fastest texter 160ms/char
normal us er speed 625ms/char
Table 4: Statistics regarding the usage latency of I-BOX.
6DISCUSSIONS AND LIMITATIONS
While I-BOX has made a first step to mitigate keystroke
leakage against untrusted IME apps, there are still a num-
ber of limitations in its design and implementation.
Side-channel attacks It has been viable to use side
channels to infer s ome keystroke information [9, 4].
I-
B
OX currently cannot prevent such side channel attacks.
However, such threats are usually less severe than those
of malicious IME apps, which can accurately observe all
user input. We leave it as our future work to address
issues related to the side-channel leakages.
Colluding malw are As
I-BOX currently only runs an
IME app inside in a sandbox transactio n ally, it is still
possible that an IME app could collude with another mal-
ware to leak information (i.e., the colluding attack [8]).
For example, an IME app could first save the user input
in a local file, and inform a colluding malware to re ad
the file when the transaction has not been rolled back and
then divulge the input. This essentially violates the p o li-
cies of
I-BOX. However, it is challenging for sandboxing
to reliably prevent this, as studied by TxBox [25].
Security of
I-BOX Any new security tools may bring
new security implications as they usually touch security-
sensitive data and
I-BOX is of no exception. As I-BOX
can essentially touch all users’ sensitive data, it is essen-
tially a key logger as well. Yet,
I-BOX is much simpler
than close-sourced proprietary IME apps (1,700 LOCs
vs. hundreds of thousands LOCs). Regarding whether to
trust
I-BOX or other IME a pps, third-party agents need to
only audit the code of
I-BOX instead of using gray-box
based approaches to auditing the behavior of dozens of
third-party IME a pps. Meanwhile,
I-BOX is completely
a local service and will not send any private data out of
the phone.
Permission Attacks As
I-BOXs security is based on
Android permission systems, it ca nnot defend against at-
tacks against the permissions like component hijacking
attacks and confused de puty attacks [23]. We consider
this out of the scope of this paper; actually there have
been a number of prior systems that statically and dy-
namically detect and prevent such attacks (e.g., [12, 43]).
Actually, Android has significa ntly imp roved its permis-
sion systems since version 4.2 [3].
Voice input Currently we limit input data prote ction
to handwriting input and keystroke input and do not con-
sider voice input as it does not have keystrokes. Yet,
users usually use dedicated system services like Apple
Siri, Google Now and Microsoft voice recognition. How
to ha ndle voice input and preserve its privacy is very
challenging and will be our future work.
Beyond Mobile IME Apps N ote that the approach of
I-BOX does not necessarily only apply to mobile plat-
forms; Similar techniques can also be applied to de sk-
tops, which suffer from a s imilar dilemma betw een pri-
vacy an d usability. We may provide a similar oblivious
sandbox for each IME app, which should be straightfor-
ward as Android actually runs atop Linux. We le ave this
as our future work. Besides, other applications that re-
quires a tradeoff between privacy and usability may use
execution transaction like
I-BOX.
7RELATED WORK
Privacy leakage detection in mobile devices. Recently,
there have been significant efforts on the detection of pri-
vacy leakage in mobile devices. Early attempts include
TaintDroid [16, 17] and PiOS [15], and recent efforts
include such as Woodpecker [22], AndroidLeaks [20],
ContentScope [50], and Appprofiler [35]. In particu-
lar, TaintDroid [16] uses dynamic taint analysis to track
whether sensitive information (e.g. , address book) can
be leaked through the network. PiOS [15] uses static
analysis and focuses on the privacy leakage in iOS apps.
Woodpecker [22] leverages a n inter-procedural data-flow
analysis to inspect whether an untrusted app c an obtain
unauthorized access to sensitive data. ContentScope [50]
detects passive content leak v u ln erabilities, by wh ich in-
app sensitive data can be leaked.
AndroidLeaks [20] instead uses static analysis to de-
tect data leakage in Android apps. Chan et al. [10] further
leverages mo b ile forensics to correlate user actions with
privacy leakages. Appprofiler [35] creates a mapping be-
tween high-level API calls and low-level privacy-related
behavior, which is then used to provide a high-level pro-
file of App’s privacy behavior. Besides, there have also
been interests in dete cting privacy leakage due to mobile
ads [38]. In contrast,
I-BOX focuses on preventing leak-
age of sensitive keystrokes.
Privacy leakage prevention in mobile devices. Other
than detecting privacy leakage, there are also a number
of systems that prevent private data from being leaked.
By extending TaintDroid [16], AppFence [24] prevents
applications from accessing sensitive information using
data shadowing, and it also blocks outgoing commu-
nications tainted by sensitive data. Wh ile
I-BOX and
AppFence both block network communications when
sensitive data is to be leaked, there are substantial dif-
ferences: AppFence uses s ha dowing to provide an illu-
sion to the app such that it can continue performing its
taint tracking, whereas
I-BOX does not use any illusion
nor any instruction-level taint tracking, due to the per-
USENIX Association 24th USENIX Security Symposium 689
vasive existence of native code. Meanwhile, AppFence
does not encounter the challenges we faced such as con-
sistent rollback, and it only simply blocks the network
communication, whereas
I-BOX still has to keep the con-
nection and allow other data to be transferred.
TISSA [51] tames information stealing apps to stop
possible privacy leakage. SpanDex [14] further uses
symbolic execution to quantify and limit the implicit
flows through a sandbox, to prevent an untrusted applica-
tion from leaking passwords. Through automatic repack-
aging of Android apps, Aurasium [43 ] attaches sandbox-
ing and policy enforcement atop existing apps, to stop
malicious behaviors such as attempts to retrieve users’
sensitive information. Unlike Aurasium that adds a sand-
box to an app, πBox [30] shifts the sandboxing protec-
tion of private data from the app level to the syste m level,
and offers a platform for privacy-preserving apps. How-
ever πBox trusts a few app vendors to protect users’ pri-
vacy data, wh ile
I-BOX treats the vendor of IME apps
as untrusted, due to their incentives to collect users’ in-
put. TinMan [42] instead completely offload passwords-
like secret to a remote cloud, but only handles a class of
special secrets that are not necessary to be displayed in
mobile devices. ScreenPass [31] leverages a trusted soft-
ware keyboard to input and tag passwords and uses taint
tracking to ensure that a password is only used within
a specific domain. In contrast, while
I-BOX also uses a
trusted software keyboard for password input, it focuses
more on preventing a malicious IME from leaking se nsi-
tive da ta (not only passwords).
Checkpoint and restore.
I-BOX employs a check-
point and restore mechanism to prevent privacy leakage .
Such a mechanism has been built for transactional mem-
ory [6], execution transactions [37], as well as whole-
system transactions [33]. Retro [26] leverages selective
re-execution for intrusion recovery. Storage Capsules [7]
also use checkpoint and restore to wipe off residual data
after an application has viewed data in a desktop.
I-BOX
is an insta nc e of a system transaction but designed spe-
cially for untrusted IME apps.
Sandboxing. There have been a large numbe r of efforts
in building sandboxes to execute untrusted programs,
web applications, and native code. These tools were built
using a variety of approaches such as kernel-based sys-
tems [19], user-level approaches [27], system call inter-
positions [21], or binary code translation [18], and re-
compilation [45].
A sandbox that also contains transactions is the
TxBox [25], a tool built atop TxOS [33] for specula-
tive execution and automatic recovery. While
I-BOX and
TxBox share the similarity of using transactions to build
a sa ndbox, there are still significant differences: the goal
of TxBox is to confine the execution of native x86 pro-
grams atop Linux kernel, whereas
I-BOX is to confine
the IME apps atop Android OS. Consequently,
I-BOX
faces additional challenges including resolving IPC bind-
ings. Further, us ing quiescent points in
I-BOX signifi-
cantly simplifies the design and implementation.
8CONCLUSION
This paper made a first systematic study on the
(in)security of third-party (trusted or untrusted) IME
apps, and revealed that these apps tend to leak users’
sensitive input (due to their incentives of improving
user’s experience). To enjoy the rich-experiences offered
by such apps while mitigating information leakages,
this paper described
I-BOX as a first step towards this
direction. In light of the opaque na ture of an IME
app,
I-BOX leverages the idea of trans a ctions to run an
IME app to make it oblivious to users ’ sensitive input.
Experiments showed that
I-BOX is efficient, incurs little
impact on users’ experiences and successfully thwarted
the leakage of sensitive user input.
ACKNOWLEDGMENTS
We thank our shepherd William Enck and the anony-
mous reviewers for their insightful comments, Xiaojuan
Li and Yutao Liu for helping prepare the final version.
This work is supported in part by the Program for New
Century Excellent Talents in University, Ministry of Ed-
ucation of China (No. ZXZY037003), a foundation for
the Author of National Excellent Doctoral Dis sertation
of PR China (No. TS0220103006), the Shanghai Sci-
ence and Technology Development Fund for high-tech
achievement translation (No. 14511100902), Zhangjiang
Hi-Tech program (No. 201501-YP-B108-012), and the
Singapore NRF (CREATE E2S2).
REFEREN C ES
[1] Free Chinese-made software poses security risk.
http://www.japantimes.co.jp/news/2013/12/26/national/chinese-
made-computer-input-system-banned- in-government-
agencies/#.U21w5
aPUS0.
[2] smali-An assembler/disassembler for Android’s dex format.
https://code.google.com/p/smali/.
[3] Security enhancements i n jelly bean. http://android-
developers.blogspot.jp/2013/02/securi ty-enhancements-in-
jelly-bean.html, 2013.
[4] A. J. Aviv, B. Sapp, M. Blaze, and J. M. Smith. Practicality of
accelerometer side channels on smartphones. In ACSAC, 2012.
[5] BBC News. Salford woman makes bid for fastest text title.
http://news.bbc.co.uk/loca l/manchester/hi/peopl e
and places/
newsid
8939000/8939790.stm, 2010.
[6] A. Birgisson, M. Dhawan, U. Erli ngsson, V. Ga napathy, and
L. I f tode. Enforcing authorization policies using transactional
memory introspection. In CCS, pages 223–234, 2008.
[7] K. B or ders, E. Vander Weele, B. Lau, and A. Prakash. Protecting
confidential data on personal computers with storage capsules. In
Usenix Security, 2009.
690 24th USENIX Security Symposium USENIX Association
[8] S. Bugiel, L. Davi, A. Dmitrienko, T. Fischer, A.-R. Sadeghi,
and B. Shastry. Towards taming privilege-escalation attacks on
android. In NDSS, 2012.
[9] L. Cai and H. Chen. Touc hlogger: inferring keystrokes on touch
screen from smartphone motion. In HotSec, 2011.
[10] J. J. K. Chan, K. W. Tan, L. Jiang, and R. K. Balan. The case
for mobile forensics of private data leaks: Towards large-scale
user-oriented privacy protection. In APSYS, 2013.
[11] S. Chen, R. Wang, X. Wang, and K. Zhang. Side-channel l eaks
in web applications: A reality today, a challenge tomorrow. In
Oakland, pages 191–206, 2010.
[12] E. Chin, A. P. Felt, K. Greenwood, and D. Wagner. Analyzing
inter-application communication in android. In MobiSys, pages
239–252. ACM, 2011.
[13] China IT Research Center. Third-part IMEs us-
age stats in China for 2014 Q1. http://www.cnit-
research.com/c ontent/201405/303.html, 2014.
[14] L. P. Cox, P. Gilbert, G. Lawler, V. Pistol, A. Razeen, B. Wu,
and S. Cheemalapati. Spandex: Secure password tra cking for
android. In USENIX Secur ity, 2014.
[15] M. Egele, C. Kruegel, E. Kirda, and G. Vigna. Pios: Detecting
privacy leaks in ios applications. In NDSS, 2011.
[16] W. Enck, P. Gilbert, B. Chun, L. Cox, J. Jung, P. McDaniel, and
A. Sheth. TaintDroid: an information-flow tracking system for
realtime privacy monitoring on sma r tphones. In OSDI, 2010.
[17] W. Enck, P. Gilber t, S. Han, V. Tendulkar, B. - G. Chun, L. P.
Cox, J. Jung, P. McDanie l, a nd A. N. Sheth. Taintdroid: an
information-flow tracking system for realtime privacy monit or-
ing on smartphones. ACM TOCS, 32(2):5, 2014.
[18] B. Ford and R. Cox. Vx32: Lightweight user-level sandboxing
on the x86. In USENIX ATC, 2008.
[19] T. F r aser, L. Badger, and M. Feldman. Hardening cots software
with generic software wrappers. In Oakland, pages 2–16, 1999.
[20] C. Gibler, J. Crussell, J. Erickson, and H. Chen. Androidleaks:
automatically de tecting potential privacy leaks in android appli-
cations on a large scale. In Trust, 2012.
[21] I . Goldberg, D. Wagner, R. Thomas, and E. A. Brewer. A secure
environment for untrusted helper applications confining the wily
hacker. In USENIX Securit y, 1996.
[22] M. Grace, Y. Zhou, Z. Wang, and X. Jia ng. Systemati c detection
of capability leaks in stock android smar tphones. In NDSS, 2012.
[23] N. Hardy. The confused deputy:(or why capabilities might have
been invented). SIGOPS Oper. Sys. Review, 22(4):36–38, 1988.
[24] P. Hornyack, S. Han, J. Jung, S. Schechter, and D. Wetherall.
These aren’t the droids you’re looking for: Retrofitting android
to protect data from imperious applications. In CCS, 2011.
[25] S. Jana, D. E. Porter, and V. Shmatikov. Txbox: Building secure,
efficient sandboxes with system transactions. In Oakland, 2011.
[26] T. Kim, X. Wang, N. Zeldovich, M. Kaashoek, et al. Intrusion
recovery using selective re-execution. In OSDI, 2010.
[27] T. Kim and N. Zeldovich. Practical and effective sandboxing for
non-root users. In USENIX ATC, pages 139–144, 2013.
[28] O. Laadan and J. Nieh. Transparent checkpoint-restart of multiple
processes on commodity operating systems. In USENIX ATC,
pages 323–336, 2007.
[29] W. S. Labs. Fake input method editor(ime) trojan.
http://community.websense.com/blogs/securitylabs/archive/
2010/07/05/trojan-using-input-method-inject-tec hnology. aspx.
[30] S. Lee, E. L. Wong, D. Goel, M. Dahli n, and V. Shmatikov. πbox:
a plat f or m for privacy-preserving apps. In NSDI, 2013.
[31] D. Liu, E. Cuervo, V. Pistol, R. Scudellari, and L. P. Cox. Screen-
pass: S ecure password entry on touchscree n devices. In MobiSys,
pages 291–304, 2013.
[32] M. N auman, S. Khan, and X. Zhang. A pex: extending android
permission model and enforcement with user-defined runtime
constraints. In ASIACCS, pages 328–332, 2010.
[33] D. E . Porter, O. S. Hofmann, C. J. Rossbach, A. Benn, and
E. Witchel. Operating system transactions. In SOSP, 2009.
[34] V. Rastogi, Y. Chen, a nd W. Enck. Appsplayground: A utomatic
security analysis of smartphone applications. In ACM conference
on Data and application security and privacy, 2013.
[35] S. Rosen, Z. Qian, and Z. M. Mao. Appprofiler: a flexible method
of exposing privacy-related behavior in android applications to
end users. In ACM conference on Data and application security
and privacy, pages 221–232. ACM, 2013.
[36] M. A. Salehi, T. Caldwell, A. Fernandez, E. Mickiewicz, E . W.
Rozier, S. Zonouz, and D. Redberg. R eseed: Regular expression
search over encrypted data in the cloud. In CCGrid, 2014.
[37] S. Sidiroglou, O. Laadan, A. D . Keromytis, and J. Nieh. Using
rescue points to navigate software recovery. In Oakland, 2007.
[38] R. Stevens, C. Gibl er, J. Crussell, J. Erickson, and H. Chen. In-
vestigating user privacy in android ad libraries. In Workshop on
Mobile Security Technologies (MoST), 2012.
[39] K. Subramanyam, C. E. Frank, and D. F. Gall i.
Keyloggers: The overlooked threat to computer se-
curity. http://www.keylogger.org/articles/kishore-
subramanyam/keyloggers-the-overlooked-threat-to-computer-
security-7.html.
[40] TechSpot News. Google fired employees for breach-
ing user privacy. http://www.techspot.com/
news/40280-google-fired-employees-
for-breaching-user-privacy.html, 2010.
[41] Y. Xia, Y. Liu, and H. Chen. Archit ecture support for gues t-
transparent vm protection from untrusted hypervisor and physical
attacks. In HPCA, 2013.
[42] Y. Xia, Y. Li u, C. Tan, M. Ma, H. Guan, B. Zang, and H. Chen.
Tinman: eliminating c onfidential mobile data exposure with se-
curity oriented offloading. In EuroSys, 2015.
[43] R. Xu, H. Sa
¨
ıdi, and R. Anderson. Aurasium: Practical policy
enforcement for android applications. In USE NIX Security, 2012.
[44] W. Yang, X. Xiao, B. Andow, S. Li, T. Xie, and W. Enck. App-
context: Differentiating malicious and benign mobile app behav-
iors using context. In ICSE, 2015.
[45] B. Yee, D. S ehr, G. Dardyk, J. B. Chen, R. Muth, T. Ormandy,
S. Okasaka, N. Narula, and N. Fullagar. Native client: A sand-
box for portable, untrusted x86 native code. Com mun. ACM,
53(1):91–99, Jan. 2010.
[46] H. Yin, D. Song, M. Egele, C . Kruegel, and E. Kirda. Panorama:
Capturing syste m-wide information flow for malware detection
and analysis. In CCS, 2007.
[47] F. Zhang, J. Chen, H. Chen, and B. Zang. Cloudvisor: retrofitting
protection of virtual machines in multi-tenant cloud with nested
virtualization. In SOSP, 2011.
[48] W. Zhou, Y. Zhou, X. Jiang, and P. Ning. Detecting repackaged
smartphone applications in third-party a ndroid marketplaces. In
ACM conference on Data and Application Security and Privacy,
pages 317–326. ACM, 2012.
[49] Y. Zhou and X. Jiang. Disse cting android malware: Characteri-
zation and evolution. In Oakland, 2012.
[50] Y. Zhou and X. Jiang. Detecting passive content leaks and pollu-
tion in android applications. In NDSS, 2013.
[51] Y. Zhou, X. Zhang, X. J iang, and V. W. Fre eh. Taming
information-stealing smartphone applica tions (on android). In
Conference on Trust and Trustworthy Computing, 2011.