Thwarting Sensitive Keystroke Leakage in Mobile IME Apps

Open access to the Proceedings of

the 24th USENIX Security Symposium

is sponsored by USENIX

You Shouldn’t Collect My Secrets:

Thwarting Sensitive Keystroke Leakage

in Mobile IME Apps

Jin Chen and Haibo Chen, Shanghai Jiao Tong University; Erick Bauman

and Zhiqiang Lin, The University of Texas at Dallas; Binyu Zang and Haibing Guan,

Shanghai Jiao Tong University

https://www.usenix.org/conference/usenixsecurity15/technical-s essions/presentation/chen-jin

This paper is included in the Proceedings of the

24th USENIX Security Symposium

August 12–14, 2015 • Washington, D.C.

ISBN 978-1-939133-11-3

USENIX Association 24th USENIX Security Symposium 675

You Shouldn’t Collec t M y Sec rets:

Thwarting Sensitive Keystroke Leakage in Mobile IME Apps

Jin Chen

†

, Haibo Chen

†

, Erick Bauman

⋆

, Zhiqiang Lin

⋆

, Binyu Zang

†

, Haibing Guan

†

Shanghai Key Kaboratory of Scalable Computing and Systems, Shanghai Jiao Tong University

⋆

Department of Computer Science, The University of Texas at Dallas

ABSTRACT

IME (input method editor) apps are the primary means

of interaction on mobile touch sc reen devices and thus

are usually granted with access to a wealth of private

user input. In order to understand the (in)security of

mobile IME apps, this paper ﬁrst performs a systematic

study and uncovers that many IME apps may (intention-

ally or unintentionally) leak users’ sensitive data to the

outside world (mainly due to the incentives of improv-

ing the user’s experience). To thwart the threat of sen-

sitive information leakage while retaining the beneﬁts of

an improved user experience, this paper then proposes

I-BOX, an app-transparent oblivious sandbox that mini-

mizes sensitive input leakage by conﬁning untrusted IME

apps to predeﬁned security policies. Several key chal-

lenges have to be addressed due to the proprietary and

closed-source nature of most IME apps and the fact that

an IME app can arbitrarily store and transform user input

before sending it out. By designing system-level transac-

tional execution,

I-BOX works seamlessly and transpar-

ently with IME apps. Speciﬁcally,

I-BOX ﬁrst check-

points an IME app’s state before the ﬁrst keystroke of an

input, monitors and analyzes the user’s input, and rolls

back the state to the checkpoint if it detects the poten-

tial danger that sensitive input may be leaked. A proof

of concept

I-BOX prototype has been built for Android

and teste d with a set of popular IME a pps. Experimental

results show that

I-BOX is able to thwart the leakage of

sensitive input for untrusted IME apps, while incurring

very small runtime overhead and little impact on user ex-

perience.

1INTRODUCTION

The Problem. With large touch screens, modern mo-

bile devices typically feature software keyboards to al-

low users to enter text input. This is different compared

to traditional desktops where we use the hardware key-

boards. These soft keyboards are known as Input Method

Editor (IME) apps, and they convert users’ touch events

to text. Since IME apps proces s almost all of a user’s in-

put in mobile devices, it is critical to ensure that they are

not keyloggers and they do not leak any sensitive input

to the outside world.

20,000

40,000

60,000

80,000

100,000

120,000

140,000

160,000

Sougou

iFlytek

Google Pinyin

TouchPal

Baidu

Jinshou

Guobi

Octopus

Slideit

Vee

The number of download (units 10,000)

The IME Apps

Figure 1: Download statistics of IME apps in our study.

While all mobile devices have a default IME app in-

stalled, users often demand third-party IME apps with

expanded feature sets in order to gain a better user ex-

perience. This is especially common for no n -Latin lan-

guages. In order to accommodate this need, mobile oper-

ating systems such as Android a nd iOS provide an exten-

sible framework allowing alternate input methods. Due

to the ease of making third-party IME apps and high de-

mand for customization, there are currently thousands of

IME apps in major App market like Google Play and Ap-

ple’s App Store. Many of whic h have gained hundreds of

millions downloads, as shown in Fig. 1. For instance, the

Sogou IME apps has in total 1.6 billion downloads in

Google Play and several third party app vendors such as

360, and Baidu. Meanwhile, a recent survey [13] found

that 68.3% of smartphones in China are using third-party

IME apps. This survey did not include statistics from

Japan or Korea, where such apps are also very popular.

Unfortunately, despite these advantages, using a third-

party IME app also brings security and privacy concerns

(assume the default IME app does not have these prob-

lems). First, IME app developers have incentives to log

and collect user input in order to improve the user’s ex-

perience with their products, and user input is as valuable

as email content, from which they can learn user’s needs

and push customized advertising or other business activ-

ities. Although an IME app may state a policy of not

collecting certain input from a user, the policies imple-

676 24th USENIX Security Symposium USENIX Association

mented in the app may unintentionally send sensitive in-

put outside the phone. In §2.3 we show that such a threat

is real by observing the output of a popular IME app that

periodically sends out user input to a remote server. In

addition, we colle cted the network activities of a set of

IME apps during a user input study and showed that they

also likely send out private data. In light of this informa-

tion leakage threat, the Japanese government’s National

Information Security Center has warned its central gov-

ernment ministries, age n cies, research institutions and

public universities to stop using IME apps offered by the

search engine provider Baidu [1].

Even if a user trusts benign IME apps to properly se -

cure private data, there is still a risk from repackaging

attacks targeting benign apps. In fact, prior study has

shown that around 86% of Android malware samples are

repackaged from legitimate apps [49]. It is also surpris-

ingly simple to repackage an IME app with a malicio u s

payload, as we demonstrate in §2. Essentially, a repack-

aged malicious IME app is essentially a keylogger, which

has been one of the most dangerous security threats for

years [39]. Also, evidence has shown that IME apps are

popular for attackers to inject malicious code [29].

Challenges. While it may se em trivial to detect thes e

repackaged malicious IME apps by comparing a hash of

the code with the corresponding vendor in the ofﬁcial

market, the wide spread existence of third-party markets

makes such checks more difﬁcult. It is also easy for at-

tackers to plant repackaged malware into these markets,

as is shown by the fact that a considerable a mount of

repackaged malware has been found in them [48].

Of further concern is the fact that it is very challenging

to analyze whether even “benign” IME apps will leak any

sensitive data or not. There are several reasons why de-

tecting privacy leaks in IME apps is challengin g . First,

many commercial IME apps use excessive amounts of

native code, which makes it very difﬁcult to understand

how they log and process user input. Second, many of

the IME apps use unknown, proprietary protocols, which

makes it especially hard to analyze how they collect and

transform user input. Third, many of them utilize encryp-

tion, and their algorithms are also unknown. Therefore,

we eventually must treat the IME apps as black boxes

for current privacy-preserving techniques on mobile de-

vices, and users must either trust them completely (and

risk leaking their private data) or switch to the default

IME a pp (and lose the improved user experience).

At a high level, it would seem that existing techniques

such as taint tracking would be viable approaches to pre-

cisely tracking and containing sensitive input. For ex-

ample, TaintDroid [16, 17] and its follow-up work have

been shown to very effective to track sensitive input and

detect when it is leaked. There will still be the follow-

ing additional challenges to be overcome. First, current

IME apps tend to use excessive native code in their core

logic, and TaintDroid currently does not track tainted

data in native code. Second, it is a well-known problem

that data-ﬂow based tracking for taint-tracking systems

to capture control-based propagation. In fact, many of

the keystrokes a re generated through lookup tables, as

reported in Panorama [46]. Third, sensitive information

is often compos ed of a sequence of keystrokes, making it

challenging to have a well-deﬁned policy to differentiate

between se nsitive and non-sensitive keystrokes in Ta int-

Droid. Therefore, we must look for new techniques.

Our approach. In this pape r, we pres ent

I-BOX, an

app-oblivious IME sandbox that prevents IME apps from

leaking sensitive user input. In light of the opaque na-

ture of third-party IME apps, the key idea of

I-BOX is

to make an IME app oblivious to sensitive input by run-

ning IME apps transactionally;

I-BOX eliminates sensi-

tive data from untrusted IME apps when there is sensi-

tive input during this process. Speciﬁcally,

I-BOX check-

points the states of an IME app before an input transa c-

tion. It then analyze s the user’s input data using a pol-

icy engine to detect whether sensitive input is ﬂowing

into an IME app. If so,

I-BOX rolls back the IME app’s

states to the saved checkpoint, which essen tially makes

an IME app oblivious to what a us er has entered. Other-

wise,

I-BOX commits the input transaction by disca rdin g

the checkpoint, which enables the IME app to leverage

users’ input to improve the user experience.

One key challenge faced when building

I-BOX is

how to make the checkpointing process efﬁcient and

consistent, which is unfortunately complicated by An-

droid’s design, especially its hybrid execution (of Java

and C), multi-threading, and complex IPC mechanism

(e.g., Binder). Fortunately,

I-BOX addresses this chal-

lenge by leveraging the event-driven nature of an IME

app. More speciﬁcally, we present a novel approach by

creating the checkpoint at a quiescent point, in which its

execution states are inactive. Such a design signiﬁcantly

simpliﬁes many issues su ch as handling residual states in

the local stack of native code, the Dalvik VM and IPCs.

We have implemented

I-BOX based on Android 4.2.2

running on a Samsung Galaxy Nexus smartphone. Per-

formance evaluations show that

I-BOX can checkpoint

and restore a set of third-party popular IME apps within a

very tiny amount of time, and thus cause little impact on

user experience. A sec urity evaluation using a set of pop-

ular IME apps shows that

I-BOX mitigates the leakage of

sensitive input. Case studies using a popular “benign”

IME app and a repackaged IME app conﬁrm that

I-BOX

accurately conforms to the predeﬁned security policies to

prevent sending of sensitive input data.

USENIX Association 24th USENIX Security Symposium 677

IME App

Client Apps

InputMethod

ManagerService

InputConnection

Touch Event

Plain Text

Invoke IME

Awaken

Start

Input

Show

Text

InputMethod

Service

EditText

User

Figure 2: The workﬂow when using an IM E app.

Contributions. In short, we make the following contri-

butions:

• New Problem. This is the ﬁrst a ttempt to systemat-

ically understand the threat caused by the leakage of

private sensitive keystrokes in third-party IME apps.

Our discovery shows the pervasive presence of such

attacks, and the seriousness of the problem.

• New Technique. We introduce oblivious sa nd-

boxing for IME apps tha t embraces both security

and usability and quiescent points based check-

point/restore that signiﬁcantly simpliﬁes the d esign

and imp lementation of

I-BOX.

• New System. We demonstrate a working prototype

of the techniques and a set of evaluations conﬁrming

the security threat of commercial IME apps and the

effectiveness of

I-BOX.

2BACKGROUND AND MOTIVATION

In this section, we ﬁrst describe the necessary back-

ground on IME architecture in Android, and then discuss

why commercial IME apps have the incentive to collect

a user’s data, followed by the case studies s howing how

IME apps can leak users’ sensitive data to remote parties.

2.1 Input Method Editor

Though Android provides a default IME app for each

language, many end users prefer using third-party IME

apps for better user experiences, such as changing the

screen layout for faster input, generating personalized

phrases to provide intelligently assoc iational input, and

providing more accurate translation from keystrokes to

the target languages. As a result, mobile operating sys-

tems such as Android provide an extensible IME infras-

tructure to allow third-party vendors to develop their own

IME a pps.

Figure 2 gives an overview of the involved IME com-

ponents when entering text in a client app. Speciﬁcally,

third-party IME apps must conform to the IME frame-

work so that the Android Input Method Management

Service (IMMS) can recognize and manage them. For

example, every IME app contains a class that extends

from InputMethodService, which helps Android

recognize it as an input service and add it into the sys-

tem as an IME app. When an end user clicks a textbox

to invoke an IME app, Android IMMS will start the de-

fault IME activity and build an InputConnection

between the IME app and the client app that helps the

IME app to commit the us e r input to the client app. In

particular, the IME app ﬁrst gets the touch event con-

taining the pos itio n data and translates it to mean in g ful

characters or words based on its keyboard layout and in-

ternal logic. Then it sends the keystrokes to the c lient

app through InputConnection.

The IME architecture is clean with well-deﬁned

classes. This not only signiﬁcantly saves pro-

grammer’s effort in developing a new IME app,

but also makes it easy for attackers to locate

key points of a victim IME app. For instance,

our study found that simply hooking the function

BaseInputConnection.commitText can inter-

cept all the user’s input in many IME apps. This

can be done by simply searching for the keyword

BaseInputConnection.commitText in the de-

compiled code to locate all of its occurrences.

2.2 Why IME Apps Collect Users’ Input

Third-party IME apps usually extend the standard IME

apps with lots of rich features to provide a better user ex-

perience. Such features usually require collecting users’

input data to learn users’ habits to allow personalizing

IME apps. Further, such data may also collectively be

used to improve experiences of other users, i.e., push-

ing phrases learned from a set of users to others. In fact,

there are many features that require collecting user input

data. The following lists a few of them:

• Personal dictionary. Commercial IME apps usu-

ally remember the words and phrases from user

input to speed up follow-up input (especially for

non-Latin languages) by prompting potential results

when input is not ﬁnished. To achieve this, they

need to maintain a personal dictionary for each user

to save frequently typed or self-made words.

• Cloud input. As users usually have multiple de-

vices and need to synchronize personal dic tionary

among them, IME apps utilize cloud-based services

to store the dictionary and to synchronize the dic-

tionary and personal settings between different de-

vices.

Meanwhile, some non-Latin languages such as

those eastern languages differ from English in that

IMEs need to translate users’ keystrokes to words

in those languages. To accelerate input speed, IMEs

678 24th USENIX Security Symposium USENIX Association

may usually need to leverage cloud services to ana-

lyze and predict users’ intended words based on the

current input.

In a ddition, for some latin-based languages, some

IME apps provide a feature that leverages the cur-

rent input to predict the intended phrases and adjust

the layout of the soft keyboard to make the soft key

of the next character close to use rs ’ current ﬁgure.

To better predict user intent, some IME apps usually

leverage the abundant resources in cloud to analyze

and predict user input. Meanwhile, they also collect

users’ habits to improve the accuracy of prediction.

• Search media ti on. Some IME apps have a new

feature named “search mediation”, which intercepts

user input and returns some search result back to the

user. However, this means that user inputs will be

unrestrictedly sent to the search engine.

Note that due to the unstable network connectivity of

mobile devices, almost all IME apps can work properly

with and without network connections. When network is

disconnected, an IME app may store current input (like

frequently used phrases) for later use when the network

connection is on. Besides, Android’s conﬁgurable per-

mission model indicates that an IME app usually works

normally even with o u t grants of certain permissions.

2.3 Possible Threats Posed by IME Apps

While third-party IME apps do offer useful features and

better user experiences, they may unduly collect user

data or be repac kaged to be malicious. Next, we study

the poss ible threats an IME app could impose.

Privacy leakage in “benign” IME apps. Conventional

wisdom is to trust a respected service provider, in the

hope that the provider will enforce policies in the cloud

to faithfully provide user se c recy [30]. Unfortunately,

this exposes users’ sensitive keystrokes from two threats.

First, a curious or malicious operator may stealthily steal

such data [47, 41], which has been evidenced by numer-

ous insider data theft inc ide nts even from reputed compa-

nies [40]. Second, even reputed cloud providers provide

no guarantee on the security of user data, which is evi-

denced by their user agreements. Hence, it is reasonable

to not trust an IME app to securely protect users’ data.

More speciﬁcally, a severe threat from “benign” IME

apps is that they may have unduly collected user data

without users’ awareness. Given that we do not have

their source code and they often use proprietary proto-

cols with encryption, it thus remains opaque to end users

how the IME apps really handle the sensitive input data.

At a high level, since they have been collecting user data

for better experiences (especially the personal dictionary

and cloud input), it is highly likely that much of a user’s

sensitive input has been leaked to these IME providers.

To conﬁrm our hypothesis, we conducted an experi-

mental s tu d y by performing a man-in-the-middle attack

on a popular IME app, namely TouchPal Keyboard (in

version chubao 5.5.5.67049, cootek). This IM E app

provides multiple rich functionalities su ch as cloud in-

put and a pers ona l dictionary and has been installed

more than 7.09 million times from a third-party market.

By intercepting its network packages us ing Wiresha rk

we found that its cloud input is implemented using an

HTTP POST command which carries several parameters

in plain text. Therefore, we are able to s ee how it works

without any protocol reverse engineering a nd packet de-

cryption. A deep investigation revealed that these param-

eters include a userid, the keycode that a user just

entered, and the existing words of the target input con-

trol that user is focusing on. This contradicts its privacy

statement of “No collection of personal information that

you type” in a prior statement

, and thus poses a serious

threat to user privacy.

We suspect there may be many other commercial IME

apps that also leak users’ se nsitive input. Currently,

we only used side-channel analysis [11] to analyze the

packet size between the IME apps and their servers. We

did notice there are notable differences in the number of

packets (as reported in §5.2).

Privacy leakage in malicious IME apps. Even if a ll

third-party IME apps did not leak any user’s private data,

there are still other attack vectors such as repackaging

attacks. In fact, a prior study uncovers that repackaged

malware samples account for 86% of all malware [49].

Moreover, there are also trojans that serve as key loggers

but masquerade as IME apps [29]. Finally, IME apps

may also be vulnerable to component-hijacking attacks.

It has bee n shown that input methods have been a popu-

lar means to inject malicious cod e [29]. While currently

we are not aware of any repackaged malicious IME apps

in Android, we envision that there will be such malware

given the large popularity of the ofﬁcial apps and the eas-

iness of repackaging them as shown below.

To unde rs tand the repackaging threat of IME apps,

we conducted an attack study by repackaging a popu-

lar commercial IME app called Baidu IME, which has

been downloaded more than 100 million times in a third-

party market. In this study, we repackage the IME app by

inserting a malicious payload into the original program.

The payload records a ll user input and sends them to a

speciﬁc server.

While the core logic of the Baidu IME app is written

http://www.wireshark.org/

We noted that the newer versions of TouchPal changed their pri-

vacy statement indicating that they will collect user privacy data.

USENIX Association 24th USENIX Security Symposium 679

using C, the other c ompone nts are written in Java

which enables an easy reverse engineering of the

bytecode especia lly with existing tools. Speciﬁcally,

we used baksmali [2], a popular Da lvik disassem-

bler to reverse classes.dex into an intermediate

representation in the form of smali ﬁles. Then we

directly modiﬁed smali code to insert our payload,

which captures the text committed by the function

BaseInputConnection.commitText and then

sends the data out. A caveat in this s tudy is that we found

it would not work if we simply repackaged the app be-

cause the IME app has a checksum protection. However,

the protection mechanism is rather simple, as it just calls

a self-crash function when detecting repackaging. How-

ever, the self-crash function is not self-protected and thus

we rewrote it to return directly to disable the prote ction.

We conducted our experiment in a contained environ-

ment and did not upload this repackaged IME app to any

third-party Android market, but attackers can easily do

this, as reported before [49, 48]. We installed this repack-

aged IME app on our test smartphone and all data w e

input through it was divulged. Our attack study shows

all critical data that a use r inputs will be compromised if

the IME app is malicious . The popularity of third-party

markets aggravates this problem, especially considering

that 5% to 13% of apps are repackaged in a number of

third-party markets [48].

3OVERVIEW

The goal of I-BOX is to protect users’ sensitive inp u t,

while still preserving the usability of (c u riou s or mali-

cious) IME apps such that users can still beneﬁt from

the rich features. One possible approach might be let-

ting users switch to a trusted IME app when they want

to type some sensitive information. While this may work

for simple sensitive data like passwords, some us e rs ’ sen-

sitive input (like addresses and diseases) is scattered in a

long conversation. It is cumbersome for users to con-

stantly keep this in mind and do the switch. Another

intuitive approach would be to block all network con-

nections during user input, but doing so will negatively

affect the user experience. Besides, there are also other

channels like third-party content providers and external

storages that an IME app may temporally store input data

to be leaked later. Therefore, we have to look for new ap-

proaches.

Approach overview. As discussed, the key challenges

of securely using third-party IME apps are that such apps

are usua lly closed-source and they may do arbitrary pro-

cessing and transformation of users’ input data before

sending it out. It is thus hard to model or predict their

behavior. Hence,

I-BOX instead treats an IME app as a

black box and makes it oblivious to users’ sensitive in-

put data. To achieve this,

I-BOX borrows the idea from

execution transactions by running an IME app transac-

tionally. Consequently, if an IME app touches users’

sensitive input data,

I-BOX will roll back the IME app’s

states to make it oblivious to what it has observed so as to

address the problem where an IME app stores and trans-

forms users’ input data.

I-BOX regards the user input process as a transaction,

which begins when a user starts to enter the input and

ends w he n the input session ends. A clean s napshot of an

IME a pp will be saved before an input transaction starts.

For normal input transactions without touching sensitive

input data,

I-BOX will commit the IME app’s state such

that the IME app can use these data to improve the user

experience. To prevent malicious IME apps from send-

ing private data out during the input transaction, the net-

work connection of the IME app will be restricted when

the current transaction is marked as sensitive. When an

input session ends and thus the client app has received

all user input,

I-BOX will abort the input transaction from

the view of the IME app, by restoring the IME app’s state

to a most-recent checkpoint. This makes the IME app

oblivious to the sensitive data it observed. Henc e, even if

the IME app locally saves a user’s input to be sent later,

the input data will be swiped during restoring.

As input data is provided in a streaming fashion by a

user, there is no general way to know the input stream

in advance. Because the IME app gets the input data

prior to

I-BOX, it would be too late to s to p an IME app’s

leaking channels like network connection after it gets the

whole input since it may have sent it out or store it lo-

cally. Hence, it is generally impossible for an approach

not leaking any user input before

I-BOX can determine if

the current input stream is sensitive or not.

As a result,

I-BOX chooses to use a combination of

context-based and policy-driven approaches based on the

state of the IME app, with the goal of striking a balance

between user experience and privacy. For speciﬁc input

such as passwords, which

I-BOX can determine through

input context,

I-BOX can immediately know they are

sensitive and thus constrains IME app’s behavior (like

blocking networking for the app). For general input,

I-BOX uses a state-machine based policy engine to

predict whether the current input transac tion is sensitive.

This is done continuously during the input process,

where

I-BOX uses the current partial input stream to

determine if the next string is sensitive or not.

An architectural overview of

I-BOX is presented in

Figure 3.

I-BOX consists of an isolated user-level pol-

icy engine that decides whether

I-BOX shall commit or

roll back the execution of an IME app’s state. The sand-

box module is implemented as a kernel module, which

saves and restores the states of an IME app as needed.

680 24th USENIX Security Symposium USENIX Association

IME App Client App

Kernel

Module

kevin1987@hotmail.com

[email protected]

ssn:642-38-1689

...

I-Box

Daemon

Network

Control

Checkpoint

/ Rollback

My mail is

alice@bob.com

USER Level

KERNEL Level

Figure 3: An Architectural Overview of I-BOX

Challenges. To realize I-BOX, we are facing several

challenges. In particular:

• How to express and enforce security policies? As

users’ privacy policies are usually vague, it is criti-

cal to efﬁciently represent users’ policies such that

there won’t be a state explosion problem. This is

especially challenging to handle for non-Latin lan-

guages as they usually require an additional layer

of translation to represent them. Further, once the

policies are represented, it should also be relatively

easy to check the current input against the policies,

which is c ritical to the latency of the checking.

• How to efﬁciently perform the checkpoint and

rollback? As checkpoint and rollback are triggered

during input, lengthy checkpoint and rollback may

extend the latency of users’ input. However, tradi-

tional checkpoint and rollback usually require e ither

expensive copying of applications’ states, or heavy-

weight recording of applications’ execution. For

example, prior checkpointing on server platforms

takes around 600ms without copying ﬁles [28].

• How to ensure consistency upon rollback? By

considering the user’s input process as a transac-

tion,

I-BOX can ignore the implementation details

of different IME apps and take them as normal pro-

cesses from the kernel’s viewpoint. However, there

also intensive cross-layer and cross-component in-

teractions between an IME app and the rest of the

environment, like the Dalvik V M, the application

framework and the client app. Further, the IME app

is essentially multi-threaded. Hence, consistently

checkpointing and rolling back an IME app’s states

while preserving the sta te s of other components is

another key technical challenge for

I-BOX.

Threat model and assumptions. As third-party IME

apps have the incentive to collect and send out users’ da ta

Sandbox

IME

App

I-BOX

Deamon

I-Box

K-Module

IMMS

Client

App

Checkpoint IME

App state

Invoke IME App

Start

Input

Data

Analyze

Input Data

Network

Notify

rollback

Rollback IME

App state

Step

Figure 4: I-BOX work ﬂow.

and some IME apps are even malicious (repackaged or

even faked),

I-BOX considers all third-party IME apps as

untrusted. However,

I-BOX trusts the underlying smart-

phone OS, including the OS kernel, system services and

any process with root or system privileges. Also, we as-

sume the user’s smartphone has not been rooted such that

the untrusted IME app cannot break the default security

isolation between different apps, especially for system

and use r-level apps.

I-BOX relies on input c ontexts and a user’s policy to

distinguish private data from normal input data. It is pos-

sible that

I-BOX may lea k sensitive user input if the pol-

icy is incomplete or inaccurate, or the user’s intent has

changed after specifying the policy. Further, depending

on the state machin e,

I-BOX may leak a preﬁx of some

sensitive input.

I-BOX also trusts the end user and rely on her as a

witness to prevent a malicious IME app from tampering

with the user’s input during typing. This should be easy

as she can tell the difference between what she typed and

what she observed from the input screen.

We consider the client app that uses the services from

an IME app as trusted. While a rogue or malicious client

app may also steal users’ sensitive input, a malicious

IME app causes more security impact than a malicious

client app as it leaks all user input to all client apps (in-

cluding system apps) in contrast to only input to a spe-

ciﬁc (third-party) client app. How to protect third-party

client apps is out of the scope of this work and many

prior efforts have intensively studied solutions to prevent

information leakage from apps [24, 51, 34].

USENIX Association 24th USENIX Security Symposium 681

4DESIGN AND IMPLEMENTATION

The work ﬂow of how I-BOX works is illustrated in

Figure 4. Speciﬁcally,

I-BOX intercepts a user’s input

data by placing hooks into Android Input Method Ser-

vice (IMS) and detects the sensitive data from the in-

put stream based on the policy engine.

I-BOX uses

both context-based and preﬁx-matching policies (§4.1)

and enforces them using transactional execution (§4.2)

to protect sensitive data such as passwords. Before div-

ing into the details how we design and implement

I-BOX,

we ﬁrst use a running example to illustrate h ow it really

works.

A running example. Assuming a sensitive string “Is-

UsenixSec2015” is being typed by a use r through an IME

app,

I-BOX ﬁrst makes a checkpoint of the IME app as a

clean snapshot before input. If this string is being typed

to a password textbox (context-based policy),

I-BOX im-

mediately knows that the string to type is sensitive and

will restrict the IME app’s behavior (such as stopping

network connections). Otherwise,

I-BOX intercepts the

characters and runs the analysis through the policy en-

gine. After getting the characters ‘I‘, ‘s‘, ‘U‘, and ‘s‘,

I-BOX predicts tha t the user may be typing the sensitive

string “IsUsenixSec2015” and

I-BOX restricts the IME

app’s behavior immediately to prevent it from sending

further keystrokes out (preﬁx-matching policy). After-

wards, the IME a pp continues to accept input from users’

typing and

I-BOX monitors the ﬁle operations of the IME

app to record the ﬁles that may log the input data. After

the us er ﬁnishes typing,

I-BOX conﬁrms that a sensitive

string was typed into the IME app and restores the states

of the IME app with the checkpoint to cle a n the sensitive

string out.

4.1 Policy Engine

The policy engine of I-BOX separates sensitive input

from normal input such that different policies can be ap-

plied to different types of da ta.

I-BOX uses both context-

based and preﬁx matching strategies to derive policies,

with the ﬁrst strategy having higher priority.

Context-based policy. We ﬁrst provide an automated

approach to deriving which input would be s e nsitive

based on the type of the input and execution context of

an app. Speciﬁcally, Android uses text ﬁelds to help the

user type text into client apps. Text ﬁelds can have dif-

ferent input types, such as numbers, dates, passwords,

or email addresses. In fact, the type information of text

ﬁelds in the client app has been used to help an IME app

to optimize its layout for frequen tly used characters.

I-

OX also leverages the type information of the text ﬁelds

to decide whether the input is sensitive or not, a nd pass-

words and email addresses are by default sensitive. In

addition, based on the user deﬁned per-client app pol-

icy (e.g., an IME app is providing services to a banking

application),

I-BOX will automatically tre at all the input

consumed by a sensitive app according to context [44] as

sensitive.

Preﬁx-matching policy. For general input streams,

I-

OX leverages preﬁx matching to distinguish which in-

put stream is sensitive or not. One challenge for deﬁning

policies for

I-BOX is that IME a pps may need to handle

multiple languages, including both Latin languages and

non-Latin languages. For non-Latin languages,

I-BOX

can only get the text in the target languages after an IME

app has translated the keystrokes for the corresponding

text. Hence, it is not viable to simply use keystrokes

to represent the current input. To address this problem,

I-BOX instead uses the UTF-8 (8-bit Unicode Transfor-

mation Format) of th e translated keystrokes to represent

current keystrokes as well as those in the policy engine.

As there may eventually be multiple data instances

that should be considered as sensitive,

I-BOX uses a trie -

like structure to maintain which data should be consid-

ered as sensitive. A trie-like structure is very space-

efﬁcient for data with a common preﬁx and is very ef-

ﬁcient for look-up.

I-BOX maintains a globa l trie struc-

ture to represent the global policy.

I-BOX may also

provide an application-speciﬁc trie structure if an end

user demands more strict policy. During a query,

I-BOX

queries the global and application-speciﬁc trie structures

in parallel but prefers application-speciﬁc policies over

the global one.

While much of the sensitive data like contacts and

cookies can be automatically translated to the trie struc-

ture as the default policies,

I-BOX also allows end users

to use regular-expressions when they manually spe cify

the policy. For example, user ma y deﬁne “abc*” to in-

dicate any word starting with “ abc” as sens itive input.

Associated with the regular expression , there is als o an

acceptable disclosure rate (ADR), which deﬁnes how

many characters can be exposed in an input stream. The

larger the ADR, the more information may be leaked but

the more chances are allowed for cloud assistance. Using

regular expression is easy for experienced users to spec-

ify sensitive input, as it does not require them to fully

remember all such sensitive data and thus matches users’

ambiguous and incomplete memory. This also avoids

asking users to input full secrets to

I-BOX. Alternatively,

average users may also specify full secret names (i.e., a

special case of regular expression) to

I-BOX.

I-BOX provides a simple script to add such regular-

expressions to the trie-like structure and report any con-

ﬂicts if they occur. For example, for a sensitive string of

15 characters (such as ‘IsUsenixSec2015’) and an ADR

of 0.2,

I-BOX will restrict an IME app’s behavior when

682 24th USENIX Security Symposium USENIX Association

the ﬁrst three characters (‘IsU’) a re typed. I-BOX runs

the trie-structure as a state machine to predict the in-

put stream by matching the typed characters with the trie

structures. Since any substring in the input data may be

sensitive,

I-BOX needs to check all of them. To speed

up this process,

I-BOX searches all poss ible substrings

when a new character is typed. Intermediate states are

maintained so that only new characters need to be han-

dled instead of new substrings constructed by the char-

acter.

Note that currently

I-BOX directly searches over the

plain text of the policy ﬁle and relies on the Android per-

mission system to protect it for simplicity. This can be

further enhanced by encrypting the policy ﬁle and us-

ing regular expressions to search over the encrypted ﬁle,

which was shown to have small runtime and space over-

head [36].

Preﬁx-substitution attacks. At ﬁrst glance, the preﬁx-

matching policy used by

I-BOX would appear to be vul-

nerable to a preﬁx-substitu tion attack by a malicious IME

App. Speciﬁcally, a malicious IME app might ﬁrst re-

place the preﬁx of a typed string with a non-sensitive one

so that

I-BOX wouldn’t recognize this preﬁx a nd thus no

oblivious sandbox would be applied for this input ses-

sion. Fortunately, we note that users, the ultimate wit-

ness, would immediately notice this by observing the dif-

ference between what they typed and what was displayed

on the screen.

Note that as

I-BOX monitors all keystrokes sent from

IME apps to us e r apps,

I-BOX will adjust the state ma-

chine accordingly for any cursor movement and special

characters like deletion. This can detect the case where a

malicious IME app stealthily moves the cursor to deceive

I-BOX on the input sensitivity.

Overall,

I-BOX requires users’ awareness of what she

types from what she observes to detect malicious behav-

ior from an IME app. If a user does not pay enough at-

tention to the input process, a malicious IME app may

still have the chance to fool

I-BOX about the sensitivity

of the input streams.

4.2 Enabling Tran s actional Exe cution

To enable transactional execution of an IME app, I-

OX nee ds to provide a checkpoint and rollback mech-

anism. The key challenges here lie in how to provide

low-latency and ens ure consistency, which are made es-

pecially difﬁcult by A ndroid’s unique design. For exam-

ple, Android uses a Dalvik virtual machine (VM) to run

the Java code of the IME a pp, whic h interacts intensively

with the a pplication framework. Further, the native code

of an IME a p p also interacts with the Dalvik VM through

the Java Native Interface (JNI). Finally, Android inten-

sively uses Binder, a complex IPC mechanism for c om-

munication among isolated apps. Such hybrid execution

and complex communication make it hard to efﬁciently

and cons istently checkpoint the states of an IME app.

I-BOX addresses the above challenges by leveraging

a set of quiescent points. A quiescent point is a point

such that all threads of an application have stopped ex-

ecution and there are no pending states and requests to

be processed. Doing checkpointing at quiescent points

frees

I-BOX from handling a number of subtle states like

residual states in stack or other communication peers.

Further, it also requires less states to be checkpointed.

Finally, when

I-BOX rolls back the states of an IME app,

the states can be restored consistently without having to

deal with s ome subtle residual states in o th er apps.

In the following, we describe in greater detail how

we choose the quiescent points (§4.2.1), how

I-BOX per-

forms the checkpoint and restore of the local states of an

IME app (§4.2.2), and how

I-BOX handles interactions

of an IME app with others through IPCs (§4.2.3).

4.2.1 Quiescent Poi nts

Our key observation is that an IME app is essentially

an event-driven app that provides services to the client

app. Consequently, it shall be usually in a quiescent point

when a user is not typing, as no event will be delivered

to the IME app at that time. At this state, the IME app’s

states are stable and consistent. Thus,

I-BOX can be re-

laxed from handling a lot of complex and subtle local

states. To achieve this ,

I-BOX ﬁrst checks if an IME app

is in a quiesc ent point by checking the process and thread

states (sleeping or not) and the IPC states. The checking

result is very likely to be true for most cases. Even if

the IME app refuses to cooperate with

I-BOX and keeps

itself busy,

I-BOX can ﬁrst wait a short time and the n

enforce a quiescent point by blocking new requests and

then forcing the IME app to sleep to do the checkpoint.

Here, a non-cooperative IME app could also be a sign

of being malicious. However, we never encountered this

case as the IME apps we tested always conform to An-

droid IME architecture. Even if so,

I-BOX may always

roll back the IME app to a clean state checkpointed early.

4.2.2 Checkpointing and Restori ng Local States

Since data typed by a user can be stored into any place of

the IME app in any form, it requires that all process states

restore in order to wipe out any sensitive data. The tra-

ditional way of doing checkpoints is copying all rela ted

process states into storages, which is very heavyweight

and would incur long latency. As the main purpose of

I-BOX’s checkpoint is to either rollback or discard later,

I-BOX chooses a lightweight approach to checkpointing,

which creates a shadow process and then tracks all later

changes by using copy-on-write (COW) features pro-

vided by Linux.

USENIX Association 24th USENIX Security Symposium 683

Saving and restoring ﬁle states. As typical IME apps

usually only modify a small amount of ﬁles during one

input transaction,

I-BOX currently records and copies

such ﬁles during checkpointing and restores them during

rollback for simplicity. Another option is us in g a COW

ﬁle system like Btrfs or ext3cow to avoid copying. This

requires replacing Android’s ﬁle system with one with a

COW feature, which will be our future work.

Android provides several options to s ave persistent a p-

plication data. Based on the position where the data is

stored, we can divide these options into two categories:

internal and external storage. Every Android app will be

assigned with a private directory in the internal storage

to store ﬁles and data. By default, da ta saved to the in-

ternal storage are private to an app and other apps cannot

access them (nor can the user).

I-BOX just copies all ﬁles

in the IME app’s private directory and then restores the

ﬁles modiﬁed during the input transaction upon rollback.

Since there are usually only a small number of ﬁles in the

internal storage for an IME app and the modiﬁed ones are

even less, the time cost is negligible.

For external storage, any Android apps with

proper permissions (e.g., android.permission.

WRITE

EXTERNAL STORAGE) can access the whole

external storage. It would be very lengthy if

I-BOX

scanned the whole external storage to ﬁnd the modiﬁed

ﬁles. Hence,

I-BOX records all the ﬁles modiﬁed by the

IME app during the input transaction and then restores

them as needed. Speciﬁcally, once

I-BOX detects the

IME app tries to write some data into a ﬁle, it duplicates

the ﬁle for subsequent restoring.

Note that as the checkpointed ﬁles are created by

I-BOX, which runs as a system process, the ﬁles are with

system privilege and thus cannot be read/written by the

IME app itself. This ensures that an IME app cannot ﬁrst

save sensitive key logs into such ﬁles and later read them

out. Actually,

I-BOX also removes the checkpointed

ﬁles after rolling back an IME app.

Saving and restori ng memory states. Memory states

include the IM E app process ’s data in memory and

process-related metadata maintained by the OS ker-

nel (i.e., Linux). Linux uses a lot of data structures

to manage a process and maintain its state, such as

task

struct, thread info and others. I-BOX

relies on a kernel module to save and restore such data

structures. Speciﬁcally, this module maintains a shadow

process in the kernel to store the data of each running

IME app. The s hadow process duplicates the process

states of the original IME app by copying the metadata

of the IME app into its own task

struct but with

some modiﬁcations for consistency. For example, it

has its own kernel stack and redirects the stack pointer

in the task

struct to its own one, although the

content on the stack is the same as in the original IME

app. For independent states like process ID or kernel

stack,

I-BOX just copies the data into a buffer and writes

them back later. As for other states connected with

other processes or other events like a pipe or waitqueue,

I-BOX needs to record the s ta te s and the relationships

so that it can recover it correctly later. Besides this,

I-BOX als o needs to save the process memory. Instead of

really copying the memory pages,

I-BOX simply creates

a shadow page table that shares the memory with the

target IME app process and marks the page table of the

target process as COW. This omits lots of unnecessary

page copying sinc e most pa ges will not be modiﬁed

during the input transaction and it just needs to switch

the page table root to restore the memory, which is very

fast. This helps reduce the stop-time of each IME app

process when

I-BOX tries to do checkpoint and restore.

Multi-thread rollback. Most Android applications run

in the Dalvik virtual machine and have multiple threads

for different purposes . Besides the main thread for UI

and the core logic of the IME app, there are about another

10 threads for garbage collection, event handling, Binder

IPC, and so on. To roll back the process states of an IME

app correctly,

I-BOX needs to deal with such threads

properly. Linux assigns task

struct to a thread just

like a proce ss to maintain its state and groups all threads

belonging to one process together through a list. So

I-

OX saves each thread with a separated shadow pro-

cess and groups these processes together through a list

to maintain their parent-child relationships jus t like the

original one. The sharing resources between threads will

be duplicated too. For example,

I-BOX will save the pipe

states between two threads and restore it later.

4.2.3 Handling IPCs

One major challenge I-BOX faces in checkpoint and roll-

back is how to deal with the IPC states of an IME app

process. An IPC involves multiple processes or even

multiple machines, but

I-BOX can only control one end

in the communication. One potential problem is that the

other side of an IPC may wait for a reply that will never

be sent, s inc e the IME app process has forgotten this re-

quest after rollback. Another serious problem is that the

client app may c o mmu n icate with an inactive IPC that

has been erased from the IME app process due to roll-

back.

As a result,

I-BOX needs to ﬁnd proper timing to do

checkpoint and rollback such that the consistency of a n

IPC is not violated. Proper timing requires several condi-

tions. First, there should not be any data in transmission

between two processes; otherwise it will lead to a cor-

rupted request with incorrect semantics. Second, there

should be no pending IPC requests. This means an IME

684 24th USENIX Security Symposium USENIX Association

app shall wait for all replies before doing checkpoint and

ensure that no request is pending to the process before

rollback. Fortunately, it is not hard for

I-BOX to ﬁnd

suitable timing because

I-BOX only does checkpoint and

rollback when a user does not input. In most cases, the

IME app processes should be sleeping at that point. If

not, we can safely enforce it without dis turbing other

client apps since the us er is not typing.

Inter-threads IPC. Linux provides a set of IPC mech-

anisms such as pipe, socket, and shared memory. An-

droid inherits such mechanisms but only uses them as a

method for communica tio n between threads within a sin-

gle process. Hence,

I-BOX can control both ends of these

inter-threads IPC, which avoids the inconsistency issues

due to unilateral actions. For example, the two commu-

nicating parties of a pipe in a single process have a pair

of pipe fd; the OS kernel allocates a buffer for them to

pass the message. To restore the pipe correctly,

I-BOX

just keeps a record of current pipe status and its buffered

data, then restores it as needed. There is no restriction

on the timing for checkpoint and rollback. Other IPCs

within the sa me process are done similarly to this.

Android Binder. Android heavily uses its own IPC

mechanism: Binder, which helps the Android permission

system to provide access control to Android services and

resources. By mapping kernel memory into user space,

Binder IPC only requires one data copy for one transmis-

sion, i.e. , from the sender’s user space to the kernel buffer

of the Binder driver. Then the receiver can directly read

the data from its read-only user space mapping, which is

more perform ance-friendly. Th ere are two issues

I-BOX

needs to take care for a consistent restore of the Binder.

More speciﬁcally:

• Reference counting for Binder proxies. An An-

droid app uses Binder proxies (e.g., BBinder, Bp-

Binder) as the reference to remote processes instead

of simple ﬁle descriptors. The Binder driver in the

kernel needs to manage the reference counter for

such proxies so that it ca n know whether a binder

instance is useless or not.

I-BOX needs to track and

record modiﬁcations to references to Binder proxies

so that it can keep the consistency of the reference

counters.

• Conversation between the Bi nder request and re-

sponse.

I-BOX also needs to keep the conversa-

tion between the Binder transaction request and re-

sponse. As an Android service provider, an IME

app process will accept a Binder transaction request

from the client app and it will send back the transac-

tion response after disposing the request. To achieve

this,

I-BOX tracks the transaction request and re-

sponse to ﬁnd a right timing when all requests have

been handled. It is not hard to ﬁnd such a point be-

cause usually

I-BOX tries to do checkpoint or roll-

back whe n IME is idle without new requests.

Content Provider. An IME app may also interact

with both third-party and system content providers.

For example, our analysis with TouchPal IME

app reveals that this app accesses third-party con-

tent providers like content://com.tencent.

mm.sdk.plugin.provider/sharedpref

and content://com.facebook.katana.

provider.AttributionIdProvider; our anal-

ysis with Guobi IME app shows that this app accesses

content://com.iflytek.speechcloud.

providers.LocalResourceProvider and

content://com.tencent.mm.sdk.plugin.

provider/sharedpref. TouchPal accesses the sys-

tem content provider like content://sms/inbox

and both TouchPal and Guobi access content://

telephony/carriers/preferapn. In Android,

all requests to content providers are issued through the

Binder mechanism, we rely on the Binder mechanism

to detect a quiescent point. Fortunately, we note that

accesses to content providers are request-oriented and

thus connection-less. Thus, there is no request on-the-ﬂy

and thus

I-BOX can checkpoint such states accordingly.

Network. Different from Binder, the network driver

does not expose any semantic information to an upper

layer’s connections. Hence, it seems hard to maintain

the consistency of request and response between an IME

app process and its cloud-based server. Fortunately, there

are two observations that help rela x the strict consis-

tency requirement. First, network connections between

an IME app and the cloud-based server, like fetching

the words by sending the keystrokes, synchronizing the

user’s library, and downloading news or advertisements,

are usually stateless and non-transactional; a redo opera-

tion does not cause any consistency issues. Second, net-

work connections during input transactions are mostly

short-time synchronized requests that are ﬁnished when

input is done; hence they will not be affected by rollback.

Lessons Learned. While it is generally hard to check-

point a complex app like an IME app, the event-driven

nature of

I-BOX greatly helps simplify the design and

implementation of

I-BOX. By leveraging a quiescence-

point based approach and conduct checkpointing at the

time at which an IME app likely to be quiescent (e.g.,

before an input session sta rt),

I-BOX enjoys both less im-

plementation complexity and runtime overhead.

4.3 Restrictin g IME Apps’ Behavior

When I-BOX detects a sensitive input session, it needs to

restrict an IME app’s behavior such that no se nsitive data

USENIX Association 24th USENIX Security Symposium 685

should be leaked during this process. A malicious IME

app may leverage various means to store and transform

the data during this process. For example, it may directly

send input data to the network, or store the input data to

a content provider to be restored and sent out later. To

this end,

I-BOX needs to restrict an IME app’s behavior

to stop such channels for a sensitive input stream.

I-BOX constrains an IME app from using network and

accesses to content provider and services during a sen-

sitive input session. Speciﬁcally, during a sensitive in-

put s ession,

I-BOX only grants an IME app with read

accesses (like query) to such content providers and ser-

vices. This is done by interposing the binder

transaction

and acts according to the access types from the transac-

tion c ode (i.e., query, insert, update or delete).

One potential issue would be that the IME app may not

function correctly without such accesses. Fortunately,

most Android apps (including IME apps) are designed

to work gracefully w ith different permissions, due to the

fact that the user may grant different permissions and an

IME app may work without network accesses. As a re-

sult, it is non-intrusive to dynamically deprive the IME

app from certain accesses as evidenced by prior research

on dynamic permissions on Android [32]. After a roll-

back, as all residual states inside an IME app have been

cleaned, any pe nding actions like insertion or deletion

will not cle ared as if they never happen. Thus, there

won’t be any confusions to the content provider and ser-

vices.

5EVALUATION

We have implemented I-BOX bas ed on Android 4.2.2

and Linux kernel OMAP 3.0.72. It consists of two main

parts: i) a user-level modiﬁcation of the Android appli-

cation framework to insert the

I-BOX policy engine and

network control module; ii) a kernel module to handle

checkpoints and rollback of IME apps.

Experimental Setup. All of our experiments were per-

formed on a Samsung Gala xy Nexus smartphone with a

1.2 GHz TI OMAP4460 CPU, a 1GB memory and 16GB

internal storage. We evaluate

I-BOX using 11 popular

IME apps to measure the performance overhead of

I-

OX. The 11 IME apps (as shown in the ﬁrst column of

Table 1) are ranked among the highest in popularity in a

large third-party market

. Many of these IME apps have

been installed more than million s of times (Figure 1).

In our testing, we set the security policies to include all

contacts in the phone and all c ommonly used accounts

and passwords. This forms a trie containing around 400

words.

http://www.wandoujia.com/

5.1 Performance Evaluation

The time overhead of I-BOX comes from three parts: (1)

time to ﬁnd the quiescent points; (2) time to perform

memory checkpoint and rollback, and (3) time to per-

form ﬁle s ave and restore. To measure the performance

overhead, we asked a volunteer with an average typing

speed of about 100 characters per minute to e nte r a 10

word paragraph in an SMS app using the tested IME

apps. We did not use an automation tool like an Android

Monkey as it cannot handle the complex UI interface of

these IM E apps.

Latency. As shown in Ta b le 1, the time to ﬁnd a quies-

cent point is very small (less than 14ms). This conﬁrms

our observation that it is very easy and fast to ﬁnd or

force a quiescent point to do checkpoint and rollback on

an IME app. The time of saving and restoring an IME

app’s memory state is also very small (less than 29ms)

since

I-BOX does not really copy the whole memory but

just mark them as COW. Based on the ﬁles touched by

the IME app proce ss during the typing,

I-BOX needs to

restore a few ﬁles to prevent the IME app from conceal-

ing the secret inside ﬁles. Hence, the time for ﬁle save

and restore is a little bit lengthy (60 ms), whic h can fur-

ther be improved by using a copy-on-write ﬁle system. In

total, the maximum total time to do a checkpoint (includ-

ing ﬁnding a quiescent point) is less than 103ms (14ms

+ 29ms + 60ms). In contras t, the world rec ord of texting

is typing a complicated 25 word message (159 charac -

ters) in 25.94 sec onds [5], which corresponding to 163

ms/character and 1.0376 second/word. Hence, the time

to do checkpointing is very s ma ll compared to user typ-

ing. As the time to search the trie is negligible, we didn’t

report it here.

Power. To measure the power overhead incurred by

I-

OX, we used the TouchPal IME to input an article and

its non-Latin translation to a text-note app calle d Catch

and count its power s tatus. The total input process spans

30 minutes for both unmodiﬁed Android and

I-BOX-

capable Android. We found that in both case s the power

dropped from 100% to 99%, whose differences were in-

distinguishable. This is probably because the IME app

is not power-hungry and the additional power consumed

I-BOX was evened by the reduced network transmis-

sions, which is thus hard to be distinguished without a

highly accurate power meter. In our future, we plan to

further characterize the power consumption using a n a c -

curate power meter.

5.2 Security Evaluation

Here, we evaluate whether I-BOX inde ed has mitigated

the leakage of a user’s sensitive keystrokes. We still use

the IME apps in our performance testing, along with a

686 24th USENIX Security Symposium USENIX Association

Quiescent Memory File

IME app C (ms) R (ms) C (ms) R (µs) C (ms) R (ms)

Sogou 13.3 14.1 22.8 91 30 30

Baidu 8.2 11.1 22.6 275 40 40

QQ 12 11.8 24.3 31 60 30

Pinyin 11.8 12 20.8 122 10 10

Vee 5.9 10.3 0.022 61 20 20

Guobi 7.4 9.5 25.5 61 10 10

Octopus 11.4 11 28.9 245 30 20

iFlytek 4.6 9.7 13.9 92 10 10

Slideit 13.2 15.2 13.5 152 20 30

Jinshou 3.1 6.5 28 91 60 50

TouchPal 7.8 13.3 22.1 183 30 30

Baidu

∗

3.3 10.9 9 61 30 40

Average 8.4 11.3 22.5 140 28.3 26.7

Table 1: Time overhead for ﬁnding a quiescent point,

doing checkpoint (C) and rollback (R).

repackaged malicious IME app (described in §2.3), to

evaluate its effectiveness. According to the accessibil-

ity of these IME apps, we conducted three sets of ex-

periments to dete rmine effectiveness: black-box testing,

gray-box testing, and white-box testing.

5.2.1 Black-box Tes ting

Since most of the IME apps use proprietary unknown

protocols with unknown encryptions, we c annot directly

trace the network packets to conﬁrm our effectiveness.

Therefore, we take a black-box approach to approximat-

ing our result. That is, instead of inspecting the packet

contents, we inspect the packet differences sent by the

IME-apps w i t h

I-BOX and without I-B OX, within an

identical experiment setup and time window.

In particular, we ran all these apps using a two-minute

time window, and we typed around 30 non-Latin words

with “[email protected]” as the sensitive word and then ob-

served the packet differences using the Wireshark tool.

Usually, these IME apps will send some packages out

when a user types something that triggers the cloud input

function. Interestingly, we found 6 out of the 11 tested

apps have a different number of packages, as shown in

Table 2. With

I-BOX being enabled, there are less pack-

ages to be sent out compared to normal one s. This is

because

I-BOX controls the network of the target IME

app when it detects sensitive input data and prevents the

target IME app from leaking the data out.

While such side-channel based black-box testing can-

not fully conﬁrm that we have prevented a ll lea ks, we

believe it is highly likely that

I-BOX has stopped them,

even for the other 5 apps that w e did not observe pack-

age differences for. (It is highly likely that these IME

apps have buffered the input with the intent to s end the

data out later. However, our oblivious sandboxing mech-

anism will clear the buffered sensitive data).

IME app w/o I-BOX w/ I-BOX

Baidu 17 6

Sogou 44 30

QQ 37 20

Octopus 32 16

TouchPal 70 28

Baidu

∗

30 18

Table 2: #packages observed for the testing apps.

Figure 5: Hexdump of the traced Touchpal package. The

leaked SSN is highlighted.

5.2.2 Gray-box Testing

Among these 11 IME apps, we are able to observe the

packet payload of TouchPal (as in discussed in §2.3) be-

cause it uses a plain-text protocol. Therefore, we con-

ducted gray-box testing to conﬁrm

I-BOX indeed miti-

gated the privacy leakage. In this experiment, we open a

client “SMS” app to send a short message to one friend

with a socia l security number (SSN), which is private and

sensitive by default. The text to send is a mixture of both

Latin and non-Latin languages, as well as the number.

Cloud input functio nality w ill be triggered in this case.

Interestingly, without

I-BOX’s protection, we found

that Touchpal uploaded not only the keycodes the user

typed as arguments of cloud input, but also the text mes-

sage before the current input cursor that includes the

sensitive social security number to the cloud through an

HTTP POST method. We intercepted this packet using a

man-in-the-middle attack. Part of the packet is disp layed

in Figure 5. However, with

I-BOX’s protection, we found

that

I-BOX successfully detected the critical number and

shutdown its network to stop the leakage of data, and w e

did not observe any network trace.

We also studied the privacy warnings generated by An-

droid on which data an IME may collect. Figure 6 shows

that Android generates privacy warnings for two popular

IME a pps, Sogou and TouchPal, indicating that they may

collect users’ passwords, credit card number, etc. This

further conﬁrms our conclusion that they collect users’

privacy data.

USENIX Association 24th USENIX Security Symposium 687

Apps Without I-BOX With I-BOX

SMS (phone number) 6204562244 62045

SMS (message) Let’s meet tomorrow noon at room 302 Let’s meet tomorrow noon at room 302

Instagram (account) [email protected] thisisf

Instagram (password) fakepassword

Facebook (ac count) [email protected] thisisf

Facebook (password) dontbelieveit

Alipay [email protected] nomo

Gmail [email protected] tosom

Google Play Ingress Ingress

browser How much is this PS3? How much is this PS3?

Table 3: Evaluation result w/ repackaged Baidu IME using different client apps.

(a) Sogou I ME App (i n Chinese) (b) TouchPal IME App (in Engl ish)

Figure 6: Privacy Warning by Android for two popular

IME apps. The left is shown in Chinese and the right is

shown in English; the essential meanings are the same.

5.2.3 White-box Testing

As discussed in § 2.3, we repackaged a very popular

Baidu IME app to log all of the user input data and se nd

them out to a malicious server we controlled. Hence,

this repackaged IME app is essentially a keylogger. We

were able to perform white-box testing by inspecting the

packet payloads and conﬁrming them with the source

code of our malicious payloa d. We installed this IME

app on our test phone and then used this phone to en-

ter some user-deﬁned private sensitive data with differ-

ent client apps ranging from SMS, Facebook, and Gmail,

etc. Table 3 shows the data we collected at the server side

with and without

I-BOX’s protection.

From this table we can clearly observe that without

I-

OX, the malicious IME app will steal all the data that

a user enters. Consequently, all sensitive data has been

leaked out; with

I-BOX, it automatically blocks the net-

work connection so that the server cannot receive any

complete sensitive information. For instance, for pass-

words, the malicious server cannot receive anything as

shown in the Instagram and Facebook case. As

I-BOX

shuts down the malicious IME app’s network when it

ﬁnds character sequences that have matched part of the

sensitive phrase in our security policy, the server side can

only receive the parts of the typed c haracters. For exam-

ple, when a user tries to type her Facebook account thi-

sisfortest@gmail.com, the server side can only receive a

part of it, i.e. th isisf

. While partial sensitive input is

still being leaked, we believe it is still hard for attackers

to guess the original message.

5.3 Users Experience

One principal goal of I-BOX is to limit the negative inﬂu-

ence on an end user’s experience as little as possible. To

evaluate this, we tested latency by determining how an

end user would feel when typing characters on devices

protected by

I-BOX. For this , w e invited a dozen stu-

dents (6 undergraduate and 6 master students) in our Lab

to install

I-BOX on their phones, and asked them to use

our system and provide us with feedbac k. By default,

I-BOX uses the context-based policy and derives all sen-

sitive data from the contacts and cookies. Two of them

also tried to inp u t their girl-friend’s names and birth dates

into

I-BOX.

To our plea sure, none of the use rs complained of any

latency imposed by our system. As shown in Table 4,

there is only 0.4 milliseconds (ms) overhead per charac-

ter imposed by our policy checking. While network shut-

down takes about 180 ms, it is not executed per word and

is instead triggered only when certain sensitive words are

going to be formed. Therefore, the additional overhead

added by

I-BOX cannot be detected by end users. This

is because the typing speed for a normal user is 625ms

per character, and the world fast record is 160 ms per

character, as shown in Table 4.

One complaint we received so far is that the users now

need to manually type their account instead of using the

automation features provided by the IME apps. We be-

lieve this is worthwhile for bette r privacy protection. An-

other complaint is that they need to specify their addi-

tional secrets manually; this will motivate us to d esign

better UI interface in our future work.

Note that we regard the sequence after @ as one character because

an attacker can guess the rest by the ﬁrst character most of the time.

688 24th USENIX Security Symposium USENIX Association

Policy Checking 0.4ms/char

Network Shutdown 180ms

Checkpoint/Re store 103ms

Guinness World R ecords of fastest texter 160ms/char

normal us er speed 625ms/char

Table 4: Statistics regarding the usage latency of I-BOX.

6DISCUSSIONS AND LIMITATIONS

While I-BOX has made a ﬁrst step to mitigate keystroke

leakage against untrusted IME apps, there are still a num-

ber of limitations in its design and implementation.

Side-channel attacks It has been viable to use side

channels to infer s ome keystroke information [9, 4].

I-

OX currently cannot prevent such side channel attacks.

However, such threats are usually less severe than those

of malicious IME apps, which can accurately observe all

user input. We leave it as our future work to address

issues related to the side-channel leakages.

Colluding malw are As

I-BOX currently only runs an

IME app inside in a sandbox transactio n ally, it is still

possible that an IME app could collude with another mal-

ware to leak information (i.e., the colluding attack [8]).

For example, an IME app could ﬁrst save the user input

in a local ﬁle, and inform a colluding malware to re ad

the ﬁle when the transaction has not been rolled back and

then divulge the input. This essentially violates the p o li-

cies of

I-BOX. However, it is challenging for sandboxing

to reliably prevent this, as studied by TxBox [25].

Security of

I-BOX Any new security tools may bring

new security implications as they usually touch security-

sensitive data and

I-BOX is of no exception. As I-BOX

can essentially touch all users’ sensitive data, it is essen-

tially a key logger as well. Yet,

I-BOX is much simpler

than close-sourced proprietary IME apps (1,700 LOCs

vs. hundreds of thousands LOCs). Regarding whether to

trust

I-BOX or other IME a pps, third-party agents need to

only audit the code of

I-BOX instead of using gray-box

based approaches to auditing the behavior of dozens of

third-party IME a pps. Meanwhile,

I-BOX is completely

a local service and will not send any private data out of

the phone.

Permission Attacks As

I-BOX’s security is based on

Android permission systems, it ca nnot defend against at-

tacks against the permissions like component hijacking

attacks and confused de puty attacks [23]. We consider

this out of the scope of this paper; actually there have

been a number of prior systems that statically and dy-

namically detect and prevent such attacks (e.g., [12, 43]).

Actually, Android has signiﬁca ntly imp roved its permis-

sion systems since version 4.2 [3].

Voice input Currently we limit input data prote ction

to handwriting input and keystroke input and do not con-

sider voice input as it does not have keystrokes. Yet,

users usually use dedicated system services like Apple

Siri, Google Now and Microsoft voice recognition. How

to ha ndle voice input and preserve its privacy is very

challenging and will be our future work.

Beyond Mobile IME Apps N ote that the approach of

I-BOX does not necessarily only apply to mobile plat-

forms; Similar techniques can also be applied to de sk-

tops, which suffer from a s imilar dilemma betw een pri-

vacy an d usability. We may provide a similar oblivious

sandbox for each IME app, which should be straightfor-

ward as Android actually runs atop Linux. We le ave this

as our future work. Besides, other applications that re-

quires a tradeoff between privacy and usability may use

execution transaction like

I-BOX.

7RELATED WORK

Privacy leakage detection in mobile devices. Recently,

there have been signiﬁcant efforts on the detection of pri-

vacy leakage in mobile devices. Early attempts include

TaintDroid [16, 17] and PiOS [15], and recent efforts

include such as Woodpecker [22], AndroidLeaks [20],

ContentScope [50], and Appproﬁler [35]. In particu-

lar, TaintDroid [16] uses dynamic taint analysis to track

whether sensitive information (e.g. , address book) can

be leaked through the network. PiOS [15] uses static

analysis and focuses on the privacy leakage in iOS apps.

Woodpecker [22] leverages a n inter-procedural data-ﬂow

analysis to inspect whether an untrusted app c an obtain

unauthorized access to sensitive data. ContentScope [50]

detects passive content leak v u ln erabilities, by wh ich in-

app sensitive data can be leaked.

AndroidLeaks [20] instead uses static analysis to de-

tect data leakage in Android apps. Chan et al. [10] further

leverages mo b ile forensics to correlate user actions with

privacy leakages. Appproﬁler [35] creates a mapping be-

tween high-level API calls and low-level privacy-related

behavior, which is then used to provide a high-level pro-

ﬁle of App’s privacy behavior. Besides, there have also

been interests in dete cting privacy leakage due to mobile

ads [38]. In contrast,

I-BOX focuses on preventing leak-

age of sensitive keystrokes.

Privacy leakage prevention in mobile devices. Other

than detecting privacy leakage, there are also a number

of systems that prevent private data from being leaked.

By extending TaintDroid [16], AppFence [24] prevents

applications from accessing sensitive information using

data shadowing, and it also blocks outgoing commu-

nications tainted by sensitive data. Wh ile

I-BOX and

AppFence both block network communications when

sensitive data is to be leaked, there are substantial dif-

ferences: AppFence uses s ha dowing to provide an illu-

sion to the app such that it can continue performing its

taint tracking, whereas

I-BOX does not use any illusion

nor any instruction-level taint tracking, due to the per-

USENIX Association 24th USENIX Security Symposium 689

vasive existence of native code. Meanwhile, AppFence

does not encounter the challenges we faced such as con-

sistent rollback, and it only simply blocks the network

communication, whereas

I-BOX still has to keep the con-

nection and allow other data to be transferred.

TISSA [51] tames information stealing apps to stop

possible privacy leakage. SpanDex [14] further uses

symbolic execution to quantify and limit the implicit

ﬂows through a sandbox, to prevent an untrusted applica-

tion from leaking passwords. Through automatic repack-

aging of Android apps, Aurasium [43 ] attaches sandbox-

ing and policy enforcement atop existing apps, to stop

malicious behaviors such as attempts to retrieve users’

sensitive information. Unlike Aurasium that adds a sand-

box to an app, πBox [30] shifts the sandboxing protec-

tion of private data from the app level to the syste m level,

and offers a platform for privacy-preserving apps. How-

ever πBox trusts a few app vendors to protect users’ pri-

vacy data, wh ile

I-BOX treats the vendor of IME apps

as untrusted, due to their incentives to collect users’ in-

put. TinMan [42] instead completely ofﬂoad passwords-

like secret to a remote cloud, but only handles a class of

special secrets that are not necessary to be displayed in

mobile devices. ScreenPass [31] leverages a trusted soft-

ware keyboard to input and tag passwords and uses taint

tracking to ensure that a password is only used within

a speciﬁc domain. In contrast, while

I-BOX also uses a

trusted software keyboard for password input, it focuses

more on preventing a malicious IME from leaking se nsi-

tive da ta (not only passwords).

Checkpoint and restore.

I-BOX employs a check-

point and restore mechanism to prevent privacy leakage .

Such a mechanism has been built for transactional mem-

ory [6], execution transactions [37], as well as whole-

system transactions [33]. Retro [26] leverages selective

re-execution for intrusion recovery. Storage Capsules [7]

also use checkpoint and restore to wipe off residual data

after an application has viewed data in a desktop.

I-BOX

is an insta nc e of a system transaction but designed spe-

cially for untrusted IME apps.

Sandboxing. There have been a large numbe r of efforts

in building sandboxes to execute untrusted programs,

web applications, and native code. These tools were built

using a variety of approaches such as kernel-based sys-

tems [19], user-level approaches [27], system call inter-

positions [21], or binary code translation [18], and re-

compilation [45].

A sandbox that also contains transactions is the

TxBox [25], a tool built atop TxOS [33] for specula-

tive execution and automatic recovery. While

I-BOX and

TxBox share the similarity of using transactions to build

a sa ndbox, there are still signiﬁcant differences: the goal

of TxBox is to conﬁne the execution of native x86 pro-

grams atop Linux kernel, whereas

I-BOX is to conﬁne

the IME apps atop Android OS. Consequently,

I-BOX

faces additional challenges including resolving IPC bind-

ings. Further, us ing quiescent points in

I-BOX signiﬁ-

cantly simpliﬁes the design and implementation.

8CONCLUSION

This paper made a ﬁrst systematic study on the

(in)security of third-party (trusted or untrusted) IME

apps, and revealed that these apps tend to leak users’

sensitive input (due to their incentives of improving

user’s experience). To enjoy the rich-experiences offered

by such apps while mitigating information leakages,

this paper described

I-BOX as a ﬁrst step towards this

direction. In light of the opaque na ture of an IME

app,

I-BOX leverages the idea of trans a ctions to run an

IME app to make it oblivious to users ’ sensitive input.

Experiments showed that

I-BOX is efﬁcient, incurs little

impact on users’ experiences and successfully thwarted

the leakage of sensitive user input.

ACKNOWLEDGMENTS

We thank our shepherd William Enck and the anony-

mous reviewers for their insightful comments, Xiaojuan

Li and Yutao Liu for helping prepare the ﬁnal version.

This work is supported in part by the Program for New

Century Excellent Talents in University, Ministry of Ed-

ucation of China (No. ZXZY037003), a foundation for

the Author of National Excellent Doctoral Dis sertation

of PR China (No. TS0220103006), the Shanghai Sci-

ence and Technology Development Fund for high-tech

achievement translation (No. 14511100902), Zhangjiang

Hi-Tech program (No. 201501-YP-B108-012), and the

Singapore NRF (CREATE E2S2).

REFEREN C ES

[1] Free Chinese-made software poses security risk.

http://www.japantimes.co.jp/news/2013/12/26/national/chinese-

made-computer-input-system-banned- in-government-

agencies/#.U21w5

aPUS0.

[2] smali-An assembler/disassembler for Android’s dex format.

https://code.google.com/p/smali/.

[3] Security enhancements i n jelly bean. http://android-

developers.blogspot.jp/2013/02/securi ty-enhancements-in-

jelly-bean.html, 2013.

[4] A. J. Aviv, B. Sapp, M. Blaze, and J. M. Smith. Practicality of

accelerometer side channels on smartphones. In ACSAC, 2012.

[5] BBC News. Salford woman makes bid for fastest text title.

http://news.bbc.co.uk/loca l/manchester/hi/peopl e

and places/

newsid

8939000/8939790.stm, 2010.

[6] A. Birgisson, M. Dhawan, U. Erli ngsson, V. Ga napathy, and

L. I f tode. Enforcing authorization policies using transactional

memory introspection. In CCS, pages 223–234, 2008.

[7] K. B or ders, E. Vander Weele, B. Lau, and A. Prakash. Protecting

conﬁdential data on personal computers with storage capsules. In

Usenix Security, 2009.

690 24th USENIX Security Symposium USENIX Association

[8] S. Bugiel, L. Davi, A. Dmitrienko, T. Fischer, A.-R. Sadeghi,

and B. Shastry. Towards taming privilege-escalation attacks on

android. In NDSS, 2012.

[9] L. Cai and H. Chen. Touc hlogger: inferring keystrokes on touch

screen from smartphone motion. In HotSec, 2011.

[10] J. J. K. Chan, K. W. Tan, L. Jiang, and R. K. Balan. The case

for mobile forensics of private data leaks: Towards large-scale

user-oriented privacy protection. In APSYS, 2013.

[11] S. Chen, R. Wang, X. Wang, and K. Zhang. Side-channel l eaks

in web applications: A reality today, a challenge tomorrow. In

Oakland, pages 191–206, 2010.

[12] E. Chin, A. P. Felt, K. Greenwood, and D. Wagner. Analyzing

inter-application communication in android. In MobiSys, pages

239–252. ACM, 2011.

[13] China IT Research Center. Third-part IMEs us-

age stats in China for 2014 Q1. http://www.cnit-

research.com/c ontent/201405/303.html, 2014.

[14] L. P. Cox, P. Gilbert, G. Lawler, V. Pistol, A. Razeen, B. Wu,

and S. Cheemalapati. Spandex: Secure password tra cking for

android. In USENIX Secur ity, 2014.

[15] M. Egele, C. Kruegel, E. Kirda, and G. Vigna. Pios: Detecting

privacy leaks in ios applications. In NDSS, 2011.

[16] W. Enck, P. Gilbert, B. Chun, L. Cox, J. Jung, P. McDaniel, and

A. Sheth. TaintDroid: an information-ﬂow tracking system for

realtime privacy monitoring on sma r tphones. In OSDI, 2010.

[17] W. Enck, P. Gilber t, S. Han, V. Tendulkar, B. - G. Chun, L. P.

Cox, J. Jung, P. McDanie l, a nd A. N. Sheth. Taintdroid: an

information-ﬂow tracking system for realtime privacy monit or-

ing on smartphones. ACM TOCS, 32(2):5, 2014.

[18] B. Ford and R. Cox. Vx32: Lightweight user-level sandboxing

on the x86. In USENIX ATC, 2008.

[19] T. F r aser, L. Badger, and M. Feldman. Hardening cots software

with generic software wrappers. In Oakland, pages 2–16, 1999.

[20] C. Gibler, J. Crussell, J. Erickson, and H. Chen. Androidleaks:

automatically de tecting potential privacy leaks in android appli-

cations on a large scale. In Trust, 2012.

[21] I . Goldberg, D. Wagner, R. Thomas, and E. A. Brewer. A secure

environment for untrusted helper applications conﬁning the wily

hacker. In USENIX Securit y, 1996.

[22] M. Grace, Y. Zhou, Z. Wang, and X. Jia ng. Systemati c detection

of capability leaks in stock android smar tphones. In NDSS, 2012.

[23] N. Hardy. The confused deputy:(or why capabilities might have

been invented). SIGOPS Oper. Sys. Review, 22(4):36–38, 1988.

[24] P. Hornyack, S. Han, J. Jung, S. Schechter, and D. Wetherall.

These aren’t the droids you’re looking for: Retroﬁtting android

to protect data from imperious applications. In CCS, 2011.

[25] S. Jana, D. E. Porter, and V. Shmatikov. Txbox: Building secure,

efﬁcient sandboxes with system transactions. In Oakland, 2011.

[26] T. Kim, X. Wang, N. Zeldovich, M. Kaashoek, et al. Intrusion

recovery using selective re-execution. In OSDI, 2010.

[27] T. Kim and N. Zeldovich. Practical and effective sandboxing for

non-root users. In USENIX ATC, pages 139–144, 2013.

[28] O. Laadan and J. Nieh. Transparent checkpoint-restart of multiple

processes on commodity operating systems. In USENIX ATC,

pages 323–336, 2007.

[29] W. S. Labs. Fake input method editor(ime) trojan.

http://community.websense.com/blogs/securitylabs/archive/

2010/07/05/trojan-using-input-method-inject-tec hnology. aspx.

[30] S. Lee, E. L. Wong, D. Goel, M. Dahli n, and V. Shmatikov. πbox:

a plat f or m for privacy-preserving apps. In NSDI, 2013.

[31] D. Liu, E. Cuervo, V. Pistol, R. Scudellari, and L. P. Cox. Screen-

pass: S ecure password entry on touchscree n devices. In MobiSys,

pages 291–304, 2013.

[32] M. N auman, S. Khan, and X. Zhang. A pex: extending android

permission model and enforcement with user-deﬁned runtime

constraints. In ASIACCS, pages 328–332, 2010.

[33] D. E . Porter, O. S. Hofmann, C. J. Rossbach, A. Benn, and

E. Witchel. Operating system transactions. In SOSP, 2009.

[34] V. Rastogi, Y. Chen, a nd W. Enck. Appsplayground: A utomatic

security analysis of smartphone applications. In ACM conference

on Data and application security and privacy, 2013.

[35] S. Rosen, Z. Qian, and Z. M. Mao. Appproﬁler: a ﬂexible method

of exposing privacy-related behavior in android applications to

end users. In ACM conference on Data and application security

and privacy, pages 221–232. ACM, 2013.

[36] M. A. Salehi, T. Caldwell, A. Fernandez, E. Mickiewicz, E . W.

Rozier, S. Zonouz, and D. Redberg. R eseed: Regular expression

search over encrypted data in the cloud. In CCGrid, 2014.

[37] S. Sidiroglou, O. Laadan, A. D . Keromytis, and J. Nieh. Using

rescue points to navigate software recovery. In Oakland, 2007.

[38] R. Stevens, C. Gibl er, J. Crussell, J. Erickson, and H. Chen. In-

vestigating user privacy in android ad libraries. In Workshop on

Mobile Security Technologies (MoST), 2012.

[39] K. Subramanyam, C. E. Frank, and D. F. Gall i.

Keyloggers: The overlooked threat to computer se-

curity. http://www.keylogger.org/articles/kishore-

subramanyam/keyloggers-the-overlooked-threat-to-computer-

security-7.html.

[40] TechSpot News. Google ﬁred employees for breach-

ing user privacy. http://www.techspot.com/

news/40280-google-fired-employees-

for-breaching-user-privacy.html, 2010.

[41] Y. Xia, Y. Liu, and H. Chen. Archit ecture support for gues t-

transparent vm protection from untrusted hypervisor and physical

attacks. In HPCA, 2013.

[42] Y. Xia, Y. Li u, C. Tan, M. Ma, H. Guan, B. Zang, and H. Chen.

Tinman: eliminating c onﬁdential mobile data exposure with se-

curity oriented ofﬂoading. In EuroSys, 2015.

[43] R. Xu, H. Sa

ıdi, and R. Anderson. Aurasium: Practical policy

enforcement for android applications. In USE NIX Security, 2012.

[44] W. Yang, X. Xiao, B. Andow, S. Li, T. Xie, and W. Enck. App-

context: Differentiating malicious and benign mobile app behav-

iors using context. In ICSE, 2015.

[45] B. Yee, D. S ehr, G. Dardyk, J. B. Chen, R. Muth, T. Ormandy,

S. Okasaka, N. Narula, and N. Fullagar. Native client: A sand-

box for portable, untrusted x86 native code. Com mun. ACM,

53(1):91–99, Jan. 2010.

[46] H. Yin, D. Song, M. Egele, C . Kruegel, and E. Kirda. Panorama:

Capturing syste m-wide information ﬂow for malware detection

and analysis. In CCS, 2007.

[47] F. Zhang, J. Chen, H. Chen, and B. Zang. Cloudvisor: retroﬁtting

protection of virtual machines in multi-tenant cloud with nested

virtualization. In SOSP, 2011.

[48] W. Zhou, Y. Zhou, X. Jiang, and P. Ning. Detecting repackaged

smartphone applications in third-party a ndroid marketplaces. In

ACM conference on Data and Application Security and Privacy,

pages 317–326. ACM, 2012.

[49] Y. Zhou and X. Jiang. Disse cting android malware: Characteri-

zation and evolution. In Oakland, 2012.

[50] Y. Zhou and X. Jiang. Detecting passive content leaks and pollu-

tion in android applications. In NDSS, 2013.

[51] Y. Zhou, X. Zhang, X. J iang, and V. W. Fre eh. Taming

information-stealing smartphone applica tions (on android). In

Conference on Trust and Trustworthy Computing, 2011.