Defend Again Website Fingerprinting With Machine Learning


In stream fingerprinting, an attacker can compromise user privacy past leveraging side-channel information (e.g., packet size) of encrypted traffic in streaming services. By taking advantages of machine learning, especially neural networks, an antagonist tin can reveal which YouTube video a victim watches with extremely high accuracy. While constructive defence methods accept been proposed, extremely loftier bandwidth overheads are needed. In other words, building an constructive defense with low overheads remains unknown. In this paper, we propose a new defense force mechanism, referred to as SmartSwitch, to address this open trouble. Our defense intelligently switches the noise level on different packets such that the defense remains effective simply minimizes overheads. Specifically, our method produces higher noises to obfuscate the sizes of more than pregnant packets. To identify which packets are more significant, we formulate it as a feature pick trouble and investigate several characteristic pick methods over high-dimensional data. Our experimental results derived from a large-scale dataset demonstrate that our proposed defense is highly effective against stream fingerprinting built upon Convolutional Neural Networks. Specifically, an adversary tin infer which YouTube video a user watches with only i% accuracy (aforementioned as random guess) even if the adversary retrains neural networks with obfuscated traffic. Compared to the country-of-the-art defense, our mechanism can save nearly xl% of bandwidth overheads.


  • Encrypted traffic analysis
  • Motorcar learning
  • Feature selection


Authors from the University of Cincinnati were partially supported by National Science Foundation (CNS-1947913), UC Office of the Vice President for Research Pilot Program, and Ohio Cyber Range at UC.

Hyperparameters of CNN. The tuned hyperparameters of our CNN are described in Tabular array four. For the search space of each hyperparameter, we represent it as a gear up. For the activation functions, dropout, filter size and pool size, we searched hyperparameters at each layer. The tuned parameters we report in the table are presented every bit a sequence of values by following the order of layers we presented in Fig. 2. For instance, the tuned activation functions are selu (1st Conv), elu (2nd Conv), relu (third Conv), elu (4th Conv), tanh (the second-to-last Dumbo layer).

Table 4. Tuned hyperparameters of CNN When \(w = 0.05\) due south

\(d^*\) -privacy. Xiao et al. [26] proposed \(d^*\)-privacy, which is a variant of differential privacy on time-series information, to preserve side-aqueduct information leakage. They proved that \(d^*\)-privacy can achieve (\(d^*\),2\(\epsilon \))-privacy, where \(d^{*}\) is a distance between 2 time series data and \(\epsilon \) is privacy parameter in differential privacy.

Let \(\mathbf {10}=(x_{1}, ..., x_{n})\) and \(\mathbf {y}=(y_{one}, ..., y_{n})\) denote two time series with the same length. The \(d^*\)-altitude between \(\mathbf {10}\) and \(\mathbf {y}\) is defined as:

$$\begin{aligned} d^*(\mathbf {x}, \mathbf {y}) = \sum _{i\ge ii} \left| (x_{i}-x_{i-1})-(y_{i}-y_{i-one}) \right| \end{aligned}$$

(half dozen)

\(d^{*}\)-privacy produces racket to data at a afterwards timestamp by considering data from an earlier timestamp in the same fourth dimension series. Specifically, allow D(i) announce the greatest power of 2 that divides timestamp i, \(d^*\)-privacy computes noised data at timestamp i as , where = 0, function \(G(\cdot )\) and \(r_{i}\) are defined as below

$$\begin{aligned} G(i)= {\left\{ \brainstorm{assortment}{ll} 0 &{} \text {if } i = 1 \\ i/2 &{} \text {if } i = D(i) \\ i-D(i) &{} \text {if } i > D(i) \\ \end{array}\right. } \end{aligned}$$


$$\begin{aligned} r_i= {\left\{ \begin{array}{ll} \mathrm {Laplace}(\frac{one}{\epsilon })&{} \text {if } i = D(i) \\ \mathrm {Laplace}(\frac{\lfloor \log _2i\rfloor }{\epsilon })&{} \text {otherwise} \end{array}\right. } \finish{aligned}$$


Li, H., Niu, B., Wang, B. (2020). SmartSwitch: Efficient Traffic Obfuscation Against Stream Fingerprinting. In: Park, N., Sun, K., Foresti, S., Butler, K., Saxena, N. (eds) Security and Privacy in Communication Networks. SecureComm 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 335. Springer, Cham.

