# 敏感度#

• 定义敏感度

• 找到计数问询的敏感度

• 找到求和问询的敏感度

• 将均值问询分解为计数问询和求和问询

• 使用裁剪技术限制求和问询的敏感度上界

(3)#\begin{align} F(x) = f(x) + \textsf{Lap}\left(\frac{s}{\epsilon}\right) \end{align}

(4)#\begin{align} GS(f) = \max_{x, x': d(x,x') <= 1} |f(x) - f(x')| \end{align}

## 距离#

(5)#\begin{align} d(x, x') = | x - x' \cup x' - x | \end{align}

• 如果$$x'$$是通过向$$x$$添加一行来构造的，则$$d(x,x') = 1$$

• 如果$$x'$$是通过从$$x$$删除一行来构造的，则$$d(x,x') = 1$$

• 如果$$x'$$是通过在$$x$$修改一行来构造的，则$$d(x,x') = 2$$

## 计算敏感度#

• $$f(x) = x$$的全局敏感度是1，因为$$x$$变化1，$$f(x)$$变化为1

• $$f(x) = x+x$$的全局敏感度是2，因为$$x$$变化1，$$f(x)$$变化为2

• $$f(x) = 5*x$$的全局敏感度是5，因为$$x$$变化1，$$f(x)$$变化为5

• $$f(x) = x*x$$的全局敏感度是无界的，因为$$f(x)$$的变化取决于$$x$$的值

import pandas as pd
import numpy as np
from mplfonts.bin.cli import init
init()
from mplfonts import use_font
use_font('SimHei')
import matplotlib.pyplot as plt
# plt.style.use('seaborn-whitegrid')
plt.style.use('fivethirtyeight')


---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 3
1 import pandas as pd
2 import numpy as np
----> 3 from mplfonts.bin.cli import init
4 init()
5 from mplfonts import use_font

ModuleNotFoundError: No module named 'mplfonts'


### 计数问询#

adult.shape[0]

32561


adult[adult['Education-Num'] > 10].shape[0]


adult[adult['Education-Num'] <= 10].shape[0]


adult[adult['Name'] == 'Joe Near'].shape[0]


### 求和问询#

adult[adult['Education-Num'] > 10]['Age'].sum()

422876


### 均值问询#

adult[adult['Education-Num'] > 10]['Age'].mean()

40.21262837580829


adult[adult['Education-Num'] > 10]['Age'].sum() / adult[adult['Education-Num'] > 10]['Age'].shape[0]

40.21262837580829


## 裁剪#

adult['Age'].clip(lower=0, upper=125).sum()

1256257


Hide code cell source
plt.hist(adult['Age'])
plt.xlabel('年龄')
plt.ylabel('数据量');


Hide code cell source
def laplace_mech(v, sensitivity, epsilon):
return v + np.random.laplace(loc=0, scale=sensitivity/epsilon)

epsilon_i = .01
plt.plot([laplace_mech(adult['Age'].clip(lower=0, upper=i).sum(), i, epsilon_i) for i in range(100)])
plt.xlabel('年龄的裁剪边界')
plt.ylabel('总求和值');


Hide code cell source
xs = [2**i for i in range(15)]
plt.plot(xs, [laplace_mech(adult['Age'].clip(lower=0, upper=i).sum(), i, epsilon_i) for i in xs])
plt.xscale('log')
plt.xlabel('年龄的裁剪边界')
plt.ylabel('总求和值');