-
Notifications
You must be signed in to change notification settings - Fork 130
Tutorials
Counterfit works to keep the target in focus for the user and tries to provide a uniform interface from which to use the underlying frameworks. However, understanding how to build a class is important for successful use. For a warmup, we will build a target for everyone’s favorite ML model, MNIST.
- Start Counterfit and execute the new command. Enter a name and select images as the data type.
[[email protected]] -> python .\counterfit.py
---------------------------------------------------
Microsoft
__ _____ __
_________ __ ______ / /____ _____/ __(_) /_
/ ___/ __ \/ / / / __ \/ __/ _ \/ ___/ /_/ / __/
/ /__/ /_/ / /_/ / / / / /_/ __/ / / __/ / /
\___/\____/\__,_/_/ /_/\__/\___/_/ /_/ /_/\__/
#ATML
---------------------------------------------------
[+] 18 attacks
[+] 4 targets
counterfit> new
? Target name: mnist
? Which framework? art
? What data type? image
counterfit>
- Find the new target folder in
counterfit/targets
, and open the new target python file in your preferred code editor. The code file is generated from a template incounterfit/core/commands/new.py
.
# Generated by counterfit #
from counterfit.core.targets import ArtTarget
class Mnist(ArtTarget):
model_name = "mnist"
model_data_type = "image"
model_endpoint = ""
model_input_shape = ()
model_output_classes = []
X = []
def __init__(self):
self.X = []
def __call__(self, x):
return x
- In your code editor, fill out the required target properties.
-
model_name
andmodel_data_type
were taken care of during new target creation. -
model_endpoint
is where Counterfit will collect outputs from the target model. We will use themnist_sklearn_pipeline.pkl
pre-trained model found in the tutorial folder. -
model_input_shape
is the input shape of the target model, which is a known(1, 28, 28)
. -
model_output_classes
are the output classes of the model, which is a known["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]
.
After filling in the blanks with the information above, the target class should look like the following,
# Generated by counterfit #
from counterfit.core.targets import ArtTarget
class Mnist(ArtTarget):
model_name = "mnist"
model_data_type = "image"
model_endpoint = "counterfit/targets/tutorial/mnist_sklearn_pipeline.pkl"
model_input_shape = (1, 28, 28)
model_output_classes = ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]
X = []
def __init__(self):
self.X = []
def __call__(self, x):
return x
- Interact with the target via
interact
. Try andreload
the target, fix any errors that show up. Now when youlist target
some of the information should be filled out.
counterfit> interact mnist
...
mnist> reload
mnist> list targets
Name Type Input Shape Location
----------------------------------------------------------------------------------------------------------------------------------------
creditfraud numpy (30,) counterfit/targets/creditfraud/creditfraud_sklearn_pipeline.pkl
mnist image (1, 28, 28) counterfit/targets/tutorial/mnist_sklearn_pipeline.pkl
moviereviews text (1,) counterfit/targets/moviereviews/movie_reviews_sentiment_analysis.pt
satelliteimages image (3, 256, 256) counterfit/targets/satelliteimages/satellite-image-params-airplane-stadium.h5
tutorial image (1, 28, 28) counterfit/targets/tutorial/mnist_sklearn_pipeline.pkl
mnist>
- With the required properties in place, we can start loading resources and implementing functionality.
- This model is an
image
data type. A user can overrideclip_values
in the target. This ensures image values remain valid pixel values. - Because this is a local model, we first load the model and expose the
predict
function that Counterfit will use to interact with the target model. - Next, load sample data
X
. The sample data is a list of lists where each list is an array containing a processed sample. The data for the tutorial is in a nice tidy numpy zip file, however, most targets will require additional processing to getX
.
Paste the __init__
function below in the target class.
def __init__(self):
self.clip_values = (0, 255)
with open(self.model_endpoint, "rb") as f:
self.model = pickle.load(f)
self.data_file = "counterfit/targets/tutorial/mnist_784.npz"
self.sample_data = np.load(self.data_file, allow_pickle=True)
self.X = self.sample_data["X"]
- Excellent, we now have samples and a model to attack. Next, we will build the
__call__
function, Counterfit will use this function to submit samples to the target model viax
.x
is a perturbed sample of shape(Batch, Channels, Height, Width)
. Channels, Height, and Width are derived from themodel_input_shape
that was defined earlier. Functionally,x
is a list of lists, where each list is a sample of shape(1, 28, 28)
. This should sound familiar as it is the same shape asX
andmodel_input_shape
. Paste the following code below the__init__
function.
def __call__(self, x):
scores = self.model.predict_proba(x.reshape(x.shape[0], -1))
return scores.tolist()
Note: A crucial piece to the __call__
function is how scores are returned to the attack algorithm. The must return a list of probabilities. An attack algorithm uses the returned scores to inform how to change the sample for the next iteration of the attack. In this tutorial the pre-trained MNIST model returns exactly what is needed, which is a list of probabilities for each label.
- Alright, the new target is almost ready. Add the following imports to the top on the file,
import pickle
,import numpy as np
. Next, execute thereload
command to load the updated target into the session. The__init__
function is called onreload
orinteract
. There will be a warnings that are not suppressed to keep the target code clean - you can safely ignore them. , The final target should look like below,
# Generated by counterfit #
import pickle
import numpy as np
from counterfit.core.targets import ArtTarget
class Mnist(ArtTarget):
model_name = "mnist"
model_data_type = "image"
model_endpoint = "counterfit/targets/tutorial/mnist_sklearn_pipeline.pkl"
model_input_shape = (1, 28, 28)
model_output_classes = ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]
X = []
def __init__(self):
self.clip_values = (0, 255)
with open(self.model_endpoint, "rb") as f:
self.model = pickle.load(f)
self.data_file = "counterfit/targets/tutorial/mnist_784.npz"
self.sample_data = np.load(self.data_file, allow_pickle=True)
self.X = self.sample_data["X"]
def __call__(self, x):
scores = self.model.predict_proba(x.reshape(x.shape[0], -1))
return scores.tolist()
- To test the functionality of the target, execute the
predict
function.
mnist> predict
[!] No index sample, setting random index.
Output Scores
Sample ['0' '1' '2' '3' '4' '5' '6'
Index Sample '7' '8' '9']
------------------------------------------------------------------------------------------------------
65923 mnist-sample-46600446.png [0.000 0.000 0.000 0.000 1.000
0.000 0.000 0.000 0.000 0.000]
mnist>
- Excellent. We are ready to run attacks on the MNIST target. List the frameworks, load art, and then list the available attacks. Attacks are filtered based on the
model_data_type
defined in the target class.
counterfit> list frameworks
Framework # of Attacks
----------------------------------------------------
art 7
textattack 11
counterfit> load art
[+] Framework loaded successfully!
counterfit> list attacks
Name Type Category Tags Framework
----------------------------------------------------------------------------------------
boundary evasion blackbox image, numpy art
hop_skip_jump evasion blackbox image, numpy art
pixel evasion blackbox image art
spatial_transformation evasion blackbox image art
square evasion blackbox image art
threshold evasion blackbox image art
zoo evasion blackbox image, numpy art
counterfit>
- Add an attack to the pipeline by executing
use hop_skip_jump
.
mnist> use hop_skip_jump
[+] Using hop_skip_jump c596a8f3
mnist>hop_skip_jump>
- Finally, start the attack with
run
.
mnist>hop_skip_jump> run
[+] Running hop_skip_jump on mnist
HopSkipJump: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:32<00:00, 32.74s/it]
[+] 1/1 succeeded
Sample Index Label (conf) Attack Label (conf) % Eucl. dist. Elapsed Time [sec] Queries (rate) Attack Input
---------------------------------------------------------------------------------------------------------------------------------
1. 0 5 (0.9990) 3 (0.6320) 0.02039% 32.8 24548 (749.4 counterfit/ta
query/sec) rgets/mnist/r
esults/mnist-
c596a8f3-fina
l-0-label-3.p
ng
mnist>hop_skip_jump>
- Alternatively, run multiple attacks with
scan
. Issue theback
command to exit the active attack. Then usescan
,
mnist>hop_skip_jump> scan --iterations 2 --attack hop_skip_jump
[+] Running these attacks 2x each:
hop_skip_jump
[+] Using hop_skip_jump f36ede50
HopSkipJump: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:04<00:00, 4.07s/it]
[+] Using hop_skip_jump 196a6995
HopSkipJump: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 16.58it/s]
===============
SCAN SUMMARY
===============
Time[sec] Queries Best Score
Attack Name Total Runs Successes (%) (min/avg/max) (min/avg/max) (attack_id) Best Parameters
---------------------------------------------------------------------------------------------------------------------------
hop_skip_jump 2 1 (50.0%) 0.1/ 2.1/ 4.1 51/ 1746/ 3441 1.0 (f36ede50) init_eval=78
init_size=29
max_eval=3631
max_iter=15
norm=inf
targeted=false
sample_index=23298
target_class=6
mnist>hop_skip_jump>
- Save the results with
save
.
mnist>hop_skip_jump> save
[+] Successfully wrote counterfit/targets/mnist/results/mnist_9f806eec.json
mnist>hop_skip_jump>