add security and privacy chapter initial draft #66

eliasab16 · 2023-11-21T02:28:38Z

Before submitting your Pull Request, please ensure that you have carefully reviewed and completed all items on this checklist.

Content
- The chapter content is complete and covers the topic in detail.
- All technical terms are well-defined and explained.
- Any code snippets or algorithms are well-documented and tested.
- The chapter follows a logical flow and structure.
References & Citations
- All references are correctly listed at the end of the chapter.
- In-text citations are used appropriately and match the references.
- All figures, tables, and images have proper sources and are cited correctly.
Quarto Website Rendering
- The chapter has been locally built and tested using Quarto.
- All images, figures, and tables render properly without any glitches.
- All images have a source or they are properly linked to external sites.
- Any interactive elements or widgets work as intended.
- The chapter's formatting is consistent with the rest of the book.
Grammar & Style
- The chapter has been proofread for grammar and spelling errors.
- The writing style is consistent with the rest of the book.
- Any jargon is clearly explained or avoided where possible.
Collaboration
- All group members have reviewed and approved the chapter.
- Any feedback from previous reviews or discussions has been addressed.
Miscellaneous
- All external links (if any) are working and lead to the intended destinations.
- If datasets or external resources are used, they are properly credited and linked.
- Any necessary permissions for reused content have been obtained.
Final Steps
- The chapter is pushed to the correct branch on the repository.
- The Pull Request is made with a clear title and description.
- The Pull Request includes any necessary labels or tags.
- The Pull Request mentions any stakeholders or reviewers who should take a look.

Co-Authored-By: eurashin <[email protected]>

… the google doc

…16/cs249r_book into pr/66

Co-Authored-By: ELSuitorHarvard <[email protected]> Co-Authored-By: Elias Nuwara <[email protected]>

Co-Authored-By: ELSuitorHarvard <[email protected]>

arbass22

Looks great overall just a few comments. I stopped putting the combined-word fixes once I realized it was a systemic problem. I wonder if someone's editor handled line breaks incorrectly but someone might need to go through this more thoroughly to fix these. Not a deal breaker though its still very clear, good job!

arbass22 · 2023-11-26T23:50:13Z

privacy_security.qmd

+
+In this chapter, we will be talking about security and privacy together, so there are key terms that we need to be clear about.
+
+- **Privacy:** For instance, when a fitness tracker collects data about your daily activities, privacy concerns revolve around who else can access this data---whether it's just the company, the user, or unwanted third parties as well.


Since the other 3 examples are about the same security it make sense to make this use the same example. Or make them all different examples. This one just feels a bit inconsistent with the rest!

arbass22 · 2023-11-26T23:53:15Z

privacy_security.qmd

+
+### Mirai Botnet
+
+The Mirai botnet involved the infection of networked devices such as digital cameras and DVR players [@antonakakis2017understanding]. In October 2016, the botnet was used to conduct one of the largest DDoS attacks ever, disrupting internet access across the United States. The attack was possible because many devices used default usernames and passwords, which were easily exploited by the Mirai malware to control the devices.


Maybe add a link to more context on what a DDoS attack is. Perhaps https://www.cloudflare.com/learning/ddos/what-is-a-ddos-attack/ ?

arbass22 · 2023-11-26T23:55:38Z

privacy_security.qmd

+
+The methodology of model inversion typically involves the following steps:
+
+- **Accessing Model Outputs:** The attacker queries the ML model withinput data and observes the outputs. This is often done through alegitimate interface, like a public API.


Suggested change

- **Accessing Model Outputs:** The attacker queries the ML model withinput data and observes the outputs. This is often done through alegitimate interface, like a public API.

- **Accessing Model Outputs:** The attacker queries the ML model with input data and observes the outputs. This is often done through a legitimate interface, such as a public API.

Yea, this was me - oops. I did a big search and replace for some line breaks and messed it up and thought I had fixed it all. Thanks for catching these and noticing them. Will update.

arbass22 · 2023-11-26T23:57:04Z

privacy_security.qmd

+
+In these attacks, the objective is to extract information about concrete metrics, such as the learned parameters of a network, the fine-tuned hyperparameters, and the model's internal layer architecture [@oliynyk2023know].
+
+- **Learned Parameters:** adversaries aim to steal the learnedknowledge (weights and biases) of a model in order to replicateit. Parameter theft is generally used in conjunction with otherattacks, such as architecture theft, which lacks parameterknowledge.


Suggested change

- **Learned Parameters:** adversaries aim to steal the learnedknowledge (weights and biases) of a model in order to replicateit. Parameter theft is generally used in conjunction with otherattacks, such as architecture theft, which lacks parameterknowledge.

- **Learned Parameters:** adversaries aim to steal the learned knowledge (weights and biases) of a model in order to replicate it. Parameter theft is generally used in conjunction with other attacks, such as architecture theft, which lacks parameter knowledge.

arbass22 · 2023-11-26T23:59:02Z

privacy_security.qmd

+
+Data poisoning can degrade the accuracy of a model, force it to make incorrect predictions or cause it to behave unpredictably. In critical applications like healthcare, such alterations can lead to significant trust and safety issues.
+
+There are four main categories of data poisoning [@oprea2022poisoning]:


Looks like six categories below?

Thank you good sir! 🙏 Fixed.

arbass22 · 2023-11-27T00:00:38Z

privacy_security.qmd

+
+The researchers added synthetically generated toxic comments with slight misspellings and grammatical errors to the model's training data. This slowly corrupted the model, causing it to misclassify increasing numbers of severely toxic inputs as non-toxic over time.
+
+After retraining on the poisoned data, the model's false negative rate increased from 1.4% to 27% - allowing extremely toxic comments to bypass detection. The researchers warned this stealthy data poisoning could enable the spread of hate speech, harassment, and abuse if deployed against real moderation systems.


Wow this is super cool/effective/scary!

Indeed, it is nuts! Some neat works out there.

arbass22 · 2023-11-27T00:10:02Z

privacy_security.qmd

+
+- **Benefit:** The end result is a machine learning model that haslearned from a wide range of patient data without any of thatsensitive data having to be shared or leave its original location.
+
+#### Trade-offs


Might be worth calling out that FL is still vulnerable to some types of attacks like data poisoning?

Good point! Thanks @arbass22 I've added it. Will merge.

Co-Authored-By: ELSuitorHarvard <[email protected]>

privacy_security.qmd

add security and privacy chapter draft

ae5fcb5

profvjreddi marked this pull request as draft November 21, 2023 02:54

profvjreddi added the cs249r label Nov 21, 2023

profvjreddi and others added 13 commits November 22, 2023 06:54

Removed TOC

8a7a29c

Cleaning up MD export issues

753f893

More MD export fixes

e01c01b

More md fixes

f79f248

figure fixes

45abfbb

More MD fixes

781ec32

Fixed MD tables

6ed6302

Added learning objectives

8cb0ad1

Added cover image

37e7636

Co-Authored-By: eurashin <[email protected]>

fix MD formatting

506ccd1

MD fix

156d651

Co-Authored-By: eurashin <[email protected]>

changed typos in introdution.

09f8241

Removing line break

7cbd858

uchendui force-pushed the main branch from 8b5ca4c to 0d2be72 Compare November 22, 2023 15:59

ELSuitorHarvard and others added 13 commits November 22, 2023 17:42

added references through Security Threats to ML Hardware as marked in…

a0cdf0e

… the google doc

Merge branch 'security_privacy_chapter' of https://github.com/eliasab…

ad00309

…16/cs249r_book into pr/66

added remaining references as marked in the doc

dbdbc49

Merge branch 'main' into security_privacy_chapter

332bb2d

Updated the MD with the relevant bibtex references

32732bc

Co-Authored-By: ELSuitorHarvard <[email protected]> Co-Authored-By: Elias Nuwara <[email protected]>

Remove google doc lead note

60fedee

Video + ref for Mirai botnet

897ddb4

Removed ML re-definition

89aab14

Bullet list formatting

a266b64

Removed line break >

3656ce3

Bold

134284d

Bullet list conversion

b55d0bd

Removing 80 char line break

8c0e321

profvjreddi and others added 3 commits November 24, 2023 09:45

Fixed broken learning objectives

996b7b1

Linked to an incorrect QMD filename, fixed it

b820e14

Co-Authored-By: ELSuitorHarvard <[email protected]>

Update references.bib

5078fb5

Co-Authored-By: ELSuitorHarvard <[email protected]>

arbass22 reviewed Nov 27, 2023

View reviewed changes

profvjreddi and others added 4 commits November 26, 2023 19:29

Fix broken (removed duplicate) references

ce2e009

Co-Authored-By: ELSuitorHarvard <[email protected]>

Merge branch 'main' into pr/66

60d621f

Merge branch 'main' into pr/66

4f8acd2

Fixing broken reference

06ef191

Co-Authored-By: ELSuitorHarvard <[email protected]>

mpstewart1 assigned mpstewart1 and profvjreddi Nov 27, 2023

mpstewart1 added this to the First public release v0.0.0 milestone Nov 27, 2023

mpstewart1 marked this pull request as ready for review November 28, 2023 22:51

profvjreddi added 5 commits November 29, 2023 08:19

Fixing linebreak merge erros, where words got fused

5817d08

Fixed all missing references but one

602e64f

Fixed missing reference on dnn MCU attack

4202bba

Minor fix, thanks to @arbass22

3c6ea81

Formatting update

5e195a7

mpstewart1 requested changes Nov 29, 2023

View reviewed changes

profvjreddi and others added 4 commits November 29, 2023 14:07

fixing table

4a71604

first pass through comments, mostly adding captions

acf6f09

Minor updates to privacy_security.qmd

3847080

More minor updates to privacy_security.qmd

98d62b5

mpstewart1 merged commit 9df6390 into harvard-edge:main Nov 30, 2023
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add security and privacy chapter initial draft #66

add security and privacy chapter initial draft #66

eliasab16 commented Nov 21, 2023 •

edited by profvjreddi

Loading

arbass22 left a comment

arbass22 Nov 26, 2023

arbass22 Nov 26, 2023

arbass22 Nov 26, 2023

profvjreddi Nov 29, 2023

arbass22 Nov 26, 2023

arbass22 Nov 26, 2023

profvjreddi Nov 29, 2023

arbass22 Nov 27, 2023

profvjreddi Nov 29, 2023

arbass22 Nov 27, 2023

profvjreddi Nov 29, 2023


		In this chapter, we will be talking about security and privacy together, so there are key terms that we need to be clear about.

		- Privacy: For instance, when a fitness tracker collects data about your daily activities, privacy concerns revolve around who else can access this data---whether it's just the company, the user, or unwanted third parties as well.


		### Mirai Botnet

		The Mirai botnet involved the infection of networked devices such as digital cameras and DVR players [@antonakakis2017understanding]. In October 2016, the botnet was used to conduct one of the largest DDoS attacks ever, disrupting internet access across the United States. The attack was possible because many devices used default usernames and passwords, which were easily exploited by the Mirai malware to control the devices.


		The methodology of model inversion typically involves the following steps:

		- Accessing Model Outputs: The attacker queries the ML model withinput data and observes the outputs. This is often done through alegitimate interface, like a public API.


		In these attacks, the objective is to extract information about concrete metrics, such as the learned parameters of a network, the fine-tuned hyperparameters, and the model's internal layer architecture [@oliynyk2023know].

		- Learned Parameters: adversaries aim to steal the learnedknowledge (weights and biases) of a model in order to replicateit. Parameter theft is generally used in conjunction with otherattacks, such as architecture theft, which lacks parameterknowledge.


		Data poisoning can degrade the accuracy of a model, force it to make incorrect predictions or cause it to behave unpredictably. In critical applications like healthcare, such alterations can lead to significant trust and safety issues.

		There are four main categories of data poisoning [@oprea2022poisoning]:


		The researchers added synthetically generated toxic comments with slight misspellings and grammatical errors to the model's training data. This slowly corrupted the model, causing it to misclassify increasing numbers of severely toxic inputs as non-toxic over time.

		After retraining on the poisoned data, the model's false negative rate increased from 1.4% to 27% - allowing extremely toxic comments to bypass detection. The researchers warned this stealthy data poisoning could enable the spread of hate speech, harassment, and abuse if deployed against real moderation systems.


		- Benefit: The end result is a machine learning model that haslearned from a wide range of patient data without any of thatsensitive data having to be shared or leave its original location.

		#### Trade-offs

add security and privacy chapter initial draft #66

add security and privacy chapter initial draft #66

Conversation

eliasab16 commented Nov 21, 2023 • edited by profvjreddi Loading

arbass22 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eliasab16 commented Nov 21, 2023 •

edited by profvjreddi

Loading