papers in adversarial machine learning — adversarial machine learning
Adversarial training: attacking your own model as a defense
Posted by Dillon Niederhut on
A critical factor in AI safety is robustness in the face of unusual inputs. Without this, models (like chatGPT) can be tricked into producing dangerous outputs. One method for inducing safety is to use adversarial attacks inside the model training loop. This also helps models align their features to human expectations.
Anti-adversarial examples: what to do if you want to be seen?
Posted by Dillon Niederhut on
Most uses of adversarial machine learning involve attacking or bypassing a computer vision system that someone else has designed. However, you can use the same tools to generate "unadversarial" examples, that give machine learning models much better performance when deployed in real life.
I asked galactica to write a blog post and the results weren't great
Posted by Dillon Niederhut on
A few weeks ago, Meta AI announced Galactica, a large language model (LLM) built for scientific work. Just for fun I asked it to write a blog post about adversarial machine learning. Galactica doesn't get anything obviously wrong, but repeats itself a lot, is fairly light on details, and makes tautological arguments.
Adversarial patch attacks on self-driving cars
Posted by Dillon Niederhut on
Self-driving cars rely on vision for safety-critical information like traffic rules, which makes them susceptible to adversarial machine learning attacks. Some carefully placed stickers on a stop sign can make it invisible to autonomous vehicles; or, an adversarial t-shirt can make a person look like a stop sign.
Faceoff : using stickers to fool Face ID
Posted by Dillon Niederhut on
What if breaking into an office was as easy as wearing a special pair of glasses, or putting a sticker on your forehead? It can be, if you make the right adversarial patch. Learn how to use adversarial machine learning to hide from face recognition systems, or convince them that you are someone else.