Abstract
Wasim Arif
Despite their impressive performance across computer vision and natural language processing tasks, deep learning models are highly susceptible to adversarial examples—subtly modified inputs designed to mislead model predictions. This paper offers a comprehensive review of adversarial attack methodologies and corresponding defense mechanisms, as studied up to 2018. We classify attacks into white-box (e.g., FGSM, JSMA, Carlini & Wagner) and black-box categories, evaluating their success across common datasets such as MNIST, CIFAR-10, and ImageNet. Key attack techniques leverage gradients to craft minimal input perturbations that are imperceptible to humans but cause misclassification with high confidence. The review also examines transferability, where adversarial examples generated for one model fool others. On the defense side, strategies such as adversarial training, defensive distillation, and input preprocessing are explored. However, most defenses remain vulnerable under
IMPORTANT LINKS
Check Article for
Plagiarism
UPDATES
INDEXED BY: