Over the last few years, deep learning neural networks have become immensely popular to use in a wide variety of applications and industries. Deep learning has proven to be effective at many tasks (such as image classification and object detection) such that deep learning algorithms have found their way into many safety-critical applications, as well. However, because these algorithms were adopted so quickly, many developers did not fully consider the security implications that come with a new class of algorithms. Nonetheless, security researchers have come together and begun research in a field called adversarial learning, which attempts to find and document vulnerabilities in machine learning algorithms. Attacks on machine learning algorithms, typically referred to as adversarial examples, have now been known to work against some machine learning algorithms in some situations (e.g., in simulation only). However, to understand and mitigate the true risk of these algorithms, researchers are now looking into the feasibility of making these adversarial examples physically realizable.
With this research project, the researchers aimed to design a robust, reliable testing framework for creating physically-realizable adversarial examples, or real-world deep-learning attacks, which can easily adapt to a wide variety of deep learning algorithms. The first step in creating this testing framework was to incorporate existing adversarial learning vulnerabilities (misclassification). The researchers then improved upon the state-of-the-art by developing and testing a novel approach to spoofing object detection neural networks. While typical object detection spoofing has involved manipulating the classification task, SwRI also researched and tested spoofing object localization by creating patches that attempted to shift the regions by more than 20% of the original region height/width. SwRI improved the state of the art in security testing by creating a better training methodology for these physically realizable adversarial examples. Conventional adversarial example training methods used affine transformations to make the adversarial example scale and rotation invariant. SwRI improved upon this by creating “perception-invariant” adversarial examples (a term coined at SwRI), which were created using full homography transformations of the adversarial example during training. The advantage to this approach is, when testing in the physical world, the adversarial example does not need to be perfectly parallel with the camera system. Instead, it can have a certain degree of rotation in all three dimensions without compromising its effectiveness. This method allowed SwRI to accurately test how a malicious party might exploit an image processing system in the physical world.
SwRI successfully created an adversarial learning framework that can quickly adapt to new deep learning algorithms and test their vulnerability to both misclassification and mislocalization attacks. The researchers proved that object localization spoofing does not produce a strong enough offset in the localization vector to effectively “displace” the detection. Instead when the researchers introduced an offset in either the x or y position by more than 20% of the length or width of the original region, that region disappeared entirely (thus creating another type of “disappearing” attack). Another discovery was that misclassification attacks (i.e., converting a truck into a bicycle) produced the largest shift/scale change in the region’s shape because each class is biased toward a particular set of region shapes. There was no research into the validity or effectiveness of such an attack prior to this project. Finally, the researchers created an improved method for producing more physically realizable adversarial examples, while also reducing the footprint (size) of the adversarial example than the current state-of-the-art.