How we exploit networks with ENM attacks
In Deep Learning, to learn means to understand that some objects are similar on certain aspects/features and use those similarities for classification.
It means to discard irrelevant information which does not contribute to understand those similarities.
Discarding information is achieved by progressively compressing input data into a space of lower dimensionality using several Neural Networks layers.
- A linear neuron would achieve this, by mapping inputs onto
output with less features, i.e. a lower-dimension Hyperplane. - During training, it learns to perform useful projections of the training data.
Neural Networks then aim to map similar objects close together in this low dimensionality space.
-
For instance, all pictures of cats will produce projection vectors a, with roughly similar values. Same thing for dogs pictures.
-
Cats vectors and dogs vectors will, however, be very different!
-
Final layer then implements a binary decision on vector a.
-
Due to compression, there are many directions in a high dimension feature space along which a small step might lead to big changes in predictions.
-
In zones with low training data densities, the decision boundaries can lie very close together, because they were never properly learned from training samples. Some small changes on an input can then lead to big changes in predictions produced by the trained Neural Network.
We exploit this to generate attack samples.