Highway Layer
Variant of the fully connected Linear layer, with an additional gated residual connection
It produces a standard mapping with a non-linear activation function
Examples:
# Blocks to be used for the highway connection
self.highway = nn.Linear(128, 128)
self.transform = nn.Linear(128, 128)
# Some Highway Layers
h = self.highway(x)
t_gate = torch.sigmoid(self.transform(x))
c_gate = 1 - t_gate
x_ = h * t_gate + x * c_gate