Member-only story

What is Implicit Deep Learning?

ODSC - Open Data Science
5 min readSep 17, 2019

--

See a larger version of the cover image here.

Prediction rules in deep learning are based on a forward, recursive computation through several layers. Implicit deep learning rules go much beyond, by relying on the solution of an implicit (or, “fixed-point”) equation that has to be numerically solved in order to make the prediction: for a given input vector u, the predicted vector y is of the form

[Related Article: Model Evaluation in the Land of Deep Learning]

y=Cx+Du, x=ϕ(Ax+Bu),

where A,B,C,D are matrices containing the model weights, and ϕ is a given (nonlinear) activation function, such as the ReLU. Here, the so-called “state” n-vector x, which contains the hidden features of the model, is not expressed explicitly; rather it is implicitly defined via the “fixed-point” (or, equilibrium) equation x=ϕ(Ax+Bu).

At first glance, the above models seem very specific. Perhaps surprisingly, they include a special case most known neural network architectures, including standard feedforward networks, CNNs, RNNs, and many more. We can specify such architectures with a proper definition of the activation ϕ and by imposing adequate linear structure in the model matrices A,B,C,D. For example, constraining matrix A to be strictly upper block-diagonal corresponds to the class of feedforward networks.

--

--

ODSC - Open Data Science
ODSC - Open Data Science

Written by ODSC - Open Data Science

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.

Responses (1)