Skip to content
Topics and papers I find interesting in Machine Learning

Updated 2019-04-12

Paper Collection

NLP benchmark

Tranformer XL


This is a great blog on the underlying concept and should help implement the same in trudle. It would be also good to see how awd-lstm compare with attentive awd-lstms.

RNN training methods and corresponding computational complexity

Short term memory for RNNs

This paper propose by Jimmy Ba and Hinton, proposes an old idea of using Hebbian like learning rule on 2nd order TPR representations to augment traditional RNN to incoorporate temporary, short-term memory.

My two cents

  • Can this be combined with Neural Ordinary Differential equation to determine automatically determine the length of inner loop?
  • Can decay parameter be learning, than have constant exponentail rate decay?
  • It will be interesting to see how large scape Ulmfit like language model trained using truncated BPTT will do. If it works, it should be faster (and real time) during inference compared to attentional variants including Transformer XL.

Symbolic Neural Reasoning using TPR

This follows on Paul Smolensky work on symbolic reasoning using TPR and gives an end-to-end learning framework using Fast-Weights update of third order TPRs. Very interesting paper, which scope to build upon further.

My two cents

  • Authors haven't been able to integrate it to a full fledged LSTM (or unable to make it work). The idea is novel and needs further investigation.

Graph Neural Networks

Neural and Symbolic Reasoning

My two cents

  • Theorem proving and theorem validation is epitomy of symbolic reasoning. It would be interesting to see how the Fast-weights and TPR variants above do on these tasks.

Logic Tensor Network

The paper doesn't as good of a justice to the theory they develop in the tutorial. They lay the foundation of first order (fuzzy) logic in tensors. The tutorial is also a good introduction to fuzzy logic.

Model confidence recalibration

I stumbled upon this blog post when I noticed that after training my model for longer (and consequently obtaining higher accuracy), the confidence thresholding didn't work as well. This also hijacked the highly confident beams during beam search. The post is baed on a nicely written article, link below.

My two cents

  • This has not been tried in NLP settings. I am in process of implementing it now as part of my work at True AI. I shall post the results after.

From Machine Learning to Machine Reasoning

Old classic and discusses some plausible requirements and neurological evidence for the basis of intelligence. - Paper

Differentiable Inductive Logic Programming

Logical Rule Induction and Theory Learning Using Neural Theorem Proving

Lifted Rule Injection for Relation Embeddings

End-to-end differential proving

Programming with a Differentiable Forth Interpreter

Differentiable Programs with Neural Libraries (Terpret)

Differentiable Functional Programming (scala)

Differentiable Functional Program Interpreters

A curate list on code induction and synthesis

Adventures in Neuro Symbolic Machine Learning

Tensor Product Programming Language

Weight Decay, BatchNorm and connection with learning rate