Brian DuSell
Lunch at 12:30pm, talk at 1pm, in 148 Fitzpatrick
Title: Stack Nondeterminism in Neural Networks
Abstract: Learning hierarchical structures in sequential data – from simple algorithmic patterns to natural language – remains a challenging problem for sequential neural networks. Past work has shown that recurrent neural networks (RNNs) struggle to generalize on held-out syntactic patterns without supervision or some inductive bias. To remedy this, many papers have explored augmenting RNNs with various differentiable stacks, by analogy with finite automata and pushdown automata. However, these techniques have all modeled deterministic stacks, which in theory is insufficient to model the syntactic ambiguity commonly found in natural language.
In this talk, I will discuss my work with David Chiang on the Nondeterministic Stack RNN (NS-RNN), a novel stack RNN that deliberately models nondeterminism and achieves better performance on formal languages than previously proposed stack RNNs. We present evidence that nondeterminism improves both training and expressive power, and I will discuss two recent modifications that improve these further. I will also discuss ongoing work aimed at improving performance on natural language.
Bio: I am a PhD candidate in David Chiang’s natural language processing research group at the University of Notre Dame. My primary research focus is on incorporating simulations of nondeterministic pushdown automata into neural networks to improve machine learning on human languages. My interests lie generally in syntax, machine translation, and neural networks.