Tree Memory Networks for Sequence Processing

Frederik Diehl, Alois Knoll

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Long-term dependencies are difficult to learn using Recurrent Neural Networks due to the vanishing and exploding gradient problems, since their hidden transform operation is applied linearly in sequence length. We introduce a new layer type (the Tree Memory Unit), whose weight application scales logarithmically in the sequence length. We evaluate this on two pathologically hard memory benchmarks and two datasets. On those three tasks which require long-term dependencies, it strongly outperforms Long Short-Term Memory baselines. However, it does show weaker performance on sequences with few long-term dependencies. We believe that our approach can lead to more efficient sequence learning if used on sequences with long-term dependencies.

Original languageEnglish
Title of host publicationArtificial Neural Networks and Machine Learning – ICANN 2019
Subtitle of host publicationTheoretical Neural Computation - 28th International Conference on Artificial Neural Networks, 2019, Proceedings
EditorsIgor V. Tetko, Pavel Karpov, Fabian Theis, Vera Kurková
PublisherSpringer Verlag
Pages431-443
Number of pages13
ISBN (Print)9783030304867
DOIs
StatePublished - 2019
Event28th International Conference on Artificial Neural Networks, ICANN 2019 - Munich, Germany
Duration: 17 Sep 201919 Sep 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11727 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference28th International Conference on Artificial Neural Networks, ICANN 2019
Country/TerritoryGermany
CityMunich
Period17/09/1919/09/19

Fingerprint

Dive into the research topics of 'Tree Memory Networks for Sequence Processing'. Together they form a unique fingerprint.

Cite this