Bidirectional recurrent neural networks

<h2 id="architecture">Architecture</h2>

The principle of BRNN is to split the neurons of a regular RNN into two directions, one for positive time direction (forward states), and another for negative time direction (backward states). Those two states' output are not connected to inputs of the opposite direction states. The general structure of RNN and BRNN can be depicted in the right diagram. By using two time directions, input information from the past and future of the current time frame can be used unlike standard RNN which requires the delays for including future information.<a class="footnote-ref" id="fnref:3" href="#fn:3">3</a>

<h2 id="training">Training</h2>
BRNNs can be trained using similar algorithms to RNNs, because the two directional neurons do not have any interactions. However, when back-propagation through time is applied, additional processes are needed because updating input and output layers cannot be done at once. General procedures for training are as follows: For forward pass, forward states and backward states are passed first, then output neurons are passed. For backward pass, output neurons are passed first, then forward states and backward states are passed next. After forward and backward passes are done, the weights are updated.<a class="footnote-ref" id="fnref:4" href="#fn:4">4</a>

<h2 id="applications">Applications</h2>
Applications of BRNN include :

<ul><li>Speech Recognition (Combined with <a href="/facts/Long_short-term_memory/NBBVEoft">Long short-term memory</a>)<a class="footnote-ref" id="fnref:5" href="#fn:5">5</a><a class="footnote-ref" id="fnref:6" href="#fn:6">6</a></li></ul>
<ul><li>Translation<a class="footnote-ref" id="fnref:7" href="#fn:7">7</a></li>
<li>Handwritten Recognition<a class="footnote-ref" id="fnref:8" href="#fn:8">8</a></li>
<li>Industrial <a href="/facts/Soft_sensor/HWeqtwmM">Soft sensor</a><a class="footnote-ref" id="fnref:9" href="#fn:9">9</a></li>
<li>Protein Structure Prediction<a class="footnote-ref" id="fnref:10" href="#fn:10">10</a><a class="footnote-ref" id="fnref:11" href="#fn:11">11</a></li>
<li>Part-of-speech tagging</li>
<li>Dependency Parsing<a class="footnote-ref" id="fnref:12" href="#fn:12">12</a></li>
<li>Entity Extraction<a class="footnote-ref" id="fnref:13" href="#fn:13">13</a></li></ul>

<h2 id="external-links">External links</h2>
<ul><li><a href="https://github.com/hycis/bidirectional_RNN">[1]</a> Implementation of BRNN/LSTM in Python with Theano</li></ul>

<h2 id="references">References</h2>

<ol>
<li id="fn:1">Schuster, Mike, and Kuldip K. Paliwal. "Bidirectional recurrent neural networks." Signal Processing, IEEE Transactions on 45.11 (1997): 2673-2681.2. Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan <a href="https://www.researchgate.net/profile/Mike_Schuster/publication/3316656_Bidirectional_recurrent_neural_networks/links/56861d4008ae19758395f85c.pdf" target="_blank">https://www.researchgate.net/profile/Mike_Schuster/publication/3316656_Bidirectional_recurrent_neural_networks/links/56861d4008ae19758395f85c.pdf</a> <a href="#fnref:1" class="footnote-back-ref">↩</a></li>
<li id="fn:2">Salehinejad, Hojjat; Sankar, Sharan; Barfett, Joseph; Colak, Errol; Valaee, Shahrokh (2017). "Recent Advances in Recurrent Neural Networks". arXiv:1801.01078 [cs.NE]. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:2" class="footnote-back-ref">↩</a></li>
<li id="fn:3">Schuster, Mike, and Kuldip K. Paliwal. "Bidirectional recurrent neural networks." Signal Processing, IEEE Transactions on 45.11 (1997): 2673-2681.2. Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan <a href="https://www.researchgate.net/profile/Mike_Schuster/publication/3316656_Bidirectional_recurrent_neural_networks/links/56861d4008ae19758395f85c.pdf" target="_blank">https://www.researchgate.net/profile/Mike_Schuster/publication/3316656_Bidirectional_recurrent_neural_networks/links/56861d4008ae19758395f85c.pdf</a> <a href="#fnref:3" class="footnote-back-ref">↩</a></li>
<li id="fn:4">Schuster, Mike, and Kuldip K. Paliwal. "Bidirectional recurrent neural networks." Signal Processing, IEEE Transactions on 45.11 (1997): 2673-2681.2. Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan <a href="https://www.researchgate.net/profile/Mike_Schuster/publication/3316656_Bidirectional_recurrent_neural_networks/links/56861d4008ae19758395f85c.pdf" target="_blank">https://www.researchgate.net/profile/Mike_Schuster/publication/3316656_Bidirectional_recurrent_neural_networks/links/56861d4008ae19758395f85c.pdf</a> <a href="#fnref:4" class="footnote-back-ref">↩</a></li>
<li id="fn:5">Graves, Alex, Santiago Fernández, and Jürgen Schmidhuber. "Bidirectional LSTM networks for improved phoneme classification and recognition." Artificial Neural Networks: Formal Models and Their Applications–ICANN 2005. Springer Berlin Heidelberg, 2005. 799-804. <a href="https://mediatum.ub.tum.de/doc/1290195/file.pdf" target="_blank">https://mediatum.ub.tum.de/doc/1290195/file.pdf</a> <a href="#fnref:5" class="footnote-back-ref">↩</a></li>
<li id="fn:6">Graves, Alan, Navdeep Jaitly, and Abdel-rahman Mohamed. "Hybrid speech recognition with deep bidirectional LSTM." Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on. IEEE, 2013. <a href="http://www.cs.toronto.edu/~graves/asru_2013.pdf" target="_blank">http://www.cs.toronto.edu/~graves/asru_2013.pdf</a> <a href="#fnref:6" class="footnote-back-ref">↩</a></li>
<li id="fn:7">Sundermeyer, Martin, et al. "Translation modeling with bidirectional recurrent neural networks." Proceedings of the Conference on Empirical Methods on Natural Language Processing, October. 2014. <a href="https://www.aclweb.org/anthology/D14-1003" target="_blank">https://www.aclweb.org/anthology/D14-1003</a> <a href="#fnref:7" class="footnote-back-ref">↩</a></li>
<li id="fn:8">Liwicki, Marcus, et al. "A novel approach to on-line handwriting recognition based on bidirectional long short-term memory networks." Proc. 9th Int. Conf. on Document Analysis and Recognition. Vol. 1. 2007. <a href="https://mediatum.ub.tum.de/doc/1289961/file.pdf" target="_blank">https://mediatum.ub.tum.de/doc/1289961/file.pdf</a> <a href="#fnref:8" class="footnote-back-ref">↩</a></li>
<li id="fn:9">Lui, Chun Fai, et al. "A Supervised Bidirectional Long Short-Term Memory Network for Data-Driven Dynamic Soft Sensor Modeling." IEEE Transactions on Instrumentation and Measurement 71 (2022): 1-13. <a href="https://ieeexplore.ieee.org/ielx7/19/9717300/09718226.pdf" target="_blank">https://ieeexplore.ieee.org/ielx7/19/9717300/09718226.pdf</a> <a href="#fnref:9" class="footnote-back-ref">↩</a></li>
<li id="fn:10">Baldi, Pierre, et al. "Exploiting the past and the future in protein secondary structure prediction." Bioinformatics 15.11 (1999): 937-946. <a href="https://academic.oup.com/bioinformatics/article-pdf/15/11/937/693153/150937.pdf" target="_blank">https://academic.oup.com/bioinformatics/article-pdf/15/11/937/693153/150937.pdf</a> <a href="#fnref:10" class="footnote-back-ref">↩</a></li>
<li id="fn:11">Pollastri, Gianluca, and Aoife Mclysaght. "Porter: a new, accurate server for protein secondary structure prediction." Bioinformatics 21.8 (2005): 1719-1720. <a href="https://academic.oup.com/bioinformatics/article/21/8/1719/250163" target="_blank">https://academic.oup.com/bioinformatics/article/21/8/1719/250163</a> <a href="#fnref:11" class="footnote-back-ref">↩</a></li>
<li id="fn:12">Kiperwasser, Eliyahu; Goldberg, Yoav (2016). "Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations". Transactions of the Association for Computational Linguistics. 4: 313–327. arXiv:1603.04351. Bibcode:2016arXiv160304351K. doi:10.1162/tacl_a_00101. S2CID 1642392. <a href="https://www.aclweb.org/anthology/Q16-1023/" target="_blank">https://www.aclweb.org/anthology/Q16-1023/</a> <a href="#fnref:12" class="footnote-back-ref">↩</a></li>
<li id="fn:13">Dernoncourt, Franck; Lee, Ji Young; Szolovits, Peter (2017-05-15). "NeuroNER: an easy-to-use program for named-entity recognition based on neural networks". arXiv:1705.05487 [cs.CL]. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:13" class="footnote-back-ref">↩</a></li>
</ol>

Bidirectional recurrent neural networks open-in-new

Bidirectional recurrent neural networks