Page 67 - Fister jr., Iztok, and Andrej Brodnik (eds.). StuCoSReC. Proceedings of the 2018 5th Student Computer Science Research Conference. Koper: University of Primorska Press, 2018
P. 67
ure 6: Epoch-by-epoch model evaluation on the Named entity recognition through classifier
validation set with accuracy metric where the model combination. In Proceedings of the Seventh Conference
with all additional features is used. LSTM is shown on Natural Language Learning at HLT-NAACL 2003 -
with the orange, GRU with the dark blue, BI-LSTM Volume 4, CONLL ’03, pages 168–171, Stroudsburg,
with the red, and BI-GRU with the light blue line. PA, USA, 2003. Association for Computational
Linguistics.
Figure 7: Epoch-by-epoch result of loss function cal-
culated on the model with all additional features. [5] A. Graves, N. Jaitly, and A. rahman Mohamed.
LSTM is shown with the orange, GRU with the dark Hybrid speech recognition with deep bidirectional
blue, BI-LSTM with the red, and BI-GRU with the lstm. In In IEEE Workshop on Automatic Speech
light blue line. Recognition and Understanding (ASRU, 2013.
6. REFERENCES [6] A. Graves, A. Mohamed, and G. E. Hinton. Speech
recognition with deep recurrent neural networks.
[1] H. L. Chieu and H. T. Ng. Named entity recognition CoRR, abs/1303.5778, 2013.
with a maximum entropy approach. In Proceedings of
the Seventh Conference on Natural Language Learning [7] J. Hammerton. Named entity recognition with long
at HLT-NAACL 2003 - Volume 4, CONLL ’03, pages short-term memory. In Proceedings of the Seventh
160–163, Stroudsburg, PA, USA, 2003. Association for Conference on Natural Language Learning at
Computational Linguistics. HLT-NAACL 2003 - Volume 4, CONLL ’03, pages
172–175, Stroudsburg, PA, USA, 2003. Association for
[2] J. Chung, C¸ . Gu¨l¸cehre, K. Cho, and Y. Bengio. Computational Linguistics.
Empirical evaluation of gated recurrent neural
networks on sequence modeling. CoRR, [8] S. Hochreiter and J. Schmidhuber. Long short-term
abs/1412.3555, 2014. memory. 9:1735–80, 12 1997.
[3] R. Dey and F. M. Salem. Gate-variants of gated [9] Z. Huang, W. Xu, and K. Yu. Bidirectional
recurrent unit (GRU) neural networks. CoRR, LSTM-CRF models for sequence tagging. CoRR,
abs/1701.05923, 2017. abs/1508.01991, 2015.
[4] R. Florian, A. Ittycheriah, H. Jing, and T. Zhang. [10] D. Jurafsky and J. H. Martin. Speech and Language
Processing: An Introduction to Natural Language
Processing, Computational Linguistics, and Speech
Recognition. Prentice Hall PTR, Upper Saddle River,
NJ, USA, 1st edition, 2000.
[11] X. Ma and E. H. Hovy. End-to-end sequence labeling
via bi-directional lstm-cnns-crf. CoRR,
abs/1603.01354, 2016.
[12] A. McCallum and W. Li. Early results for named
entity recognition with conditional random fields,
feature induction and web-enhanced lexicons. In
Proceedings of the Seventh Conference on Natural
Language Learning at HLT-NAACL 2003 - Volume 4,
CONLL ’03, pages 188–191, Stroudsburg, PA, USA,
2003. Association for Computational Linguistics.
[13] C. Olah. Understanding lstm networks. 2015.
[14] L. A. Ramshaw and M. P. Marcus. Text Chunking
Using Transformation-Based Learning, pages 157–176.
Springer Netherlands, Dordrecht, 1999.
[15] J. Rowley. The wisdom hierarchy: representations of
the dikw hierarchy. Journal of Information Science,
33(2):163–180, 2007.
[16] E. F. T. K. Sang and F. D. Meulder. Introduction to
the conll-2003 shared task: Language-independent
named entity recognition. CoRR, cs.CL/0306050,
2003.
[17] A. Severyn and A. Moschitti. Twitter sentiment
analysis with deep convolutional neural networks. In
Proceedings of the 38th International ACM SIGIR
Conference on Research and Development in
Information Retrieval, SIGIR ’15, pages 959–962, New
York, NY, USA, 2015. ACM.
[18] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever,
and R. Salakhutdinov. Dropout: A simple way to
prevent neural networks from overfitting. J. Mach.
Learn. Res., 15(1):1929–1958, Jan. 2014.
[19] P. Wang, Y. Qian, F. Soong, L. He, and H. Zhao.
Part-of-speech tagging with bidirectional long
StuCoSReC Proceedings of the 2018 5th Student Computer Science Research Conference 69
Ljubljana, Slovenia, 9 October
validation set with accuracy metric where the model combination. In Proceedings of the Seventh Conference
with all additional features is used. LSTM is shown on Natural Language Learning at HLT-NAACL 2003 -
with the orange, GRU with the dark blue, BI-LSTM Volume 4, CONLL ’03, pages 168–171, Stroudsburg,
with the red, and BI-GRU with the light blue line. PA, USA, 2003. Association for Computational
Linguistics.
Figure 7: Epoch-by-epoch result of loss function cal-
culated on the model with all additional features. [5] A. Graves, N. Jaitly, and A. rahman Mohamed.
LSTM is shown with the orange, GRU with the dark Hybrid speech recognition with deep bidirectional
blue, BI-LSTM with the red, and BI-GRU with the lstm. In In IEEE Workshop on Automatic Speech
light blue line. Recognition and Understanding (ASRU, 2013.
6. REFERENCES [6] A. Graves, A. Mohamed, and G. E. Hinton. Speech
recognition with deep recurrent neural networks.
[1] H. L. Chieu and H. T. Ng. Named entity recognition CoRR, abs/1303.5778, 2013.
with a maximum entropy approach. In Proceedings of
the Seventh Conference on Natural Language Learning [7] J. Hammerton. Named entity recognition with long
at HLT-NAACL 2003 - Volume 4, CONLL ’03, pages short-term memory. In Proceedings of the Seventh
160–163, Stroudsburg, PA, USA, 2003. Association for Conference on Natural Language Learning at
Computational Linguistics. HLT-NAACL 2003 - Volume 4, CONLL ’03, pages
172–175, Stroudsburg, PA, USA, 2003. Association for
[2] J. Chung, C¸ . Gu¨l¸cehre, K. Cho, and Y. Bengio. Computational Linguistics.
Empirical evaluation of gated recurrent neural
networks on sequence modeling. CoRR, [8] S. Hochreiter and J. Schmidhuber. Long short-term
abs/1412.3555, 2014. memory. 9:1735–80, 12 1997.
[3] R. Dey and F. M. Salem. Gate-variants of gated [9] Z. Huang, W. Xu, and K. Yu. Bidirectional
recurrent unit (GRU) neural networks. CoRR, LSTM-CRF models for sequence tagging. CoRR,
abs/1701.05923, 2017. abs/1508.01991, 2015.
[4] R. Florian, A. Ittycheriah, H. Jing, and T. Zhang. [10] D. Jurafsky and J. H. Martin. Speech and Language
Processing: An Introduction to Natural Language
Processing, Computational Linguistics, and Speech
Recognition. Prentice Hall PTR, Upper Saddle River,
NJ, USA, 1st edition, 2000.
[11] X. Ma and E. H. Hovy. End-to-end sequence labeling
via bi-directional lstm-cnns-crf. CoRR,
abs/1603.01354, 2016.
[12] A. McCallum and W. Li. Early results for named
entity recognition with conditional random fields,
feature induction and web-enhanced lexicons. In
Proceedings of the Seventh Conference on Natural
Language Learning at HLT-NAACL 2003 - Volume 4,
CONLL ’03, pages 188–191, Stroudsburg, PA, USA,
2003. Association for Computational Linguistics.
[13] C. Olah. Understanding lstm networks. 2015.
[14] L. A. Ramshaw and M. P. Marcus. Text Chunking
Using Transformation-Based Learning, pages 157–176.
Springer Netherlands, Dordrecht, 1999.
[15] J. Rowley. The wisdom hierarchy: representations of
the dikw hierarchy. Journal of Information Science,
33(2):163–180, 2007.
[16] E. F. T. K. Sang and F. D. Meulder. Introduction to
the conll-2003 shared task: Language-independent
named entity recognition. CoRR, cs.CL/0306050,
2003.
[17] A. Severyn and A. Moschitti. Twitter sentiment
analysis with deep convolutional neural networks. In
Proceedings of the 38th International ACM SIGIR
Conference on Research and Development in
Information Retrieval, SIGIR ’15, pages 959–962, New
York, NY, USA, 2015. ACM.
[18] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever,
and R. Salakhutdinov. Dropout: A simple way to
prevent neural networks from overfitting. J. Mach.
Learn. Res., 15(1):1929–1958, Jan. 2014.
[19] P. Wang, Y. Qian, F. Soong, L. He, and H. Zhao.
Part-of-speech tagging with bidirectional long
StuCoSReC Proceedings of the 2018 5th Student Computer Science Research Conference 69
Ljubljana, Slovenia, 9 October