Page 63 - Fister jr., Iztok, and Andrej Brodnik (eds.). StuCoSReC. Proceedings of the 2018 5th Student Computer Science Research Conference. Koper: University of Primorska Press, 2018
P. 63
ed Entity Recognition and Classification using
Artificial Neural Network

Luka Bašek Borko Boškovicˇ

Fakulteta za elektrotehniko, racˇunalništvo in Fakulteta za elektrotehniko, racˇunalništvo in
informatiko informatiko

Koroška cesta 46 Koroška cesta 46
2000 Maribor, Slovenija 2000 Maribor, Slovenija

luka.basek@student.um.si borko.boskovic@um.si

ABSTRACT entity recognition.
To solve the problem of Named entity recognition, a variety
In this paper, we will analyze variants of Long-Short Term of machine learning algorithms were used. The set of al-
Memory (LSTM) and Gated Recurrent Unit (GRU) based gorithms was composed of several statistical models such as
models for sequence tagging, special for Name Entity Recog- the Hidden Markov Models (HMM) Florian et al. (2003) [4],
nition (NER). The models which will be analyzed include Maximal Entropy Models (MaxEnt) Chieu et al. (2003) [1],
LSTM network, bidirectional LSTM network (BI-LSTM), Conditional Random Fields (CRF) McCallum et al. (2003)
GRU network and bidirectional GRU network (BI-GRU) [12]. There are some machine learning methods but in this
and using pre-trained GloVe vectors 1, Part of speech and paper, we decide to use deep learning methods. On CoNLL
characters embedding for features. We evaluated our models 2003 one of the solutions contain Recurrent Neural Networks
with data set for sequence labeling task CoNLL 2003 corpus. (RNN) with LSTM cells Hammerton (2013) [7] and this is
We obtain F1 score on this dataset - 86.04% 2. the start point of our work.
The task is to explore deep learning methods in NLP and
Keywords more specific for Named Entity Recognition (NER). Back in
the few years, the neural networks have proved to be effec-
Natural Language Processing, Named Entity Recognition, tive in the NLP area such as Text Classification Zhang et al.
Neural Network, Recurrent Neural Network, LSTM, GRU, (2015) [21], Sentiment Analysis Severyn et al. (2015) [17],
Keras, GloVe vectors Part of speech tagging Wang et al. (2015) [19].
In the case of solving the problem of Name Entity Recogni-
1. INTRODUCTION tion, Hammerton (2003) [7] using RNN with LSTM cells pre-
sented by Hochreiter et al. (1997) [8]. Huang et al. (2015)
By developing information technologies, people today have a [9] presents more complex models for NER based on the
quick and wide access to a large amount of data. Everyday LSTM network, Bi-directional LSTM network, and various
we meet a lot of sources of knowledge such as social net- combinations with the CRF layer.
works, news, reviews, blogs, etc. There is a lot of data but In this paper, we propose a variety of recurrent neural net-
the data is simple and raw isolated facts which have some works based models which include LSTM networks, bidirec-
meaning [15]. In the world is more and more data every tional LSTM networks (BI-LSTM), GRU networks and bidi-
day and data have to be analyzed to become a useful infor- rectional (BI-GRU) networks. Beside network architectures,
mation. The computer can quickly process data and store we include additional features to help neural networks learn
the information. However, the computers without human about the context of words. We include pre-trained GloVe
presence can’t analyze text and provide information about for word representation, Part of Speech tags and Characters
text. For this purpose, computer science has developed a Embedding. We evaluate our models on English data from
lot of methods and techniques for analyzing and discover- CoNLL 2003 shared task Sang et al. (2003) [16]. Our best
ing knowledge. The field of Natural Language Processing F1 score is 86.05%. Slightly worse than current attractive
(NLP) is an interdisciplinary area at the crossroads of com- solutions which have F1 score 91.21%, Xuezhe et al. (2016)
puter science, artificial intelligence and linguistics. Our goal [11] but this is a good starting point for improvements.
is Information Retrieval, more specifically the task of Named
2. MODELS
1https://nlp.stanford.edu/projects/glove/
2The code of the this project is available at: In this section, we describe the models used in this paper:
https://github.com/lbasek/named-entity-recognition LSTM, BI-LSTM, GRU, BI-GRU.

2.1 LSTM Network

For operating sequential data the RNN is the most used ap-
proach. RNN takes as input a sequence of vectors x =
(x1, ..., xn) and return another sequence h = (h1, ..., hn)
that represents some information about the sequence by one-
time unit. Reading articles about RNN, in theory, it works

StuCoSReC Proceedings of the 2018 5th Student Computer Science Research Conference DOI: https://doi.org/10.26493/978-961-7055-26-9.65-70 65
Ljubljana, Slovenia, 9 October
   58   59   60   61   62   63   64   65   66   67   68