Page 13 - Fister jr., Iztok, and Andrej Brodnik (eds.). StuCoSReC. Proceedings of the 2016 3rd Student Computer Science Research Conference. Koper: University of Primorska Press, 2016

P. 13

REFERENCE

[1] European parliament proceedings parallel corpus
1996-2011. http://www.statmt.org/europarl/.
Accessed: 2016-05-23.

[2] Wikipedia. https://www.wikipedia.org/. Accessed:
2016-05-23.

[3] K. Abainia, S. Ouamour, and H. Sayoud. Robust
language identiﬁcation of noisy texts: Proposal of
hybrid approaches. In 2014 25th International
Workshop on Database and Expert Systems
Applications, pages 228–232, Sept 2014.

[4] W. B. Cavnar and J. M. Trenkle. N-gram-based text
categorization. In In Proceedings of SDAIR-94, 3rd
Annual Symposium on Document Analysis and
Information Retrieval, pages 161–175, 1994.

[5] R. M. Milne, R. A. O’Keefe, and A. Trotman. A study
in language identiﬁcation. In Proceedings of the
Seventeenth Australasian Document Computing
Symposium, ADCS ’12, pages 88–95, New York, NY,
USA, 2012. ACM.

[6] A. Selamat, N. C. Ching, and Y. Mikami. Arabic script
web documents language identiﬁcation using decision
tree-artmap model. In Convergence Information
Technology, 2007. International Conference on, pages
721–726, Nov 2007.

[7] A. K. Singh. Study of some distance measures for
language and encoding identiﬁcation. In Proceedings of
the Workshop on Linguistic Distances, LD ’06, pages
63–72, Stroudsburg, PA, USA, 2006. Association for
Computational Linguistics.

[8] A. Xafopoulos, C. Kotropoulos, G. Almpanidis, and
I. Pitas. Language identiﬁcation in web documents
using discrete {HMMs}. Pattern Recognition, 37(3):583
– 594, 2004.

StuCoSReC Proceedings of the 2016 3rd Student Computer Science Research Conference 13
Ljubljana, Slovenia, 12 October

8 9 10 11 12 13 14 15 16 17 18