Page 64 - Fister jr., Iztok, and Andrej Brodnik (eds.). StuCoSReC. Proceedings of the 2016 3rd Student Computer Science Research Conference. Koper: University of Primorska Press, 2016
P. 64
TUNING GA NEW
SET
GARW MODELS'
MODELS MERT WEIGHTS
MODELS'
WEIGHTS
Figure 2: Tuning process.
Table 1: BLEU scores on test corpora. We successfully built two SMT systems using JRC-Acquis
corpora for Slovenian-English language pair. We implemented
System BLEU (%) ↑ the basic GA and added the Roulette Wheel Selection to
Slovenian → English English → Slovenian speed up the process. We showed that using the Roulette
Wheel Selection the tuning time was shortened while still
MERT 33.18 23.34 maintaining the comparable translation quality. For further
research we can use more advanced EAs, such as jDE [4],
GA 32.15 23.39 L-SHADE [22], nature-inspired algorithms [14] to improve
the translation quality and shorten the tuning time.
GARW 33.19 23.49
7. REFERENCES
Table 2: Tuning time on tuning set.
[1] O. Al Jadaan, L. Rajamani, and C. Rao. Improved
System Time (minutes) selection operator for ga. Journal of Theoretical &
Applied Information Technology, 4(4), 2008.
MERT 97
[2] T. F. Albat. US Patent 0185235, Systems and
GA 182 Methods for Automatically Estimating a Translation
Time, 2012.
GARW 91
[3] N. Bertoldi, B. Haddow, and J.-B. Fouet. Improved
it makes 50 % less evaluations than the basic GA. The cur- Minimum Error Rate Training in Moses. ACL, pages
rent difference is 1.5 hours but with much more larger sets 160–167, 2009.
the difference could be in 10 hours or even days. The BLEU
score represents the similarity between the machine trans- [4] J. Brest, S. Greiner, B. Boskovic, M. Mernik, and
lated text and human translated text. At this moment it is V. Zumer. Self-adapting control parameters in
impossible to achieve a BLEU score of 100 % because that differential evolution: A comparative study on
would mean that the two texts are identical. Even human numerical benchmark problems. IEEE Trans.
translators can not produce the same translations. Thats Evolutionary Computation, 10(6):646–657, 2006.
why every % counts and the difference 1 % is actually a
good improvement in the translation quality. [5] P. Brown, J. Cocke, S. D. Pietra, V. D. Pietra,
F. Jelinek, J. D. Lafferty, R. L. Mercer, and
6. CONCLUSIONS P. Roosing. A statistical approach to machine
translation. Computational Linguistics, pages 79–85,
StuCoSReC Proceedings of the 2016 3rd Student Computer Science Research Conference 64
Ljubljana, Slovenia, 12 October
SET
GARW MODELS'
MODELS MERT WEIGHTS
MODELS'
WEIGHTS
Figure 2: Tuning process.
Table 1: BLEU scores on test corpora. We successfully built two SMT systems using JRC-Acquis
corpora for Slovenian-English language pair. We implemented
System BLEU (%) ↑ the basic GA and added the Roulette Wheel Selection to
Slovenian → English English → Slovenian speed up the process. We showed that using the Roulette
Wheel Selection the tuning time was shortened while still
MERT 33.18 23.34 maintaining the comparable translation quality. For further
research we can use more advanced EAs, such as jDE [4],
GA 32.15 23.39 L-SHADE [22], nature-inspired algorithms [14] to improve
the translation quality and shorten the tuning time.
GARW 33.19 23.49
7. REFERENCES
Table 2: Tuning time on tuning set.
[1] O. Al Jadaan, L. Rajamani, and C. Rao. Improved
System Time (minutes) selection operator for ga. Journal of Theoretical &
Applied Information Technology, 4(4), 2008.
MERT 97
[2] T. F. Albat. US Patent 0185235, Systems and
GA 182 Methods for Automatically Estimating a Translation
Time, 2012.
GARW 91
[3] N. Bertoldi, B. Haddow, and J.-B. Fouet. Improved
it makes 50 % less evaluations than the basic GA. The cur- Minimum Error Rate Training in Moses. ACL, pages
rent difference is 1.5 hours but with much more larger sets 160–167, 2009.
the difference could be in 10 hours or even days. The BLEU
score represents the similarity between the machine trans- [4] J. Brest, S. Greiner, B. Boskovic, M. Mernik, and
lated text and human translated text. At this moment it is V. Zumer. Self-adapting control parameters in
impossible to achieve a BLEU score of 100 % because that differential evolution: A comparative study on
would mean that the two texts are identical. Even human numerical benchmark problems. IEEE Trans.
translators can not produce the same translations. Thats Evolutionary Computation, 10(6):646–657, 2006.
why every % counts and the difference 1 % is actually a
good improvement in the translation quality. [5] P. Brown, J. Cocke, S. D. Pietra, V. D. Pietra,
F. Jelinek, J. D. Lafferty, R. L. Mercer, and
6. CONCLUSIONS P. Roosing. A statistical approach to machine
translation. Computational Linguistics, pages 79–85,
StuCoSReC Proceedings of the 2016 3rd Student Computer Science Research Conference 64
Ljubljana, Slovenia, 12 October