NLP/Paper Review

[MTQE] Translation Quality and Error Recognition in Professional Neural Machine Translation Post-Editing

joannekim0420 2021. 11. 22. 15:46
728x90

목적

  • attempt to model the cognitive processes in humans

 

Terminology

  • NMTPE : Neural Machine Translation Post-Editing

 

2.2.1 Automatic Error Annotation with Hjerson

  • WER = Word Error Rate
  • RPER = position-independent error rate in the reference (source)
  • HPER = position-independent error rate in the hypothesis (target)
  1. inflectional error 
    a word whose full form is marked as RPER/HPER error but the base forms are the same
  2. reordering error
    a word which occurs both in the reference and in the hypothesis is thus not contributing to RPER or HPEr but is marked a WER error
  3. missing word
    a wrod which occurs as deletion in WER errors and at the same time occurs as RPER error without sharing the base form with any hypothesis error
  4. extra word
    a wrod which occurs as insertion in WER errors and at the same time occurs as HPER error without sharing the base form with any reference error
  5. incorrect lexical choice
    a word which belongs to neither to inflectional errors nor to missing or extra wrods is considered as lexical error

 

2.2.2 Manual Error Annotation According to the MQM framework

  • MQM : Multidimensional Quality Metrics → 기존에 나와있던 여러 기법들을 정리함
  • NMT와 NMTPE의 lexical errors 와 extra words 항목에 있는 문장들을 더 자세한 기준으로 분류하기 위함.
  1. Mistranslation - source 를 target이 정확하게 대표하지 못함
  2. terminology - domain이 정해진 단어가 같은 도메인으로 번역되지 않음
  3. Unidiomatic - 문법적으로는 맞지만, 자연스럽지 못함
  4. Register - 원 뜻보다 더 구체적이거나 포괄적인 의미로 번역됨.
  5. Spelling - spelling 틀림
  6. Function words - 전치사, 관형사, 등등이 옳게 쓰이지 않음

RESULTS

 

 

 

논문 : https://www.mdpi.com/2227-9709/6/3/41/htmNMTPE : Neural Machine Translation Post-Editing