Skip to content

MT quality – Automatic metrics or manual evaluation?

Automatic evaluation
One method is the bleu score: bilingual evaluation understudy. Here MT output is compared to human translation: the higher the similarity between MT and the human version is, the higher the translation quality is considered.
However, automatic evaluation ignores the difference among various styles and terminological choices.

Manual evaluation
Done by asking humans sometimes through crowd-sourcing platforms.
In this case people were asked through the Amazon Mechanical Turk to asses if the translations were adequate and fluent in their native language.

Conclusions
The research is going on in both directions. Automatic and manual MT evaluation are still considered complementary methods.

You can read the full article here.

 

 

2 Comments »

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 223 other followers

%d bloggers like this: