Why pay for the ugly part of MT? With the new innovative pricing model, Globalese users are paying only for qualified MT words, meaning words which are really helping translators. Gábor Bessenyei (Morphologic)
13. • Pricing model where users
only pay for useful MT
output
• Can be based on QE or
post-editing evaluation
• What is useful MT output?
• No additional costs
Sum up
same file only
counts once
75%
and above repetitions
only count
once
Welcome everybody. I am glad to have this opportunity to participate in the TAUS innovation contest. Before I start, I would briefly introduce my self: I am Gábor Bessenyei, CEO and managing partner of MorphoLogic Localisation. We are a software localization and language technology company located in Budapest, Hungary.
I think it was Bob Dylan who said that the reason for him to start playing music was because nobody played the music he wanted to hear. We had quite similar motivation to start developing Globalese. We identified several gaps which were not filled by any of the existing MT solutions.
One of these gaps was pricing. There are several pricing models on the machine translation market, but most of them have one in common: beside other fees, users have to pay a flat word quota, without making any differentiation in the quality. And machine translation quality can be very different from segment to segment, even within the same engine.
You know Fukubukuro? These are bags filled with unknown random contents and sold for a substantial discount. We felt that these pricing models are like buying such a Fukubukuro. What we thought was: why shall users pay for he ugly part of machine translation?
So what we had in mind was a pricing model where users pay only for those machine translated segments which have a certain quality and which are really good. They should not pay for the crappy part of MT. This means cherry-picking the good machine translations, but forget the bad ones. Sounds easy? Actually, it is not.
And this is where a very simple thing like this pricing model becomes complicated. In any machine translation project, you have a certain amount of good MT output, some segments which are not perfect but still good enough to help the translators and finally, there are unfortunately always some ugly segments. So we had to find a way to measure the quality of the machine translated segments. And judging about language quality is something really challenging, even if we speak about human translation.
I spent a couple of years in the translation industry, and so far, I found only one good definition about translation quality. A good translation is a translation which is accepted by the customer. So simple. But unfortunately this method cannot be applied to all projects so we had to find a more industrial solution.
This means that we are comparing the machine translation output with the final post-edited translation. Even if it is not perfect, this is in my opinion still the best method to judge about the quality. We call this process evaluation, and the result is a fuzzy-match analysis like table you can see here. Important is that in this case there is already a reference translation available. The basis for billing are the segments or words which are at least in the 75% match range or higher. Repetitions and segments under 75% are for free.
So far, so good. But there is one small problem: evaluation figures are available only after the project is completed. We had to find a way to tell something about the machine translation quality before project start. The solution was the quality estimation feature which is based on machine learning technologies and which provides figures similar to fuzzy-match values about the predicted machine translation quality. Again, the basis for billing are the segments or words which are at least in the 75% match range or higher.
Combining evaluation with quality estimation allowed us to implement a two-step price calculation process. Customers have the choice: if they are lazy and do not want to spend time with uploading final translations, they can rely on the predicted values of the Quality Estimation figures. These are not perfect, but still fair and close to the actual values. If they prefer to calculate the price based on actual values, they can use the final translations and measure the actual distance between MT and post-edited translation. In any case, if an evaluation value is available, it will be the basis for pricing.
OK, we can now measure the quality. But there were still some issues to be solved. What if a file is machine translated several times, because the user tried different engine versions? What if evaluation is running a few weeks after the project start? We tried to create fair rules to allow maximum flexibility to our users. We allow to translate the same file several times, and users have a couple of weeks to upload the final version of the translation. Even better: if this happens after the billing period, the estimated value will be compensated with the actual evaluation values. And repetitions are counted of course only once.
I hope I could give you some insight into the fact that the pricing model is just the tip of the iceberg, and requires a big portion of new technology, which is not visible.
So let’s sum up. We created a pricing model where users only pay for useful MT segment. We call them qualified words. The same translation file only counts once, users can translate the same file several times at no additional costs. Repetitions only count ones. The figures are calculated based on the Quality Estimation or on the post-editing evaluation values. What is a useful MT output? Every segment with at least 75% evaluation score. And last but not least: there are no additional costs, just a small minimum fee to avoid mini projects with two words.