Analysis Expert explained

The Analysis Expert analyses one or more projects by comparing them to one of more translation memory. It calculates the number of segment matches between the projects and the Translation Memories to report a word count. This report can be used by the project manager and/or translator to create a time line and costing for a translation project.

This article will explain how the analysis is performed and explains how the calculations are made.

First let's look at the report window itself and explain the main

 

 

Now let's consider some scenarios to better understand the results.

Scenario 1

The following sample file is inserted in Catalyst to create a project. It contains the following strings:

ID001=Cancel

ID002=String unique 1

ID003=String unique 2

ID003=Cancel

ID004=OK is repeated 3 times

ID005=OK

ID006=OK

ID007=OK

ID008=OK is translated in the TM file

ID009=Cancel

ID010=Cancel all

ID011=Cancel

It looks like so when inserted in Catalyst using the default (WithIDs) parsing rule for .properties file.

In this scenario, I have only one translation available in the sample TM (Translation Memory). The translated string is for the word "OK".

The Analysis Expert will report the following:

  • The total word count is 27 words. Note that a number in segment is counted as a word. For example, the segment String unique 1 counts for 3 words.
  • The segment Cancel appears 4 times in the file. The first instance is reported as a No match (Unique) circle in Blue.
  • In red, all other instances of the word Cancel for which there are no available translation, are counted as Duplicate.
  • In Green, the word OK, appearing in 3 segments, is counted as a 100% match as there is a translation available in the TM and it's a 100% match.

The Catalyst project below is view using the Duplicate filter to focus on Duplicated strings.

 

Scenario 2

We now add 2 strings to our second sample file (in bold). The following file is inserted in Catalyst to create a project:

ID001=Cancel

ID002=String unique 1

ID003=String unique 2

ID003=Cancel

ID004=OK is repeated 3 times

ID005=OK

ID006=OK

ID007=OK

ID008=OK is translated in the TM file

ID009=Cancel

ID010=Cancel all

ID011=Cancel

ID012=OK all

ID013=It's OK

In this scenario, I used the same sample TM (Translation Memory) as above  which contains a translation for the segment "OK".

The Analysis Expert reports the same as above, but now contains some fuzzy matches on the 2 new strings inserted:

  • The total word count is now 31 words. Our 2 new segments contain 4 words (an apostrophe is not counted as a word.
  • In Blue are counted all segments for which there are no translation available. Including the first instance of the segment Cancel. They are reported as No match (Unique).
  • In red, all other instances of the word Cancel for which there are no available translation, are counted as Duplicate.
  • In Green, the word OK, appearing in 3 segments, is counted as a 100% match as there is a translation available in the TM and it's a 100% match.
  • In Orange, the 2 new segments which include the word OK are counted a Fuzzy matches in the appropriate bracket. In this case they are 50%-74% fuzzy matches.

 

 

Scenario 3

Using the same source file as above, we now update the sample TM adding a translation for the segment Cancels, which will be a fuzzy match for the segments Cancel (without S).

So the TM contains 2 translated segments:

  • OK

  • Cancels

The Analysis Expert reports the following. It is important to repeat at this point that duplicated segments will only be counted as duplicates when there is no translation available. This scenario demonstrates this.

  • The total word count remains 31 words.
  • In Blue are counted all segments for which there are no translation available. It no longer includes segment Cancel as there are Fuzzy matches available from the TM.
  • Because the segment Cancel has now a fuzzy match available, it is no longer treated as a Duplicate as previously highlighted in red.
  • In Green, the word OK, appearing in 3 segments, is counted as a 100% match as there is a translation available in the TM and it's a 100% match.
  • In Orange, the segments which include OK or Cancel are counted a Fuzzy matches in the appropriate bracket.

 

 

Products or Versions Affected

  • Alchemy CATALYST 10.0 and greater

 

Last updated with Catalyst 11