Integrating DocMaker 3.0 with a translation memory system

The new DocMaker 3.0 API opens up a whole new range of integration possibilities. A popular request has always been to integrate our DocMaker with some kind of translation memory.

A TM is a database that stores words, sentences and paragraphs in different languages and thus helps to translate a document by identifying often used text segments and automatically or semi-automatically translating them. Integrating with such a system will allow the creation process of multi-language documents to be streamlined and automated to a certain degree.

Our new API allows you to do this in a few simple lines of code:

C#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
private static void AutoTranslate(Record record, string languageCode)
{
    if (record == null)
    {
        throw new ArgumentNullException("record");
    }

    if (string.IsNullOrEmpty(languageCode))
    {
        throw new ArgumentNullException("languageCode");
    }

    using (Document doc = Document.CreateFromFile(record.Files.Master))
    {
        foreach (Story story in doc.Stories)
        {
            foreach (Paragraph paragraph in story.Paragraphs)
            {
                foreach (Item item in paragraph.Items)
                {
                    Run run = item as Run;
                    if (run != null)
                    {
                        string translation;
                        if (FindSegment(run.Value, languageCode, out translation))
                        {
                            run.Value = translation;
                        }
                    }
                }
            }                    
        }
        doc.PublishToRecord(DocumentSaveMode.NewVersion);
    }
}

private static bool FindSegment(string segment, string languageCode, out string translation)
{
    if (segment == null)
    {
        throw new ArgumentNullException("segment");
    }

    if (string.IsNullOrEmpty(languageCode))
    {
        throw new ArgumentNullException("languageCode");
    }

    translation = null;

    XmlDocument xmlDocument = new XmlDocument();
    xmlDocument.Load(@"TranslationMemory.xml");
    XmlNode node = xmlDocument.SelectSingleNode(string.Format("//translations/translation/add[@value='{0}']", segment));
    if (node != null)
    {
        node = node.ParentNode.SelectSingleNode(string.Format("add[@language='{0}']", languageCode));
        if (node != null)
        {
            translation = node.Attributes["value"].Value;
            return true;
        }
    }
    return false;
}

The code uses the following XML document as TM, but you could easily make a connection to a database, web service or whatever your TM vendor uses to expose its data:

XML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
<?xml version="1.0" encoding="utf-8"?>
<translations>
    <translation>
        <add language="EN" value="Working within the ADAM ecosystem" />
        <add language="FR" value="Travailler au sein de l’écosystème ADAM" />
        <add language="DE" value="Arbeiten innerhalb des ADAM Ecosystems" />
    </translation>
    <translation>
        <add language="EN" value="The WOW Factor" />
        <add language="FR" value="Le facteur OUAH" />
        <add language="DE" value="Der WOW-Faktor" />
    </translation>
    <translation>
        <add language="EN" value="Our software is everything, but nothing without providing continuous support and standing 100% behind our partners" />
        <add language="FR" value="Notre logiciel est tout et rien à la fois si nous n’assurons pas un soutien permanent et ne sommes pas à 100% derrière nos partenaires." />
        <add language="DE" value="Unsere Software ist alles, aber nichts ohne die Bereitstellung von ständiger Unterstützung und die Tatsache, dass wir zu 100 % hinter unseren Partnern stehen." />
    </translation>
    <translation>
        <add language="EN" value="How can We help you?" />
        <add language="FR" value="Comment pouvons-nous vous aider?" />
        <add language="DE" value="Wie können wir Ihnen helfen?" />
    </translation>
</translations>

Obviously, the code has been simplified for demonstration purposes and brevity: it only takes in account Runs of a TextElement; you could enhance it to evaluate nested TextElements, Tables,...
It also doesn't process and cache the XML document in the fastest way.
The customer will probably also want to see some reporting on what was translated and what was not, how much of the segments were an exact match and so on, but all of that can be implemented based on the customer's requirements and the capabilities of the translation memory system.

My recommendation would be to trigger this code as a maintenance task (as a sort of preprocessing) or incorporate this as a rule, in case you have a very fast TM that can do this sort of thing real-time.

Comments

Leave a comment
You must be logged in to post comments.
Sign in now
 
 
Technical
Business
rss feed