Artificial intelligence (AI) is now revolutionizing entire industries and the current Corona situation is accelerating this development. In particular, AI has made tremendous progress in natural language processing and understanding: knowledge extraction and building, machine translation, voice assistants, chatbots and more.
Those who take advantage of these technologies are ahead of the game. D.O.G. recognized this several years ago and assembled a team of specialists to develop solutions and products that leverage the possibilities of AI and Natural Language Understanding (NLU).
Artificial Intelligence at D.O.G.
D.O.G. has been developing since 2018 AI methods and algorithms to solve various tasks in the linguistic and occasionally in the non-linguistic domain. The results:
- The intelligent terminology management system LookUp with its Knowledge module
- Intelligent context checking in ErrorSpy, the quality assurance software for translations
- Training, maintenance and development of individual machine translation (MT) solutions
- Extracting knowledge from texts and building intelligent terminologies that model knowledge with semantic relationships between concepts.
Artificial Intelligence and Machine Translation
For occasional machine translations, the standard engines from DeepL or Google suffice. A post-editor corrects the result. However, adapting technical terminology and phrasing to the desired corporate language takes a lot of time. Therefore, for larger or regular amounts of text, it is worthwhile to train your own translation engines. This is where D.O.G.'s know-how in AI and machine learning comes into play:
- We train MT systems ourselves and can therefore directly influence a whole range of parameters (e.g. batch size, learning rate, optimizer selection, etc...).
- We optimize the training material to achieve better translation results.
- We adapt automatic evaluation procedures such as BLEU for our training purposes.
- We develop our own methods for better consideration of terminology specifications.
- Our algorithms and tools support the post-editing process to reduce post-editing effort. Our algorithms learn from post editors' corrections which typical errors MT systems make with which customers, languages and topics. With our quality assurance program ErrorSpy, for example, we are able to detect machine errors that can be traced back to an incorrect context interpretation (e.g. is the German "Leitung" to be understood as a pipeline, cable, or telephone line?)
Figure 1: MT error assessment model: learning curve
We are far from having developed the perfect machine translation system, but the AI know-how of D.O.G. developers and the possibility to directly influence machine translation systems already bring significant benefits. You can benefit from this.
|German||Human translation||DOG MT Engine||DeepL||Comment|
|Status Fehlteil setzen||Set status reject part||Set status reject part||Set missing part status||Terminology has been learned (no error of meaning)|
|Das Handling wurde von der Maschine angefordert.||Handling was requested by the machine.||Handling was requested by the machine.||The handling was requested by the machine.||Style has been learned|
|Ursache:||Cause:||Cause:||cause:||Case senstivity has been learned|
|Als erstes muß die Datei Einstellungen.cfg vom Hauptverzeichnis kopiert werden.||File Einstellungen.cfg•must be copied beforehand from the main directory.||First copy the file Einstellungen.cfg from the main directory.||First, the Settings.cfg file must be copied from the root directory.||File names (named entities) and terminology have been learned|
Figure 2: Examples of the benefits of having your own engine
Artificial Intelligence, Knowledge and Terminology
Knowledge and terminology are at the heart of the work of a translator or technical writer. Technical information or terminology can be very demanding. The search for explanations, connections and further information is very time-consuming. There are no reliable figures on this, but it is reasonable to assume that translators, revisers, proofreaders or editors spend several hours a week searching for information.
One focus of D.O.G.'s AI activities is on intelligent methods that can be used to automatically extract knowledge from unstructured texts and organize it into multiple languages. The D.O.G. algorithms learn, for example, from corpora on a certain topic which terms are semantically related and which meaning terms take in a certain context. In the case of multilingual corpora, differences in meaning between the source and target languages can be detected in this way. The result is captured by D.O.G. terminologists in LookUp, a system that can be used to model knowledge. This knowledge can be used, among other things, for training machine translation systems or for quality assurance of machine translations.
Figure 3: Determining semantically related terms using AI methods
This know-how can be used, among other things, for SEO adaptation to other languages and cultures. Our AI services for you:
- Training of customized translation engines for the machine translation of your texts
- Building multilingual intelligent terminologies that represent knowledge using relations.
Request your offer free of charge and without any commitment: