A green cover

Title

Routledge Studies in Translation Technology and Techno-Humanities Controlled Document Authoring in a Machine Translation Age

Author

Rei Miyata

Size

236 pages

Language

English

Released

April 29, 2022

ISBN

9780367500207

Published by

Routledge

Japanese Page

view japanese page

“以下の書類を提出します” (ika no shorui o teishutsu shimasu)
 
How would you translate this Japanese sentence into English? If it is a high-school English assignment, adding the subject as needed and translating it as “I will submit the following document” may suffice. However, for an actual industrial translation, it is first necessary to confirm where and in what kind of document, in terms of genre and topic, the sentence appears (rather, the translation should start only after such confirmation). For example, let us say that this sentence is a bullet item in a document issued by a local government aimed at citizens that explains administrative procedures in a stepwise manner. Moreover, immediately after this sentence, two documents to be submitted are listed. If so, you could choose “Submit the following documents,” using the imperative verb form and translating the word “shorui” as “documents” in the plural. The final translation should also reflect the purpose of the translation and the relevant style guide.
 
Professional translators will naturally make such judgments to arrive at a valid translation. Then, what about machine translation? This book proposes and verifies the effectiveness of a method for producing translations using machine translation that considers the properties and elements of documents. The point is not to modify the machine translation itself, but to control the Japanese source text input properly. In the above example, you can rewrite the original text written as a stepwise action in a bulleted list to the imperative “… teishutsu shiro” in pre-translation processing, which increases the possibility of it being translated as the imperative “Submit …” in English. This method of sufficiently controlling expressions at the writing stage based on the linguistic form defined according to the document structure enables the advanced use of machine translation. Here, knowledge about what is meant by “linguistic form” and “document structure” or what different types exist is not necessarily clearly shared even among professional translators. Part II of this book covers the work of verbalizing and organizing this knowledge.
 
The research in this book does not deal with neural machine translation based on deep learning, which is currently the mainstream method. With natural language processing technologies advancing at a dizzying pace, the individual technologies covered herein are becoming obsolete. Current machine translation systems could be said to implicitly acquire knowledge of document properties and elements from a large quantity of training data. Nonetheless, machine translation systems still do not take into account document properties and elements in an extensive or explicit manner. Furthermore, these systems are black-boxed and cannot explain the appropriateness of translation results. Recently, ChatGPT and other generative AI services that use large language models have garnered attention, making it possible to explore the possibility of outputting better translations by providing specific verbal instructions. The framework presented in this book, which aims to understand and control documents and language precisely, is worth reexamining using the latest technologies.
 
We are working to organize document properties and elements that may be referenced in translation more comprehensively and have published the results as part of our project to create and share metalanguages for translation processes, primarily in industrial translation settings:
https://www.u-tokyo.ac.jp/biblioplaza/en/H_00146.html
 

(Written by MIYATA Rei, Lecturer, Graduate School of Education / 2023)

Table of Contents

Part I. Research background
1. Introduction
2. Related work
 
Part II. Controlled document authoring
3. Document formalization
4. Controlled language
5. CL contextualization
6. Terminology management
 
Part III. MuTUAL: An authoring support system
7. System development
8. Evaluation of the CL violation detection component
9. System usability evaluation Part
 
Part IV. Conclusion
10. Research findings and outlook

Related Info

Book Review:
Review by Xie’an Huang & Hong Xu (Language Resources and Evaluation Vol.57 pp.1423–1430  2023)
https://doi.org/10.1007/s10579-022-09598-0

Try these read-alike books: