Tải bản đầy đủ (.pdf) (9 trang)

THE CONCEPTS OF EQUIVALENCE GAIN AND LOS

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (533.75 KB, 9 trang )

THE CONCEPTS OF EQUIVALENCE, GAIN AND LOSS (DIVERGENCE) IN ENGLISHURDU WEB-BASED MACHINE TRANSLATION PLATFORMS
Sharmin Muzaffar* & Pitambar Behera**
(sharmin.muzaffar & pitambarbehera2)@gmail.com
Abstract:
Equivalence, gain and loss (divergence) are well-established and the most prevalent concepts in both
the theoretical and applied translation studies. Equivalence denotes to the ‘ideal or perfect translation’
from the SL to TL and is inherently vital since it is indispensably reader-oriented or listener-oriented.
In the process of translation, total equivalence is hardly achieved because of some gain and loss. Gain
and loss can be accounted for the differences or divergences between languages viz. SL and TL. The
divergences could be pertained to linguistic, social, cultural (Vermeer, 1987 & Goodenough, 1964),
religious and other knowledge paradigms as translating a given language text encapsulates
representing all of these paradigms into the TL text. Anton Popovic (1976) has identified four broader
types of equivalence in translation: linguistic, paradigmatic, stylistic and textual. Dorr has classified
divergence into two basic types: syntactic and lexical-semantic. In order to deal with the concept of
equivalence, Popovic’s theoretical model has been considered. So far as divergence is concerned, we
have devised our own theoretical model.
With regard to methodology, we have applied 1 thousand corpus of English sentences for this
research study and analysed the translated Urdu output considering different areas of translational
equivalence, gain and loss on web-based Machine Translation platforms such as Bing and Google
Translate. We presume that gain and loss emerge as a huge issue which may owe their genesis from
socio-linguistic, cultural, religious and anthropological factors.
1. Introduction:
Equivalence on one side and gain and loss on the other are contradictory to each other and are quite
well-known in the field of translation studies. Achieving complete equivalence in translation (manual
or machine) is next to impossible because of divergent linguistic patterns between the SL and TL text.
This study focuses on both manual and machine translation and is applicable to theoretical and applied
translation. A corpus of 1k English ILCI sentences has been provided as input to Google and Bing
translates and data (in Urdu) has been crawled in bulk for observation, generalization and analysis. The
outcomes of this study would prove to be beneficial for the enhancement of the accuracy of both of the
aforementioned platforms. Below are the snapshots for Google and Bing translates.


Figure 1. Google Translate

Figure 2. Bing Translate

Affiliation: *Research Scholar, Dept. of Linguistics, Aligarh Muslim University, Aligarh, India
** Research Scholar, Centre for Linguistics, Jawaharlal Nehru University, New Delhi, India


2. Equivalence:
Eugene Nida (1964, 1969) categorizes equivalence into two types, i.e. formal and dynamic. As far as
formal equivalence is concerned, there is complete correspondence between the SL and TL texts with
regard to both form and structure (e.g. sentence-to-sentence, word-for-word and concept-to-concept).
It further attempts to convey as much information about the SL text as is feasible. A faithful translation
is characterized by formal equivalence between the two texts. On the other hand, dynamic equivalence
aims at having or recreating a similar relationship between the reader/listener and the text. Both forms
of equivalence are still relevant in translation in spite of their merits and demerits. Equivalence is
considered to be a missing link between the dynamic model (process-oriented) and the static model
(product-oriented) (Neubert, 1985). A mathematical representation of the logic behind the equivalence
is as follows which refers to the binary truth-function which takes the true value when the SL and TL
texts are true or false.
Symbol: ≡ or ↔, as in - (p ^ q)
2.1. Types of Equivalence:
Anton Popovic (1976) has identified four broader types of equivalence in translation. They are vividly
discussed as follows. In addition, they have also been quantifiably mapped on both the platforms from
English to Urdu.
2.1.1.

Linguistic equivalence:

When there is word-word translation there is equivalence/similarity/identicality/homogeneity between

languages (SL and TL). In the examples instantiated below, ‘football practice’ is translated in Google
as ‘fooTball kI meshk’ and ‘football practice’ in Bing. Google translates typical Urdu words while Bing
borrows the same phrase from the English input sentence.
For instance,
English: Ben goes to football practice every Tuesday.
Urdu Google: ben fooTball kI meshk har mangal ko jaataa hai
Football practice
Urdu Bing: ben fooTball practice ke liye har mangal ko ʝaataa hai
Football practice
2.1.2.

Paradigmatic equivalence:

It refers to the similarity in the grammatical structures between the two texts. André Lefevere (1976)
has emphasized on preserving the structures of the SL text as closely as possible but not so closely that
the TL structures are distorted. To his opinion, translation as a process should be syntax-oriented. In the
examples mentioned below, the prepositional phrase ‘for cosmetic surgery’ is mapped into Google and
Bing with the postpositional phrase ‘kasmeTik sarjari ke liye’ in Urdu.
For example,
English: We produce lasers for cosmetic surgery.
Urdu Google: həm kasmeTik sarjari ke liye lasers ke paidaa karte hai.N
Cosmetic surgery for
Urdu Bing: həm lasers kasmeTik sarjari ke liye paidaa kar rahe hai.N
Cosmetic surgery for


2.1.3.

Stylistic equivalence:


It suggests the similarity in the perceived meaning or its influence on the readers’ mind conveyed
through the translated message. In other words, if there is functional equivalence of elements in both
original and translation- aiming at an expressive identity with the invariant of identical meaning. The
idiomatic or multiword expressions are quite crucial for both manual and machine translation as one
needs to consider the socio-cultural milieu of a given language. Google translates the English idiomatic
sentence perfectly taking meaning and socio-cultural context into consideration while Bing has
translated it word to word and thereby making it wrong.
For example,
English: Son is the apple of mother’s eyes.
Urdu Google: beTaa maa.N ki aankho.N kaa sitaaraa hai
Urdu Bing: beTaa maa.N ki aankho.N kaa sev hai
“In translation, there is substitution of TL meanings for SL meanings: not transference of TL meanings
into the SL”- J.C. Catford (1965). In transference, there is an implantation of SL meanings into the TL
text. These two processes of substitution and transference must be clearly differentiated in any theory
of translation. Lefevere (1976) has emphasised on the approximation of meanings between the SL and
TL texts. But to encode the semantic aspect of linguistics to the machine is a daunting task and it seems
relevant for MT having input and output texts that are quite divergent in nature linguistically.
2.1.4.

Textual or syntagmatic equivalence:

It takes into consideration the similarity in the organizational structure and forms of the texts. To put
forth differently, if there is equivalence of the syntagmatic structuring of a text, i.e. equivalence of both
form and shape, it is known as textual equivalence. Keeping the form and shape of both the texts while
translating is little difficult which results in collapsing of the translation output.
For instance,
English: She planned the event all by herself.
Urdu Google: wo tamaam khud kI taraf se wagiyaa kI mansubaa bandI kI
Urdu Bing: unho.N-ne taqrIb kI taraf se sab-ne khud kaa mansubaa banaayaa
2.2. Comparison of Equivalence on Google and Bing:

The statistical data (see fig. 3) demonstrated below represents the rate of equivalence in Google and
Bing MT platforms at four major levels: linguistic, paradigmatic, stylistic and textual. The highest rate
of equivalence is registered in the category of linguistic equivalence whereas the lowest rate is figured
in the stylistic equivalence in both of the said platforms. In paradigmatic equivalence, Google registers
33% accuracy rate which is around 2% higher in comparison to the Bing translate. Similarly, Bing
figures 3% lower accuracy rate in the section of textual equivalence than Google.
The analysis of the results obtained from both the platforms demonstrates the fact that Bing translates
English to Urdu texts word-word as a result of which it achieves linguistic equivalence higher than
Google. So far as other categories are concerned, there is not much difference between both. With
regard to the stylistic equivalence, one point which is worthy to be mentioned here is that both of the
platforms collapse in translating the multiword expressions which include pair and compound words;
reduplicated, abbreviated, idiomatic expressions and so on. It is quite natural that machine can hardly
be made to comprehend, process, parse and translate higher level of syntax and semantics for best

3


translations. Overall, Google performs better than Bing in all areas of equivalence with exception to the
category of linguistic equivalence.

Equivalence Rate in %

45
40

22 41

33
34


19

31

11

35

9

30
25
20

Linguistic

15
10

Paradigmatic

5
0
Google

Bing

MT Platforms

Stylistic

Textual/syntagmatic

Figure 3 Equivalence Rate in Google and Bing
3. Gain and Loss:
"Every translation entails a loss by comparison with the original" Wolf Harranth (1991)
When a text or communication in one code is translated into another, it is indispensable that something
is gained while some elements are lost which results in miscommunication. Therefore, it is next to
impossible to achieve complete equivalence. The issue of loss and gain owes its genesis to the cultural
dissimilarity and divergent linguistic structures between two linguistic communities. The more the
structures are divergent, the more the translation becomes error-prone. As a result, it is indispensable
for any MT platform to observe and incorporate the divergent patterns between a pair of languages so
that the accuracy could be enhanced and translated more correctly.
3.1. Motivation:
So far as the translation of literary text is concerned gain and loss are different in nature in contrary to
the other genre of texts, for instance, technical texts, natural language texts and so on. The gain and loss
in the literary texts do owe much to the figurative usage of the two encoded and decoded languages
involved in the process of translation. By suggesting the figurative usage of the language, we refer to
the very fact that figures of speech such as simile, metaphor, irony, paradox, humor, word-play,
metonymy, synecdoche and so on are abundantly employed to obtain the flamboyance and sublimity in
the language and its impact. The issue of ‘gain’ crops out when an overenthusiastic translator
inadvertently over-translate the concerned text at hand. Divergence is an umbrella terminology
exploited in the area of Machine Translation to cover both the gain and loss which is responsible for
the inefficiency of the systems. Translation refers to the process of translation from a source language
(SL) to a target language (TL) applying all the meta-linguistic-contextual knowledge by a human
translator considering almost all the factors into account. Therefore, it is quite obvious and natural that


there are less linguistic erroneous patterns in human translation as opposed to an automated MT output
which translates with the assistance of computers. It emerges due to the parametric variations between
languages involved in the process. According to Dorr (1993), “translation divergence arises when the

natural translation of one language into another results in a very different form than that of the original.”
As discussed in Muzaffar et al., (2016b), English is a European language while Urdu belongs to the
Indo-Aryan (IA) group of languages. There are many mutually incongruent features related to
morphology, syntax, semantics, and discourse. In consonance with the IA languages, Urdu has an
enriched morphology and allows scrambling as a syntactic process. Contrastingly, English has a weak
morphology and fixed word-order. Expletives ‘it’ and existential ‘there’ subjects are quite commonly
applied in English which are not true to Urdu language. In addition, Urdu has lexically marked
honorifics in the verbs whereas the English counterpart does not have so. This divergence could be
ascribed to the cultural dissimilarity between the two cultures. Besides, there are some instances of gain
and loss from natural language text as explicated.
4. Instances of Gain and Loss:
This section has been divided into two broad sub-sections: Linguistic and Cultural.
4.1. Linguistic
4.1.1.

Word order

English is not a free word-order language as it doesn’t allow scrambling unlike Urdu and other Indian
languages. That is to say, if one tweaks the order of subject, object and verb, the sentences in all cases
will read meaningfully. English, on one hand, follows a rigid configurational pattern of word order SVO
whereas, on the other, Urdu has different acceptable orders like SOV, SVO and OVS (Muzaffar et al,
2016b) as in the following examples.
For Example,
(Eng) Qasim is feeding the baby.
S
V
O
 (Urdu 1)Qasim bacche ko khilaa rahaa hai.
S
O

V
 (Urdu 2) bacche ko khilaa rahaa hai Qasim
O
V
S
 (Urdu 3) Qasim khilaa rahaa hai bacche ko
S
V
O
4.1.2.

Gerunds and Participles:

Gerundive and participial constructions are really crucial for gain and loss processes. In all the below
exemplified instances, one can observe that the gerunds and participles, both adjuncts and complements,
having to + infinitive constructions are realized by different structures. Therefore, these sorts of
parametric structures between a pair of languages need to be considered between English and Urdu.
For instance,
(Eng) To do (doing) exercise is good for health
(Urdu) warzish karnaa sehad kI behatarI ke liye achhaa hai.
Exercise to do health of improvement for good be-PRS.IMPFV
(Eng) He is not able to do this.
(Urdu) wah
yah karne ke qaabil nahI.N hai.N
He-3MSG.NOM it doing of able not be-3MSG.PRS.IMPFV.HON.
(Eng) we would like to read.
(Urdu) hum
pa.Dhanaa chaahate hai.N (Sinha and Thakur, 2005)
5



We-1PL.NOM to read want-1PL.IMPFV be-1PL.PRS
(Eng) They have come to serve you.
(Urdu) wo log
aapke khidmat mei.N haazir hai.N
Those people your service in
present be-3PL.PRS.IMPFV.
4.1.3.

Mapping have-verbs in Urdu:

Some have sentences and sentences with copular verbs in English are quite difficult to map into Urdu.
Sentences with have-verbs in English have first person singular subject ‘I’ and first person singular
subject ‘he’ with no case inflections and the have verbs inflecting with tense. Contrarily, Urdu sentences
have the same subjects with some oblique case endings and agree with the objects. In addition, the
copular verbs have the morphological forms identical to the counterparts of English have verbs in Urdu.
For instance,
(Eng) He has courage.
(Urdu) usme

kaabiliyat

hai.

He-LOC ability-3FSG.NOM have-3SG.PRS.IMPFV
(Eng) I have three watches.
(Urdu) mere paas tIn gha.Diyaa.N

hai.N


I with three watches-3FPL.NOM have-3FPL.PRS.IMPFV
4.1.4.

Optative Mood Constructions:

The optative constructions contain two clauses: independent and dependent. The former contains the
finite verb whereas the latter contains the non-finite verb. The Urdu verbs in the subordinate clause get
the inflectional markers for person, number and gender of the subject. On the other hand, the verb forms
remain constant in English counterpart and maintain the root forms in some cases as in the following
example. The first instance is an exception as it is passive sentence with an optative mood.
For instance,
(English) I want that my letter be sent to me.
(Urdu) Mai.N chaahataa

hu.N kI

meraa khat mere hawaale kiyaa jaaye.

I-1SG.NOM want-1MSG.IMPFV do-1MSG.PRS my letter my control

be done

(English) We want that Rahim succeed.
(Urdu) hum log chahte

hEN ki rahim kaamyaab ho.

We-1PL.NOM want-1PL.IMPFV do-1PL.PRS succeed be
4.1.5.


Complex Predicates:

Conjunct verbs are those which consist of a noun or an adjective followed by verb. In Hindi and Urdu,
conjunct verbs are formed by combining a noun or an adjective with a verb and “semantically denote
an action or a process or a state” as a complete whole (Begum, 2011; Muzaffar et al, 2015 & 2016).
The most frequent verbalizers in Urdu are /karanA/ ‘to do’, /honA/ ‘to be’, /denA/ ‘to give’, /lenA/ ‘to
take’, /AnA/ ‘to come’ and so on (Muzaffar et al, 2016).
For example,


(English) I helped Raam.
(Urdu) Maine raam kI madad kI
I-1SG.ERG Ram GEN help-3FSG.PST.PRFV
4.1.6.

Object-verb Agreement

“In Indian languages verbs agree with both the subject and the object; provided some conditions are
fulfilled” (Behera et al., 2016). In Hindi (Jha et al., 2014), Urdu (Muzaffar et al., 2015; Muzaffar &
Behera, 2014) and Marathi, the oblique (both ergative and non-nominative) sentences do usually have
object-verb agreement. In the instance exemplified in the following, the verb agrees with the person,
number and gender of the object ‘naak’.
For instance,
(English) Afrin has made us ashamed.
(Urdu) AfarIn-ne naak kaTwaa dI
Afrina-3FSG.ERG nose-3FSG cut-CAUS. Make-3FSG.PST.PRFV
4.2. Socio-cultural:
These factors completely dependent on the societal and cultural norms that are reflected on the
linguistic aspects. Both these factors have been adapted from (Sinha & Thakur, 2005)
4.2.1.


Honorificity Markers:

In Urdu, the honorific features are marked by the pluralization of the verb. There are also specific
morphological forms of pronominal elements that are honorific in nature. These above discussed
features are not marked in English counterpart.
(English) My father has arrived.
(Urdu) Mere waalid aa chuke hEN
My father-3MSG.NOM. come have-HON be-3MSG.PRS.HON
4.2.2.

Mapping of Time

“Usually, people’s perception of different objects in the world is dependent upon several sociocultural
beliefs. For instance, time is conceptualized in the Indian culture differently than that is done in the
Western culture” (Sinha & Thakur, 2005). The concept of a.m. and p.m. can hardly be mapped into
Urdu as exactly as required. In all the below mentioned examples, a.m. can cover ‘fajr’ while p.m.
covers the temporal words such as ‘zohar’, ‘asar’, ‘maghrIb’ and ‘ayeshaa’.
(English) He came around 5 am.
(Urdu) wo fajra ke waqt aayaa.
(English) He came around 1 pm.
(Urdu) wo zohar ke waqt aayaa.
(English) He came around 5 pm.
(Urdu) wo asar ke waqt aayaa.
(English) He came around 6:30 pm.
7


(Urdu) wo maghrib ke waqt aayaa.
(English) He came around 8:00 pm.

(Urdu) wo aeshaa ke waqt aayaa.
5. Conclusion:
In this paper, we have dealt with the concepts of equivalence, gain and loss in terms of English to Urdu
language pairs by classifying and analyzing the data from Google and Bing. The rationale for taking on
equivalence along with gain and loss is to observe the cases where structures of both SL and TL are
similar and where they are divergent. For equivalence, we have adhered to the framework as provided
by Popovic and for gain and loss, we have categorized on our own framework. This analytical study
would prove to be fruitful in terms of building machine translation platforms more efficient as it
conducts a detailed analysis of what kinds of linguistic patterns can complicate translation process.
From this above discussion, it can, however, be averred that language and culture play an eminent role
in deciding the margin of gain and loss between the SL and TL.
Acknowledgements:
We are hugely indebted to the Google and Bing web-based MT platforms for the translation of English
to Urdu texts.
References:
1. As-Safi, A. B. (2006). Loss & Gain and Translation Strategies with Reference to the
Translations of the Glorious Qur’an. Atlas Stud. Res.
2. Bassnett, Susan. Translation Studies. (1980). Revised edition 1991. London: Routledge.
3. Begum R., Jindal K., Jain A., Husain S., and Sharma D. M. (2011), Identification of conjunct
verbs in hindi and its effect on parsing accuracy, Computational Linguistics and Intelligent Text
Processing, Springer Berlin Heidelberg, 29-40.
4. Behera, P., Maurya, N., & Pandey, V. (2016). Dealing with Linguistic Divergences in EnglishBhojpuri Machine Translation. Proceedings of the 6th Workshop on South and Southeast Asian
Natural Language Processing, ACL, pp. 103–113, Osaka, Japan.
5. Behera, P., Muzaffar, M., Ojha, A. K., & Jha, G. N. (2016a). The IMAGACT4ALL Ontology
of Animated Images: Impli-cations for Theoretical and Machine Translation of Action Verbs
from English-Indian Languages. Proceedings of the 6th Workshop on South and Southeast
Asian Natural Language Processing, ACL, pp. 64-73, Osaka, Japan.
6. Dorr, B. (1993). Machine Translation: a View from the Lexicon. The MIT Press, Cambridge,
Mass.
7. Dorr, B. (1994). Classification of Machine Translation Divergences and a Proposed Solution.

Computational Linguistics 20(4):597-633.
8. Dave, S., Parikh, J., & Bhattacharya, P. (2001). Interlingua-based English-Hindi Machine
Translation.
Journal
of
Machine
Translation,
16(4),
251-304.
retrieved on 11.10.2015.
9. Dash, N. S. (2013). Linguistic Divergences in English to Bengali Translation, International
Journal of English Linguistics; Vol. 3, No. 1; 2013.
10. Gautam, T. R. (2012). Loss and gain in translation from Hindi to English: a stylistic study of
multiple English translations of Premchand’s Godaan and Nirmala.
11. Gupta, D. and Chatterjee, N. (2003). Identification of Divergence for English to Hindi EBMT.
In Proceeding of MT Summit-IX, pp. 141-148.
12. Harranth, W. (1991). Das Ãœbersetzen von Kinder- und Jugendliteratur." JuLit Information no.
1:23-27.


13. Jha, G. N., Hellan, L., Beermann, D., Singh, S., Behera, P., and Banerjee, E. (2014). Indian
Languages on the TypeCraft Platform–The Case of Hindi and Odia. In WILDRE-2, LREC2014.
14. Muzaffar, S. and Behera, P. (2014). Error Analysis of the Urdu Verb Markers: A Comparative
Study on Google and Bing Machine Translation Platforms. Aligarh Journal of Linguistics
(ISSN- 2249-1511), 4 (1-2), pp 199-208.
15. Muzaffar, S., Behera, P., Jha, G. N., Hellan, L., & Beermann, D. (2015). TypeCraft Natural
Language Database: Annotating and Incorporating Urdu. Indian Journal of Science and
Technology (ISSN-0974-5645), 8(27).
16. Muzaffar, S., Behera, P., & Jha, G. N. (2016). Issues and Challenges in Annotating Urdu Action
Verbs on the IMAGACT4ALL Platform. LREC-2016.

17. Muzaffar, S., Behera, P., & Jha, G. N. (2016a). A Pāniniān Framework for Analyzing Case
Marker Errors in English-Urdu Machine Translation. Procedia Computer Science, 96, 502510.
18. Muzaffar, S., Behera, P., & Jha, G. N. (2016b). Classification and Resolution of Linguistic
Divergences in English-Urdu Machine Translation. WILDRE: LREC.
19. Online Machine Translation System, The Bing Translator by Microsoft Inc.
retrieved on 14.10.2015.
20. Online Machine Translation System, The Google Translate by Google Inc.
retrieved on 14.10.2015.
21. Saboor, A. & Khan, M.A. (2010). Lexical-semantic Divergence in Urdu-to-English Example
Based Machine Translation, 6th International Conference on Emerging Technologies (ICET),
pp. 316-320.
22. Shaheen, M. (1991). Theories of translation and their applications to the teaching of
English/Arabic-Arabic/English translating (Doctoral dissertation, University of Glasgow).
23. Shukla, V., & Sinha, R. M. K. (2011). Divergence patterns for Urdu to English and English to
Urdu Translation. In Human-Machine Interaction in Translation: Proceedings of the 8th
International NLPCS Workshop (Vol. 41, p. 21). Samfundslitteratur.
24. Sinha, R. M. K., & Thakur, A. (2005). Divergence patterns in machine translation between
Hindi and English. 10th Machine Translation summit (MT Summit X), Phuket, Thailand, 346353.
25. Syalies, F. N. (2016). A Loss and Gain in Equivalence Analysis of Noun Phrases in Strawberry
Shortcake Bilingual Series Dandanan Kacau Makeover Madness.
26. Venuti, Lawrence. (2000). Ed. The Translation Studies Reader. London: Routledge.

9



×