اختصاصی از
فی لوو دانلود پایان نامه ارشد مترجمی زبان انگلیسی دانلود با لینک مستقیم و پر سرعت .
دانلود پایان نامه ارشد مترجمی زبان انگلیسی با عنوان A generic framework for Arabic to English machine translation of simplex sentences using the Role and Reference Grammar linguistic model که شامل 191 صفحه و بشرح زیر میباشد:
نوع فایل : PDF قابل ویرایش
A generic framework for Arabic to English machine translation of simplex sentences using the Role and Reference Grammar linguistic model
Abstract
The aim of this research is to develop a rule-based lexical framework for Arabic language
processing using the Role and Reference Grammar linguistic model. A system, called
UniArab is introduced to support the framework. The UniArab system for Modern Standard
Arabic (MSA), which takes MSA Arabic as input in the native orthography, parses
the sentence(s) into a logical meta-representation, and using this, generates a grammatically
correct English output with full agreement and morphological resolution. UniArab
utilizes an XML-based implementation of elements of the Role and Reference Grammar
theory, and its representations for the universal logical structure of Arabic sentences.
Role and Reference Grammar (RRG) is a functional theory of grammar that posits a
direct mapping between the semantic representation of a sentence and its syntactic representation.
The theory allows a sentence in a specific language to be described in terms
of its logical structure and grammatical procedures. RRG creates a linking relationship
between syntax and semantics, and can account for how semantic representations are
mapped into syntactic representations. We claim that RRG is highly suitable for machine
translation of Arabic via an Interlingua bridge implementation model. RRG is a mono
strata–theory, positing only one level of syntactic representation, the actual form of the
sentence and its linking algorithm can work in both directions from syntactic representation
to semantic representation, or vice versa. In RRG, semantic decomposition of predicates
and their semantic argument structures are represented as logical structures. The
lexicon in RRG takes the position that lexical entries for verbs should contain unique information
only, with as much information as possible derived from general lexical rules.
For this reason and due to the functional nature of our linguistic model, we will create
our own lexicon.
We use the RRG theory to motivate the architecture of the lexicon and the RRG bidirectional
linking system to design and implement the parse and generate functions between
the syntax-semantic interfaces. Through an input process with seven phases, including
morphological and syntactic unpacking, UniArab extracts the universal logical structure
of an Arabic sentence. Using the XML based metadata representing the RRG logical
structure (XRRG), UniArab accurately generates an equivalent grammatical sentence in
the target language through four output phases. We outline the conceptual structure of
the UniArab System which utilizes the framework and translates the Arabic language
into another natural language. We follow the Interlingua design approach for machine
translation. We analyse the Arabic sentences to create a universal, abstract logical representation,
and from this representation we generate English translations.
We also explore how the characteristics of the Arabic language will affect the development
of a Machine Translation (MT) tool. Several characteristics of Arabic pertinent
to MT will be explored in detail with reference to some potential difficulties that they
present. We will conclude with a proposed model incorporating the Role and Reference
Grammar techniques to achieve this end. The UniArab system has been tested by generating
equivalent grammatical sentences, in English, via the universal logical structure of
Arabic sentences, based on MSA Arabic input with very significant and accurate results.
It provides more accurate translations when compared with automated translators from
Google and Microsoft though these systems have a much wider coverage than UniArab
at present. The free word order nature of Arabic and the challenges of incorporating transitivity
into the logical structure will be outlined in detail. This research demonstrates the
capabilities of the Role and Reference Grammar as a base for multilingual translation
systems.
Contents
Abstract iii
Declaration iv
Acknowledgements vi
1 Introduction 1
1.1 Motivation...... 4
1.2 Goals..... 4
1.3 Technologies...... 5
1.4 Thesis organization... 6
2 The Arabic Language 8
2.1 Characteristics of the Arabic language.... 9
2.2 Characteristics of Arabic words.. 11
2.2.1 Free word order.... 12
2.3 Part of speech inventory of the Arabic language.... 14
2.3.1 Noun...... 14
2.3.1.1 Definite nouns.. 15
CONTENTS
2.3.1.2 Indefinite nouns.... 16
2.3.2 Adjectives..... 16
2.3.3 Adverbs.....16
2.3.4 Verbs...... 17
2.3.4.1 Verb tenses.. 17
2.3.4.2 Aspect.. 19
2.3.4.3 Mood.... 19
2.3.4.4 Voice.... 20
2.3.4.5 Transitivity.. 20
2.3.5 Demonstratives.... 21
2.3.6 Others...... 21
2.4 Sentence types in Arabic.. 22
2.4.1 Equational sentences.. 22
2.4.1.1 Verb and noun.. 23
2.4.1.2 Verb and two nouns.... 23
2.4.1.3 Verb and three nouns.. 24
2.4.1.4 Verb and four nouns.... 24
2.4.2 The Verbal Sentence.. 24
2.4.3 Clause...... 25
2.5 Summary...... 25
3 Role and Reference Grammar (RRG) 27
3.1 Role and Reference Grammar linguistic model.. 28
3.2 Formal representation of layered structure of the clause... 31
3.2.1 Representing the universal aspects of the layered structure of the
clause...... 31
3.2.2 Layered structure of the clause (LSC).. 32
CONTENTS
3.2.3 Non-universal aspects of the layered structure of the clause..32
3.3 Noun phrase structure....36
3.3.1 NP headed... 38
3.4 Lexical representations for verbs.. 38
3.4.1 Agents, effectors, instruments and forces.. 39
3.4.2 change of state verb.. 40
3.5 Why we use RRG as the linguistic model... 41
3.5.1 RRG representing the universal aspects of the layered structure
of the clause....42
3.5.2 The lexical representation of verbs and their arguments..43
3.6 Summary...... 44
4 Machine translation strategies 46
4.1 Advantages of machine translation.... 47
4.2 Computational techniques in MT.. 47
4.2.1 System design.... 48
4.2.2 Interactive systems.... 48
4.2.3 Lexical databases.. 49
4.2.4 Tokens and tokenization.. 49
4.2.5 Syntactic analysis (Parsing).. 50
4.3 Basic machine translation strategies.. 50
4.3.1 Multilingual versus bilingual systems.. 50
4.3.2 Direct translation.. 51
4.3.3 Interlingua... 52
4.3.4 Transfer systems.. 53
4.3.5 Statistical machine translation.... 56
4.4 Linguistic aspects of MT.. 56
CONTENTS
4.4.1 Non-Roman alphabet scripts.. 57
4.4.2 Lexical ambiguity.... 57
4.4.2.1 Category ambiguity.... 57
4.4.2.2 Homograph.... 58
4.4.3 Syntactic ambiguity.. 58
4.4.4 Structural differences.. 60
4.5 Challenges of Arabic to English MT.. 60
4.6 Generation...... 62
4.6.1 Generation in direct systems.. 62
4.6.2 Generation in transfer-based systems.. 63
4.6.3 Generation in interlingua systems... 64
4.7 Summary...... 65
5 Design of Arabic to English machine translation system based on RRG 67
5.1 UniArab: Interlingua-based system.... 68
5.2 Designing an XML lexicon architecture for Arabic MT based on RRG..69
5.2.1 An XML-based lexicon.... 70
5.2.2 Lexical representation in UniArab... 70
5.2.3 Lexical properties.. 71
5.3 Design of test strategy.... 74
5.4 Design of evaluation criteria.. 77
5.5 Summary...... 78
6 UniArab: a proof-of-concept Arabic to English machine translation system 79
6.1 Conceptual structure of the UniArab system.... 80
6.1.1 Technical architecture of the UniArab system.....81
6.1.2 UniArab: Lexical representation in interlingua system... 84
CONTENTS
6.2 UniArab: Lexical representation in interlingua system based on RRG..86
6.2.1 Verb.....86
6.2.2 Common noun.... 88
6.2.3 Proper noun... 88
6.2.4 Adjective..... 89
6.2.5 Demonstrative.... 90
6.2.6 Adverb.....91
6.2.7 Other Arabic words.... 92
6.3 UniArab: Generation....92
6.4 UniArab: Screen design.... 94
6.4.1 Lexicon interface.. 97
6.5 Technical challenges....98
6.6 Summary...... 99
7 Testing and evaluation 100
7.1 Evaluation of MT systems.... 100
7.2 Sentence tests...... 101
7.2.1 Verb-Subject with one argument in different tenses..102
7.2.2 Gender-ambiguous proper nouns.. 106
7.2.3 Verb ‘to be’... 108
7.2.4 Verb ‘to have’....110
7.2.5 Free word order.... 112
7.2.6 Pro-drop..... 115
7.2.7 Transitivity of verbs.. 116
7.2.7.1 Intransitive.. 116
7.2.7.2 Transitive.. 118
7.2.7.3 Ditransitive.... 119
CONTENTS
7.2.8 Limitation of UniArab.... 122
7.3 System evaluation..... 125
7.4 Summary...... 128
8 Conclusion 129
8.1 Thesis summary.....131
8.2 Summary of thesis contributions.. 132
8.3 Future work.....133
References 134
Appendix 140
A The author’s publications related to this research 140
B Buckwalter Arabic transliteration 142
C List of translatable sentences 145
D Verbs in lexicon 161
E The UniArab code 170
List of Figures
2.1 A classification for the Arabic language syntax.... 14
2.2 A classification of clauses in the Arabic language.. 22
3.1 Layout of Role and Reference Grammar.. 29
3.2 Arabic sentence types; verb subject object or subject verb object (for
gloss please see example 3.1).. 30
3.3 Formal representation of the layered structure of the clause...31
3.4 English Sentence with precore slot and left-detached position... 33
3.5 Operator projection in LSC.... 34
3.6 LSC with constituent and operator projections.. 35
3.7 Arabic LSC.....36
3.8 The RRG representing the universal aspects of the layered structure of
the clause (Van Valin and LaPolla 1997).. 42
4.1 Direct MT system... 51
4.2 Interlingua1 model with eight languages pairs.. 52
4.3 Multilinguality transfer model with eight languages pairs.. 54
LIST OF FIGURES
4.4 Difference between direct, transfer, and interlinguaMT models, (Trujillo 1999) 55
4.5 NP rule (NP –> det n pp).. 59
4.6 PP is attached at a higher level.... 59
4.7 Direct MT system... 63
4.8 Semantic generation... 64
4.9 Structure to be generated.. 65
4.10 Interlingua model of Arabic MT.. 66
5.1 The conceptual architecture of the UniArab system.. .. 68
5.2 Information recorded in the UniArab lexicon.. 72
6.1 Layout of Role and Reference Grammar.. 79
6.2 The conceptual architecture of the UniArab system.....80
6.3 Generation the right tense for the verbs.... 84
6.4 Information recorded on the Arabic verb.. 87
6.5 Information recorded on the Arabic noun... 88
6.6 Information recorded on the Arabic proper noun.... 89
6.7 Information recorded on the Arabic adjective.. 90
6.8 Information recorded on the Arabic demonstrative......91
6.9 Information recorded on the Arabic adverb.. 91
6.10 Information recorded on the other Arabic words.... 92
6.11 UniArab’s GUI 1..... 95
6.12 UniArab’s GUI 2..... 96
6.13 UniArab’s GUI 3..... 97
6.14 UniArab’s lexicon interface.... 98
7.1 Verb-Subject with one argument.. 102
LIST OF FIGURES
7.2 Verb-Subject with one argument.. 103
7.3 Verb-subject agreement 1.. 104
7.4 Verb-subject agreement 2.. 105
7.5 Gender-ambiguous proper nouns 1.... 106
7.6 Gender-ambiguous proper nouns 2.... 107
7.7 Verb ‘to be’ 1...... 108
7.8 Verb ‘to be’ 2...... 109
7.9 Verb ‘to have’ 1.....110
7.10 Verb ‘to have’ 2.....111
7.11 Free word order (Verb Noun Noun scenario one).. 112
7.12 Free word order (Verb Noun Noun scenario two).. 113
7.13 Free word order (Verb Noun Noun scenario three).. 114
7.14 Pro-drop........ 115
7.15 Intransitive.....116
7.16 Intransitive with an adverb.... 117
7.17 Transitive...... 118
7.18 Ditransitive 1...... 119
7.19 Ditransitive with 2 NP.... 120
7.20 Ditransitive with preposition.. 121
7.21 Limitation of UniArab 1.. 122
7.22 Limitation of UniArab 2.. 123
7.23 Limitation of UniArab 3.. 124
List of Tables
2.1 Dual: merely add two letters to achieve dual form in Arabic.. 10
2.2 Grammatical gender....11
2.3 Feminine is different than masculine.. 12
2.4 Feminine and masculine in Arabic.... 12
2.5 Definiteness in Arabic.... 12
2.6 Definiteness example in Arabic.. 12
2.7 Free word order.....13
2.8 Noun example in Arabic.. 15
2.9 Definite example in Arabic.... 15
2.10 Indefinite example in Arabic.. 16
2.11 Arabic adjective..... 16
2.12 Arabic adverb...... 16
2.13 Imperfect tense
2.14 Perfect tense... 17
2.15 Imperfect inflectional forms of word ‘write’.... 18
2.16 Perfect inflectional forms of word ‘wrote’... 18
2.17 Future tense in Arabic.... 18
LIST OF TABLES
2.18 Indicative mood..... 19
2.19 Subjunctive mood... 19
2.20 Jussive mood...... 19
2.21 Imperative mood..... 20
2.22 Particle ‘Lan’...... 21
2.23 Nominal sentence..... 23
2.24 Kan and its sisters wth¯a.... 23
2.25 zanna and its sisters wth¯a.... 24
2.26 Informed and showed....24
2.27 verb(V), subject(S) and object(O).... 25
2.28 subject(S), verb(V) and object(O).... 25
2.29 verb(V), object(O) and subject(S).... 25
2.30 Two simple clauses by subordinating conjunction.. 25
3.1 Relationships between the semantic and syntactic units... 32
3.2 Lexical representations for the basic Aktionsart classes... 38
4.1 Modules required in an all-pairs multilingual transfer system.. 54
4.2 Derived words from a three-letter-root in Arabic.. 61
5.1 Verb 1..... 73
5.2 Verb 2..... 74
5.3 Test strategy: verb-subject agreement.... 75
5.4 Test strategy: demonstrative adjective-noun agreement... 75
5.5 Test strategy: gender-ambiguous proper nouns.... 75
5.6 Test strategy: verb ‘to be’.... 76
5.7 Test strategy: verb ‘to have’.. 76
5.8 Test strategy: free word order (Verb Noun Noun).. 77
xvii
LIST OF TABLES
5.9 Test strategy: pro–drop.... 77
6.1 Verb 1..... 87
6.2 Verb 2..... 87
6.3 Noun..... 88
6.4 Proper Noun...... 89
6.5 Adjective...... 89
6.6 Demonstrative representative.. 90
6.7 Adverb........ 91
6.8 Other Arabic words (where ‘NON’ means not applicable).... 92
7.1 Test : Verb-Subject; one argument.... 102
7.2 Test : Verb-subject; agreement 1.. 103
7.3 Test : verb-subject; agreement 2.. 104
7.4 Test : Gender-ambiguous proper nouns 1... 106
7.5 Test : gender-ambiguous proper nouns 2.. 107
7.6 Test : Verb ‘to be’ 1....108
7.7 Test : Verb ‘to be’ 2....109
7.8 Test : Verb ‘to have’ 1.... 110
7.9 Test : Verb ‘to have’ 2.... 111
7.10 Test : Free word order (Verb Noun Noun scenario one)... 112
7.11 Test : Free word order (Verb Noun Noun scenario two)... 113
7.12 Test : Free word order (Verb Noun Noun scenario three)..114
7.13 Test: Pro-drop.....115
7.14 Test : Intransitive 1... 116
7.15 Test : Intransitive 2... 117
7.16 Test : Transitive..... 118
LIST OF TABLES
7.17 Test : Ditransitive 1... 119
7.18 Test : Ditransitive with 2 NP.. 120
7.19 Test : Ditransitive with preposition.. 121
7.20 Test : Limitation of UniArab.. 122
7.21 Test : Limitation of UniArab 3 using non existing nonsense word..124
دانلود با لینک مستقیم
دانلود پایان نامه ارشد مترجمی زبان انگلیسی