Hindi Verb Project

Syntax and Morphology in Hindi and Urdu: A Lexical Resource

Alice Davison, U. of Iowa
December 1999

The file which follows is a sample of lexical information about verbs in Hindi and Urdu. It is a searchable database with entries for about 60 verbs, part of a larger database project which is in progress. Each entry has fields for information about verb attributes, which for a specific entry define important properties which are projected into a sentence. This lexical information constrains the possible sentences formed from this verb: for example, the number of arguments which verb takes, their category and grammatical function, and the case forms which are required or possible. The data fields also contain semantic information about the meaning of the verb, its aspectual class (Aktionsart), synonymous and derivationally related verbs and nouns, as well as morphological information: whether the verb is a simple verb or a complex predicate compounded of a noun or adjective with a light verb, whether it has a causative suffix or other overt marker of derivation. There are examples of usage in sentences illustrating the syntactic, morphological and semantic properties of the verb. The notes field lists discussion in published research of the particular properties of that verb or the sub-class to which it belongs, as well as special features of that verb (for example that the transitive and intransitive senses have the same form, unlike most such related verbs in the language).

Hindi and Urdu have a number of features which are of interest in linguistic research. It is a head-final language with postpositional case marking. Some postpositions are associated with grammatical function, some with specific roles associated with the meaning of the verb. The full database can be used to test hypotheses about the general linking principles governing the syntactic projection and case properties which can be derived from lexical information. Some issues involve syntactic properties such as transitivity. This language has split ergative subject marking, which is required (in perfective finite clauses) for some transitive verbs, optional for others, and prohibited on the subjects of some bivalent verbs: these verbs seem to be just the ones which cannot mark the direct object with the dative postposition. Other issues involve semantic classes, such as psychological verbs of emotion, physical state, and mental perception. These verbs project a variety of sentence types, including nominative, dative or ergative case. Morphological issues include a characterization of causative derivation and related transitive and intransitive verbs. Another issue of particular prominence in Hindi and Urdu involves complex predicates formed from a variety of sources, including Sanskrit, Persian, Arabic and English. There are many nearly synonymous doublets: simple verbs and complex predicates, or complex predicates formed from Hindi/Urdu nouns and adjectives, and from words borrowed from another source. The syntactic projections of these verbs are often quite different from one another, particularly in the grammatical function and case of the thematic object.

The lexicon of Hindi/Urdu appears in many ways to encode semantic information in a different way from a language like English. In many cases, the meaning of a verb in Hindi/Urdu is broader than the meaning of a translation equivalent. While inherent verbal aspect is determined not just by verb meaning, but also by the properties of the object, aspect in Hindi/Urdu is somewhat less specified by the main verb. The sentence aspect/tense affixes as well as verbal compounds define the interpretation as state, achievement, activity or accomplishment, in combination with some core verb meaning. Some goals of the project are (1) to find a series of tests which specify inherent aspect in verb meanings more narrowly than the currently used tests for sentences and (2) to make clearer the links between semantic class and aspect type, and the kind of case which is selected for the verb arguments.

A major goal of this project is make accessible, in one resource, many types of information which are otherwise not readily available, previously unknown, or accessible only to people who can read the non-roman script in which some or all of the information is represented in published sources. The fields of information in each entry allow comparison with other verbs, and this information is stated in a form which is intended to be useful across particular theoretical approaches.