To construct the Danish framenet, syntactic verb complementation patterns from a Danish valency lexicon are combined with distributional information drawn from Danish text corpora annotated with the DanGram system with syntactic-structural information (dependency) on the one hand and semantic roles on the other. Another layer of information, ARG0-1-2, used in the English PropBank, will be implicit and can be deduced from the combination of semantic roles and syntactic function (also part of the DanGram annotation).
The system uses about 500 different semantic verb classes (verb sense frames), and 38 semantic roles, the latter modelled on the role set for DanGram's Portuguese sister grammar PALAVRAS semantic roles), fousing on valency-bound roles. Another 12 free adverbial roles are not treated in the FrameNet lexicon itself, but will be tagged independently by the associated semantic role tagger.
A basic frame extracted from the dependency-semantic annotation will look like this for the verb donate:
§DON (@SUBJ) donate <13.1.1:give> §REC (@DAT) §TH (@ACC)
(§DON = donor, §REC = recipient, §TH = theme, @SUBJ = subject, @DAT = dative object, @ACC = accusative object)
Further differentiation of the 13.1.1:give frame is handled at the level of verb-semantic classes (e.g. 13.5.1:buy and 13.1.2:sell), not at the level of arguments, the difference being implicit from the verb meanings themselves. Thus, a §DON role will be interpreted as seller not because of different role arguments (they remain the same), but because of the governing verb's semantic class of selling. This way, is becomes easier and more consistent to establish frame sets from a corpus, using only limited linguistic revision. The role Seller, for instance, can thus be read as sell-DONor.
Apart from syntactic (@) and semantic (§) functions, our FrameNet entries will also list syntactic form such as icl (non-finite clause), pp (prepositional phrase) and semantic form in the shape of prototypical slot fillers, so-called semantic prototypes, such as <H> (human), <sem-r> (readable semantic product), <con> (container) or <build> (building).
For individual verbs, the Danish Framenet can be accessed through a Lookup Interface, currently covering 11873 verb frames, with 10990 valency patterns and 6822 lexemes.
Upon completion, the resource will also be distributed in its entirety. For further information, or licensing issues, please contact Eckhard Bick.