OLiA Discourse Extensions

Ontologies of Linguistic Annotation. Machine-readable tagsets and annotation schemata for more than 100 languages.

OLiA Discourse Extensions

The OLiA Discourse Extensions extend the Ontologies of Linguistic Annotation (OLiA) with respect to discourse features. The OLiA ontologies provide a a terminology repository that can be employed to facilitate the conceptual (semantic) interoperability of annotations of discourse phenomena as found in important corpora available to the community, including the RST Discourse Treebank and the Penn Discourse Treebank as well as standards such as ISO SemAF and community standards such as CCR.

For full documentation, please see our website.

Coverage

Discourse phenomena considered here include

All annotation schemes are formalized as self-contained OWL/DL ontologies (Annotation Models), with a declarative linking (Linking Models) linking them to an ontology that provides a generalized vocabulary for discourse annotation (Reference Model). For the latter aspects, we currently provide two ontologies that will subsequently be integrated with the OLiA Reference Model (cf. provisional linking: provisional Linking with OLiA Reference Model).

The OLiA ontologies do currently not cover dialogue structure, Gricean and Post-Gricean pragmatics and speech act theory or annotation schemes developed on this basis. In a broad sense, these can be regarded discourse phenomena, as the distinction between discourse and pragmatics is largely underdefined.

Instead, we follow a pragmatic distinction based on the types of available annotations: We restrict ourselves to the annotation of text (no dialogues, hence), with a particular focus on theories of discourse structure and discourse relations (in the sense of the Rhetorical Structure Theory or the Segmented Discourse Representation Theory) and frequently annotated phenomena most often discussed in regard to this (hence, anaphora, information status and information structure). Further extensions are continuously being worked on.

Application

Note that the OLiA discourse model does not constitute an independent theory of discourse. Instead, it takes a strong focus on practical application, i.e., the conjoint handling of heterogeneous annotations from publicly available corpora. The main function is a practical one to be able to map annotations from one framework to the specifications of another, to enable and to trace imprecise mappings.

In particular, OLiA allows to perform both precise and imprecise mappings (by subsumption inference or search over subClassOf relations in the reference model), but also quantify the mismatch between two concepts (using metrics such as the path length in OLiA). At the moment, the OLiA discourse extensions are still considered experimental. In the longer perspective, they should be formally integrated into the OLiA Reference Model.

Despite its name, the OLiA Reference Model is not a prescriptive model (i.e., a standard) for discourse annotation. Instead, it is a tool that allows to map annotations from one schema to another, but also from a particular schema to a reference system such as ISO SemAF or CCR as well as any from any such standard to a specific annotation schemes. At the moment, SemAF and CCR are linked as external reference models (i.e., OLiA concepts are defined as subconcepts), as soon as similarly authoritative specifications are being established for information structure, the OLiA Discourse Extensions will be adjusted accordingly, and integrated with the OLiA Reference Model.

Attribution

If you use this resource, please refer to

Chiarcos, C. (2014). Towards interoperable discourse annotation.
Discourse features in the Ontologies of Linguistic Annotation.
In Proceedings of LREC-2014, Reykjavik, Iceland, May 2014, (pp. 4569-4577).

Todos

OLiA Discourse Extensions

The OLiA Discourse Extensions extend the Ontologies of Linguistic Annotation (OLiA) with respect to discourse features. The OLiA ontologies provide a a terminology repository that can be employed to facilitate the conceptual (semantic) interoperability of annotations of discourse phenomena as found in important corpora available to the community, including the RST Discourse Treebank and the Penn Discourse Treebank as well as standards such as ISO SemAF and community standards such as CCR.

Coverage

Discourse phenomena considered here include

All annotation schemes are formalized as self-contained OWL/DL ontologies (Annotation Models), with a declarative linking (Linking Models) linking them to an ontology that provides a generalized vocabulary for discourse annotation (Reference Model). For the latter aspects, we currently provide two ontologies that will subsequently be integrated with the OLiA Reference Model (cf. provisional linking: provisional Linking with OLiA Reference Model).

The OLiA ontologies do currently not cover dialogue structure, Gricean and Post-Gricean pragmatics and speech act theory or annotation schemes developed on this basis. In a broad sense, these can be regarded discourse phenomena, as the distinction between discourse and pragmatics is largely underdefined.

Instead, we follow a pragmatic distinction based on the types of available annotations: We restrict ourselves to the annotation of text (no dialogues, hence), with a particular focus on theories of discourse structure and discourse relations (in the sense of the Rhetorical Structure Theory or the Segmented Discourse Representation Theory) and frequently annotated phenomena most often discussed in regard to this (hence, anaphora, information status and information structure). Further extensions are continuously being worked on.

Application

Note that the OLiA discourse model does not constitute an independent theory of discourse. Instead, it takes a strong focus on practical application, i.e., the conjoint handling of heterogeneous annotations from publicly available corpora. The main function is a practical one to be able to map annotations from one framework to the specifications of another, to enable and to trace imprecise mappings.

In particular, OLiA allows to perform both precise and imprecise mappings (by subsumption inference or search over subClassOf relations in the reference model), but also quantify the mismatch between two concepts (using metrics such as the path length in OLiA). At the moment, the OLiA discourse extensions are still considered experimental. In the longer perspective, they should be formally integrated into the OLiA Reference Model.

Despite its name, the OLiA Reference Model is not a prescriptive model (i.e., a standard) for discourse annotation. Instead, it is a tool that allows to map annotations from one schema to another, but also from a particular schema to a reference system such as ISO SemAF or CCR as well as any from any such standard to a specific annotation schemes. At the moment, SemAF and CCR are linked as external reference models (i.e., OLiA concepts are defined as subconcepts), as soon as similarly authoritative specifications are being established for information structure, the OLiA Discourse Extensions will be adjusted accordingly, and integrated with the OLiA Reference Model.

Attribution

If you use this resource, please refer to

Chiarcos, C. (2014). Towards interoperable discourse annotation.
Discourse features in the Ontologies of Linguistic Annotation.
In Proceedings of LREC-2014, Reykjavik, Iceland, May 2014, (pp. 4569-4577).

Todos