Verticalised (.vrt) format#

Resembling - and making use of - XML syntax, the verticalised format (VRT, VeRticalised text) is a “token-oriented columnar text format”1https://www.kielipankki.fi/development/korp/corpus-input-format/ where the tab character (Tab ↹) is used to separate a token from its POS and lemma details (and potentially, any further annotation detail as well). It is the default accepted format for the IMS Corpus Workbench ([Evert and Hardie, 2011]) as well as a number of other corpus tools (e.g. SketchEngine).

In example [e5.29] (showing a sample of the format) the symbol is the graphical representation of the tab character.

Example [e5.29]#
 1<?xml version='1.0' encoding='UTF-8'?>
 2<text>
 3    <s n="1">
 4        And→and→CCONJ
 5        now→now→ADV
 6        ,→,→PUNCT
 7        for→for→ADP
 8        something→something→PRON
 9        completely→completely→ADV
10        different→different→ADJ
11        !→!→PUNCT
12    </s>
13</text>