Corpus of Historical Low German

HeliPaD: List of tags and empty categories

POS tags

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

. sentence-final punctuation
, sentence-internal punctuation
' quotation mark
" double quotation mark
+ joins constituent morphemes in compounds
A
ADJ adjective
ADJR adjective, comparative
ADJS adjective, superlative
ADV adverb
ADVR adverb, comparative
ADVS adverb, superlative
ALSO the word ok
B
BE wesan, infinitive
BEDI wesan, past indicative
BEDS wesan, past subjunctive
BEI wesan, imperative
BEPI wesan, present indicative
BEPS wesan, present subjunctive
BG wesan, present participle (uninflected)
BGI wesan, present participle (inflected)
BN wesan, past participle (uninflected)
BNI wesan, past participle (inflected)
C
C complementizer
CODE non-text material (e.g., page or line number, caesura)
CONJ coordinating conjunction
D
D determiner
E
F
FW foreign word
G
GE the prefix ge-
H
HG hebbian, present participle (uninflected)
HGI hebbian, present participle (inflected)
HN hebbian, past participle (uninflected)
HNI hebbian, past participle (inflected)
HV hebbian, infinitive
HVDI hebbian, past indicative
HVDS hebbian, past subjunctive
HVI hebbian, imperative
HVPI hebbian, present indicative
HVPS hebbian, present subjunctive
I
INTJ interjection
J
K
L
M
MAN indefinite man
MD modal verb, infinitive
MDDI modal, past indicative
MDDS modal, past subjunctive
MDI modal, imperative
MDPI modal, present indicative
MDPS modal, present subjunctive
MG modal, present participle (uninflected)
MGI modal, present participle (inflected)
MN modal, past participle (uninflected)
MNI modal, past participle (inflected)
N
N common noun
NEG negation
NPR proper noun
NUM numeral
O
P
P preposition
PRO personal pronoun
PRO$ possessive pronoun
Q
Q quantifier
QR quantifier, comparative (MORE, LESS)
QS quantifier, superlative (MOST, LEAST)
R
RD werthan, infinitive
RDDI werthan, past indicative
RDDS werthan, past subjunctive
RDI werthan, imperative
RDPI werthan, present indicative
RDPS werthan, present subjunctive
RG werthan, present participle (uninflected)
RGI werthan, present participle (inflected)
RN werthan, past participle (uninflected)
RNI werthan, past participle (inflected)
RP adverbial particle (note that RP can adjoin to verbs)
S
T
TO infinitival to
U
UTP the word uton
V
VB infinitive, verbs other than BE, HV, RD, modals
VBDI past indicative
VBDS past subjunctive
VBI imperative
VBPI present indicative
VBPS present subjunctive
VG present participle (uninflected)
VGI present participle (inflected)
VN past participle (uninflected)
VNI past participle (inflected)
W
WADJ wh-adjective
WADV wh-adverb
WPRO wh-pronoun
WPRO$ possessive wh-pronoun
WQ hwethar (as polar-question-introducer)
X
Y
Z

Additional attributes

The HeliPaD differs from the Penn historical corpora of English in making more extensive use of attributes separated by a delimiter, ^. Where a part of speech has more than one of the three attributes, they occur in the order given below (case, person, number). So a third person singular personal pronoun in the dative case would be PRO^D^3^SG.

Note that all elements that can receive attributes must receive them. This is the case even for elements that are formally invariant (e.g. PRO$ his, PRO$ iru, Q filo). Participles with attributes are marked with an I (e.g. VGI, VNI); those with no morphological marking have no I (e.g. VG, VN).

Case

The possible attribute values for case are ^N, ^A, ^G, ^D and ^I. Our approach differs from that of the YCOE in that we label all relevant parts of speech for case, not just morphologically unambiguous instances.

The following parts of speech are labelled for case:

  • nouns (N, NPR)
  • attributive adjectives (ADJ, ADJR, ADJS, WADJ)
  • inflected participles (VNI, VGI, etc.)
  • quantifiers (Q, QR, QS)
  • determiners (D)
  • numbers (NUM)
  • pronouns (MAN, PRO, PRO$, WPRO, WPRO$)

The case of PRO$ is the case it receives from context (so not always genitive).

Person

The possible attribute values for person are ^1, ^2 and ^3.

The following parts of speech are labelled for person:

  • pronouns (MAN, PRO, PRO$)
  • finite imperative verbs (BEI, HVI, MDI, RDI, VBI)
  • finite past indicative verbs (BEDI, HVDI, MDDI, RDDI, VBDI)
  • finite past subjunctive verbs (BEDS, HVDS, MDDS, RDDS, VBDS)
  • finite present indicative verbs (BEPI, HVPI, MDPI, RDPI, VBPI)
  • finite present subjunctive verbs (BEPS, HVPS, MDPS, RDPS, VBPS)

Number

The possible attribute values for number are ^SG, ^PL and ^DU (the latter in Old Saxon only). Our approach differs from the Penn historical corpora of English in that, for consistency, we mark number on nouns as an attribute rather than using the additional tags NS and NPRS.

The following parts of speech are labelled for number:

  • nouns (N, NPR)
  • adjectives (ADJ, ADJR, ADJS, WADJ)
  • inflected participles (VNI, VGI, etc.)
  • quantifiers (Q, QR, QS)
  • determiners (D)
  • pronouns (MAN, PRO, PRO$)
  • finite imperative verbs (BEI, HVI, MDI, RDI, VBI)
  • finite past indicative verbs (BEDI, HVDI, MDDI, RDDI, VBDI)
  • finite past subjunctive verbs (BEDS, HVDS, MDDS, RDDS, VBDS)
  • finite indicative verbs (BEPI, HVPI, MDPI, RDPI, VBPI)
  • finite subjunctive verbs (BEPS, HVPS, MDPS, RDPS, VBPS)

The number of PRO$ is the number it receives from context, not its inherent value. So, for example, usa "our" will be treated as singular where it occurs with a singular noun.

Syntactic tags

Basic syntactic tags

Clausal:

  • IP: basic clause, including verb
  • IPX: incomplete IP
  • CP: clausal shell for subordinate and non-declarative clauses
  • CPX: incomplete CP

Nominal:

  • NP: noun phrase
  • WNP: wh- noun phrase
  • NUMP: number phrase
  • QP: quantifier phrase
  • WQP: wh- quantifier phrase
  • ADJP: adjectival phrase
  • WADJP: wh- adjectival phrase
  • PP: prepositional phrase
  • WPP: wh- prepositional phrase

Other:

  • FRAG: sentence fragment
  • PTP: participial phrase
  • QTP: quotative phrase (quoted non-sentential sequences)
  • ADVP: adverbial phrase
  • WADVP: wh- adverbial phrase
  • CONJP: conjunction phrase
  • INTJP: interjection phrase
  • X: incomplete phrase due to text problems

Extended syntactic tags

For nominals:

  • Arguments:
    • -SBJ: subject
    • -OB1: direct or only object
    • -OB2: indirect or second object
    • -PRD: predicate (also for ADJPs and PTPs)
  • Non-arguments:
    • -ADT: adjunct (also for ADJPs and PTPs)
    • -LOC: locative
    • -TMP: temporal
    • -VOC: vocative
    • -POS: possessor (only one immediately dominated by each NP)
  • -RFL: reflexive (not mutually exclusive with above)

For IPs:

  • -MAT: main clause
  • -SUB: subordinate clause, inc. direct questions
  • -INF: infinitival clause
  • -INF-NCO: non-complement infinitival clause
  • -SMC: small clause
  • -SPE: direct speech (not mutually exclusive with above)

For CPs:

  • -REL: relative clause
  • -FRL: free relative clause
  • -THT: that-clause
  • -ADV: adverbial clause
  • -CMP: comparative clause
  • -QUE: interrogative clause
  • -DEG: degree clause
  • -SPE: direct speech (not mutually exclusive with above)

For ADVPs:

  • -LOC: locative
  • -DIR: directional
  • -TMP: temporal

Miscellaneous:

  • -PRN: parenthetical
  • -LFD: left-dislocated (with later -RSP)
  • -RSP: resumptive (usually with earlier -LFD)
  • -X: expletive subject or associate
  • -1, -2, -3 etc.: indices indicating movement
  • -0, -00, -00 etc.: indices indicating antecedents in instances of ellipsis
  • =0, =00, =00 etc.: indices indicating ellipsis sites

Order of extended tags where they co-occur: clausal (except -SPE), nominal (except -RFL), -RFL, -PRN/-LFD/-RSP, -SPE, -X/-1/-2/-3/-0/-00/-000/=0/=00/=000 etc.

Empty categories

  • *con*: subject elided under co-ordination
  • *exp*: null expletive subject (may be co-referential with a clausal element)
  • *arb*: arbitrary subject in infinitivals
  • *pro*: null subject, none of the above (usually referential)
  • *T*: trace of wh-movement, always coindexed
  • *ICH*: trace of other movement, always coindexed
  • *: other empty category (e.g. in tough-movement)