Perspektiven einer Qualitativen Stilometrie am Beispiel stilschichtig abgesenkter Lexeme
Here we outline an approach to stylometry, which intends to be more comprehensive compared to classical stylistic metrics and its commonly used lexical frequency counts. As a prerequisite, such an approach needs language data as a basis for its stylistic analyses. In this paper, we describe the acquisition of two relevant resources: First, we depict collecting and preparing CodE Alltag, a German-language email corpus, which contains formal expressions as well as informal and personal interactions, and thus possesses a high stylistic variability. Envisaging the analysis of the vulgar, rough or obscene dimensions of style, we then detail inducing VulGer, a lexical resource covering the lower end of the German language register.
This work is licensed under a Creative Commons Attribution 4.0 International License.