The Principles of Teachers’ Speech Corpus Annotation

Keywords: teaching practices, classroom discourse, ethnographic methods in education studies, spoken corpus


The article describes the principles of creating a corpus of teachers’ speech, which enables to apply an ethnographic approach to study teaching practices. Through the analysis of a large dataset of real classroom recordings, this corpus aims to identify linguistic, psychological, and sociological factors contributing to the improvement of teaching effectiveness. The corpus includes audio recordings of lessons in 5–8 grades from several schools in Russia. Annotation of the corpus is conducted using the Praat program. To determine the linguistic parameters that can influence teachers’ effectiveness and should be annotated in the corpus, we conducted a survey aimed to find out how students describe an ideal and a poor teacher. Based on the survey results, along with an analysis of existing spoken corpora and papers in linguistics and education, we have developed an annotation system comprising 19 levels. Some of these levels overlap with those found in any spoken corpus (orthographic transcription of words, lemmas, parts of speech, morphological annotation). The following levels are specific to our corpus: the parts of the lesson (organizational stage, introduction of new material, etc.), the level at which fragments of reading are separated from the rest of the teacher’s speech, four levels for marking pauses, phonetic transcription level, volume annotation, two levels for error annotation (phonetic and grammatical separately), and four levels related to vocabulary (words with special derivational features, emotionally-evaluative vocabulary, word usage domains, discourse markers). The corpus will allow to provide recommendations for improving teachers’ speech behavior.


Download data is not yet available.


Adriosh M., Razı Ö. (2019) Teacher’s Code Switching in EFL Undergraduate Classrooms in Libya: Functions and Perceptions. SAGE Open, vol. 9, no 2, pp. 1–11.

Atkinson A.A., Kaplan R.S., Young S.M, Banker R.D., Banker P.D. (2005) Management Accounting. Englewood Cliffs, NJ: Prentice Hall.

Apresjan V.Yu. (2010) Speech Strategies for Expressing Emotions in Russian. Russian Language and Linguistic Theory, vol. 20, no 2, pp. 26–56 (In Russian).

Azbel A.A., Ilyushin L.S., Kazakova E.I., Morozova P.A. (2022) Teachers’ and Students’ Attitudes Towards Feedback: Contradictions and Development Trends. The Education and Science Journal, vol. 24, no 7, pp. 76–109.

Azizah A.N., Suparno S., Supriyadi S. (2020) Indonesian in Service Teacher’s Production of Directive Speech Acts and Students’ Responses. Randwick International of Education and Linguistics Science Journal, vol. 1, no 3, pp. 449–461.

Baeva E.M. (2018) Hesitation Phenomena in Spoken Russian Speech of Low Spontaneity. Communication Studies, vol. 15, no 1, pp. 75–84 (In Russian).

Baranov A.N., Plungian V.A., Rakhilina E.V. (1993) A Guidebook to Discourse Words of the Russian Language. Moscow: Pomovski and partners (In Russian).

Baranova V.V., Panova E.A., Gavrilova T.O., Fedorova K.S. (2012) Language, Society, School. Moscow: Novoe Literaturnoe Obozrenie (In Russian).

Blaauw E. (1994) The Contribution of Prosodic Boundary Markers to the Perceptual Difference between Read and Spontaneous Speech. Speech Communication, vol. 14, no 4, pp. 359–375.

Boboshko M.Yu., Riekhakaynen E.I. (2019) Speech Audiometry in Clinical Practice. Saint Petersburg: Dialog (In Russian).

Bogdanova N.V. (2001) Live Phonetic Processes in Russian Speech. Saint Petersburg: Saint Petersburg State University (In Russian).

Bogdanova N.V., Asinovsky A.S., Rusakova M.V., Ryko A.I., Stepanova S.B., Sherstinova T.Ju. (2009) A Speech Corpus as a Tool for Monitoring and Fixation of Various Forms of Natural Language. Computational Linguistics and Intellectual Technologies. Proceedings of the the Annual International Conference "Dialogue" (Bekasovo, 2009, 27–31 May), vol. 8, Moscow: Russian State University for the Humanities, vol. 8, pp. 33–48 (In Russian).

Bogdanova-Beglarian N.V. (ed.) (2021) Pragmatic Markers of Russian Everyday Speech: Dictionary-Monograph. Saint Petersburg: Nestor-Istoriya (In Russian).

Bogdanova-Beglarian N.V., Blinova O.V., Sherstinova T.Ju., Troshchenkova E.V, Gorbunova D.A, Zajdes K.D., Popova T.I., Sulimova T.S. (2021) Pragmatic Markers of Russian Everyday Speech: Quantitative Data. Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference "Dialogue", Moscow: Russian State University for the Humanities, iss. 20, pp. 119–126 (In Russian).

Bond Z., Moore T.J. (1994) A Note on the Acoustic-Phonetic Characteristics of Inadvertently Clear Speech. Speech Communication, vol. 14, no 4, pp. 325–337.

Bordovskaya N.V., Rean A.A. (2006) Pedagogy. Saint Petersburg: Piter (In Russian).

Chernov D.E., Chernova L.V. (2011) Voice and Speech of a Teacher as a Major Element of Professional Skill. Pedagogical Education in Russia, no 5, pp. 218–222 (In Russian).

Dale M.E., Godley A.J., Capello S.A., Donnelly P.J., D'Mello S.K., Kelly S.P. (2022) Toward the Automated Analysis of Teacher Talk in Secondary ELA Classrooms. Teaching and Teacher Education, vol. 110, no 3, Article no 103584.

Erofeeva E.V., Yushkova S.V. (2021) Discursive Words in Oral Spontaneous Speech: Social, Individual and Thematic Variation. Socio- and Psycholinguistic Studies, iss. 10, pp. 15–26 (In Russian).

Frolova O.E. (2004) Spontaneous Text: Structure and Typology. Proceedings of the Conference "Theory and Practice of Speech Communication" (Moscow, 2004, 7–9 September), pp. 145–148 (In Russian).

Galitskikh E.O. (2018) Teacher's Public Speech as a Time of Apprenticeship. Pedagogical IMAGE, vol. 39, no 2, pp. 20–28 (In Russian).

Gorbova E.V., Slepokurova N.A., Chernigovskaya T.V., Komovkina E.P., Matveeva T.V., Riekhakaynen E.I., Romanova A.S. (2006) Preliminary Results of Monitoring Modern Russian Spoken Spontaneous Speech. Modern Russian Speech: State and Functioning. Collection of Analytical Materials. Saint Petersburg: Faculty of Philology, Saint Petersburg State University, vol. 2, pp. 7–30 (In Russian).

Howell P., Kadi-Hanifi K. (1991) Comparison of Prosodic Properties between Read and Spontaneous Speech Material. Speech Communication, vol. 10, no 2, pp. 163–169.

Ishino M. (2024) Inclusive Third-Turn Repeats: Managing or Constraining Students’ Epistemic Status? Classroom Discourse, vol. 15, no 1, pp. 24–61.

Ivanovskaja O.G. (2015) The Semantic Resonance at Pupils on the Speech of a Teacher as the Text of Culture. Nizhny Novgorod Education, no 1, pp. 126–130 (In Russian).

Johansson V. (2008) Lexical Diversity and Lexical Density in Speech and Writing. A Developmental Perspective. Lund University, Department of Linguistics and Phonetics: Working Papers, vol. 53, pp. 61–79.

Karpova O.S., Reznikova T.I., Arkhangelskiy T.A., Kyuseva M.V., Rakhilina E.V., Ryzhova D.A., Tagabileva M.G. (2010) The Database on Russian Polysemous Adjectives and Adverbs. Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference "Dialogue", Moscow: Russian State University for the Humanities, vol. 9, pp. 163–168 (In Russian).

Khaymovich L.V., Kurlygina O.E. (2020) Formation of an Evaluation Component of a Teacher’s Professional Speech. Nizhny Novgorod Education, no 1, pp. 66–73 (In Russian).

Kibrik A.A., Podlesskaya V.I. (2009) Night Dream Stories: A Corpus Study of Spoken Russian Discourse. Moscow: Languages of Slavonic Culture (In Russian).

Klimova A.V. Kaurova A.N. (2018) Teacher’s Speech and Requirements to Speech. Ekonomicheskie i gumanitarnye issledovaniya regionov, no 2, pp. 20–25 (In Russian).

Kobozeva I.M. (2007) Ambiguity of Discourse Markers – Can It Be Resolved in Clausal Context? (the Case of vot). Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference "Dialogue", Moscow: Russian State University for the Humanities, iss. 7, pp. 250–255 (In Russian).

Kochkina N.L. (2008) Formation of the Speech Culture of the Future Teacher in the Educational Process of the University (PhD Thesis). Voronezh: Voronezh State University (In Russian).

Krause J.C., Braida L.D. (2002) Investigating Alternative Forms of Clear Speech: The Effects of Speaking Rate and Speaking Mode on Intelligibility. The Journal of the Acoustical Society of America, vol. 112, no 5, pp. 2165–2172.

Krysin L.P. (2013) Modern Russian Language. Lexical Semantics. Lexicology. Phraseology. Lexicography. Moscow: Akademiya (In Russian).

Kustova G.I., Lyashevskaya O.N., Paducheva E.V., Rakhilina E.V. Semantic Marking in the Dictionary of the National Corpus of Russian Language: Principles, Problems, and Prospects. National Corpus of Russian Language Results and Prospects. Moscow: Indrik, pp. 155–174 (In Russian).

Lam J., Tjaden K., Wilding G. (2012) Acoustics of Clear Speech: Effect of Instruction. Journal of Speech, Language, and Hearing Research, vol. 55, no 6, pp. 1807–1821.

Lazurenko E.Iu. (2006) Professional Communication Behavior: An Experimental Study (PhD Thesis). Voronezh: Voronezh State University (In Russian).

Lee J.J. (2020) Spoken Classroom Discourse. The Routledge Handbook of Corpus Approaches to Discourse Analysis (eds E. Friginal, J.A. Hardy), Abingdon: Routledge, pp. 82–97.

Loukachevitch N.V. Levchik A.V. (2016) Creating Russian Sentiment Lexicon. Open Semantic Technologies for Intelligent System, vol. 6, pp. 377–382 (In Russian).

Lyashevskaya O.N. (2016) Corpus Instruments for Russian Grammar Studies. Moscow: Languages of Slavonic Culture (In Russian).

Makarova D.V. (2009) Teacher's Speech Behavior in the Structure of Pedagogical Discourse (PhD Thesis). Moscow: Moscow Pedagogical State University (In Russian).

Malov E.M., Gorbova E.V. (2007) Discursive Words in Russian Colloquial Speech (Based on the Analysis of Spontaneous Colloquial Speech). Proceedings of the First Interdisciplinary Workshop "Analysis of the Russian Colloquial Speech" (Saint Petersburg, 2007, 29 August), pp. 31–36 (In Russian).

Mamaev I., Khokhlova M., Dayter M. (2024). Lessons of Secondary School Teachers: From Automatic Speech Analysis to the Markers of Effective Teaching Practices. Education and Self Development, vol. 19, no 1, pp. 27–37.

Maschler Y., Schiffrin D. (2015) Discourse Markers. Language, Meaning, and Context. The Handbook of Discourse Analysis (eds D. Tannen, H.E. Hamilton, D. Schiffrin), Chichester: John Wiley & Sons, pp. 189–221.

Murashov A.A. (2014) The Teacher's Speech and His Professional Image. School Technologies, no 1, pp. 241–244 (In Russian).

Nakamura M., Iwano K., Furui S. (2008) Differences between Acoustic Characteristics of Spontaneous and Read Speech and Their Effects on Speech Recognition Performance. Computer Speech & Language, vol. 22, no 2, pp. 171–184.

Nasyrova E.V. (2022) Reduction of Preposition "v" in Russian Spoken Speech. Phonetic Lyceum (eds T. V. Kachkovskaya, A.A. Portnova), iss. 7, pp. 78–82 (In Russian).

Nigmatulina Yu.O. (2017) Sound Contraction at Word Boundaries in Spontaneous and Read Aloud Speech: Evidence from Russian. Vestnik SPbSU. Language and Literature, vol. 14, no 1, pp. 76–88 (In Russian).

Petrov S.M. (2003) Loudness Level and Phrase Intelligibility of Spectrally Transformed Speech. Human Physiology, vol. 29, no 4, pp. 45–48 (In Russian).

Pidkasisty P.I. (2011) Pedagogy. Moscow: Jurait (In Russian).

Rakhilina E.V., Kustova G.I., Lyashevskaya O.N., Reznikova T.I., Shemanaeva O.Ju. (2009) Tasks and Principles of Semantic Markup of Vocabulary in RNC. Russian National Corpus: 2006–2008. New Results and Prospects. Saint Petersburg: Nestor-Istoriya, pp. 215–239 (In Russian).

Remacle A., Bouchard S., Etienne A.M. et al. (2021) A Virtual Classroom Can Elicit Teachers’ Speech Characteristics: Evidence from Acoustic Measurements During in Vivo and in Virtuo Lessons, Compared to a Free Speech Control Situation. Virtual Reality, vol. 25, pp. 935–944.

Remacle A., Bouchard S., Morsomme D. (2023) Can Teaching Simulations in a Virtual Classroom Help Trainee Teachers to Develop Oral Communication Skills and Self-Efficacy? A Randomized Controlled Trial. Computers & Education, vol. 200, July, Article no 104808.

Riapina N.E., Permyakova T.M., Balezina E.A. (2023) Approbation of Pedagogical Communication Scales for Educational Online Interaction in Russian Universities. Voprosy obrazovaniya / Educational Studies Moscow, no 2, pp. 161–186 (In Russian).

Riekhakaynen E.I. (2020) Corpora of Russian Spontaneous Speech as a Tool for Modelling Natural Speech Production and Recognition. Proceedings of the Annual Computing and Communication Workshop and Conference, CCWC 2020 (Las Vegas, 2020, 6–8 January), pp. 406–411.

Riekhakaynen E.I. (2016) Perception of Russian Oral Speech: Context + Frequency. Saint Petersburg: Saint Petersburg State University (In Russian).

Rudneva E.A. (2016) Anthropology of Politeness: Cultural and Local Interaction Norms. Forum for Anthropology and Culture, vol. 30, pp. 215–242 (In Russian).

Sergomanov P.A., Bysik N.V. (2022) Teaching Practices: Research and Its Platformization in the Digital Age. The Educational Policy, vol. 89, no 1, pp. 54–65 (In Russian).–838Х-2022–1-54-65

Sergomanov P.A., Maltsev M.A., Bysik N.V., Beketov V.Yu., Baiburin R.F. (2023) Sociology of the Lesson: Discourse Organization of Successful Teaching Practices. Voprosy obrazovaniya / Educational Studies Moscow, no 1, pp. 191–218 (In Russian).

Shalina I.V., Chuvasheva N.L. (2006) Teacher's Explanatory Speech: Linguorhetorical Aspect. Philological Class, vol. 16, no 2, pp. 45–49 (In Russian).

Shanahan L.E., Roof L.M. (2013) Developing Strategic Readers: A Multimodal Analysis of a Primary School Teacher's Use of Speech, Gesture and Artefacts. Literacy, vol. 47, no 2, pp. 157–164.

Sharonov I.A. (2016) Discursive Words and Communicatives. Computational Linguistics and Intellectual Technologies. Proceedings of the Annual International Conference "Dialogue", Moscow: Russian State University for the Humanities, iss. 15, pp. 605–615 (In Russian).

Sharpe T. (2008) How Can Teacher Talk Support Learning? Linguistics and Education, vol. 19, no 2, pp. 132–148.

Shcherba L.V. (1957) On Different Pronunciation Styles and Ideal Phonetic Word Composition. Selected Works of Russian Language (ed. M.I. Matusevich), Moscow: Uchpedgiz, pp. 21–26 (In Russian).

Sherstinova T.Ju., Ryko A.I., Stepanova S.B. (2009) Annotation System in the ORD Speech Corpus. Proceedings of the XXXVIII International Philological Conference (Saint Petersburg, 2009, 16–20 March), pp. 66–75 (In Russian).

Tognini-Bonelli E. (2001) Corpus Linguistics at Work. Amsterdam: John Benjamins.

Trouvain J., Werner R., Möbius B. (2020) An Acoustic Analysis of Inbreath Noises in Read and Spontaneous Speech. Proceedings of the 10th International Conference on Speech Prosody (online, 2020, 25 May — 31 August), pp. 789–793.

Utkina O.N. (2012) Technique of Measurement of a Loudness Level Voices of the Teacher. Bulletin of Surgut State Pedagogical University, no 3. pp. 215–221 (In Russian).

Ventsov A.V., Grudeva E.V. (2008) Frequency dictionary of Russian language word forms (project). Cherepovets: ChSU..

Ventsov A.V., Slepokurova N.A. (2013) Phonetic Transcription of Speech Corpora: Problems and Solutions. Topical Issues of Theoretical and Applied Phonetics. Collection of Articles for the Anniversary of O.F. Krivnova (eds A.V. Arkhipov, I.M. Kobozeva, K.P. Semenova), Moscow: Buki Vedi, pp. 33–42 (In Russian).

Ventsov A.V., Slepokurova N.A., Snyugina E.A. (2011) Pause Features of Spontaneous and Read Texts. Proceedings of the Fifth Interdisciplinary Workshop "Analysis of the Russian Colloquial Speech" (Saint Petersburg, 2011, 25–26 August), pp. 27–32 (In Russian).

Vinogradova Iu.S., Prokaeva V.O., Riekhakaynen E.I. (2023) Not All Pauses Are the Same: Multidimensional Classification of Pauses for the Annotation of Russian Spoken Corpora. Russian Speech, no 6, pp. 7–23 (In Russian).

Yanko T.E. (2023) Intonation. Materials for the Corpora Description Project of the Russian Grammar (In Russian). Available at: (accessed 21 April 2024).

Zakharov V.P. (2016) Prolegomena to Corpus Linguistics. Journal of Psycholinguistics, vol. 28, no 2, pp. 150–161 (In Russian).

Zalevskaya A.A. (2007) Introduction to Psycholinguistics. Moscow: Russian State University for the Humanities (In Russian).

Zotova T.Ju., Surkova A.P. (2022) Tactics of Speech Influence in the Aspect of Studying the Communicative Behavior of the Teacher. Problems of Modern Pedagogical Education, vol. 74, no 4, pp. 78–82 (In Russian).

Zuyeva D.A. (2009) Mathematical Speech Standards of a Teacher: Basic Qualities and Conditions for Their Development. Izvestia: Herzen University Journal of Humanities & Sciences, no 112, pp. 134–139 (In Russian).

How to Cite
RiekhakaynenElena I., BratashValentina S., ZubovVladislav I., and SergomanovPavel A. 2024. “The Principles of Teachers’ Speech Corpus Annotation”. Voprosy Obrazovaniya / Educational Studies Moscow, no. 2 (July).
Research Articles