Bàlà Xàn (Balaxan) Corpus of Kurmanji.
The first Kurmanji Speech corpus.

Corpus Details

There are currently 58 utterances by one speaker of Kurmanji. utterances are divided into 4 categories based on their sentence structures: Declarative, Imperative, Interrogative, and Exclamatory. The corpus has subtitles both in Kurmanji (Latin alphabet) and English.

Technical Details

Recordings were done by professional microphones. all the recordings are in WAV format and recorded with the bit rate of: 3072kbps. the subtitles are in text (TXT) format and some special characters like: à, ä are used.

Obtaining the corpus

The corpus is available at LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague. http://hdl.handle.net/11372/LRT-1531

You can download the corpus here:

  • imperative
  • interrogative
  • exclamatory
  • declarative

  • You can listen to samples:

    Balaxan (Bàlàxàn, Balakhan, or written seperately: Bala Xan) was one of the first Kurds entered Amarlu district in Guilan, Iran

    Copyright by Adel Rahimi - Attribution License 4.0
    Creative Commons License
    Balaxan Corpus of Kurmanji by Adel Rahimi is licensed under a Creative Commons Attribution 4.0 International License.