About me

I am Taraka Rama Kasicheyanula. I did my PhD in NLP from Graduate School in Language Technology, Sweden. Then, a postdoc with Gerhard Jaeger at University of Tuebingen, Germany. In Tuebingen, I used to work with Cagri Coltekin on shared tasks. Then, another postdoc in healthcare at University of Oslo until 2019. Then, I was a faculty for few years at the University of North Texas, Denton, TX. My PhD thesis is here. I worked with Soeren Wichmann a lot and did exciting work on language change.

My resume is here

Work Experience

Walmart Global Tech (2022–2023)
University of North Texas (2019–2022)
Post-Doctoral Fellow, BIGMED project with University of Oslo, Norway (2017–2019)
Post-Doctoral Fellow, Language Evolution: The Empirical Turn (EVOLAEMP) project (ERC Advanced Grant) with University of Tuebingen, Germany (November 2015–August 2017)

Education

PhD in NLP: University of Gothenburg, Gothenburg, Sweden
Masters: IIIT-Hyderabad, Hyderabad, India
Bachelors: DA-IICT, Gandhinagar, India

Honors and Awards

Ranked first in:
- VarDial 2016, 2017 shared task on dialect classification
- SemEval-2018 shared task on multilingual emoji classification
- CLPsych-2018 Shared Task on Predicting Current and Future Psychological Health from Childhood Essays
Outstanding Reviewer for COLING 2018
Seed grant of $5,000 USD from College of Information, University of North Texas
NVIDIA Academic GPU grant in 2016 and 2019

Paper database Links

Wordlists links

Dravidian Wordlists collected basic word lists for most of the Dravidian family.
Linguistic Survey of India Digitization

Selected Papers

Luise Häuser, Gerhard Jäger, Johann-Mattis List, Taraka Rama, and Alexandros Stamatakis. 2024. Are Sounds Sound for Phylogenetic Reconstruction?. In Proceedings of the 6th Workshop on Research in Computational Linguistic Typology and Multilingual NLP, pages 78–87, St. Julian’s, Malta. Association for Computational Linguistics. pdf
Søren Wichmann, Taraka Rama; Testing methods of linguistic homeland detection using synthetic data. Philos Trans R Soc Lond B Biol Sci 10 May 2021; 376 (1824): 20200202. pdf
Rama Taraka, Wichmann Søren (2020) A test of Generalized Bayesian dating: A new linguistic dating method. PLoS ONE 15(8): e0236522.pdf
Taraka Rama, Lisa Beinborn, and Steffen Eger. 2020. Probing Multilingual BERT for Genetic and Typological Signals. In Proceedings of the 28th International Conference on Computational Linguistics, pages 1214–1228, Barcelona, Spain (Online). International Committee on Computational Linguistics. pdf
Brekke PH, Rama T, Pilán I, Nytrø Ø, Øvrelid L. Synthetic data for annotation and extraction of family history information from clinical text. J Biomed Semantics. 2021 Jul 14;12(1):11. doi: 10.1186/s13326-021-00244-2. PMID: 34261535; PMCID: PMC8278746. pdf
Dahl, Fredrik A., Taraka Rama, Petter Hurlen, Pål H. Brekke, Haldor Husby, Tore Gundersen, Øystein Nytrø, and Lilja Øvrelid. “Neural classification of Norwegian radiology reports: using NLP to detect findings in CT-scans of children.” BMC Medical Informatics and Decision Making 21, no. 1 (2021): 84.pdf
Çağrı Çöltekin and Taraka Rama. 2018. Tübingen-Oslo at SemEval-2018 Task 2: SVMs perform better than RNNs in Emoji Prediction. In Proceedings of the 12th International Workshop on Semantic Evaluation, pages 34–38, New Orleans, Louisiana. Association for Computational Linguistics.pdf
Çağrı Çöltekin and Taraka Rama. 2017. Tübingen system in VarDial 2017 shared task: experiments with language identification and cross-lingual parsing. In Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), pages 146–155, Valencia, Spain. Association for Computational Linguistics.pdf
Taraka Rama and Johann-Mattis List. 2019. An Automated Framework for Fast Cognate Detection and Bayesian Phylogenetic Inference in Computational Historical Linguistics. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 6225–6235, Florence, Italy. Association for Computational Linguistics.pdf
Taraka Rama. 2018. Similarity Dependent Chinese Restaurant Process for Cognate Identification in Multilingual Wordlists. In Proceedings of the 22nd Conference on Computational Natural Language Learning, pages 271–281, Brussels, Belgium. Association for Computational Linguistics.pdf
Chundra Cathcart and Taraka Rama. 2020. Disentangling dialects: a neural approach to Indo-Aryan historical phonology and subgrouping. In Proceedings of the 24th Conference on Computational Natural Language Learning, pages 620–630, Online. Association for Computational Linguistics.pdf
Taraka Rama, Lisa Beinborn, and Steffen Eger. 2020. Probing Multilingual BERT for Genetic and Typological Signals. In Proceedings of the 28th International Conference on Computational Linguistics, pages 1214–1228, Barcelona, Spain (Online). International Committee on Computational Linguistics.pdf
Sowmya Vajjala and Taraka Rama. 2018. Experiments with Universal CEFR Classification. In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 147–153, New Orleans, Louisiana. Association for Computational Linguistics.pdf
Taraka Rama and Sowmya Vajjala. 2017. A Telugu treebank based on a grammar book. In Proceedings of the 16th International Workshop on Treebanks and Linguistic Theories, pages 119–128, Prague, Czech Republic.pdf
Taraka Rama, Çağrı Çöltekin, and Pavel Sofroniev. 2017. Computational analysis of Gondi dialects. In Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), pages 26–35, Valencia, Spain. Association for Computational Linguistics.pdf
Aleksandrs Berdicevskis, Çağrı Çöltekin, Katharina Ehret, Kilu von Prince, Daniel Ross, Bill Thompson, Chunxiao Yan, Vera Demberg, Gary Lupyan, Taraka Rama, and Christian Bentz. 2018. Using Universal Dependencies in cross-linguistic complexity research. In Proceedings of the Second Workshop on Universal Dependencies (UDW 2018), pages 8–17, Brussels, Belgium. Association for Computational Linguistics.pdf