METU NLP — Publications

OCRTurk: A Comprehensive OCR Benchmark for Turkish

D Yılmaz, EA Munis, C Toraman, SK Köse, B Aktaş, MC Baytekin, ...

Proceedings of the Second Workshop Natural Language Processing for Turkic Languages (SIGTURK 2026), 2026

RAGTurk: Best Practices for Retrieval Augmented Generation in Turkish

SK Köse, MC Baytekin, B Aktaş, BK Görür, EA Munis, D Yılmaz, MY Kartal, ...

Proceedings of the Second Workshop Natural Language Processing for Turkic Languages (SIGTURK 2026), 2026

BIRDTurk: Adaptation of the BIRD Text-to-SQL Dataset to Turkish

B Aktaş, MC Baytekin, SK Köse, Ö İlbilgi, EÖ Yılmaz, C Toraman, …

Proceedings of the Second Workshop Natural Language Processing for Turkic Languages (SIGTURK 2026), 2026

TurkBench: A Benchmark for Evaluating Turkish LLMs

Ç Toraman, AK Sever, AA Cengiz, EE Arslan, G Sevinç, MM Birdal

arXiv preprint arXiv:2601.07020, 2026

FIBER: A Multilingual Evaluation Resource for Factual Inference Bias

EA Munis, D Yılmaz, A Muti, Ç Toraman

arXiv preprint arXiv:2512.11110, 2025

Evaluating the quality of benchmark datasets for low-resource languages: A case study on Turkish

EE Umutlu, AA Cengiz, AK Sever, S Erdem, B Aytan, B Tufan

Proc. of the 4th Workshop on Generation, Evaluation and Metrics, 2025

OpenEthics: A Comprehensive Ethical Evaluation of Open-Source Generative LLMs

BE Çetin, Y Özen, EN Demiryılmaz, K Engür, C Toraman

arXiv preprint arXiv:2505.16036, 2025

Adapting open-source generative LLMs for low-resource languages: A case study for Turkish

C Toraman

Proc. of the 4th Workshop on Multilingual Representation Learning, 2024

MiDe22: An annotated multi-event tweet dataset for misinformation detection

C Toraman, O Ozcelik, F Şahinuç, F Can

Proc. of the 2024 Joint Int. Conf. on Computational Linguistics, 2024

JL-Hate: An Annotated Dataset for Joint Learning of Hate Speech and Target Detection

K Büyükdemirci, IE Kucukkaya, E Ölmez, C Toraman

Proc. of the 2024 Joint Int. Conf. on Computational Linguistics, 2024

PejorativITy: Disambiguating pejorative epithets to improve misogyny detection

A Muti, F Ruggeri, C Toraman, A Barrón-Cedeño, S Algherini

Proc. of the 2024 Joint Int. Conf. on Computational Linguistics, 2024

Arc-nlp at climateactivism 2024: Stance and hate speech detection

A Kaya, O Ozcelik, C Toraman

Proc. of the 7th Workshop on Challenges of Automated Processing, 2024

Detecting Misinformation on Social Media Using Community Insights

O Ozcelik, C Toraman, F Can

ACM Transactions on Intelligent Systems and Technology, 2024

Constructing ensembles for hate speech detection

IE Kucukkaya, C Toraman

Natural Language Processing, 2024

SiMiD: Similarity-based misinformation detection via communities

O Ozcelik, C Toraman, F Can

2023 10th Int. Conf. on Social Networks Analysis, 2023

ARC-NLP at PAN 2023: Writing Style Detection

IE Kucukkaya, U Sahin, C Toraman

arXiv preprint arXiv:2307.14913, 2023

ARC-NLP at PAN 2023: Hierarchical long text classification

U Sahin, IE Kucukkaya, C Toraman

arXiv preprint arXiv:2307.14912, 2023

Arc-nlp at multimodal hate speech event detection 2023

U Sahin, IE Kucukkaya, O Ozcelik, C Toraman

arXiv preprint arXiv:2307.13829, 2023

Zero and few-shot hate speech detection in earthquake disaster

U Sahin, IE Kucukkaya, O Ozcelik, C Toraman

2023 31st Signal Processing and Communications Conf. (SIU), 2023

The effect of gender bias on hate speech detection

F Şahinuç, EH Yilmaz, C Toraman, A Koç

Signal, Image and Video Processing 17 (4), 2023

Impact of tokenization on language models: An analysis for Turkish

C Toraman, EH Yilmaz, F Şahinuç, O Ozcelik

ACM Trans. on Asian and Low-Resource Language Information, 2023

Tweets under the rubble: Detection of messages calling for help (v1)

C Toraman, IE Kucukkaya, O Ozcelik, U Sahin

arXiv preprint arXiv:2302.13403, 2023

Tweets under the rubble: Detection of messages calling for help (v2)

C Toraman, IE Kucukkaya, O Ozcelik, U Sahin

arXiv preprint arXiv:2302.13403, 2023

ARC-NLP at CASE 2022: Ensemble learning for protest detection

U Sahin, O Ozcelik, IE Kucukkaya, C Toraman

Proc. of the 5th Workshop on Automated Processing, 2022

Understanding social engagements: Comparative analysis of Twitter

C Toraman, F Şahinuç, EH Yilmaz, IB Akkaya

Social network analysis and mining 12 (1), 2022

Named entity recognition in Turkish: A comparative study

O Ozcelik, C Toraman

Information Processing & Management 59 (6), 2022

D2U: distance-to-uniform learning for out-of-scope detection

E Yilmaz, C Toraman

Proc. of NAACL 2022, 2022

Blacklivesmatter 2020: analysis of deleted and suspended users

C Toraman, F Şahinuç, EH Yilmaz

Proc. of the 14th ACM Web Science Conference, 2022

Large-scale hate speech detection with cross-domain transfer

C Toraman, F Şahinuç, E Yilmaz

Proc. of the 13th Language Resources and Evaluation Conference, 2022

Slot filling for voice assistants

O Ozcelik, EH Yilmaz, F Şahinuç, C Toraman

2022 30th Signal Processing and Communications Conf. (SIU), 2022

ARC-NLP at CheckThat!-2022: Contradiction for Harmful Tweets

C Toraman, O Ozcelik, F Sahinuç, U Sahin

CLEF (Working Notes), 2022

Event-related microblog retrieval in Turkish

Ç TORAMAN

Turkish Journal of Electrical Engineering & Computer Sciences, 2022

Conqx: Semantic expansion of spoken queries

EH Yilmaz, C Toraman

arXiv preprint arXiv:2109.00729, 2021

Topic Detection based on Deep Learning in Turkish Microblogs

F Şahinuç, C Toraman, A Koç

2021 29th Signal Processing and Communications Conf. (SIU), 2021

Intent classification based on deep learning in Turkish dialogs

EH Yilmaz, C Toraman

2021 29th Signal Processing and Communications Conf. (SIU), 2021

Tweet Length Matters: Topic Detection in Microblogs

F Şahinuç, C Toraman

European Conference on Information Retrieval, 2021

Türkçe mikroblog metinlerinde derin öğrenme tabanlı konu tespiti

F Şahinuç, Ç Toraman, A Koç

IEEE, 2021

KLOOS: KL divergence-based out-of-scope intent detection

EH Yilmaz, C Toraman

Proc. of the 43rd international ACM SIGIR conference, 2020

Crosssimon: Probabilistic approach to OSN simulation

J Liu, W Chung, Y Huang, C Toraman

2019 IEEE Int. Conf. on Intelligence and Security Informatics, 2019

SimON-Feedback: Performance tuning in social simulation

M Vora, W Chung, C Toraman, Y Huang

2019 IEEE Int. Conf. on Intelligence and Security Informatics, 2019

Deep learning approach to modeling temporal networks on Reddit

W Chung, C Toraman, Y Huang, M Vora, J Liu

2019 IEEE Int. Conf. on Intelligence and Security Informatics, 2019

Discovering story chains: zigzagged search and news actors

C Toraman, F Can

Journal of the Association for Information Science and Technology, 2017

Early prediction of public reactions using microblogs

C Toraman

7th BCS-IRSG Symposium on Future Directions, 2017

Past, present, and future on news streams (Thesis)

Ç Toraman

Bilkent University, 2017

A front-page news-selection algorithm based on topic modelling

C Toraman, F Can

Journal of Information Science 41 (5), 2015

Türkçe Haber Yazılarında Sosyal Ağların İncelenmesi

Ç Toraman, F Can

17. Akademik Bilişim Konferansı, 2015

News Selection with Topic Modeling

C Toraman

5th BCS-IRSG Symposium (FDIA 2013), 2013

Haber Yığınlarında Konu Başlıklarının Belirlenmesi

Ç Toraman, F Can

29. Ulusal Bilişim Kurultayı (Bilişim 2012), 2012

Squeezing the ensemble pruning: Faster news categorization

C Toraman, F Can

European Conference on Information Retrieval, 2012

Ensemble pruning for text categorization (Data Partitioning)

C Toraman, F Can

Asia Information Retrieval Symposium, 2011

Developing a text categorization template for Turkish news

C Toraman, F Can, S Koçberber

2011 Int. Symposium on Innovations in Intelligent Systems, 2011

Text categorization and ensemble pruning in Turkish news (Thesis)

Ç Toraman

Bilkent University, 2011