Haris Riaz

prof_pic.jpg

710 Gould Simpson

1040 E 4th Street

Tucson, AZ 85721

I am a PhD student at the University of Arizona, advised by Professor Mihai Surdeanu. My primary area of interest is the faithfulness and causality of reasoning in Large Language Models. Specifically I study strategies for internally integrating knowledge from symbolic reasoners in LLMs via reward modeling, and generating synthetic reward feedback data using weak supervision techniques. I have worked on meta-algorithms for synthetic data generation, incorporating causal reasoning and pragmatics into retrieval augmented generation (RAG) frameworks as a downstream application of my research interest, and exploiting linguistic hints as weak supervision for the NER task.

Recently I completed my internship at Amazon Web Services, where I was an Applied Scientist Intern on the Amazon Science Bedrock team. I worked on an agentic meta-approach for formally diverse synthetic data generation which can be used to adapt LLMs to specific domains requiring only a small amount of synthetic data and without using any real data. This work resulted in a paper currently under review @ ACL 2025.

Before joining the UofA, I completed my undergraduate studies in Computer Science at the School of Electrical Engineering and Computer Science at the National University of Sciences and Technology in 2021.

Besides work, I am learning to play the guitar. I enjoy rock climbing and hiking along the various trails surrounding Tucson. A long time ago, I memorized the spelling of every word in the English dictionary and was a finalist in the 4th Dawn in Education National Spelling Bee.

news

Dec 31, 2024 New paper “MetaSynth: Meta-Prompting Your Large Language Model to Generate Formally Diverse Synthetic Data”. Under submission.
Dec 10, 2024 Our work Say Less Mean More: Leveraging Pragmatics in Retrieval-Augmented Generation is presented as a contributed lightning talk at the MusIML workshop co-located with NeurIPS 2024
Nov 03, 2024 Serving as reviewer for ICLR 2025.
Oct 15, 2024 New paper: Say Less Mean More: Leveraging Pragmatics in Retrieval-Augmented Generation. Under submission.
May 28, 2024 Started my internship as an Applied Scientist Intern at Amazon Bedrock hosted by Sourav Bhabesh and Vinnayak Arannil in Herndon, Virginia!
Mar 13, 2024 Our paper Best of Both Worlds: A Pliable and Generalizable Neuro-Symbolic Approach for Relation Classification is accepted at NAACL 2024 Findings!
Feb 20, 2024 Our paper ELLEN: Extremely Lightly Supervised Learning For Efficient Named Entity Recognition is accepted at LREC-COLING 2024!
Feb 16, 2024 Attending Stanford Treehacks 2024. Our hackathon project eventually morphed into StoryEngine ($750k pre-seed backed by A16z). Note: I am not affiliated with StoryEngine (all credit goes to my hackathon teammate Wanrong He).
Jan 15, 2024 Serving as reviewer for NAACL 2024
Apr 04, 2022 Serving as reviewer for Second Workshop on Pattern-based Approaches to NLP in the Age of Deep Learning (Pan-DL) co-located with EMNLP 2023
Apr 04, 2022 Awarded AI Talent Bursary for attending AI week organized by Alberta Machine Intelligence Institute (AMII)
Jan 12, 2022 Started PhD at University of Arizona working with Professor Mihai Surdeanu
Jun 01, 2021 Graduated with Bachelors in CS from NUST-SEECS. My Final Year Project on Handwritten Sequence Recognition with Time Series Transformers is 1/3 selected for Rector’s Gold Medal for best final year CS project
May 07, 2021 Serving as Volunteer for ICLR 2021

selected publications

  1. Say Less, Mean More: Leveraging Pragmatics in Retrieval-Augmented Generation
    Haris Riaz, Ellen Riloff, and Mihai Surdeanu
    Jan 2025
    Under Submission
  2. MetaSynth: Meta-Prompting Your Large Language Model to Generate Formally Diverse Synthetic Data
    Jan 2025
    Under Submission
  3. Best of Both Worlds: A Pliable and Generalizable Neuro-Symbolic Approach for Relation Classification
    Robert Vacareanu, Fahmida Alam, Md Asiful Islam, Haris Riaz, and Mihai Surdeanu
    In Findings of the Association for Computational Linguistics: NAACL 2024, Jun 2024
  4. ELLEN: Extremely Lightly Supervised Learning for Efficient Named Entity Recognition
    Haris Riaz, Razvan Gabriel Dumitru, and Mihai Surdeanu
    In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), May 2024
  5. Deep neural network techniques in the calibration of space-charge distortion fluctuations for the ALICE TPC
    Sergey Gorbunov, Ernst Hellbär, Gian Michele Innocenti, Marian Ivanov, Maja Kabus, Matthias Kleiner, Haris Riaz, David Rohr, Rifki Sadikin, Kai Schweda, and  others
    In Proceedings of the 25th International Conference on Computing in High Energy and Nuclear Physics (CHEP 2021), Aug 2021