Bowen Wei

Hello! My name is Bowen Wei, and I am a third-year Ph.D. student in Computer Science at George Mason University. I am fortunate to be advised by Professor Ziwei Zhu.

My research spans trustworthy and interpretable AI and agentic reinforcement learning (RL) for large language models. I develop prototypebased, symbolic, and explanation-driven methods to make model behavior transparent, and robust, enabling users to understand and trust AI decisions in high-stakes settings. In parallel, I study RL and post-training techniques that distill multi-agent reasoning into single, verifiable agents—improving reasoning quality, evidence attribution, and causal grounding. Together, these directions aim to advance AI systems that are both interpretable and competent in reasoning.

news

Apr 14, 2026	🎉 Two papers accepted to ACL 2026! “VIGNETTE: Socially Grounded Bias Evaluation for Vision-Language Models” as an oral in the Main Conference, and “Context-Aware Decoding for Faithful Vision-Language Generation” in Findings.
Nov 08, 2025	🎉 Our paper “Making Sense of LLM Decisions: A Prototype-based Framework for Explainable Classification” has been accepted for an oral presentation at AAAI 2026!
May 15, 2025	🎉 Our paper “ProtoLens: Advancing Prototype Learning for Fine-Grained Interpretability in Text Classification” has been accepted to the main conference at ACL 2025!

selected publications

AAAI 2026 Oral

Making Sense of LLM Decisions: A Prototype-based Framework for Explainable Classification

Bowen Wei, Mehrdad Fazli, and Ziwei Zhu

In Proceedings of the AAAI Conference on Artificial Intelligence, 2026

DOI
ACL 2025 Main

ProtoLens: Advancing Prototype Learning for Fine-Grained Interpretability in Text Classification

Bowen Wei and Ziwei Zhu

In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Jul 2025

Abs DOI

In this work, we propose ProtoLens, a novel prototype-based model that provides fine-grained, sub-sentence level interpretability for text classification. ProtoLens uses a Prototype-aware Span Extraction module to identify relevant text spans associated with learned prototypes and a Prototype Alignment mechanism to ensure prototypes are semantically meaningful throughout training. By aligning the prototype embeddings with human-understandable examples, ProtoLens provides interpretable predictions while maintaining competitive accuracy. Extensive experiments demonstrate that ProtoLens outperforms both prototype-based and non-interpretable baselines on multiple text classification benchmarks. Code and data are available at \urlhttps://github.com/weibowen555/ProtoLens.
NeurIPS LAW 2025

CORTEX: Collaborative LLM Agents for High-Stakes Alert Triage

Bowen Wei, Yuan Shen Tay, Howard Liu, and 4 more authors

Jul 2025
WACV 2026

Mitigating Hallucination in Large Vision-Language Models via Adaptive Attention Calibration

Mehrdad Fazli, Bowen Wei, Ahmet Sari, and 1 more author

Jul 2025
arXiv

ClawSafety: "Safe" LLMs, Unsafe Agents

Bowen Wei, Yunbei Zhang, Jinhao Pan, and 5 more authors

Jul 2026
arXiv

A Logical-Rule Autoencoder for Interpretable Recommendations

Bowen Wei^*, Jinhao Pan^*, and Ziwei Zhu

Jul 2026

* Equal contribution.
arXiv

KnowBias: Mitigating Social Bias in LLMs via Know-Bias Neuron Enhancement

Jinhao Pan, Chahat Raj, Anjishnu Mukherjee, and 4 more authors

Jul 2026
arXiv

Neural Symbolic Logical Rule Learner for Interpretable Learning

Bowen Wei and Ziwei Zhu

Jul 2024
ACL 2026 Oral

VIGNETTE: Socially Grounded Bias Evaluation for Vision-Language Models

Chahat Raj, Bowen Wei, Aylin Caliskan, and 2 more authors

In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics, Jul 2026
ACL 2026 Findings

Context-Aware Decoding for Faithful Vision-Language Generation

Mehrdad Fazli, Bowen Wei, and Ziwei Zhu

In Findings of the Association for Computational Linguistics: ACL 2026, Jul 2026