Beyond English Safety: Measuring Behavioral Risk in Multilingual & Code-Switched LLMs

Cohere Scholars Presentation


Presented for Cohere, I develop a risk-science framework for evaluating and hardening multilingual LLMs. It moves beyond static refusal benchmarks to measure real behavioral risks, including the portability of jailbreaks across languages, the persistence (or decay) of safety patches.

It appears you don't have a PDF plugin for this browser. No biggie... you can click here to download the PDF file.