Banca d'Italia today publishes 'Chat Bankman-Fried? An Exploration of LLM Alignment in Finance'.
Advances in large language models (LLMs) renew concerns about whether artificial intelligence shares human values - the so-called alignment problem. We assess whether various LLMs comply with fiduciary duty in simulated financial scenarios. We prompt the LLMs to impersonate the CEO of a financial institution and test their willingness to misappropriate customer assets to repay corporate debt. After evaluating a baseline scenario, we adjust preferences and incentives. We find significant heterogeneity among LLMs in baseline behavior. Responses to changes in risk tolerance, profit expectations, and regulation all match predictions from economic theory. Simulation-based testing can be informative for regulators seeking to ensure LLM safety, but it should be complemented by an analysis of internal LLM mechanics. Appropriate frameworks for LLM risk governance within financial institutions are also necessary.