The current development of large-scale machine learning agents highlights the problem of aligning agents with human values and morals.
A recent study on arXiv.org introduces the Jiminy Cricket environment suite for evaluating moral behavior in text-based games.
It consists of text adventures with dense morality annotations. For every action taken by the agent, the environment reports the moral valence of the scenario and its degree of severity. The proposed approach enables researchers to evaluate agents’ adherence to ethical standards while maximizing regard in complex settings.
It is also shown that large language models with ethical understanding can be used to improve performance by translating moral knowledge into action. Experiments show that the artificial conscience approach steers agents towards moral behavior without sacrificing performance.
When making everyday decisions, people are guided by their conscience, an internal sense of right and wrong. By contrast, artificial agents are currently not endowed with a moral sense. As a consequence, they may learn to behave immorally when trained on environments that ignore moral concerns, such as violent video games. With the advent of generally capable agents that pretrain on many environments, it will become necessary to mitigate inherited biases from environments that teach immoral behavior. To facilitate the development of agents that avoid causing wanton harm, we introduce Jiminy Cricket, an environment suite of 25 text-based adventure games with thousands of diverse, morally salient scenarios. By annotating every possible game state, the Jiminy Cricket environments robustly evaluate whether agents can act morally while maximizing reward. Using models with commonsense moral knowledge, we create an elementary artificial conscience that assesses and guides agents. In extensive experiments, we find that the artificial conscience approach can steer agents towards moral behavior without sacrificing performance.
Research paper: Hendrycks, D., “What Would Jiminy Cricket Do? Towards Agents That Behave Morally”, 2021. Link: https://arxiv.org/abs/2110.13136
Discover more from Today Headline
Subscribe to get the latest posts to your email.