By: Eliza Bennet
AI firm Anthropic has partnered with the Collective Intelligence Project to democratize the development of AI by leveraging public input. Through an innovative experiment, called the 'Collective Constitutional AI', the values of 1,000 diverse test subjects were used to fine-tune a large language model's (LLM) judgement criteria.
Typically, public-facing LLMs, like Anthropic's Claude and OpenAI's ChatGPT, come with pre-set behavioral instructions known as 'guardrails'. However, such prescriptive measures have been criticized for limiting user agency and creating a discrepancy between what's acceptable and what's useful. In an attempt to address these issues, Anthropic's experiment sought to allow users to influence AI model value alignment.
Anthropic utilized a method known as 'Constitutional AI' to guide this endeavor. This method involves providing the AI with a host of rules it has to follow, similar to the constitutions that govern nations. In the 'Collective Constitutional AI' experiment, Anthropic tried to integrate crowd-sourced feedback into the model's constitutional rules. While the experiment faced challenges due to the absence of established benchmarks for this kind of process, the model with user polling feedback displayed slightly better performance in checking biased outputs than the base model.
Overall, the experiment demonstrated the potential of public-influenced AI behavior direction and hinted at future possibilities for culturally and context-specific models in line with public needs.