Beyond these broad objectives, the document also provides clear instructions, which the blog refers to as “rules.” These rules are designed to address complex situations and “help ensure the safety and legality” of AI actions. Some of these rules include following instructions from users, complying with laws, avoiding the creation of information hazards, respecting user rights and privacy, and avoiding the generation of inappropriate or NSFW (not safe for work) content.
Finally, the Model Spec acknowledges that there may be situations where these objectives and rules “conflict.” To navigate these complexities, the document suggests default behaviors for the AI model to follow. These default behaviors include assuming the best intentions from the users, being helpful without “overstepping” boundaries, and encouraging respectful interactions.
“This is the direction the models should ideally be going and it’s great to see OpenAI making the effort with this new spec on how a model should behave according to the user with greater context and personalization but more so “responsibly,” said Neil Shah, VP for research and partner at Counterpoint Research, a global research and consulting firm.
OpenAI’s stress on transparency and collaboration
OpenAI, in the blog post, acknowledged the Model Spec as a “living document,” meaning that it is open for feedback and evolving alongside the field of AI.
“Our intention is to use the Model Spec as guidelines for researchers and data labelers to create data as part of a technique called reinforcement learning from human feedback (RLHF),” another document by OpenAI detailing the Model Spec said. “The Spec, like our models themselves, will be continuously updated based on what we learn by sharing it and listening to feedback from stakeholders.”
RLHF will drive how a model will be more tuned to actual human behavior but also make it transparent with set objectives, principles, and rules. This takes the OpenAI model to the next level making it more responsible and useful, Shah said. “Though this will be a constantly moving target to fine-tune the specs as there are a lot of grey areas with respect to how a query is construed and what the final objective is and the model has to be intelligent and responsible enough to detect if the query and response is less responsible.”