Databricks crowdsourced 13,000 demonstrations of instruction-following behaviour from more than 5,000 of its employees between March and April 2023. The resulting data set, along with Dolly’s model weights and training code, have been released fully open source under a Creative Commons license, enabling anyone to use, modify, or extend the data set for any purpose, including commercial applications.
In contrast, OpenAI’s ChatGPT is a proprietary model that requires users to pay for API access and adhere to specific terms of service, potentially limiting the flexibility and customization options for businesses and organizations. Meta’s LLaMA, a partially open-source model (with restricted weights) that recently spawned a wave of derivatives after its weights leaked on BitTorrent, does not allow commercial use.
And of course, being open source, this helps spark further innovation and creation on its own, without the restrictions and limitations imposed by the proprietary OpenAI’s ChatGPT.