ChatGPT implementation: key takeaways from our internal projects
At Boldare we have a ‘hands-on’ approach, and that’s why we decided to explore the ChatGPT topic by using it rather than reading about it. Our research & development team spent last month brainstorming, drafting, implementing, and coding AI-powered apps. In this article, we’ll share nine lessons we’ve learned about the GPT model, including the significance of vector databases, security concerns, and the importance of data. We hope you’ll find them helpful!
Table of contents
Disclaimer: ‘ChatGPT’ is an app that is based on the ‘GPT’ model created by OpenAI. However, for the purposes of this article we use both terms interchangeably. While working with this model we used mostly the GPT 3.5 and 3.5 Turbo versions.
It’s a revolution!
ChatGPT (and other GPT models) will revolutionize the software development market. However, the exact nature of the revolution remains unclear. The possibilities are endless, limited only by our creativity and the technical boundaries of the model. However, with each release and new version of GPT, these limitations become less problematic with newer, more efficient models, allowing us to handle more data and provide more user-friendly, out-of-the-box solutions that are easy to implement.
Build and learn
One of the best ways to learn about GPT’s capabilities is to adopt a proof of concept (POC) approach. In product development, a POC provides practical evidence of the technical feasibility (or not) of an idea. Using this approach, we were able to test multiple ideas and verify our assumptions with data inputs and prompts, helping us to quickly validate hypotheses and gather valuable information for future implementations before any code is even written.
We need to put emphasis also on the process of prompting - understanding how to build and fine-tune prompts is an essential part of working with ChatGPT or similar AI apps. Iterative and intensive testing of results is crucial here.
Therefore, we suggest brainstorming, playing with ChatGPT, creating POCs and prototypes, making mistakes, and learning from them. In the near future, as GPT models improve in efficiency, security, and reliability, previous experience with the technology will be extremely valuable to businesses seeking to benefit from it quickly.
Text only
Despite promising visions of the future, it’s important to note that GPT is currently a language model limited to processing and outputting plain text through the publicly available OpenAI API. GPT version 3.5 may not be suitable for nuanced tasks that rely on multiple variables and up-to-date data, as the model is limited to the data available from 2021. Therefore, if we want to create something reliable, it is important to feed the model with up-to-date information, leaving the model to format the final user-readable output, rather than relying on GPT’s general knowledge about the world.
Low reliability… for now
When we consider the number of new AI-powered tools, it’s hard to believe that AI and LLM (large language model) technology is still in its infancy. However, ChatGPT has limited capabilities and may not be the best choice for businesses that want to use API-based AI solutions as a primary product feature. It’s impossible to guarantee high performance and uninterrupted access to the API.
At the moment, the OpenAI service is unstable and inefficient, and solutions based on it still have a lot of room for improvement. That being said, we obviously know that it can visibly improve certain workflows and assist in building digital products, without being directly involved in their critical features.
Data is like oxygen
Although it may sound harsh, GPT without latest data can be useless for many business applications - much like any other large language model. As ChatGPT’s popularity grows, the value of reliable and truthful data sources will increase exponentially. As a result, many companies that provide access to content (such as media corporations, magazines, and data warehouses) will restrict access to their data behind paywalls. This will make it more difficult to access reliable data for projects that rely on it. Those who have access to such data on their servers will also be able to consider an additional line of business by providing reliable content for a fee.
From an engineering perspective, the most significant challenge of any product that uses GPT is ensuring that the data is properly prepared. Connecting data with GPT or other models is relatively straightforward for someone with experience in product development. If you plan to create an app that uses a specific dataset, you need to know how to prepare the data correctly and seeking assistance from a professional data engineer may be a good idea.
Vector databases
Once we started working on ChatGPT-based features, we realized how crucial vector databases are.
You can think of a vector database as the long-term memory of your AI app. These databases store learned input representations - typically a text document along with an embedding vector (a list of numbers representing the LLM’s “notion” of a particular document). These documents can later be queried with AI and used to improve the output of language models.
Properly maintaining and curating vector databases can greatly improve ChatGPT’s accuracy and quality. This involves regularly updating the database with new information, removing irrelevant data, and optimizing for the implementation’s specific use case. Managing the vector database carefully can improve ChatGPT’s performance and effectiveness.
Brace yourselves, regulations are coming
While working on our internal PoCs here at Boldare, we have learned that even though GPT can do amazing things and offers endless possibilities, we need to be careful about the data we collect and use. This means encrypting data, controlling access, and regularly checking the system for vulnerabilities. We also need to ensure that user data is only used for the intended purposes, and that users are informed about how their data is being used.
By prioritizing security and privacy from the outset of working with ChatGPT, it will be easier to comply with upcoming legislation, such as regulations implemented by organizations like the European Union (e.g. GDPR).
Will it replace people?
One of the valuable lessons we learned:
AI won’t replace experts, but people who know how to make use of AI will.
To make effective use of AI, a development team needs to acquaint itself with a variety of new technologies, learn how to load data efficiently within data limits, and be aware of the available databases and mechanisms for a given use case. Since new AI-related technologies are released every week, companies must invest significant effort into staying up to date. It’s essential to invest in the skills of both developers and non-developers.
Not just GPT
GPT is only one of many models available at the moment. If you want to explore different tools we recommend Hugging Face - a very vivid community and library of models for different purposes like speech to text, text to graphic, audio to audio, video classification, etc.
Lessons learned - a summary
We won’t stop here with our GPT and AI-related works. Our research and development team is still working on various projects, ranging from small but helpful apps like Slack-based virtual assistants to more complex features and tools that we can offer as parts of our partners’ products. There’s plenty to discover and learn, and fortunately, we’re pretty good at learning! If you have any questions, related or not, feel free to let us know.
Share this article: