The long expected competition to OpenAI’s chatGPT was clearly going to come from Google Cloud. At TensorOps we received early access to the studio and decided to take it to a trail run and tell you what we found.
In the past decades Google has been leading the global advance in AI, although it felt like the company is lagging in the past months, the recently announced Generative AI Studio as part of Vertex AI definitely puts them back in the lead. We got an exclusive preview of the new features and this is what you can expect from it.
What’s in the box?
Let’s start with the “grocery” list of features. Generative AI studio started with only text support but has grown its offering since. In the UI you will find some usage examples of the services, these include:
Image generation
Code chat
Speech to text and text to speech
Various text application
Behind all these examples is typically a growing set of models. Google initially used the Text-Bison model that was offered in two versions 001 and alpha. Since then more models were introduced. Some of them have bigger and better architectures and some are fine tuned to specific tasks such as code.
Additionally, Google assists the users in discovering the potential of the studio by providing some examples of how different API calls can adjust the service to their business use case with pre-built prompt templates.
The chat app is based on PaLM-2. This is the older brother of PaLM that was released in April 2022 by google. PaLM-2’s 340B parameters make it capable of accomplishing complex tasks like coding and participating in conversations. The model will likely to get improved as it will be used more, we can also expect Google to release more advanced backed models in the future.
Built for developers
The Generative AI Studio was definitely designed for developers. The UI is less playful that chatGPT, however it allows access to the model parameters that can be tuned to achieve accurate results.
All the services in Generative AI Studio are, of course, accessible via API. Users of the Vertex AI SDK for python will find it easy to integrate the models into their app.
Tuned models
LLMs and other models are trained on huge amounts of data, but this can make them be too generic. Tuning the LLMs on your data can lead to cost reduction. How come? As of today there are 3 main ways to make LLMs output more specific results:
Prompt engineering — providing more context in the prompt and engineering the request.
Context database — referencing the model to some vector set of embedded knowledge or providing the raw data to the model before calling it.
Tune models — re-training the model on your own data.
While the first two methods require engineering every request, tuning the model is the only way to persist the modification so that every API call will fit your case better with less data required to be sent to the API.
Guide the models through examples
Helping the models understand the user intention can be the difference between a cool tool that you play with or show to grandma vs an app that you can trust in production. Generative AI Studio allows providing examples of the expected results.
For example, the transcript summarization lets user input some examples of expected summary results. These hints refine the output of the model, allowing you to do less coding and tranformations on your data before getting the response that you wanted.
UI for non AI experts
Democrztizing AI has never been as close as it is these days. Humans with no AI or software background are capable of leveraging AI tools in order to create content, apps and more.
The little info-boxes that the Generative AI Studio places everywhere, provide users with more information about the parameters that they are using. This must be helpful for users who are new to AI or who are not familiar with the specific parameters (like Temperature or Top-k). These parameters are used to tune LLMs, therefore non-AI-experts can easily learn how to modify the reults even if they haven’t read the most recent scientific articles about LLMs.
Quality of the results?
When we originally tested the model in early April 2023 it seemed like some features are still to be improved and developed. However, since then, as we forecasted the results had improved dramatically. We examined this article from ThePortugalNews website. In April it seemed like the model was confusing some statistics coming from the article with other information that it was trained on or that it was over hallucinating due to high temperature.
The April model was asked which countries have less female representation in the government than Portugal. While the list of countries that it provided was correct, there were a few mismatches between the article and the answer. For example: it mentioned that the female representation in Portugal was 45.3% even though the article clearly speaks about around 35% and in the list of countries, the model outputed countries that are not mentioned in the article at all.
What’s next for LLMs on GCP?
It appears like as more users are going to adopt the product and the usage will increase, the underlying LLMs that Google uses should improve. The design and compatibility for backend integration place Vertex AI in a front position to provide AI as a service. In the next blog post, we will dive further into specific use cases of Vertex AI Generative Studio, and how it can be used in production systems.
Follow us on LinkedIn to get more updates content!