OpenAI
In this section, we present how to bring OpenAI models to MindsDB.
OpenAI is an AI research organization and company, known for developing AI models like GPT-3 and GPT-4. Their mission is to advance AI for the betterment of humanity, emphasizing transparency, safety, and ethics in AI development.
Read on to find out how to use OpenAI models within MinsdDB.
Setup
MindsDB provides the OpenAI handler that enables you to create OpenAI models within MindsDB.
AI Engine
Before creating a model, it is required to create an AI engine based on the provided handler.
If you installed MindsDB locally, make sure to install all OpenAI dependencies by running pip install .[openai]
or from the requirements.txt file.
You can create an OpenAI engine using this command:
Please note that you need to provide your OpenAI API key. See OpenAI’s help center or alternatively, watch this video.
The name of the engine (here, openai_engine
) should be used as a value for the engine
parameter in the USING
clause of the CREATE MODEL
statement.
AI Model
The CREATE MODEL
statement is used to create, train, and deploy models within MindsDB.
Default Model
When you create an OpenAI model in MindsDB, it uses the gpt-3.5-turbo
model by default. But you can use the gpt-4
model as well by passing it to the model-name
parameter in the USING
clause of the CREATE MODEL
statement.
Supported Models To see all supported models, including chat models, embedding models, and more, click here.
The USING
clause takes more parameters depending on the operation mode. Follow the next section to learn about available operation modes.
Operation Modes
Let’s go through the available operation modes.
Please note that the examples presented here use SQL. To see how to create OpenAI models in Mongo database using MQL, check out this example on sentiment classification.
Answering Questions without Context
Here is how to create a model that answers questions without context:
Where:
Expression | Description |
---|---|
openai_model | The model name is openai_model and it resides inside the mindsdb project by default. Learn more about MindsDB projects here. |
answer | It is the value to be predicted. |
engine | The openai engine is used. |
question_column | It is the column that stores input data. |
model_name | Optional. By default, the text-davinci-002 model is used. If you prefer to use a cheaper model or a model that was fine-tuned outside of MindsDB, use this parameter. |
api_key | Your OpenAI API key. |
Let’s look at an example.
On execution, we get:
Now we can query for answers.
On execution, we get:
Answering Questions with Context
Here is how to create a model that answers questions with context:
Where:
Expression | Description |
---|---|
openai_model | The model name is openai_model and it resides inside the mindsdb project by default. |
answer | It is the value to be predicted. |
engine | The openai engine is used. |
question_column | It is the column that stores input data being a question. |
context_column | It is the column that stores input data being a context. |
api_key | Your OpenAI API key. |
Let’s look at an example.
On execution, we get:
Now we can query for answers.
On execution, we get:
Prompt Completion
Here is how to create a model that offers the most flexible mode of operation. It answers any query provided in the prompt_template
parameter.
Good prompts are the key to getting great completions out of large language models like the ones that OpenAI offers. For best performance, we recommend you read their prompting guide before trying your hand at prompt templating.
Where:
Expression | Description |
---|---|
openai_model | The model name is openai_model and it resides inside the mindsdb project by default. |
answer | It is the value to be predicted. |
engine | The openai engine is used. |
prompt_template | It is the column that stores a query to be answered. Please note that this parameter can be overridden at prediction time. |
max_tokens | It defines the maximum token cost of the prediction. Please note that this parameter can be overridden at prediction time. |
temperature | It defines how risky the answers are. The value of 0 marks a well-defined answer, and the value of 0.9 marks a more creative answer. Please note that this parameter can be overridden at prediction time. |
api_key | Your OpenAI API key. |
Let’s create a model:
Let’s look at an example that uses parameters provided at model creation time.
On execution, we get:
Now let’s look at an example that overrides parameters at prediction time.
On execution, we get:
Model Modes
You can define a mode for OpenAI models using the mode
parameter in the USING
clause.
The available modes include default
, conversational
, conversational-full
, image
, and embedding
.
- The
default
mode is used by default. The model replies to theprompt_template
message. - The
conversational
mode enables the model to read and reply to multiple messages. - The
conversational-full
mode enables the model to read and reply to multiple messages, one reply per message. - The
image
mode is used to create an image instead of a text reply. - The
embedding
mode enables the model to return output in the form of embeddings.
Examples
Sentiment Analysis
Let’s go through a sentiment classification example to understand better how to bring OpenAI models to MindsDB as AI tables.
On execution, we get:
Where:
Expressions | Values |
---|---|
project_name | mindsdb |
predictor_name | sentiment_classifier |
target_column | sentiment |
engine | openai |
prompt_template | predict the sentiment of the text:{{review}} exactly as either positive or negative or neutral |
In the prompt_template
parameter, we use a placeholder for a text value that comes from the review
column, that is, text:{{review}}
.
Before querying for predictions, we should verify the status of the sentiment_classifier
model.
On execution, we get:
Once the status is complete
, we can query for predictions.
Don’t forget to create the example_db
database before using one of its tables, like in the query above.
On execution, we get:
Fine-Tuning of OpenAI Models
All OpenAI models belong to the group of Large Language Models (LLMs). By definition, these are pre-trained on large amounts of data. However, it is possible to fine-tune these models with a task-specific dataset for a defined use case.
OpenAI supports fine-tuning of some of its models, including davinci
, curie
, babbage
, and ada
(more details here). And with MindsDB, you can easily fine-tune an OpenAI model making it more applicable to your specific use case.
Let’s create a model to answer questions about MindsDB’s custom SQL syntax.
You can check model status with this command:
Once the status is complete, we can query for predictions:
On execution, we get:
If you followed one of the MindsDB tutorials before, you’ll see that the syntax provided by the model is not exactly as expected.
Now, we’ll fine-tune our model using a table that stores details about MindsDB’s custom SQL syntax.
Here is a table we’ll use to fine-tune our model:
And here is its content:
This is how you can fine-tune an OpenAI model:
The FINETUNE
command creates a new version of the openai_davinci
model. You can query all available versions as below:
Once the new version status is complete and active, we can query the model again, expecting a more accurate output.
On execution, we get: