NLP with MindsDB and Hugging Face
MindsDB NLP Supported Tasks
There are four main NLP tasks currently supported by MindsDB:
- Text Classification
- Zero-Shot Classification
- Translation
- Summarization
Currently, MindsDB’s NLP engine is powered by Hugging Face and OpenAI. But we plan to expand to other NLP options in the future, so stay tuned!
The MindsDB’s Hugging Face engine is extensible. We are actively working on adding more tasks and models. If you have a specific task or model in mind, please let us know in the MindsDB Community.
MindsDB NLP Tested Models
The Hugging Face models are used to perform these tasks. Keep in mind that usually there is more than one model for each task, so you can choose the one that suits you best.
How to Bring the Hugging Face Model to MindsDB
We use the CREATE MODEL
statement to bring the Hugging Face models to MindsDB.
Generally, it looks like this:
Where:
Expressions | Description |
---|---|
project_name | Name of the project where the model is created. By default, the mindsdb project is used. |
predictor_name | Name of the model to be created. |
target_column | Column to store the predicted values. |
engine | Optional. You can provide an ML engine, based on which the model is created. |
task | Optional. It is relative to the Hugging Face task tag. |
model_name | Model name from the Hugging Face model hub. |
input_column | Name of the column that has the input data, especially important for batch predictions using JOIN. |
labels | Depending on the model. Usually used for Zero-Shot Classification models. |
For more examples and explanations, visit our doc page on Hugging Face.
Example using SQL
Let’s go through a Spam Classification example to understand better how to link Hugging Face models and bring them to MindsDB as AI tables.
Using Local Installation of MindsDB
Please note that if you use a local installation of MindsDB, instead of MindsDB Cloud, you should install transformers==4.21.0
to be able to use the Hugging Face models.
Where:
Expressions | Values |
---|---|
project_name | mindsdb |
predictor_name | spam_classifier |
target_column | PRED |
engine | huggingface |
task | text-classification |
model_name | mrm8488/bert-tiny-finetuned-sms-spam-detection |
input_column | text_spammy |
labels | [‘ham’, ‘spam’] |
On execution, we get:
Before querying for predictions, we should verify the status of the spam_classifier
model.
On execution, we get:
Once the status is complete
, we can query for predictions.
On execution, we get:
For the full library of supported examples please go here.
Example using MQL
Let’s go through a Sentiment Classification example, but this time we’ll use a Mongo database.
Using Local Installation of MindsDB
Please note that if you use a local installation of MindsDB, instead of MindsDB Cloud, you should install transformers==4.21.0
to be able to use the Hugging Face models.
We have a sample Mongo database that you can connect to your MindsDB Cloud account by running this command in Mongo Shell:
Followed by:
We use this sample database throughout the example.
The next step is to create a connection between Mongo and MindsDB. Follow the instructions to connect MindsDB via Mongo Compass or Mongo Shell.
Now, we are ready to create a Hugging Face model.
On execution, we get:
We can check its status using this command:
On execution, we get:
Once the status is complete
, we can query for predictions.
Here is how to query for a single prediction:
On execution, we get:
You can also query for batch predictions. Here we use the mongo_test_db
database connected earlier in this example.
On execution, we get:
For the full library of supported examples please go here.
What’s Next?
Have fun while trying it out yourself!
- Bookmark MindsDB repository on GitHub.
- Sign up for a free MindsDB account.
- Engage with the MindsDB community on Slack or GitHub to ask questions and share your ideas and thoughts.
If this tutorial was helpful, please give us a GitHub star here.