AI Table
or a predictor
. By querying the model, we’ll predict the
real estate sales using a multivariate time series strategy.
Make sure you have access to a working MindsDB installation, either locally or
at MindsDB Cloud.
If you want to learn how to set up your account at MindsDB Cloud, follow
this guide. Another way is to set up
MindsDB locally using
Docker or
Python.
example_db.demo_data.house_sales
table).
CSV
data file here (we use the ma_lga_12345.csv file) and upload it via MindsDB SQL Editor.
Follow this guide to find out how to upload a file to MindsDB.
Now you can run queries directly on the file as if it were a table. Let’s preview the data that we’ll use to train our predictor.
files.house_sales
file as a table. Make sure
you replace it with example_db.demo_data.house_sales
if you connect the data
as a database.MA
column values. It tracks quarterly moving averages (MA
) of
house sales aggregated by real estate type and the number of bedrooms in each
listing.
Below is the sample data stored in the files.house_sales
table.
Column | Description | Data Type | Usage |
---|---|---|---|
saledate | The date of sale. | date | Feature |
MA | Moving average of the historical median price of the house or unit. | integer | Label |
type | Type of property (house or unit ). | character varying | Feature |
bedrooms | Number of bedrooms. | integer | Feature |
CREATE MODEL
statement and specify the
input columns used to train FROM
(features) and what we want to
PREDICT
(labels).
MA
column, whose values
will be predicted.
MindsDB makes it simple so that we don’t need to repeat the predictor creation
process for every group, that is, for every distinct number of bedrooms or for
every distinct type of real estate. Instead, we just group by both the
bedrooms
and type
columns, and the predictor learns from all series and
enables forecasts for all of them!
complete
, we can start making
predictions!
SELECT
statement lets you make predictions for the
label based on the chosen features for a given time period. Usually, you want to
know what happens right after the latest training data point that was fed. We
have a special keyword for that, the LATEST
keyword.
SELECT
statement, we select
m.saledate
instead of t.saledate
because we make predictions for future
dates that are not in the data table.
Now, try changing the type
column value to unit, or the bedrooms
column
value to any number between 1 to 5, and check how the forecasts vary. This is
because MindsDB recognizes each grouping as being its own different time series.