Google Sheets
This is the implementation of the Google Sheets data handler for MindsDB.
Google Sheets is a spreadsheet program included as a part of the free, web-based Google Docs Editors suite offered by Google.
Please note that the integration of MindsDB with Google Sheets works for public sheets only.
Implementation
This handler is implemented using duckdb
, a library that allows SQL queries to be executed on pandas
DataFrames.
In essence, when querying a particular sheet, the entire sheet is first pulled into a pandas
DataFrame using the Google Visualization API. Once this is done, SQL queries can be run on the DataFrame using duckdb
.
Since the entire sheet needs to be pulled into memory first (DataFrame), it is recommended to be somewhat careful when querying large datasets so as not to overload your machine.
The required arguments to establish a connection are as follows:
spreadsheet_id
is the unique ID of the Google Sheet.sheet_name
is the name of the sheet within the Google Sheet.
If you installed MindsDB locally via pip, you need to install all handler dependencies manually. To do so, go to the handler’s folder (mindsdb/integrations/handlers/sheets_handler) and run this command: pip install -r requirements.txt
.
Usage
In order to make use of this handler and connect to a Google Sheet in MindsDB, the following syntax can be used:
You can use this established connection to query your table as follows:
The name of the table will be the name of the relevant sheet, provided as an input to the sheet_name
parameter.
At the moment, only the SELECT
statemet is allowed to be executed through duckdb
. This, however, has no restriction on running machine learning algorithms against your data in Google Sheets using the CREATE PREDICTOR
statement.