Choose a training method

This topic explains the key differences between training a model in Vertex AI using AutoML or custom training and training a model using BigQuery ML.

With AutoML, you create and train a model with minimal technical effort. You can use AutoML to quickly prototype models and explore new datasets before investing in development. For example, you can use it to learn which features are best for a given dataset.

With custom training you can create a training application optimized for your targeted outcome. You have complete control over training application functionality. Namely, you can target any objective, use any algorithm, develop your own loss functions or metrics, or do any other customization.

Using BigQuery ML, you can train models using your BigQuery data directly in BigQuery. Using SQL commands, you can quickly create a model and use it to get batch predictions.

To compare the different functionality and expertise required for each service, review the following table.

AutoML Custom training BigQuery ML
Data science expertise needed No Yes, to develop the training application and also to do some of the data preparation like feature engineering. No.
Programming ability needed No, AutoML is codeless. Yes, to develop the training application. SQL programming ability required to build, evaluate, and use the model in BigQuery ML.
Time to trained model Lower. Less data preparation is required, and no development is needed. Higher. More data preparation is required, and training application development is needed. Lower. Model development speed is increased since you don't need build the infrastructure required for batch predictions or model training, as BigQuery ML leverages the BigQuery computational engine. This increases speed to training, evaluation, and prediction.
Limits on machine learning objectives Yes, you must target one of AutoML's predefined objectives. No Yes.
Can manually optimize model performance with hyperparameter tuning No. AutoML does some automated hyperparameter tuning, but you can't modify the values used. Yes. You can tune the model during each training run for experimentation and comparison. Yes. BigQuery ML supports hyperparameter tuning when training ML models using `CREATE MODEL` statements.
Can control aspects of the training environment Limited. For image and tabular datasets, you can specify the number of node hours to train for, and whether to allow early stopping of training. Yes. You can specify aspects of the environment such as Compute Engine machine type, disk size, machine learning framework, and number of nodes. No.
Limits on data size

Yes. AutoML uses managed datasets; data size limitations vary depending on the type of dataset. Refer to one of the following topics for specifics:

For unmanaged datasets, no. Managed datasets have the same limits as managed dataset objects created in and hosted by Vertex AI and are used to train AutoML models. Yes. BigQuery ML enforces appropriate quotas on a per-project basis. To learn more, see Quotas and limits.

What's next