Smart Autofill is a Google Spreadsheets Add On that uses the Prediction API for performing Machine Learning directly in a Google Spreadsheet.Click here to install and enable the Add On. After the add-on is installed, you should see a menu item under "Add-ons" called "Smart Autofill".
Estimating car prices
Suppose you want to price a used vehicle based on a few simple attributes: year, milage, number of doors and vehicle type (Car or Truck). Of course, accurate pricing will depend on many other variables as well, but we will use these four simple features to illustrate the use of Smart Autofill.
Using SmartAutofill with text data
Smart Autofill is designed to specially handle free-form text and can be used to help fill in missing numeric/categorical values.
As an example, suppose you sent out a survey using Google Forms to a multi-lingual audience that could potentially respond with comments in either English, Spanish, or French. When someone responds to the Form, their answer will automatically be populated into a Google Sheets spreadsheet.
Now, you wish to read some of the comments in the form, but suppose you only speak English. Instead of sifting through the responses looking for the ones that are in English, you can use Smart Autofill to simplify your task. If you would like to follow along this example, the fully labeled data is available here.
Although this example only has one filled column and one column with missing data, SmartAutofill can also handle the case where there is more than one filled column. Some of the filled columns can contain text and others can contain numeric or categorical data (like in the used vehicle example).
There are many other text applications for Smart Autofill that you may think of, here are just a few more examples:
We note that the larger the vocabulary of the text then the larger the number of non-missing examples you will generally need in order to get more accurate filled values. SmartAutofill may not be well suited for all data, but we hope you can use these examples as a guide to find many interesting use cases.
- What do all the fields in Autofill Info mean?
Estimated Accuracy / Average Error
If the autofill column contains categorical values (ex. True or False) then Smart Autofill will report the Estimated accuracy and if the autofill column contains numeric values it will report the Average error.
These metrics are computed by measuring how well Smart Autofill would do if it hypothetically were to fill in the cells with non-missing values by comparing predicted values with the actual values (Smart Autofill of course cannot measure the accuracy of the autofilled entries since we do not know what the missing values truly are in that case). In the categorical scenario, Estimated accuracy is the fraction of cells Smart Autofill gets right (100% is best), and in the numeric scenario, Average error is the average difference between the predicted value and the actual value (0 is best).
Smart Autofill does not guarantee the accuracy of the autofilled entries. The Estimated accuracy and Average error metrics are only a guideline and may not be representative of the actual accuracy of the autofilled entries especially if (a) the number of non-missing entries is small and (b) if the rows with non-missing values have very different characteristics than the rows with missing values.
Number of rows filled
The number of rows in the selection that were autofilled by SmartAutofill.
Number of labeled rows
The number of rows in the selection that were already labeled and therefore were not modified by SmartAutofill.
Number of empty rows
The number of rows in the selection that were completely empty and therefore skipped by SmartAutofill.
Smart Autofill will work even if a few cells are not filled in the non-autofilled columns, however, the autofilled/predicted value for those rows may be less accurate.
If the column value is just free form text, then it is ok to just leave it as regular text. However, if the column value represents a single entity, then you wil most likely get better performance if you replace spaces with underscores.
Suppose you wish to add another column that represents a comment/review about the vehicle, then you should keep the text as is - ex. "This car is really fast!".
Suppose you wish to add another column to the data that represents the city in which the vehicle is being sold, you should replace all the spaces with underscores - ex. "NEW_YORK_CITY".
If you run into any other issues, please email email@example.com.