Cloud Dataprep can import data from a variety of flat file formats and other distributed sources.
NOTE: Cloud Dataprep does not modify a source. Instead, a set of metadata is associated with the raw data, which enables transformation of the source. On export, a new version of the data is written to one or more specified output destinations.
For more information on the formats supported for input, see Supported File Formats.
- When data is imported, a reference to it is stored by the platform as an imported dataset. The source data is not modified.
- In the application, you modify the recipe associated with a dataset to transform the imported data.
NOTE: Any user with a valid user account can import data from a local file.
- In the menubar, click Datasets. Click Import Data.
To add a dataset:
- Select the connection where your source is located. For this basic workflow, select Upload to upload a file from your local desktop. You can select multiple files to upload. For this example, select only one file.
- Navigate and select the file or files for your source. Click Open.
- To queue the dataset for uploading, click the Plus icon next to its name.
To begin working with a dataset, you must first add it into a flow, which is a container for datasets. Click the Add Dataset to a Flow checkbox and enter the name for a new flow.
Tip: If you have selected a single file, you can begin wrangling it immediately. Click Import and Wrangle. The flow is created for you, and your dataset is added to it.
- Click Import & Add to Flow.
- After the flow has been created, open the flow and select the dataset to begin transforming it. See Transform Basics.