This guide explains how to customize Cloud Datalab by adding Python libraries to your Cloud Datalab VM instance.
Adding Python libraries to a Cloud Datalab instance
Cloud Datalab includes a set of libraries. The included libraries are intended to support common data analysis, transformation, and visualization scenarios. You can add additional Python libraries using one of the following mechanisms:
- Add a code cell in a notebook to pip install the library, and then run the
code cell after substituting
!pip install lib_nameThis is the easiest way to customize for individual needs and involves minimum maintenance as the underlying Cloud Datalab image is updated because rerunning the code cell is trivial.
- Create a new notebook and add a code cell with the following content after
substituting after substituting
%%bash echo "pip install lib_name" >> /content/datalab/.config/startup.sh cat /content/datalab/.config/startup.shRun the cell, then restart the Cloud Datalab instance by clicking the information icon in the top-right corner of the Cloud Datalab notebook or notebook listing page in your browser, then clicking the Restart Server option.
- Another approach is to inherit from the Cloud Datalab Docker
container using a Docker customization mechanism. This option is much more
heavyweight compared to the other options listed above. However, it provides
maximum flexibility for those who intend to significantly customize the
container for use by a team or organization. To use this mechanism
you need to build your own container—named "Dockerfile-extended-example",
below— by following the
Also see the customization example in the Cloud Datalab GitHub repo.
FROM datalab ... pip install lib_name ...This approach requires you to take on the additional work of building and maintaining your own image as the underlying
datalabcontainer evolves. Therefore, it is recommended that you use this approach only if the other mechanisms described above do not meet your needs.