Problem
Start a Dataproc cluster with an environment variable set to a predefined value and access the value of this variable in Jupyter Notebook.
Environment
- Dataproc cluster
- Jupyter notebook
- Optional component enabled
Solution
Workaround
- Set a new environment variable and restart Jupyter service.
- Sample init script:
#!/bin/bash function set_env() { local role role="$(/usr/share/google/get_metadata_value attributes/dataproc-role)" if [[ "${role}" == 'Master' ]]; then cat <<EOF >>"/etc/environment" DATAPROC_TEST_ENV="my_env_var" EOF systemctl restart jupyter.service fi } set_env
Cause
The environment variables are read when Jupyter service is started by the dataproc-startup script. The new environment variables are defined in the initialization script that is run after the dataproc-startup script. Therefore they will not be available in the Jupyter Notebook.