Dataproc Serverless Spark runtime 1.2.x

Spark runtime version 1.2 components

Component 1.2.30
2024/10/31
1.2.29
2024/10/25

1.2.28
2024/10/17
1.2.27
2024/10/11
1.2.26
2024/10/04
Apache SparkNote 1 3.5.1 3.5.1 3.5.1 3.5.1 3.5.1
Cloud Storage Connector 3.0.3 3.0.3 3.0.0 3.0.0 3.0.0
BigQuery Connector 0.36.4 0.36.4 0.36.4 0.36.4 0.36.4
Java 17 17 17 17 17
Conda 24.1 24.1 24.1 24.1 24.1
Python 3.12 3.12 3.12 3.12 3.12
R 4.3 4.3 4.3 4.3 4.3
Scala 2.12 2.12 2.12 2.12 2.12

Notes:

1. The Dataproc Serverless 1.2 runtime uses the UTF-8 default character encoding.

Spark runtime 1.2 libraries

learning libraries, such TensorFlow, PyTorch, and XGBoost, and offer a ready-to-use environment for machine learning and data science applications.

The following sections list the library versions that are available in Dataproc Serverless for Spark runtime version 1.2.

GPU-specific libraries

Following NVIDIA drivers and Spark Rapids library versions are available in the Dataproc Serverless container to accelerate Spark batch workloads using NVIDIA Spark Rapids library.

Package Name Version
Spark Rapids 24.04.0
NVIDA Driver 550.127.05

XGBoost libraries

The following Maven package versions are available in Dataproc Serverless for Spark runtime version 1.2 to use XGBoost with Spark in Java or Scala.

Group ID Package Name Version
ml.dmlc xgboost4j-gpu_2.12 2.0.3
ml.dmlc xgboost4j-spark-gpu_2.12 2.0.3

Python libraries

The following Python library versions are included in Dataproc Serverless for Spark runtime version 1.2.

Package Name Version
accelerate 0.33
bigframes 1.7
cookiecutter 2.6
cython 3.0
dask 2024.5
deepspeed 0.14
evaluate 0.4
fastavro 1.9
fastparquet 2024.2
gcsfs 2024.5
git 2.45
google-auth-oauthlib 1.2
google-cloud-aiplatform 1.60
google-cloud-bigquery 3.23
google-cloud-bigquery-storage 2.25
google-cloud-bigtable 2.23
google-cloud-container 2.45
google-cloud-datacatalog 3.19
google-cloud-dataproc 5.9
google-cloud-datastore 2.19
google-cloud-dlp 3.22
google-cloud-language 2.13
google-cloud-logging 3.10
google-cloud-monitoring 2.21
google-cloud-pubsub 2.21
google-cloud-redis 2.15
google-cloud-secret-manager 2.20
google-cloud-spanner 3.46
google-cloud-speech 2.26
google-cloud-storage 2.16
google-cloud-texttospeech 2.16
google-cloud-translate 3.15
google-cloud-vision 3.7
httplib2 0.22
ipyparallel 8.8
ipython-sql 0.3
ipywidgets 8.1
jupyter_http_over_ws 0.0
jupyterlab 4.1
jupyterlab-git 0.50
keyrings.google-artifactregistry-auth 1.1
langchain 0.2
lightgbm 4.5
markdown 3.6
matplotlib 3.8
nbclassic 1.0
nbconvert 7.16
nbdime 4.0
nltk 3.8
nodejs 20.12
numba 0.59
numpy 1.26
oauth2client 4.1
openblas 0.3
opencv 4.9
orc 2.0
pandas 2.2
papermill 2.6
pyarrow 15.0
pydot 2.0
pyhive 0.7
pymongo 4.7
pynvml 11.5
pytables 3.9
pytorch-cpu 2.3
regex 2024.5
requests 2.31
rtree 1.2
scikit-image 0.22
scikit-learn 1.5
scipy 1.11
seaborn 0.12
sentence-transformers 3.0
sqlalchemy 2.0
sympy 1.12
tokenizers 0.19
transformers 4.43
tornado 6.4
uritemplate 4.1
virtualenv 20.26
wordcloud 1.9
xgboost 2.0
ydata-profiling 4.8

R libraries

The following R library versions are included in Dataproc Serverless for Spark runtime version 1.2.

Package Name Version
askpass 1.2
assertthat 0.2
backports 1.5
bit 4.0
bit64 4.0
blob 1.2
boot 1.3_30
brew 1.0_10
broom 1.0
callr 3.7
caret 6.0_94
cellranger 1.1
chron 2.3_61
class 7.3_22
cli 3.6
clipr 0.8
cluster 2.1
codetools 0.2_20
colorspace 2.1_0
commonmark 1.9
cpp11 0.4
crayon 1.5
curl 5.1
data.table 1.15
dbi 1.2
dbplyr 2.5
desc 1.4
devtools 2.4
digest 0.6
dplyr 1.1
ellipsis 0.3
evaluate 0.23
fansi 1.0
fastmap 1.2
forcats 1.0
foreach 1.5
foreign 0.8_86
fs 1.6
future 1.33
generics 0.1
ggplot2 3.5
gh 1.4
glmnet 4.1_8
globals 0.16
glue 1.7
gower 1.0
gtable 0.3
haven 2.5
highr 0.10
hms 1.1
htmltools 0.5.8
htmlwidgets 1.6
httpuv 1.6
httr 1.4
hwriter 1.3.2
ini 0.3
ipred 0.9_14
isoband 0.2
iterators 1.0
jsonlite 1.8
kernsmooth 2.23_24
knitr 1.46
labeling 0.4
later 1.3
lattice 0.22_6
lava 1.7
lifecycle 1.0
listenv 0.9
lubridate 1.9
magrittr 2.0
markdown 1.12
mass 7.3_60
matrix 1.6_5
memoise 2.0
mgcv 1.9_1
mime 0.12
modelmetrics 1.2.2
modelr 0.1
munsell 0.5
nlme 3.1_164
nnet 7.3_19
numderiv 2016.8_1
openssl 2.2
pillar 1.9
pkgbuild 1.4
pkgconfig 2.0
pkgload 1.3
plogr 0.2
plyr 1.8
praise 1.0
prettyunits 1.2
processx 3.8
prodlim 2023.08
progress 1.2
promises 1.3
proto 1.0
ps 1.7
purrr 1.0
r6 2.5
randomforest 4.7_1
rappdirs 0.3
rcmdcheck 1.4
rcolorbrewer 1.1_3
rcpp 1.0
rcurl 1.98_1
readr 2.1
readxl 1.4
recipes 1.0
rematch 2.0
remotes 2.5
reprex 2.1
reshape2 1.4
rlang 1.1
rmarkdown 2.27
rodbc 1.3_23
roxygen2 7.3
rpart 4.1
rprojroot 2.0
rserve 1.8_7
rsqlite 2.3
rstudioapi 0.16
rvest 1.0
scales 1.3
selectr 0.4_2
sessioninfo 1.2
shape 1.4.6
shiny 1.8.1
sourcetools 0.1
spatial 7.3_17
squarem 2021.1
stringi 1.8
stringr 1.5
survival 3.6_4
sys 3.4
teachingdemos 2.12
testthat 3.2.1
tibble 3.2
tidyr 1.3
tidyselect 1.2
tidyverse 2.0
timedate 4032.109
tinytex 0.51
usethis 2.2
utf8 1.2
uuid 1.2_0
vctrs 0.6
whisker 0.4
withr 3.0
xfun 0.44
xml2 1.3
xopen 1.0
xtable 1.8_4
yaml 2.3
zip 2.3