Link Search Menu Expand Document

Installation

Install via helm chart

Add Cuebook helm repo

helm repo add cuebook https://cuebook.github.io/helm-charts/ 
helm repo update

Install CueLake

Install CueLake with default configuration:

helm install cuelake cuebook/cuelake

Install CueLake in a specific namespace:

helm install cuelake cuebook/cuelake -n <NAMESPACE>

To install CueLake with custom configuration, download the values.yaml file for CueLake.

wget https://raw.githubusercontent.com/cuebook/helm-charts/main/helm-chart-sources/cuelake/values.yaml

Edit the file and install CueLake:

helm install cuelake cuebook/cuelake -f values.yaml

After installation is complete, CueLake UI can be acceseed via port forward:

kubectl port-forward services/lakehouse 8080:80 -n <NAMESPACE>

Install via Kubectl

kubectl create namespace cuelake
kubectl apply -f https://raw.githubusercontent.com/cuebook/cuelake/main/cuelake.yaml -n cuelake
kubectl port-forward services/lakehouse 8080:80 -n cuelake

To install CueLake with custom configuration, download the cuelake.yaml file and edit the properties for resource cuelake-conf config map.

wget https://raw.githubusercontent.com/cuebook/cuelake/main/cuelake.yaml

Edit the properties in cuelake-conf and apply the changes via kubectl.

kubectl apply -f cuelake.yaml

Properties

Below properties can be set in both cuelake.yaml (kubectl installation) and values.yaml (helm installation).

CueLake DB Properties (Optional)

If below properties are not set, sqlite db is used as CueLake database.

POSTGRES_DB_HOST="localhost"
POSTGRES_DB_USERNAME="postgres"
POSTGRES_DB_PASSWORD="postgres"
POSTGRES_DB_SCHEMA="cuelake_db"
POSTGRES_DB_PORT=5432

Metastore DB Settings

We currently support Postgres as spark metastore database. Information related to saved spark tables and views get stored here. If not set the tables and views will be destroyed on every interpreter restart.

METASTORE_POSTGRES_HOST="localhost"
METASORE_POSTGRES_PORT=5432
METASORE_POSTGRES_USERNAME="postgres"
METASORE_POSTGRES_PASSWORD="postgres"
METASORE_POSTGRES_DATABASE="cuelake_metastore"

Spark Interpreter Settings

To enable hive metastore

spark.sql.catalogImplementation	                        hive	
spark.sql.warehouse.dir	                                s3a://<BUCKET_NAME>/warehouse	
spark.sql.catalog.spark_catalog.type	                hive	
spark.sql.catalog.spark_catalog	                        org.apache.iceberg.spark.SparkSessionCatalog
spark.hadoop.javax.jdo.option.ConnectionURL	            jdbc:postgresql://<POSTGRES_HOST>:5432/<DATABASE_NAME>	
spark.hadoop.javax.jdo.option.ConnectionUserName	    username	
spark.hadoop.javax.jdo.option.ConnectionPassword	    password
spark.hadoop.javax.jdo.option.ConnectionDriverName	    org.postgresql.Driver

To enable hadoop metastore for iceberg tables

spark.sql.catalog.cuelake	                            org.apache.iceberg.spark.SparkCatalog	
spark.sql.catalog.cuelake.type	                        hadoop	
spark.sql.catalog.cuelake.warehouse	                    s3a://<BUCKET_NAME>/cuelake