Deploy using CLI
This guide describes how to use the kfctl
command line interface (CLI) to
deploy Kubeflow on GCP. The command line deployment gives you more control over
the deployment process and configuration than you get if you use the deployment
UI. If you’re looking for a simpler deployment procedure, see how to deploy
Kubeflow using the deployment UI.
Before you start
Before installing Kubeflow on the command line:
Ensure you have installed the following tools:
If you’re using Cloud Shell, enable boost mode.
Make sure that your GCP project meets the minimum requirements described in the project setup guide.
If you want to use Cloud Identity-Aware Proxy (Cloud IAP) for access control, follow the guide to setting up OAuth credentials. Cloud IAP is recommended for production deployments or deployments with access to sensitive data. Alternatively, you can use basic authentication with a username and password.
Prepare your environment
Follow these steps to download the kfctl binary for the Kubeflow CLI and set some handy environment variables:
Download the kfctl v0.7.0 release from the Kubeflow releases page.
Unpack the tar ball:
tar -xvf kfctl_v0.7.0_<platform>.tar.gz
Create user credentials. You only need to run this command once:
gcloud auth application-default login
Create environment variables to make the deployment process easier:
# Set your GCP project ID and the zone where you want to create # the Kubeflow deployment: export PROJECT=<your GCP project ID> gcloud config set project ${PROJECT} export ZONE=<your GCP zone> gcloud config set compute/zone ${ZONE} # Use the following kfctl configuration file for authentication with # Cloud IAP (recommended): export CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_gcp_iap.0.7.0.yaml" # If using Cloud IAP for authentication, create environment variables # from the OAuth client ID and secret that you obtained earlier: export CLIENT_ID=<CLIENT_ID from OAuth page> export CLIENT_SECRET=<CLIENT_SECRET from OAuth page> # Alternatively, use the following kfctl configuration if you want to use # basic authentication: export CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_gcp_basic_auth.0.7.0.yaml" # If using basic authentication, create environment variables # for username and password: export KUBEFLOW_USERNAME=<your username> export KUBEFLOW_PASSWORD=<your password> # Set KF_NAME to the name of your Kubeflow deployment. You also use this # value as directory name when creating your configuration directory. # See the detailed description in the text below this code snippet. # For example, your deployment name can be 'my-kubeflow' or 'kf-test'. export KF_NAME=<your choice of name for the Kubeflow deployment> # Set the path to the base directory where you want to store one or more # Kubeflow deployments. For example, /opt/. # Then set the Kubeflow application directory for this deployment. export BASE_DIR=<path to a base directory> export KF_DIR=${BASE_DIR}/${KF_NAME} # The following command is optional. It adds the kfctl binary to your path. # If you don't add kfctl to your path, you must use the full path # each time you run kfctl. export PATH=$PATH:<path to your kfctl file>
Notes:
- ${PROJECT} - The project ID of the GCP project where you want Kubeflow deployed.
- ${ZONE} - The GCP zone where you want to create the Kubeflow deployment. You can see a list of zones in the Compute Engine documentation. If you plan to use accelerators, you must choose a zone that supports the type you want. See the guide to customizing your Kubeflow deployment.
${CONFIG_URI} - The GitHub address of the configuration YAML file that you want to use to deploy Kubeflow. For GCP deployments, the following configurations are available:
https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_gcp_iap.0.7.0.yaml
https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_gcp_basic_auth.0.7.0.yaml
When you run
kfctl apply
orkfctl build
(see the next step), kfctl creates a local version of the configuration YAML file which you can further customize if necessary.${KF_NAME} - The name of your Kubeflow deployment. If you want a custom deployment name, specify that name here. For example,
my-kubeflow
orkf-test
. The value of KF_NAME must consist of lower case alphanumeric characters or ‘-’, and must start and end with an alphanumeric character. The value of this variable cannot be greater than 25 characters. It must contain just a name, not a directory path. You also use this value as directory name when creating the directory where your Kubeflow configurations are stored, that is, the Kubeflow application directory.${KF_DIR} - The full path to your Kubeflow application directory.
Set up and deploy Kubeflow
To set up and deploy Kubeflow using the default settings,
run the kfctl apply
command:
mkdir -p ${KF_DIR}
cd ${KF_DIR}
kfctl apply -V -f ${CONFIG_URI}
Alternatively, set up your configuration for later deployment
If you want to customize your configuration before deploying Kubeflow, you can set up your configuration files first, then edit the configuration, then deploy Kubeflow:
Run the
kfctl build
command to set up your configuration:mkdir -p ${KF_DIR} cd ${KF_DIR} kfctl build -V -f ${CONFIG_URI}
Edit the configuration files, as described in the guide to customizing your Kubeflow deployment.
Set an environment variable for your local configuration file:
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_iap.0.7.0.yaml
Or:
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_basic_auth.0.7.0.yaml
Run the
kfctl apply
command to deploy Kubeflow:kfctl apply -V -f ${CONFIG_FILE}
Check your deployment
Follow these steps to verify the deployment:
The deployment process creates a separate deployment for your data storage. After running
kfctl apply
you should notice two new deployments:- {KF_NAME}-storage: This deployment has persistent volumes for your pipelines.
- {KF_NAME}: This deployment has all the components of Kubeflow, including a GKE cluster named ${KF_NAME} with Kubeflow installed.
When the deployment finishes, check the resources installed in the namespace
kubeflow
in your new cluster. To do this from the command line, first set yourkubectl
credentials to point to the new cluster:gcloud container clusters get-credentials ${KF_NAME} --zone ${ZONE} --project ${PROJECT}
Then see what’s installed in the
kubeflow
namespace of your GKE cluster:kubectl -n kubeflow get all
Access the Kubeflow user interface (UI)
Follow these steps to access the Kubeflow central dashboard:
Enter the following URI into your browser address bar. It can take 20 minutes for the URI to become available:
https://<KF_NAME>.endpoints.<project-id>.cloud.goog/
You can run the following command to get the URI for your deployment:
kubectl -n istio-system get ingress NAME HOSTS ADDRESS PORTS AGE envoy-ingress your-kubeflow-name.endpoints.your-gcp-project.cloud.goog 34.102.232.34 80 5d13h
The following command sets an environment variable named
HOST
to the URI:export HOST=$(kubectl -n istio-system get ingress envoy-ingress -o=jsonpath={.spec.rules[0].host})
Follow the instructions on the UI to create a namespace. See the guide to creation of profiles.
Notes:
- It can take 20 minutes for the URI to become available. Kubeflow needs to provision a signed SSL certificate and register a DNS name.
- If you own or manage the domain or a subdomain with Cloud DNS then you can configure this process to be much faster. See kubeflow/kubeflow#731.
Understanding the deployment process
This section gives you more details about the kfctl configuration and deployment process, so that you can customize your Kubeflow deployment if necessary.
kfctl process and configuration
The kfctl deployment process includes the following commands:
kfctl build
- (Optional) Creates configuration files defining the various resources in your deployment. You only need to runkfctl build
if you want to edit the resources before runningkfctl apply
. See the guide to customizing your Kubeflow deployment.kfctl apply
- Creates or updates the resources.kfctl delete
- Deletes the resources.
The kfctl deployment process applies default values to certain properties as follows:
Email address: kfctl attempts to fetch your email address from your Cloud SDK configuration. You can run
gcloud config list
to see the default email address, which the command output lists as the account. If kfctl can’t find a valid email address, you must use the flag--email <your email address>
to pass a valid email address. This email address becomes an administrator in the configuration of your Kubeflow deployment.GCP project ID: kfctl attempts to fetch your project ID from your Cloud SDK configuration. You can run
gcloud config list
to see your active project ID.GCP zone: kfctl attempts to fetch the zone from your Cloud SDK configuration. You can run
gcloud config list
to see your active zone.Kubeflow deployment name: kfctl defaults to the name of the directory where you run the
kfctl build
orkfctl apply
command.
You can also explicitly set the following values in your ${CONFIG_FILE}
configuration file:
- Kubeflow deployment name
- GCP project
- GCP zone
- Email address
The following snippet shows you how to set values in the configuration file using yq:
yq w -i ${CONFIG_FILE} spec.plugins[0].spec.project ${PROJECT}
yq w -i ${CONFIG_FILE} spec.plugins[0].spec.zone ${ZONE}
yq w -i ${CONFIG_FILE} metadata.name ${KF_NAME}
Application layout
Your Kubeflow application directory ${KF_DIR} contains the following files and directories:
${CONFIG_FILE} is a YAML file that defines configurations related to your Kubeflow deployment.
- This file is a copy of the GitHub-based configuration YAML file that
you used when deploying Kubeflow:
- either
https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_gcp_iap.0.7.0.yaml
- or
https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_gcp_basic_auth.0.7.0.yaml
.
- either
- When you run
kfctl apply
orkfctl build
, kfctl creates a local version of the configuration file, ${CONFIG_FILE}, which you can further customize if necessary.
- This file is a copy of the GitHub-based configuration YAML file that
you used when deploying Kubeflow:
gcp_config is a directory that contains Deployment Manager configuration files defining your GCP infrastructure.
- The directory is created when you run
kfctl build
orkfctl apply
. - You can modify these configurations to customize your GCP infrastructure.
After modifying a configuration, run
kfctl apply
again.
- The directory is created when you run
kustomize is a directory that contains the kustomize packages for Kubeflow applications. See how Kubeflow uses kustomize.
- The directory is created when you run
kfctl build
orkfctl apply
. - You can customize the Kubernetes resources by modifying the manifests and
running
kfctl apply
again.
- The directory is created when you run
We recommend that you check in the contents of your ${KF_DIR} directory into source control.
GCP service accounts
The kfctl deployment process creates three service accounts in your GCP project. These service accounts follow the principle of least privilege. The service accounts are:
${KF_NAME}-admin
is used for some admin tasks like configuring the load balancers. The principle is that this account is needed to deploy Kubeflow but not needed to actually run jobs.${KF_NAME}-user
is intended to be used by training jobs and models to access GCP resources (Cloud Storage, BigQuery, etc.). This account has a much smaller set of privileges compared toadmin
.${KF_NAME}-vm
is used only for the virtual machine (VM) service account. This account has the minimal permissions needed to send metrics and logs to Stackdriver.
Next steps
- Run a full ML workflow on Kubeflow, using the end-to-end MNIST tutorial or the GitHub issue summarization example.
- See how to delete your Kubeflow deployment using the CLI.
- See how to customize your Kubeflow deployment.
- See how to upgrade Kubeflow and how to upgrade or reinstall a Kubeflow Pipelines deployment.
- Troubleshoot any issues you may find.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.