Version v0.7 of the documentation is no longer actively maintained. The site that you are currently viewing is an archived snapshot. For up-to-date documentation, see the latest version.

Serving

Serving of ML models in Kubeflow

Overview

Model serving overview

KFServing

Model serving using KFServing

Seldon Serving

Model serving using Seldon

NVIDIA TensorRT Inference Server

Model serving using TRT Inference Server

TensorFlow Serving

Serving TensorFlow models

TensorFlow Batch Predict

See Kubeflow v0.6 docs for batch prediction with TensorFlow models

PyTorch Serving

Instructions for serving a PyTorch model with Seldon