Skip to main content

Creating a Konan Compatible ML Webservice

info

If you're using Python, an easier option of deploying models to Konan would be use Konan Templates.

Web Server

The first thing we need in order to communicate with the ML model is a web server. There are plenty of web server frameworks out there that you can use, feel free to use whichever you're comfortable with. If you're undecided, we recommend using Fast API for python implementations and plumber for R implementations.

Fast API is a cool and minimal web framework with a similar interface to Flask but with a faster performance. It's fairly easy to get started with as well as being loaded with advanced features that we’ll be needing for a smooth integration with Konan. To install fastpi, run:

pip install fastapi[all]

This will install fastapi as well as its dependencies such as uvicorn which is the server that runs the code. You will need to create a requirements.txt file and add fastapi as well as uvicorn for later use.

Creating a web server for your app is as simple as this:

server.py
from fastapi import FastAPI

# initializing your app
app = FastAPI()

# TODO: load you model weights here (for later use)

We'll be building on top of server.py as we go along the guide. For testing purposes, you can run the web server locally using this command:

uvicorn server:app --reload

then go to http://localhost:8000 in your browser. There's nothing to see or test yet, but that'll change in the next section.

note

server: the file server.py (the Python "module").

app: the object created inside of main.py with the line app = FastAPI().

--reload: make the server restart after code changes. Only use for development.


Endpoints

An endpoint is the URL used to communicate with the web server. A request refers to the input to the endpoint and a response refers to the output of the endpoint. [In the case of model serving, we’ll need a prediction endpoint for inferencing.]

There are two main components that need to be defined in most endpoints:

  1. Data validators, which are used to both enforce and communicate the expected format of the data going into (and out of) the model. For example, if your model expects feature A as a boolean and feature B as an integer, your data validator should specify these types. If the input does not follow this format, a value error should be raised. Data validators are also essential for generating the model's API docs which makes the model usable (more on that in a bit).
  2. Endpoint logic, which calls the model's predict function. It may also include some pre-processing or post-processing before or after calling the predict function.

/predict

The first and most important endpoint required in the web server is, of course, the prediction endpoint. This is the endpoint that will return the model's output for a given input. The following is an example of how you can implement your endpoint while incorporating data validators.

Prediction request validator:

FastAPI uses pydantic for data validation, which supports an array of data types you can find here. The following is an example of a prediction request data validator that specifies the format of the input data coming into the model.

server.py
from pydantic import BaseModel, validator

class PredictionRequest(BaseModel):
"""
Request validator.
"""
some_feat: str
other_feat: int
optional_field: Optional[bool] = None # default value

# TODO: add validators to enforce value ranges
@validator('some_feat')
def in_values(cls, v):
"""
Validates prediction score is in ranges or limits.
"""
if v not in ["A", "B", "C"]:
raise ValueError('Unkown value, must be a value in ["A", "B", "C"]')
return v

As the use of the @validator property shows in the example, you’re encouraged to go crazy with validation. The more strict the validation is, the easier the debugging becomes. For example, for gender columns, instead of just putting a str type, add a restriction to enforce it being only M and F or whatever else your model accepts.

Prediction endpoint logic:

Create a POST /predict endpoint, this is where the prediction logic is called. Make sure you pass the prediction request validator as an input to the function as shown below (PredictionRequest was defined in the above snippet).

server.py
@app.post("/predict")
def predict(req: PredictionRequest):

# TODO: call preprocessing function (if exists)

# TODO: call model's predict function
prediction = True # TODO: replace

# TODO: call postprocessing function (if exists)

return {"output": prediction}

/docs

This endpoint is used to return the generated API documentation of the web server endpoints. The generated document is used to communicate how to use the API by defining the inputs and outputs expected by API. In our case, the API is the collection of endpoints exposed by the web server. The generation of this document is made possible by the data validators previously defined in the web server. Once you have the endpoints' data validators in place, it becomes fairly simple to generate this document.

If you run the server (uvicorn server:app --reload) and go to http://localhost:8000/docs, you should see the interactive API document generated by SwaggerUI.

Swagger UI

In order to integrate with Konan, we'll need this endpoint to return the docs' json format instead of the rendered html. This can be achieved by adding the following argument in the app initialization:

server.py
from fastapi import FastAPI

app = FastAPI(openapi_url='/docs')
note

If you're using a different framework and plan to implement this endpoint, make sure you return an OpenAPI V3 json, earlier versions will not render.

/healthz

This is a simple GET endpoint that is used internally by Konan for integration purposes. Including it is required for the deployment to work.

server.py
@app.get('/healthz')
def healthz_func():
"""
Health check for API server.
"""
return Response(content="\n", status_code=200, media_type="text/plain")

/evaluate

The evaluate endpoint is optional, but highly useful. It allows you to expose the model evaluation metrics of your choice. This is a POST endpoint that takes a list of dicts, where each dict contains a prediction value and its corresponding groundtruth obtain by the Konan feedback endpoint.

[
{
"prediction": "value_1",
"target": "true_value_1"
},
{
"prediction": "value_2",
"target": "true_value_2"
}
]

It expects the response also as a list of dicts (or just one), where each dict contains the evaluation metric name and its value. For example:

[
{"metric_name": "MSE", "metric_value": "1.5"},
{"metric_name": "Accuracy", "metric_value": "0.99"}
]
server.py
class EvaluationRequestDict(BaseModel):
prediction: Any
target: Any

class EvaluationRequest(BaseModel):
data: List[EvaluationRequestDict]

class EvaluationResponseDict(BaseModel):
metric_name: str
metric_value: Any


class EvaluationResponse(BaseModel):
results: List[EvaluationResponseDict]

@app.post("/evaluate", response_model=EvaluationResponse)
def evaluate_func(req: EvaluationRequest):
"""
Evaluate metrics for model perfomance based on predictions and ground truth
"""

# Evaluate model performance
results = [{"metric_name": "MSE", "metric_value": "1.5"}, {"metric_name": "Accuracy", "metric_value": "0.99"} ]

return {"results": results}
---

Your server file should now look like this:

server.py
from fastapi import FastAPI, Response
from pydantic import BaseModel, validator, ValidationError

# import the types you need
from typing import Optional

app = FastAPI(openapi_url="/docs")
# TODO: load you model weights here

class PredictionRequest(BaseModel):
"""
Request serializer for input format validation.
"""

some_feat: str
other_feat: int
optional_field: Optional[bool] = None # default value

# TODO: add validators to enforce value ranges
@validator("some_feat")
def in_values(cls, v):
"""
Validates prediction score is in ranges or limits.
"""
if v not in ["A", "B", "C"]:
raise ValidationError('Unknown value, must be a value in ["A", "B", "C"]')
return v


@app.post("/predict")
def predict(req: PredictionRequest):
"""
Prediction logic.
"""

# TODO; call preprocessing function (if exists)

# TODO: call model's predict function
prediction = True # TODO: replace

# TODO: call postprocessing function (if exists)

return {"output": prediction}


@app.get("/healthz")
def healthz_func():
"""
Health check for API server.
"""
return Response(content="\n", status_code=200, media_type="text/plain")

class EvaluationRequestDict(BaseModel):
prediction: Any
target: Any

class EvaluationRequest(BaseModel):
data: List[EvaluationRequestDict]

class EvaluationResponseDict(BaseModel):
metric_name: str
metric_value: Any


class EvaluationResponse(BaseModel):
results: List[EvaluationResponseDict]

@app.post("/evaluate", response_model=EvaluationResponse)
def evaluate_func(req: EvaluationRequest):
"""
Evaluate metrics for model perfomance based on predictions and ground truth
"""

# Evaluate model performance
results = [{"metric_name": "MSE", "metric_value": "1.5"}, {"metric_name": "Accuracy", "metric_value": "0.99"} ]

return {"results": results}


Local Testing

After creating all the endpoints, try test them locally before moving on to the next step. You can do that by starting your web server locally then hitting each of the above endpoints with the expected (and unexpected) data.

Here's an example of how you can test:

  1. Run the server:
uvicorn server:app --reload
  1. In a separate commandline window, run the following command, feel free to change the data afyer -d to what your model expects:
curl -X POST "http://localhost:8000/predict/" -H  "accept: application/json" -H  "Content-Type: application/json" -d "{\"some_feat\":\"A\",\"other_feat\":\"1\", \"optional_field\": \"True\"}"

This command sends a post request to the predict endpoint with the following payload, i.e., request body:

{
"some_feat": "value_1",
"other_feat": "2",
"optional_field": "True"
}

If you passed to correct values in the request, you should see your model's expected output in the terminal:

{"output":true}
Play around with different values or invalid types to test your validation.

Now that you’re done exposing all the necessary endpoints, you're one step closer to deploying your model on Konan. Head over to the next section to containerize your app.