Prediction creation is now async

Why async

The prediction creation endpoint previously operated synchronously — the client sent a request and waited for the model to finish before receiving a response. This worked for fast models, but ML inference times vary widely depending on model size, input complexity, and current load. Long-running predictions risked HTTP timeouts, dropped connections, and blocked clients.

What changed

POST /v1/predictions now returns immediately with a prediction object in a pending state. The model runs in the background, and the prediction status updates as it progresses through pending to succeeded or failed.


1# Create a prediction
2curl "https://api.wrift.ai/v1/predictions" \
3  -H "Authorization: Bearer $WRIFTAI_ACCESS_TOKEN" \
4  -H "Content-Type: application/json" \
5  -d '{
6    "model": "owner/model_name",
7    "input": {
8      "prompt": "Summarize quantum computing."
9    }
10  }'
11
12# Response returns immediately
13{
14  "id": "a1b2c3d4-5678-9abc-def0-1234567890ab",
15  "status": "pending",
16  ...
17}

Getting results

Poll the prediction by ID to check its status:


1curl https://api.wrift.ai/v1/predictions/a1b2c3d4-5678-9abc-def0-1234567890ab \
2  -H "Authorization: Bearer $WRIFTAI_ACCESS_TOKEN"

Prediction creation is now async

Share

Why async

What changed

Getting results

Scale Your Projects.
Build With Confidence.

Prediction creation is now async

Share

Why async

What changed

Getting results

Scale Your Projects.Build With Confidence.

Scale Your Projects.
Build With Confidence.