Prediction creation is now async

Jun 10, 2024

1 minute read

Improvement

Share

Why async

The prediction creation endpoint previously operated synchronously — the client sent a request and waited for the model to finish before receiving a response. This worked for fast models, but ML inference times vary widely depending on model size, input complexity, and current load. Long-running predictions risked HTTP timeouts, dropped connections, and blocked clients.

What changed

POST /v1/predictions now returns immediately with a prediction object in a pending state. The model runs in the background, and the prediction status updates as it progresses through pending to succeeded or failed.


1# Create a prediction
2curl "https://api.wrift.ai/v1/predictions" \
3 -H "Authorization: Bearer $WRIFTAI_ACCESS_TOKEN" \
4 -H "Content-Type: application/json" \
5 -d '{
6 "model": "owner/model_name",
7 "input": {
8 "prompt": "Summarize quantum computing."
9 }
10 }'
11
12# Response returns immediately
13{
14 "id": "a1b2c3d4-5678-9abc-def0-1234567890ab",
15 "status": "pending",
16 ...
17}

Getting results

Poll the prediction by ID to check its status:


1curl https://api.wrift.ai/v1/predictions/a1b2c3d4-5678-9abc-def0-1234567890ab \
2 -H "Authorization: Bearer $WRIFTAI_ACCESS_TOKEN"


Scale Your Projects.
Build With Confidence.

Scale your projects effortlessly with WriftAI. Seamlessly integrate and optimize performance as you expand and innovate.

© 2026 Sych Inc.