Prediction creation is now async
Jun 10, 2024
1 minute read
Why async
The prediction creation endpoint previously operated synchronously — the client sent a request and waited for the model to finish before receiving a response. This worked for fast models, but ML inference times vary widely depending on model size, input complexity, and current load. Long-running predictions risked HTTP timeouts, dropped connections, and blocked clients.
What changed
POST /v1/predictions now returns immediately with a prediction object in a pending state. The model runs in the background, and the prediction status updates as it progresses through pending to succeeded or failed.
1# Create a prediction2curl "https://api.wrift.ai/v1/predictions" \3 -H "Authorization: Bearer $WRIFTAI_ACCESS_TOKEN" \4 -H "Content-Type: application/json" \5 -d '{6 "model": "owner/model_name",7 "input": {8 "prompt": "Summarize quantum computing."9 }10 }'1112# Response returns immediately13{14 "id": "a1b2c3d4-5678-9abc-def0-1234567890ab",15 "status": "pending",16 ...17}
Getting results
Poll the prediction by ID to check its status:
1curl https://api.wrift.ai/v1/predictions/a1b2c3d4-5678-9abc-def0-1234567890ab \2 -H "Authorization: Bearer $WRIFTAI_ACCESS_TOKEN"