Evaluate and run large AI models affordably with Bytez – achieve GPU performance at CPU pricing.
Two steps to run inference in minutes:
- Get your API Key
- Choose your path: run inference locally via Docker or use our API (javascript, python, REST API)
Join the Bytez Discord or go to Bytez.com, sign in, and visit user settings to get your key.
All Bytez models are available on Docker Hub or our About page 🤙
Load and run a model after installing our python library (pip install Bytez
).
import os
from bytez import Bytez
client = Bytez(api_key=os.environ.get("YOUR_API_KEY")
# Grab a model
model = client.model('openai-community/gpt2')
# Start a model
model.load()
# Run a model
output = model.run("Once upon a time there was a", model_params={"max_new_tokens":1,"min_new_tokens":1})
print(output)
See the API Documentation for all examples.
Load and run a model after installing our Typescript library (npm i bytez.js
).
import Bytez from "bytez.js";
client = new Bytez("YOUR_API_KEY");
// Grab a model
model = client.model("openai-community/gpt2");
// Start a model
await model.load();
console.log(results);
// Run a model
output = await model.run("Once upon a time there was a");
console.log(output);
See API Documentation for all examples.
Load and run a model after installing our Bytez library (add Bytez
).
Interactive Notebook! (Coming Soon)
using Bytez
client = Bytez.init("YOUR_API_KEY");
# Grab a model
# args => modelId, concurrency = 1, timeout = 300 secs
model = client.model("openai-community/gpt2")
# Start a model
model.load()
# Run a model
output = model.run("Roses are")
println(output)
Bytez has a REST API for loading, running, and requesting new models.
curl --location 'https://api.bytez.com/model/load' \
--header 'Authorization: Key API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"model": "openai-community/gpt2",
"concurrency": 1
}'
curl --location 'https://api.bytez.com/model/run' \
--header 'Authorization: Key API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"model": "openai-community/gpt2",
"prompt": "Once upon a time there was a",
"params": {
"min_length": 30,
"max_length": 256
},
"stream": true
}'
curl --location 'https://api.bytez.com/model/job' \
--header 'Authorization: Key API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"model": "openai-community/gpt2"
}'
See the API Documentation for all endpoints.
We currently support 14K+ open source AI models across 30+ ML tasks.
Task | Total Models |
---|---|
Total Available | 14559 |
Text-generation | 5765 |
Summarization | 380 |
Unconditional-image-generation | 416 |
Text2text-generation | 393 |
Audio-classification | 390 |
Image-classification | 533 |
Zero-shot-classification | 213 |
Token-classification | 546 |
Video-classification | 419 |
Text-classification | 474 |
Fill-mask | 358 |
Text-to-image | 467 |
Depth-estimation | 53 |
Object-detection | 405 |
Sentence-similarity | 457 |
Image-segmentation | 322 |
Image-to-text | 249 |
Zero-shot-image-classification | 174 |
Translation | 592 |
Automatic-speech-recognition | 455 |
Question-answering | 563 |
Image-feature-extraction | 114 |
Visual-question-answering | 105 |
Feature-extraction | 399 |
Mask-generation | 77 |
Zero-shot-object-detection | 27 |
Text-to-video | 11 |
Text-to-speech | 173 |
Document-question-answering | 18 |
Text-to-audio | 11 |
To see the full list, run:
models = client.list_models()
print(models)
Here are some models that can be run - with their required RAM.
Model Name | Required RAM (GB) |
---|---|
EleutherAI/gpt-neo-2.7B | 2.23 |
bigscience/bloom-560m | 3.78 |
succinctly/text2image-prompt-generator | 1.04 |
ai-forever/mGPT | 9.59 |
microsoft/phi-1 | 9.16 |
facebook/opt-1.3b | 8.06 |
openai-community/gpt2 | 0.50 |
bigscience/bloom-1b7 | 7.82 |
databricks/dolly-v2-3b | 11.09 |
tiiuae/falcon-40b-instruct | 182.21 |
tiiuae/falcon-7b-instruct | 27.28 |
codellama/CodeLlama-7b-Instruct-hf | 26.64 |
deepseek-ai/deepseek-coder-6.7b-instruct | 26.50 |
upstage/SOLAR-10.7B-Instruct-v1.0 | 57.63 |
elyza/ELYZA-japanese-Llama-2-7b-instruct | 38.24 |
NousResearch/Meta-Llama-3-8B-Instruct | 30.93 |
VAGOsolutions/SauerkrautLM-Mixtral-8x7B-Instruct | 211.17 |
codellama/CodeLlama-34b-Instruct-hf | 186.52 |
deepseek-ai/deepseek-coder-7b-instruct-v1.5 | 27.05 |
Equall/Saul-Instruct-v1 | 2.44 |
Equall/Saul-7B-Instruct-v1 | 10.20 |
microsoft/Phi-3-mini-128k-instruct | 14.66 |
microsoft/Phi-3-mini-4k-instruct | 14.65 |
victor/CodeLlama-34b-Instruct-hf | 127.37 |
gradientai/Llama-3-8B-Instruct-262k | 30.80 |
gradientai/Llama-3-8B-Instruct-Gradient-1048k | 30.59 |
yanolja/EEVE-Korean-Instruct-10.8B-v1.0 | 54.30 |
codellama/CodeLlama-13b-Instruct-hf | 50.38 |
deepseek-ai/deepseek-coder-1.3b-instruct | 6.16 |
deepseek-ai/deepseek-coder-33b-instruct | 158.74 |
filipealmeida/Mistral-7B-Instruct-v0.1-sharded | 27.42 |
unsloth/llama-3-8b-Instruct | 30.77 |
speakleash/Bielik-7B-Instruct-v0.1 | 27.52 |
Deci/DeciLM-7B-instruct | 26.90 |
tokyotech-llm/Swallow-70b-instruct-hf | 242.23 |
tokyotech-llm/Swallow-7b-NVE-instruct-hf | 26.89 |
codellama/CodeLlama-70b-Instruct-hf | 372.52 |
togethercomputer/Llama-2-7B-32K-Instruct | 25.65 |
beomi/Llama-3-Open-Ko-8B-Instruct-preview | 30.81 |
abhishekchohan/SOLAR-10.7B-Instruct-Forest-DPO-v1 | 15.38 |
deepseek-ai/deepseek-math-7b-instruct | 28.08 |
occiglot/occiglot-7b-eu5-instruct | 28.94 |
MediaTek-Research/Breeze-7B-Instruct-v1_0 | 29.84 |