12/03/2025

APIArtificial IntelligenceDevelopmentProgrammingPython

Build Your Own AI: Access GPT, Claude & 300+ Models in Python

Python client connecting to multiple AI models

Introduction

Are you hemorrhaging money on multiple AI subscriptions? I know I was feeling the financial pain. Every month, another charge would appear for ChatGPT Plus, followed by yet another bill for Claude Pro, and then a separate fee for Perplexity. Fortunately, I discovered a better approach – building my own CLI AI tool using a single unified solution. Throughout this guide, you’ll learn how to create your own command-line AI interface while still accessing the world’s most powerful models through a minimal Python tool.

In this guide, I’ll walk you through building a powerful yet minimal Python client that lets you instantly switch between the world’s most advanced AI models:

OpenAI’s GPT-4.5 and GPT-4o
Anthropic’s Claude 3.7 and Claude Opus
Perplexity’s Sonar Reasoning Pro
And many more, all through one simple interface

As a result, you’ll gain access to premium AI capabilities without the burden of multiple subscriptions.

The best part? You’ll build this with less than 50 lines of core code, and consequently, you’ll only pay for what you actually use – no more multiple subscriptions burning holes in your wallet. Additionally, my client defaults to OpenAI’s GPT-4o-mini, which I’ve found offers an excellent balance of capability and cost. Nevertheless, you can easily switch to any model with a simple command-line flag when needed.

Key Points

Create a powerful Python client that taps into 300+ AI models, including GPT-4.5, Claude 3.7, and Perplexity
Switch between models with a simple command-line flag without changing your code
Save on AI subscriptions by paying only for what you use instead of monthly fees
Implement advanced features like multimodal image processing with less than 50 lines of core code
Run everything on a standard Debian 12 system with minimal dependencies

How Building Your AI Saves You Money

By using this OpenRouter client, you can significantly reduce your AI expenses. To illustrate this point, a typical user paying for ChatGPT Plus ($20), Claude Pro ($20), and Perplexity Pro ($20) spends $60 monthly regardless of actual usage patterns. In contrast, with OpenRouter, most users report spending between $5-15 monthly based on their specific needs. Therefore, you could potentially save $45+ every month while still accessing the same powerful models.

Prerequisites

Before diving in, let’s make sure you have everything needed on your Debian 12 system to save on AI subscriptions:

Python 3.9+ installed on your Debian 12 server or workstation
An OpenRouter account and API key (sign up at openrouter.ai)
Basic knowledge of Python and command-line operations

First, let’s set up our Debian 12 environment with all the necessary packages:

# Update your package lists
sudo apt update
# Install Python 3 and pip if not already installed
sudo apt install -y python3 python3-pip python3-venv
# Install git for version control (recommended)
sudo apt install -y git
# Install additional libraries for SSL/TLS support (prevents certificate errors)
sudo apt install -y ca-certificates libssl-dev
# Upgrade pip to the latest version
python3 -m pip install --upgrade pip

I’ve run into SSL certificate issues in the past when working with APIs on Debian, so installing the SSL libraries upfront saves headaches later. Before proceeding further, ensure your system has internet access and can reach the OpenRouter API endpoints. Moreover, if you’re behind a corporate firewall, you might need to configure proxy settings to allow this communication.

Setting Up Your Project

Let’s create our project directory and set up a virtual environment to manage dependencies:

# Create the project directory
mkdir -p ~/openrouter-client
cd ~/openrouter-client
# Create a virtual environment
python3 -m venv venv
# Activate the virtual environment
source venv/bin/activate
# Install required packages
pip install requests python-dotenv

I always use virtual environments for my Python projects – they keep dependencies isolated and make it much easier to manage different projects on the same system. For this particular client, we only need two external packages: requests for handling API communication and python-dotenv for managing environment variables.

Once the virtual environment is set up, we can proceed with creating the project structure:

# Create the necessary files
touch .env config.py openrouter_client.py cli.py README.md

Project Structure

The project has a simple but effective organization:

openrouter-client/
│
├── .env                  # API keys file (don't commit this!)
├── config.py             # Configurations and constants
├── openrouter_client.py  # Main client
├── cli.py                # Command-line interface
└── README.md             # Documentation

I’m a big fan of keeping things minimal but organized. This structure separates concerns cleanly – configuration, core functionality, and user interface are all in their own files. As a result, the code becomes more maintainable and easier to extend in the future.

The Core: OpenRouter Client

Let’s implement the client that will handle communication with the OpenRouter API and help you save on AI subscriptions:

# openrouter_client.py
import requests
import json
from config import MODELS, DEFAULT_MODEL, API_URL
class OpenRouterClient:
    def __init__(self, api_key, default_model=DEFAULT_MODEL, referer=None, title=None):
        self.api_key = api_key
        self.default_model = default_model
        self.referer = referer
        self.title = title

    def ask(self, query, model=None, max_tokens=None, temperature=0.7):
        """
        Send a query to the specified model (or default)
        and return the response.

        Args:
            query (str): The question or prompt for the model
            model (str, optional): Model identifier. Defaults to self.default_model.
            max_tokens (int, optional): Maximum tokens in the response. Defaults to None.
            temperature (float, optional): Controls randomness (0-1). Defaults to 0.7.

        Returns:
            str: The model's response text
        """
        model_to_use = model if model else self.default_model

        # Verify the model exists
        if model_to_use not in MODELS:
            available = ", ".join(MODELS.keys())
            raise ValueError(f"Model '{model_to_use}' not available. Choose from: {available}")

        # Prepare headers
        headers = {
            "Authorization": f"Bearer {self.api_key}",
            "Content-Type": "application/json"
        }

        # Add optional headers if available
        if self.referer:
            headers["HTTP-Referer"] = self.referer
        if self.title:
            headers["X-Title"] = self.title

        # Prepare request body
        data = {
            "model": MODELS[model_to_use],
            "messages": [
                {
                    "role": "user",
                    "content": query
                }
            ],
            "temperature": temperature
        }

        # Add max_tokens if specified
        if max_tokens:
            data["max_tokens"] = max_tokens

        # Make API call
        response = requests.post(
            url=API_URL,
            headers=headers,
            data=json.dumps(data),
            timeout=30  # Default timeout of 30 seconds
        )

        # Handle response
        if response.status_code == 200:
            result = response.json()
            return result["choices"][0]["message"]["content"]
        else:
            error_msg = f"Error {response.status_code}: {response.text}"
            raise Exception(error_msg)

I’ve designed the client to be simple yet flexible. It accepts various configuration parameters and, importantly, handles errors explicitly, making debugging easier if issues arise. Furthermore, the default timeout is set to 30 seconds, which I’ve found works well for most models. However, you might want to increase this value for complex queries with premium models that require more processing time.

One thing I particularly like about OpenRouter is that it normalizes the response format across different providers. Whether you’re using OpenAI, Anthropic, or Perplexity, the response structure remains consistent. Consequently, it becomes much easier to work with multiple models without having to adapt your code for each provider’s unique response format.

System Configuration

Now, let’s create the configuration file containing constants and available models:

# config.py
import os
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
# OpenRouter API URL
API_URL = "https://openrouter.ai/api/v1/chat/completions"
# Default model
DEFAULT_MODEL = "gpt-4o-mini"
# Dictionary of available models
# Key: friendly name, Value: full API identifier
MODELS = {
    # Premium models (most expensive)
    "gpt-4.5": "openai/gpt-4.5-preview",
    "claude-3.7": "anthropic/claude-3.7-sonnet",
    "claude-opus": "anthropic/claude-3-opus",
    "gpt-4o": "openai/gpt-4o",
    "gpt-4-turbo": "openai/gpt-4-turbo",
    "o1": "openai/o1",

    # Mid-tier models
    "claude-3.5-sonnet": "anthropic/claude-3.5-sonnet",
    "sonar-deep": "perplexity/sonar-deep-research",
    "perplexity": "perplexity/sonar-reasoning-pro",
    "sonar-pro": "perplexity/sonar-pro",
    "r1-1776": "perplexity/r1-1776",
    "gpt-4o-mini": "openai/gpt-4o-mini",

    # Affordable models (most economical)
    "claude-haiku": "anthropic/claude-3.5-haiku",
    "o3-mini": "openai/o3-mini",
    "llama-70b": "perplexity/llama-3.1-sonar-70b-online",
    "llama-8b": "perplexity/llama-3.1-sonar-8b-online"
}
# Get API key from environment variables
API_KEY = os.getenv("OPENROUTER_API_KEY")
REFERER = os.getenv("SITE_URL", None)
SITE_TITLE = os.getenv("SITE_TITLE", None)

In this file, I’ve included the most popular models organized by pricing tier. I’ve set OpenAI’s GPT-4o-mini as the default model because, in my experience, it offers a great balance of capabilities and cost-effectiveness. Additionally, you can easily expand this list by adding more entries to the MODELS dictionary as new options become available on the OpenRouter platform.

One tip I’ve discovered through extensive testing: For reasoning-heavy tasks (like math problems or logical analysis), I tend to switch to models like GPT-4.5 or Claude 3.7. In contrast, for creative writing, I prefer Claude Opus which excels at generating nuanced text. Meanwhile, for daily information queries, Perplexity’s models with their online search capabilities are incredibly useful.

Building the Command-Line Interface for Your AI

To simplify the client’s usage, let’s implement a command-line interface that helps you save on AI subscriptions:

# cli.py
import argparse
import sys
import os
import json
from config import API_KEY, MODELS, DEFAULT_MODEL, REFERER, SITE_TITLE
from openrouter_client import OpenRouterClient
def main():
    parser = argparse.ArgumentParser(description="OpenRouter client for accessing multiple AI models")
    parser.add_argument("query", nargs="?", help="The question to ask the model")
    parser.add_argument("--model", "-m", choices=MODELS.keys(), default=DEFAULT_MODEL,
                        help=f"The model to use (default: {DEFAULT_MODEL})")
    parser.add_argument("--interactive", "-i", action="store_true", 
                        help="Interactive mode for continuous conversations")
    parser.add_argument("--temperature", "-t", type=float, default=0.7,
                        help="Temperature (0.0-1.0) controlling randomness (default: 0.7)")
    parser.add_argument("--max-tokens", type=int, 
                        help="Maximum number of tokens in the response")
    parser.add_argument("--image", type=str,
                        help="URL to an image (for multimodal models)")
    parser.add_argument("--list-models", "-l", action="store_true",
                        help="List all available models grouped by pricing tier")
    parser.add_argument("--save", "-s", type=str, metavar="FILENAME",
                        help="Save the response to a file")

    args = parser.parse_args()

    # Verify that the API key is configured
    if not API_KEY:
        print("Error: OPENROUTER_API_KEY not configured in .env file")
        print("Create a .env file with the content: OPENROUTER_API_KEY=your_api_key")
        sys.exit(1)

    # Initialize client
    client = OpenRouterClient(API_KEY, args.model, REFERER, SITE_TITLE)

    # List all available models if requested
    if args.list_models:
        print("Available models for your CLI AI:")

        # Group models by pricing tier
        model_tiers = {
            "Premium (High Cost)": ["gpt-4.5", "claude-3.7", "claude-opus", "o1", "gpt-4o", "gpt-4-turbo"],
            "Mid-tier": ["claude-3.5-sonnet", "perplexity", "sonar-pro", "sonar-deep", "r1-1776", "gpt-4o-mini"],
            "Economy (Low Cost)": ["claude-haiku", "o3-mini", "llama-70b", "llama-8b", "gpt-3.5"]
        }

        for tier, models in model_tiers.items():
            print(f"n{tier}:")
            for model in models:
                if model in MODELS:
                    if model == DEFAULT_MODEL:
                        print(f"  {model} (default) -> {MODELS[model]}")
                    else:
                        print(f"  {model} -> {MODELS[model]}")

        print("nUsage example for your CLI AI:")
        print("  python cli.py "What is quantum computing?" --model claude-3.7")
        sys.exit(0)

    if args.interactive:
        print(f"Interactive mode with {args.model} (temp: {args.temperature}). Press Ctrl+C to exit.")
        print(f"Using this model in your CLI AI lets you pay per use instead of monthly fees.")
        try:
            while True:
                query = input("nQuestion: ")
                if not query.strip():
                    continue
                try:
                    if args.image:
                        print(f"Using image: {args.image}")
                        response = client.ask_with_image(query, args.image, args.model)
                    else:
                        response = client.ask(query, args.model, args.max_tokens, args.temperature)

                    print(f"nResponse ({args.model}):n{response}")

                    # Save response if requested
                    if args.save:
                        with open(args.save, 'a') as f:
                            f.write(f"Q: {query}nnA: {response}nn{'='*50}nn")
                            print(f"Response saved to {args.save}")

                except Exception as e:
                    print(f"Error: {e}")
        except KeyboardInterrupt:
            print("nGoodbye!")
    elif args.query:
        try:
            if args.image:
                print(f"Using image: {args.image}")
                response = client.ask_with_image(args.query, args.image, args.model)
            else:
                response = client.ask(args.query, args.model, args.max_tokens, args.temperature)

            print(response)

            # Save response if requested
            if args.save:
                with open(args.save, 'a') as f:
                    f.write(f"Q: {args.query}nnA: {response}nn{'='*50}nn")
                    print(f"Response saved to {args.save}")

        except Exception as e:
            print(f"Error: {e}")
    else:
        parser.print_help()
if __name__ == "__main__":
    main()

I’ve added several quality-of-life features to this interface to enhance usability. First, it supports both single-question use and an interactive mode for continuous conversations. Second, the --save option automatically logs your interactions to a file – a feature I’ve found incredibly useful for research and documentation purposes. Furthermore, these options make the tool versatile enough for various workflows.

The --list-models flag displays available models grouped by pricing tier, making it easier to choose the right model for your budget and needs. Additionally, I’ve included support for temperature adjustment, which controls how creative or deterministic the responses are. This parameter is particularly important when you need to balance between innovative and predictable outputs.

Environment File for Securing Your API Credentials

Don’t forget to create a .env file to securely store your API key:

# .env
OPENROUTER_API_KEY=your_openrouter_api_key
SITE_URL=https://your-site.com  # Optional
SITE_TITLE=Your Project         # Optional

You can create this file with:

cat > .env << EOF
OPENROUTER_API_KEY=your_openrouter_api_key
SITE_URL=https://your-site.com
SITE_TITLE=Your Project
EOF

I’ve learned the hard way never to hardcode API keys into your scripts. Instead, using environment variables with python-dotenv keeps your credentials secure. Moreover, this approach makes it easier to manage different environments (development, production, etc.) without modifying your core code.

How to Use Your CLI AI Tool

Now for the fun part! Here are some examples of how to use the command-line client to save on AI subscriptions:

# Use the default model (Perplexity)
python cli.py "What is the capital of Italy?"
# Specify a different model
python cli.py "Write a haiku about autumn" --model claude-3.7
# Use interactive mode with a more affordable model
python cli.py --interactive --model llama-8b
# Use a premium model for complex reasoning tasks
python cli.py "Explain quantum computing principles" --model gpt-4.5
# Use an online search method
python cli.py "latest news about tech" --model sonar-deep
# Use a multimodal-capable model with an image
python cli.py "What's in this image? https://example.com/image.jpg" --model gpt-4o

I’ve been using this client daily for a few weeks now, and consequently, I’ve discovered some interesting patterns in model performance. For creative writing, Claude models generally excel due to their nuanced understanding of language. Meanwhile, for code generation, GPT-4.5 has become my go-to because of its precision and technical understanding. Furthermore, for up-to-date information, Perplexity’s models with their search capabilities are unbeatable due to their integration with web data.

Model Selection Guide

Our client includes models across different price points and capabilities to help you save on AI subscriptions:

Premium Tier (High Cost, Maximum Capability)

gpt-4.5: OpenAI’s latest research preview with advanced reasoning
claude-3.7: Anthropic’s advanced model with reasoning capabilities
claude-opus: Anthropic’s most powerful model for complex tasks
o1: OpenAI’s model optimized for STEM reasoning

Mid-Tier (Balanced Cost/Performance)

claude-3.5-sonnet: Good all-around performer with reasonable pricing
perplexity: offering great reasoning with online search
gpt-4o-mini: Affordable multimodal model from OpenAI

Economy Tier (Cost-Effective)

claude-haiku: Fast responses at lower cost
o3-mini: OpenAI’s affordable STEM reasoning model
llama-8b: Very economical for basic tasks

I’ve found that for daily questions, using the mid-tier or economy models saves a significant amount of money without sacrificing much in quality. I typically reserve the premium models for truly complex tasks or when I need the highest quality output.

Creating a System-Wide Command for Easy Access

To make the client easily usable from any location in the system, we can create a system-wide command:

# Create a bash script wrapper called "assistant"
sudo cat > /usr/local/bin/assistant << 'EOF'
#!/bin/bash
# Path to the Python virtual environment
VENV_PATH="$HOME/openrouter-client/venv"
# Path to the client script
CLIENT_PATH="$HOME/openrouter-client/cli.py"
# Activate virtual environment and run the command
source "$VENV_PATH/bin/activate"
python "$CLIENT_PATH" "$@"
deactivate
EOF
# Make the script executable
sudo chmod +x /usr/local/bin/assistant

This is one of my favorite productivity hacks – creating a system-wide command that can be called from anywhere. Now you can use the assistant command from any directory to save on AI subscriptions:

# Simple question with the default model
assistant "What is the capital of Italy?"
# Creative response from a specific model
assistant "Write a haiku about autumn" --model claude-3.7
# Start an interactive session
assistant --interactive --model perplexity

I use this constantly throughout my workday – it’s incredibly convenient to have AI assistance just a terminal command away.

Extending Your CLI AI Tool

Adding a Simple Web Interface

For those who prefer a web interface while still trying to save on AI subscriptions, here’s a quick Flask server implementation:

# web_interface.py
from flask import Flask, request, render_template, jsonify
from openrouter_client import OpenRouterClient
from config import API_KEY, MODELS, DEFAULT_MODEL
app = Flask(__name__)
client = OpenRouterClient(API_KEY)
@app.route('/')
def index():
    return render_template('index.html', models=MODELS.keys())
@app.route('/ask', methods=['POST'])
def ask():
    data = request.json
    query = data.get('query')
    model = data.get('model', DEFAULT_MODEL)

    try:
        response = client.ask(query, model)
        return jsonify({'response': response})
    except Exception as e:
        return jsonify({'error': str(e)}), 400
if __name__ == '__main__':
    app.run(debug=True)

To set this up, you would need additional dependencies:

pip install flask

I’ve found that having both CLI and web interfaces provides maximum flexibility – the CLI for quick questions and the web interface for longer sessions or when sharing with team members.

Running Your CLI AI as a Service

For those wanting to run this as a service on a Debian 12 server to maximize savings on AI subscriptions, here’s how to set it up using systemd:

# Create a systemd service file
sudo nano /etc/systemd/system/openrouter-client.service

Add the following content:

[Unit]
Description=OpenRouter Python Client for CLI AI
After=network.target
[Service]
User=your_username
WorkingDirectory=/home/your_username/openrouter-client
ExecStart=/home/your_username/openrouter-client/venv/bin/python /home/your_username/openrouter-client/web_interface.py
Restart=on-failure
Environment=PYTHONUNBUFFERED=1
[Install]
WantedBy=multi-user.target

Replace your_username with your actual username. Then:

# Enable and start the service
sudo systemctl enable openrouter-client.service
sudo systemctl start openrouter-client.service
# Check the status
sudo systemctl status openrouter-client.service

I’ve set up this service on my home server, and it’s been running flawlessly for weeks. The automatic restart feature ensures it recovers from any temporary failures.

Troubleshooting

Over the weeks I’ve been using this client to save on AI subscriptions, I’ve encountered a few common issues. Here are their solutions:

Authentication Error: Verify that your API key is correct and that there are no extra spaces in the .env file. I once spent hours debugging what turned out to be an extra newline at the end of my API key.
Model Unavailable: Check the list of available models on OpenRouter – they sometimes update model names or availability. The OpenRouter status page is a good resource to check if a model is temporarily down.
Quota Limitations: OpenRouter has different pricing policies for various models. If you hit a quota limit, the error message can sometimes be cryptic. Check your usage in the control panel.
Response Timeouts: For complex queries, some models might take longer to respond. I’ve noticed the premium models (GPT-4.5, Claude 3.7, etc.) often require more time, especially for reasoning tasks. Consider increasing the timeout:

response = requests.post(
    url=API_URL,
    headers=headers,
    data=json.dumps(data),
    timeout=60  # Increase timeout to 60 seconds
)

SSL Certificate Errors: On some Debian installations, you might need to update your certificates:
```
sudo apt install -y ca-certificates
```

Rate Limiting: Some models like GPT-4.5 and O1 have strict rate limits. If you encounter rate limit errors, try implementing exponential backoff:

import time
from requests.exceptions import RequestException
max_retries = 3
backoff_factor = 2
for attempt in range(max_retries):
try:
    response = requests.post(
        url=API_URL,
        headers=headers,
        data=json.dumps(data),
        timeout=30
    )
    response.raise_for_status()
    break
except RequestException as e:
    if attempt == max_retries - 1:
        raise
    wait_time = backoff_factor ** attempt
    print(f"Request failed, retrying in {wait_time} seconds...")
    time.sleep(wait_time)

Image Processing Errors: When using multimodal models with images, ensure the image URL is publicly accessible. I’ve found that some models like GPT-4o have restrictions on image file sizes and formats.
Model-Specific Parameters: Some models support special parameters. For example, you can use reasoning_effort: "high" with OpenAI’s O3 models for better results on complex tasks:

data = {
    "model": MODELS[model_to_use],
    "messages": [...],
    "temperature": temperature,
    "model_params": {
        "reasoning_effort": "high"  # Only works with O3 models
    }
}

Smart Model Selection

Based on my extensive testing, here are my personal recommendations to help you save on AI subscriptions:

For complex reasoning tasks: Choose gpt-4.5, claude-3.7, or o1
For creative writing: claude-3.7 or claude-opus excel at this
For image understanding: Use multimodal models like gpt-4o or claude-3.7
For code generation: claude-3.7 or claude-3.5-sonnet are particularly strong
For research tasks: perplexity and sonar-pro offer internet-enhanced responses
For budget-conscious applications: llama-8b, o3-mini, or claude-haiku

I’ve seen pricing fluctuate somewhat, so always check the current rates on the OpenRouter website before making heavy use of a particular model.

Conclusion: The Power of Building Your Own CLI AI

After weeks of daily use, I can confidently say that this minimalist OpenRouter client has completely transformed my AI workflow. I no longer need separate subscriptions to ChatGPT Plus, Claude Pro, and Perplexity Pro – I can access all these models (and hundreds more) through a single interface, paying only for what I actually use.

The power of this approach lies in its flexibility and simplicity:

A common interface for all AI models
Easy model switching based on specific needs
Cost control by selecting appropriate models for different tasks
Extensibility with custom parameters, image support, and more

Whether you’re building a personal assistant, a research tool, or incorporating AI into your business applications, this client provides a solid foundation to save on AI subscriptions. The code is minimal, but the capabilities are vast.

I encourage you to implement this client and experiment with different models to find the perfect balance of capability and cost for your specific use case. And remember – with OpenRouter, you’re never locked into a single model or provider, helping you maximize your savings on AI subscriptions.

Next Steps to Enhance Your AI

If you’re looking to extend this project further, here are some ideas I’m considering for my own implementation:

Adding support for conversation history (storing context across multiple queries)
Implementing real-time response streaming for more interactive experiences
Creating a graphical interface with Tkinter or PyQt
Integrating the client into a Discord or Telegram bot

By building your own CLI AI tool that leverages OpenRouter’s API, you’ll not only save money but also gain more flexibility and control over your AI interactions.