Use Models Directly | Athena | API Reference Docs

This example shows how to use Athena’s language models directly when building with our Python SDK.

Use any model available in your workspace
Fully compatible with LangChain, a popular open-source library for LLM apps
Get started with athena.llm.invoke("Hello!")
Efficient parallel processing and support for high volumes with llm.batch()

Install Package

$ pip install athena-intelligence

Set Up Client

1 from athena.client import Athena
2 
3 # Initialize client
4 athena = Athena(api_key="<YOUR_API_KEY>")
5 
6 # Access models interface
7 llm = athena.llm

Basic Usage

1 # Simple synchronous invocation
2 response = llm.invoke("What is the capital of France?")
3 print(response.content)
4 
5 # Use a list of message objects
6 from langchain_core.messages import HumanMessage
7 message_response = llm.invoke([HumanMessage(content="Hello, how are you today?")])
8 print(message_response.content)

Available Models

Specify models explicitly using with_config(). The default model is Claude.

Available models include:

claude_3_7_sonnet: Claude 3.7 Sonnet
claude_4_sonnet: Claude 4 Sonnet
claude_4_opus: Claude 4 Opus (default)
openai_gpt_4_5: OpenAI GPT-4.5 Preview
openai_gpt_4: OpenAI GPT-4
openai_gpt_4_turbo: OpenAI GPT-4 Turbo
openai_gpt_4_turbo_preview: OpenAI GPT-4 Turbo Preview
openai_gpt_4o: OpenAI GPT-4o
openai_gpt_4o_mini: OpenAI GPT-4o Mini
openai_o3_mini: OpenAI o3 Mini
openai_o3_low_reasoning: OpenAI o3 (Low Reasoning)
openai_o3_medium_reasoning: OpenAI o3 (Medium Reasoning)
openai_o3_high_reasoning: OpenAI o3 (High Reasoning)
openai_o3_mini: OpenAI o3 Mini
openai_o3_mini_low_reasoning: OpenAI o3 Mini (Low Reasoning)
openai_o3_mini_high_reasoning: OpenAI o3 Mini (High Reasoning)
openai_o4_mini: OpenAI o4 Mini

1 # Specify a model
2 claude = llm.with_config(configurable={"model": "claude_3_7_sonnet"})
3 response = claude.invoke("Who are you?")
4 print(response.content)
5 
6 # Use another model
7 gpt4 = llm.with_config(configurable={"model": "openai_gpt_4o"})
8 response = gpt4.invoke("Explain quantum computing briefly")
9 print(response.content)

Batch Processing

Process multiple prompts in parallel:

1 # Multiple inputs in one request
2 prompts = [
3     "Explain the theory of relativity",
4     "What is machine learning?",
5     "How does photosynthesis work?",
6     "Describe the water cycle"
7 ]
8 
9 batch_results = llm.batch(prompts)
10 
11 for i, result in enumerate(batch_results):
12     print(f"Response {i+1}:")
13     print(result.content)
14     print("-" * 40)

Streaming

1 import asyncio
2 
3 async def stream_example():
4     prompt = "Write a short story about a robot learning to paint"
5     
6     print("Streaming response:")
7     print("-" * 40)
8     
9     # Stream events as they arrive
10     async for event in llm.astream_events(prompt):
11         data = event["data"]
12         if "chunk" in data:
13             # Print each chunk as it arrives
14             print(data["chunk"].content, end="", flush=True)
15         elif "output" in data:
16             # Final output
17             print("\n" + "-" * 40)
18             print("Complete!")
19     
20     print("\nStreaming complete!")
21 
22 # Run the async function
23 asyncio.run(stream_example())