AI21 Studio Tutorial: Mastering the Chat Completions API

Last updated: July 25 2025

Welcome to this advanced tutorial on AI21 Studio! This notebook demonstrates how to leverage the unified Chat Completions API for a wide range of tasks. Instead of using different APIs for different jobs, we'll show how to instruct a single model, Jamba-mini, to perform text generation, paraphrasing, summarization, and contextual Q&A through carefully crafted prompts.

1. Setup and Installation

First, let's install the necessary libraries. We'll then configure our AI21Client, which will automatically read your API key from a .env file for secure access.

#!pip install -q ai21 python-dotenv

import os
from ai21 import AI21Client
from ai21.models.chat import ChatMessage
from dotenv import load_dotenv

# Load environment variables from a .env file
load_dotenv()

# The client automatically uses the AI21_API_KEY environment variable
client = AI21Client()

2. General Text Generation

For general text generation, we provide the model with a user message containing our instruction. The model then generates a response in the role of an 'assistant'.

def generate_text_with_chat(user_prompt):
    """Generates text using the Chat Completions API."""
    try:
        messages = [ChatMessage(role="user", content=user_prompt)]
        
        response = client.chat.completions.create(
            model="jamba-mini",
            messages=messages,
            max_tokens=150
        )
        
        return response.choices[0].message.content
    except Exception as e:
        return f"An error occurred: {e}"

# Example usage
prompt = "Write a short, optimistic paragraph about the future of renewable energy."
generated_text = generate_text_with_chat(prompt)
print(f"Generated Text:\n{generated_text}")

Generated Text:
The future of renewable energy is brighter than ever, with groundbreaking innovations and growing global investments driving its rapid expansion. Solar, wind, and battery technologies are becoming more efficient and cost-effective, making clean energy accessible to communities worldwide. As nations prioritize sustainability, renewable energy will play a pivotal role in combating climate change, creating jobs, and fostering energy independence. The transition to renewables is not just an environmental necessity but a powerful opportunity to build a cleaner, healthier, and more equitable future for generations to come.

3. Paraphrasing as a Chat Task

To paraphrase text, we instruct the model by assigning it a specific role using a system message. The system message guides the model's behavior, telling it to act as an expert editor. The user message then provides the text to be paraphrased.

def paraphrase_text_with_chat(text_to_paraphrase):
    """Paraphrases text using an instruction-based chat call."""
    try:
        system_prompt = "You are an expert editor. Your task is to paraphrase the given text, ensuring the original meaning is perfectly preserved but the wording is different."
        user_prompt = f"Please paraphrase the following text: \"{text_to_paraphrase}\""
        
        messages = [
            ChatMessage(role="system", content=system_prompt),
            ChatMessage(role="user", content=user_prompt)
        ]
        
        response = client.chat.completions.create(
            model="jamba-mini",
            messages=messages,
            max_tokens=100
        )
        
        return response.choices[0].message.content
    except Exception as e:
        return f"An error occurred: {e}"

# Example usage
original_text = "The quick brown fox jumps over the lazy dog."
paraphrased_text = paraphrase_text_with_chat(original_text)
print(f"Original Text: {original_text}")
print(f"Paraphrased Text: {paraphrased_text}")

Original Text: The quick brown fox jumps over the lazy dog.
Paraphrased Text: The fast-moving brown fox leaps gracefully over the sluggish dog.

4. Summarization as a Chat Task

Similarly, we can perform summarization by defining the model's role in a system message and then passing the long text in the user message.

def summarize_text_with_chat(long_text):
    """Summarizes text using an instruction-based chat call."""
    try:
        system_prompt = "You are a helpful assistant that summarizes long texts into a single, concise paragraph."
        user_prompt = f"Please summarize the following text: \n\n{long_text}"
        
        messages = [
            ChatMessage(role="system", content=system_prompt),
            ChatMessage(role="user", content=user_prompt)
        ]
        
        response = client.chat.completions.create(
            model="jamba-mini",
            messages=messages,
            max_tokens=200
        )
        
        return response.choices[0].message.content
    except Exception as e:
        return f"An error occurred: {e}"

# Example usage
text_to_summarize = """AI21 Labs is an AI company specializing in Natural Language Processing. Founded in 2017, its goal is to build AI systems with an unprecedented capacity to understand and generate natural language. They have developed a family of large language models, including the Jurassic and Jamba series, which are among the largest and most sophisticated in the world. These models power applications for text generation, summarization, and paraphrasing."""
summary = summarize_text_with_chat(text_to_summarize)
print(f"Summary:\n{summary}")

Summary:
AI21 Labs, founded in 2017, is an AI company specializing in Natural Language Processing, aiming to build AI systems with advanced natural language understanding and generation capabilities. They have developed a family of large language models, including the Jurassic and Jamba series, which are among the largest and most sophisticated in the world, powering applications for text generation, summarization, and paraphrasing.

5. Contextual Answering as a Chat Task

Finally, we can build a Q&A system by providing the context and the question within the user prompt. The system prompt instructs the model to only use the provided context for its answer.

def get_contextual_answer_with_chat(context, question):
    """Gets a contextual answer using an instruction-based chat call."""
    try:
        system_prompt = "You are a question-answering assistant. You must answer the user's question based ONLY on the provided context. If the answer is not in the context, say 'The answer is not available in the provided text.'"
        user_prompt = f"Context: {context}\n\nQuestion: {question}"
        
        messages = [
            ChatMessage(role="system", content=system_prompt),
            ChatMessage(role="user", content=user_prompt)
        ]
        
        response = client.chat.completions.create(
            model="jamba-mini",
            messages=messages,
            max_tokens=100
        )
        
        return response.choices[0].message.content
    except Exception as e:
        return f"An error occurred: {e}"

# Example usage
qa_context = "The capital of France is Paris. Paris is known for its art, fashion, and culture."
qa_question = "What is the primary industry in Paris?"
answer = get_contextual_answer_with_chat(qa_context, qa_question)
print(f"Context: {qa_context}")
print(f"Question: {qa_question}")
print(f"Answer: {answer}")

Context: The capital of France is Paris. Paris is known for its art, fashion, and culture.
Question: What is the primary industry in Paris?
Answer: The answer is not available in the provided text.

6. Conclusion & Next Steps

In this notebook you saw how a single unified Chat Completions interface (with the jamba-mini model) can power multiple classic NLP capabilities simply by varying the conversation messages:

What we built

General text generation with only a user message
Paraphrasing by assigning an explicit editing role via a system message
Summarization by constraining style and compression in the system prompt
Context‑bound Q&A that refuses to hallucinate beyond supplied context

Next Steps

Wrap these functions in a lightweight FastAPI or Flask service.
Create a front‑end (Streamlit / Gradio) for interactive experimentation.
Expand to multi-turn conversations by appending prior assistant and user messages.
Explore larger AI21 models for higher quality where latency/token cost trade‑offs make sense.

With careful prompt design and a thin layer of helper functions, you can cover a large surface area of language tasks using one consistent chat endpoint. Happy building! 🚀