Jun 23, 2026
How to Integrate Telegram Stars Payment API: Step-by-Step Python Guide (2026)
Learn how to integrate the Telegram Stars (XTR) payment API into your Python Telegram bots …
As artificial intelligence continues to redefine user interactions, conversational interfaces have evolved from static response pages into real-time, interactive experiences. Modern users expect immediate feedback, watching answers generate word-by-word just like in ChatGPT, Claude, or Gemini. For developers building on Telegram, achieving this smooth token-by-token streaming has historically been a significant challenge. Traditional workarounds relying on continuous message editing caused severe screen flickering, cluttered chat histories, and quickly triggered restrictive 429 Too Many Requests rate limits. Fortunately, the release of the native sendMessageDraft method in the Telegram Bot API (finalized in Bot API 9.5 and expanded in 10.1) has completely solved this problem. This comprehensive guide will walk you through the step-by-step process of implementing native, flicker-free AI streaming in your Telegram bots using Python.
To stream AI responses natively in a Telegram bot, use the new sendMessageDraft API method to push incremental token updates to an ephemeral draft bubble in the user's chat. This prevents API rate limits and screen flickering. Once the AI finishes generating, finalize the message by calling the standard sendMessage method to save the final text permanently to the chat history. For highly rated developer templates and curated bot channels, visit Telekit's Science & Technology Catalog.
Prior to the introduction of the sendMessageDraft method, developers wanting to stream text from an LLM had to write complex loops that repeatedly called editMessageText. While this approach worked for simple single-user bots, it failed spectacularly in production under concurrent loads. Here is why the old method is obsolete:
editMessageText triggers a minor UI update in the Telegram client. This causes the bubble to shift, flash, and flicker, creating a poor user experience.The new sendMessageDraft method introduces a dedicated, high-speed, ephemeral draft state. It allows high-frequency partial updates to a temporary bubble. Because Telegram optimizes the transmission of drafts, it does not count against normal message editing rate limits and animates seamlessly on the client screen without any flickering.
Before coding your streaming bot, ensure you have the following prerequisites ready:
Let's set up the workspace, install the required packages, and implement the code step-by-step.
First, install the necessary Python libraries using your terminal. We will install aiogram for the Telegram Bot API and google-generativeai to demonstrate a real Gemini API stream integration.
# Install the latest aiogram and Google Gemini AI SDK
pip install -U aiogram google-generativeai
Create a file named bot.py and add the following complete, production-ready Python implementation. This code utilizes aiogram to capture user messages, request a streaming response from the Gemini API, and stream the tokens natively using sendMessageDraft before finalization.
import os
import asyncio
import uuid
from aiogram import Bot, Dispatcher, html
from aiogram.types import Message
from aiogram.filters import CommandStart
from aiogram.methods import SendMessageDraft, SendMessage
import google.generativeai as genai
# Configure API Keys (Load from environment variables in production)
TELEGRAM_BOT_TOKEN = os.getenv("TELEGRAM_BOT_TOKEN", "YOUR_BOT_TOKEN_HERE")
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY", "YOUR_GEMINI_API_KEY_HERE")
# Initialize Gemini Client
genai.configure(api_key=GEMINI_API_KEY)
model = genai.GenerativeModel("gemini-1.5-flash")
# Initialize Bot and Dispatcher
bot = Bot(token=TELEGRAM_BOT_TOKEN)
dp = Dispatcher()
@dp.message(CommandStart())
async def command_start_handler(message: Message) -> None:
# Send start message
await message.answer(
f"Hello, {html.bold(message.from_user.full_name)}! "
"Ask me anything, and I will stream the answer in real-time."
)
@dp.message()
async def chat_handler(message: Message) -> None:
# Handle incoming messages and stream responses
user_prompt = message.text
chat_id = message.chat.id
# Generate a unique draft_id for this specific stream
draft_id = str(uuid.uuid4())
# Send an initial typing indicator
await bot.send_chat_action(chat_id=chat_id, action="typing")
try:
# Request a streaming response from Gemini
response_stream = model.generate_content(user_prompt, stream=True)
current_text = ""
update_interval = 0.4 # Throttle draft updates to 400ms to stay extremely safe
last_update_time = asyncio.get_event_loop().time()
for chunk in response_stream:
current_text += chunk.text
now = asyncio.get_event_loop().time()
# Send draft updates in a throttled loop to maintain perfect fluid animation
if now - last_update_time >= update_interval:
await bot(SendMessageDraft(
chat_id=chat_id,
draft_id=draft_id,
text=current_text + " ▌", # Visual cursor indicator
parse_mode="Markdown"
))
last_update_time = now
# Yield control briefly to ensure async loop concurrency
await asyncio.sleep(0.01)
# Stream has completed. Finalize the message to save it permanently in the chat history.
await bot(SendMessage(
chat_id=chat_id,
text=current_text,
parse_mode="Markdown"
))
except Exception as e:
# Finalize with an error message if the API call fails
await bot(SendMessage(
chat_id=chat_id,
text=f"❌ Error occurred: {str(e)}"
))
async def main() -> None:
print("Starting bot polling...")
await dp.start_polling(bot)
if __name__ == "__main__":
asyncio.run(main())
To help you understand the architectural trade-offs, here is a detailed breakdown comparing the three primary methods used to display active thinking or generation in Telegram:
When deploying real-time streaming bots in production, you must adhere to several structural and safety parameters to protect your bot and ensure compliance with Telegram's guidelines:
asyncio) and consider using a production process manager like PM2 or Systemd to manage your bot lifecycle.sendMessageDraft is temporary. If the connection fails or takes longer than 30 seconds to send the final sendMessage, the draft bubble will automatically expire and disappear from the user's view. Always wrap your streaming loops in try-finally blocks to guarantee a final message is delivered even if the model fails mid-generation.Unlike editMessageText, which writes persistent data to Telegram's cloud database, sendMessageDraft updates an ephemeral memory-cached draft. Because it requires no heavy database write operations, Telegram allows bots to push draft updates at much higher frequencies without triggering 429 rate limit errors.
Yes, the method fully supports the parse_mode argument. You can stream text formatted with MarkdownV2 or HTML, allowing inline bold text, spoilers, code tags, and links to render on the fly as they are streamed.
The streamed draft is ephemeral. If you do not call sendMessage when generation is complete, the draft bubble will eventually expire and vanish (typically within 30–60 seconds), leaving the user with a completely empty chat window. Finalization is mandatory to keep the message permanent.
The sendMessageDraft method works perfectly in private direct chats, group chats, and Telegram forum topics. However, it is not supported in broadcast Channels, where messages are sent directly as permanent posts and do not have draft previews.
The introduction of the sendMessageDraft API has elevated the capabilities of Telegram AI bots. By eliminating flickering and resolving HTTP 429 rate limits, you can build a premium, ChatGPT-like fluid streaming interface in just a few lines of Python. Deploying these streaming bots on high-performance hosting platforms is a massive value-add for developers in 2026. Ready to showcase your Telegram tools or find top developer channels? Join the conversation and explore verified resources in our catalog today!
Pick your interests and we'll show you the best communities first.
Stay updated with the latest Telegram groups and channels
Or scan the QR code
Loading community stats...
No active reviews. Be the first to add one!