FAQ

Frequently asked questions about the YT2Text API, features, pricing, and integration patterns.

By YT2Text Team • Published February 26, 2026

FAQ

Answers to the most common questions about YT2Text, the YouTube-to-document API platform.

General

What is YT2Text?

YT2Text is a SaaS platform and REST API that converts YouTube videos into structured text documents, including full transcripts and AI-powered summaries. The service extracts captions from any public YouTube video with available subtitles, then processes the content through five distinct AI summary modes: TL;DR, Detailed, Study Notes, Timestamped, and Key Insights. YT2Text supports transcript extraction in nine or more languages and exports results in multiple formats including Markdown, JSON, HTML, CSV, and plain text. Developers integrate via the public REST API at

https://api.yt2text.cc/api/v1
, while non-technical users access the same capabilities through the web application at yt2text.cc. The platform handles video processing asynchronously, returning structured data that teams can feed into content pipelines, research workflows, educational tools, and accessibility systems. Pro plan users also get batch processing, webhook notifications, custom prompts, and infographic generation. See Getting Started for a quickstart walkthrough.

What summary modes does YT2Text offer?

YT2Text offers five AI summary modes, each designed for a different use case and output style. TL;DR produces a concise two-to-three sentence overview of the video content, ideal for quick scanning and social sharing. Detailed generates a comprehensive breakdown with key points and supporting context, suitable for thorough review and documentation. Study Notes formats the output as structured educational notes with bullet points and section headings, built for learning workflows and exam preparation. Timestamped creates a chronological outline with time references linked to specific moments in the video, useful for navigation and precise citation. Key Insights extracts the most important takeaways and actionable conclusions from the content. You can request one or more modes per video by passing an array of mode identifiers to the

summary_modes
field on the processing endpoint. The Free plan supports TL;DR only, while Starter and Pro plans unlock all five modes. See Videos API for request format details.

What export formats are available?

YT2Text supports five export formats for transcript and summary output, covering both human-readable and machine-readable use cases. Markdown is the default format, delivering structured content with headings, lists, and emphasis that renders cleanly in documentation tools, note-taking apps, and static site generators. Plain text strips all formatting for maximum compatibility with legacy systems, clipboard sharing, and simple text processors. JSON returns structured data with discrete fields for video metadata, transcript segments, and summaries, making it ideal for programmatic consumption, database storage, and pipeline automation. HTML provides ready-to-render markup suitable for direct embedding in web pages, email templates, and CMS platforms. CSV outputs tabular data that imports directly into spreadsheets and data analysis tools like Excel and Google Sheets. The format is specified when retrieving results from the

/api/v1/videos/result/{job_id}
endpoint, allowing you to request the same job output in different formats without reprocessing the video.

How many languages does YT2Text support?

YT2Text supports transcript extraction in nine or more languages, covering the most widely spoken languages globally including English, Spanish, French, German, Portuguese, Japanese, Korean, Chinese, and Hindi. The platform works with both auto-generated captions produced by YouTube's speech recognition engine and manually uploaded subtitle tracks provided by video creators. Language detection happens automatically during processing -- YT2Text identifies the available caption tracks for a given video and selects the best match based on quality and completeness. For videos with multiple caption tracks, the system prioritizes manual subtitles over auto-generated ones because manual tracks are typically more accurate. The extracted transcript preserves the original language of the captions. AI summary generation processes the transcript content regardless of source language, though summary quality is highest for English-language content. Language availability depends entirely on what caption tracks exist for each specific YouTube video, as YT2Text reads existing captions rather than performing its own speech-to-text conversion.

API and Integration

How do I get a YT2Text API key?

The YT2Text API key is your credential for authenticating all requests to the REST API. To obtain one, first create an account at yt2text.cc/auth/signup using email and password, Google OAuth, or a magic link. After signing in, navigate to your Dashboard and open the API Keys section. Click "Create New API Key" to generate a new key. Your key uses the

sk_
prefix followed by 64 hexadecimal characters, for example
sk_8f4e5c...
. Copy the key immediately after creation because the full value is only displayed once and cannot be retrieved later. Pass the key on every API request using either the
Authorization: Bearer <api_key>
header or the
X-API-Key: <api_key>
header. Query parameter authentication is not supported. Store your key in environment variables and never expose it in client-side code or version control. See Authentication for complete security guidance and best practices.

What is the base URL for the YT2Text API?

The base URL for all YT2Text API requests is

https://api.yt2text.cc/api/v1
. Every endpoint documented in the API reference is relative to this base path. For example, the video processing endpoint resolves to
https://api.yt2text.cc/api/v1/videos/process
, the batch processing endpoint resolves to
https://api.yt2text.cc/api/v1/batch/process
, and the usage endpoint resolves to
https://api.yt2text.cc/api/v1/usage
. The API is served over HTTPS exclusively -- plain HTTP requests are not accepted and will be rejected. The backend runs on a dedicated VPS separate from the frontend web application at
yt2text.cc
, which is hosted on Cloudflare Pages. All API requests require authentication via an API key passed in the
Authorization
or
X-API-Key
header. Interactive API documentation with a Swagger UI interface is available at
https://api.yt2text.cc/api/docs
during development for testing endpoints directly in the browser. See Getting Started for a complete walkthrough of your first API call.

How does video processing work?

Video processing in YT2Text follows an asynchronous job pattern designed for reliability and scalability. You submit a YouTube URL to

POST /api/v1/videos/process
with your desired summary modes and optional parameters, and the API immediately returns a
job_id
with a
queued
status. The backend then extracts captions from the video, runs AI summarization across your selected modes, and assembles the structured output. To retrieve results, either poll
GET /api/v1/videos/status/{job_id}
until the status reaches
completed
, or register a
webhook_url
in your processing request to receive a
job.completed
callback automatically when finished. Once complete, fetch the full output from
GET /api/v1/videos/result/{job_id}
, which includes video metadata, the raw transcript, and all requested summaries. Most videos process in under 30 seconds, though longer videos may take additional time proportional to their duration. If a job fails, the status shows
failed
with an error code explaining the reason. See Videos API for complete endpoint details and request examples.

What is batch processing in YT2Text?

Batch processing is a Pro plan feature that allows you to submit multiple YouTube videos for processing in a single API call rather than sending individual requests for each video. You post an array of job objects to

POST /api/v1/batch/process
, and the API returns a
batch_id
that tracks the entire group as one unit. Each video within the batch can specify its own
summary_mode
,
webhook_url
, and
custom_instructions
, giving you fine-grained control over individual outputs. You monitor aggregate progress via
GET /api/v1/batch/status/{batch_id}
, which reports the status of every individual job within the batch alongside summary statistics. When processing finishes, retrieve all results at once from
GET /api/v1/batch/results/{batch_id}
. The batch status can be
completed
,
processing
,
failed
, or
partial_failure
depending on individual job outcomes. Batch processing is ideal for content teams, researchers, and automated pipelines that need to process entire playlists or video collections efficiently. See Batch API for request format details and response schemas.

How do webhooks work in YT2Text?

Webhooks in YT2Text are outbound HTTP POST callbacks that notify your server automatically when a processing job finishes, eliminating the need to poll the status endpoint. You include a

webhook_url
field in your video processing or batch processing request, and YT2Text sends a JSON payload to that URL when the job reaches a terminal state. Two event types are currently supported:
job.completed
delivers the full result payload including video metadata, transcript, and summaries, while
job.failed
delivers the error code and message explaining the failure reason. Each webhook request includes a
Content-Type: application/json
header and an
X-Webhook-Signature
header containing an HMAC-SHA256 signature computed over the raw JSON body bytes using your webhook secret. Your server should verify this signature before processing the payload to confirm authenticity and prevent spoofed requests. Return a 2xx status code to acknowledge successful receipt. Webhooks are available on Pro plans. See Webhooks for complete payload schemas, event types, and signature verification implementation details.

What SDKs are available for YT2Text?

YT2Text provides SDK examples in four languages to accelerate integration with the REST API. The JavaScript/TypeScript SDK (

yt2text
on npm) offers typed methods for video processing, batch operations, polling helpers, and error handling with classes like
RateLimitError
and
AuthenticationError
. The Python SDK (
yt2text
on PyPI) provides equivalent functionality with async support and automatic retry logic. The Swift SDK targets iOS and macOS applications, enabling native Apple platform integrations. The Kotlin SDK supports Android and JVM environments for mobile and server-side Java ecosystem projects. All SDKs authenticate using the
Authorization: Bearer
header pattern and accept API keys via constructor parameters or environment variables. Note that first-party SDK packages are planned but not yet published as official releases. The REST API remains the officially supported integration path today. See SDK Roadmap for current status.

Pricing and Plans

How much does YT2Text cost?

YT2Text offers three pricing tiers designed for different usage levels. The Free plan costs $0 and includes 3 videos per month with TL;DR summaries and Markdown export, suitable for evaluation and light personal use. The Starter plan costs $9 per month or $90 per year and includes 50 videos per month with access to all five summary modes, all export formats, and API access, plus a 7-day free trial for new subscribers. The Pro plan costs $29 per month or $290 per year and includes 200 videos per month with batch processing, webhook notifications, custom prompt instructions, infographic generation, and AI chat capabilities. Yearly billing on both paid plans provides a discount equivalent to two months free. All plans include transcript extraction and access to the web application. Paid plan limits reset monthly. See Rate Limits for daily and monthly quota breakdowns.

What are the rate limits for the YT2Text API?

The rate limits for the YT2Text API are enforced per API key and plan tier, covering both request frequency and video processing quotas. Every API response includes rate limit headers:

X-RateLimit-Limit
shows the request cap for the current window,
X-RateLimit-Used
shows consumed requests,
X-RateLimit-Remaining
shows requests left before throttling, and
X-RateLimit-Reset
provides the Unix timestamp when the window resets. When you exceed your limit, the API returns HTTP status
429 Too Many Requests
with a
Retry-After
header indicating how many seconds to wait. Video processing quotas are separate: Free allows 3 videos per day and 3 per month, Starter allows 10 per day and 50 per month, and Pro allows 30 per day and 200 per month. Use exponential backoff when you receive a 429 response rather than retrying in a tight loop. See Rate Limits for the complete quota table.

Is there a free trial?

YT2Text provides two ways to evaluate the platform at no cost. The Free plan is permanently available and includes 3 video processing jobs per month with TL;DR summary mode and Markdown export, requiring no credit card to sign up. This plan never expires and is suitable for ongoing light usage or initial evaluation. Additionally, the Starter plan includes a 7-day free trial period that grants full access to all Starter features: 50 videos per month, all five AI summary modes, all export formats, and API access. The trial begins when you subscribe and requires payment information upfront, but you will not be charged until the trial period ends. You can cancel during the trial to avoid charges. There is no separate trial for the Pro plan, but you can upgrade from Starter to Pro at any time. Visit yt2text.cc/auth/signup to create your account and start processing videos immediately.

Troubleshooting

What does error 401 UNAUTHORIZED mean?

The 401 UNAUTHORIZED error indicates that your API request is missing valid authentication credentials or the provided credentials were rejected. The most common cause is an absent

Authorization
header -- every request to the YT2Text API must include either
Authorization: Bearer <api_key>
or
X-API-Key: <api_key>
. Other causes include a mistyped API key, a key that has been revoked or disabled, or a key that does not match the expected
sk_
prefix format. To resolve this error, first confirm your key is correctly formatted as
sk_
followed by 64 hexadecimal characters. Then verify you are passing it in a supported header and that no extra whitespace or characters were introduced during copy-paste. If the key was recently created, ensure it was fully copied before the creation dialog was dismissed, as keys are only shown once. See Authentication for the complete credential format and header requirements.

What does error 429 RATE_LIMITED mean?

The 429 RATE_LIMITED error means your API key has exceeded the allowed number of requests for the current rate limiting window. YT2Text enforces per-key request throttling and per-plan video processing quotas to ensure fair access across all users. When you receive this error, check the

X-RateLimit-Remaining
response header on your recent requests to see how close you are to the limit, and check the
Retry-After
header on the 429 response to determine how long to wait before sending another request. Implement exponential backoff in your client code: wait the number of seconds indicated by
Retry-After
, then double the wait interval on each subsequent 429 response. Do not retry in a tight loop, as this will extend your throttling period. If you consistently hit rate limits, consider upgrading your plan for higher quotas or optimizing your request patterns to reduce unnecessary calls. See Rate Limits for per-plan quota tables and header descriptions.

What does error 422 TRANSCRIPT_UNAVAILABLE mean?

The 422 TRANSCRIPT_UNAVAILABLE error means YT2Text could not extract captions or subtitles from the requested YouTube video. This occurs when the video has no available caption tracks, which is common in several situations: live streams that have not been archived with captions, private or unlisted videos that restrict access to subtitle data, videos where the creator has explicitly disabled captions, and very recently uploaded videos where YouTube has not yet generated automatic captions. YT2Text requires at least one caption track -- either auto-generated by YouTube's speech recognition or manually uploaded by the creator -- to produce a transcript. To confirm whether captions exist, open the video on YouTube and check if the closed captions (CC) button is available in the player controls. If the video genuinely has no captions, YT2Text cannot process it. Consider selecting a different video or waiting for YouTube to generate automatic captions, which typically appear within 24 hours of upload.

Why is my job stuck in processing status?

A job remaining in processing status usually indicates that the video requires more time than average to complete. Typical processing finishes in under 30 seconds, but longer videos, videos with extensive caption tracks, or jobs requesting multiple summary modes may take additional time due to the AI summarization step. First, verify the current status by calling

GET /api/v1/videos/status/{job_id}
and checking the
progress_percentage
and
current_step
fields for signs of active progress. If the job has been processing for more than five minutes without progress, the job may have encountered an internal error that was not properly surfaced. In this case, try submitting the same video as a new processing request. If the issue persists across multiple attempts, contact support with your
job_id
for investigation. Registering a
webhook_url
with your processing request eliminates the need for polling and delivers results automatically upon completion or failure.

What types of YouTube videos can YT2Text process?

YT2Text can process any public YouTube video that has available captions or subtitles. The platform supports both auto-generated captions produced by YouTube's speech recognition system and manually uploaded subtitle tracks provided by video creators. When multiple caption tracks are available, YT2Text prioritizes manual subtitles over auto-generated ones for higher accuracy. Videos must be publicly accessible -- private videos, deleted videos, and age-restricted content that requires authentication cannot be processed. Unlisted videos may work if the full URL is provided, but access depends on YouTube's caption availability policies. Live streams can be processed after they conclude and YouTube generates the archived caption track, which typically happens within a few hours. There is no maximum video length restriction imposed by YT2Text, though very long videos will take proportionally longer to process. The primary requirement is the existence of at least one caption track on the video.

Security

How should I store my API key securely?

YT2Text API keys are secret credentials that should be protected with the same care as passwords or database connection strings. Store your key exclusively on the server side and never embed it in client-side JavaScript, mobile application bundles, or publicly accessible configuration files. The recommended approach is to use environment variables -- set

YT2TEXT_API_KEY
in your server's environment and reference it in your application code rather than hardcoding the value. Never commit API keys to version control systems like Git, even in private repositories, as credential leaks from repository history are a common attack vector. Add your
.env
file to
.gitignore
to prevent accidental commits. Rotate your API keys on a regular schedule and immediately if you suspect exposure. If a key is compromised, revoke it from the API Keys dashboard and generate a new one. All SDK examples support reading the key from environment variables automatically. See Authentication for header format guidance.

How does webhook signature verification work?

Webhook signature verification is a security mechanism that lets your server confirm each incoming webhook was genuinely sent by YT2Text and was not tampered with in transit. When YT2Text delivers a webhook, it computes an HMAC-SHA256 hash over the exact raw JSON body bytes using the webhook secret configured for your API key, then sends the resulting hex digest in the

X-Webhook-Signature
header prefixed with
sha256=
. To verify, your server should compute the same HMAC-SHA256 using your copy of the webhook secret and the raw request body bytes, then compare your computed signature with the value in the header using a constant-time comparison function to prevent timing attacks. If the signatures match, the payload is authentic and unmodified. If they do not match, reject the request with a non-2xx status and log the event for investigation. Never skip signature verification in production, as unsigned webhook endpoints are vulnerable to spoofed payloads. See Webhooks for payload structure and delivery behavior.