FAQ

Q: How do I convert a YouTube video to text?

Submit a public YouTube URL through the YT2Text web app or API. YT2Text reads the available caption track, returns the full transcript, and lets you export the result in multiple text formats.

Q: Can I summarize YouTube videos automatically?

Yes. After transcript extraction, YT2Text can generate TL;DR, detailed notes, study notes, timestamped outlines, and key insights automatically from the video transcript.

Q: Is YT2Text a YouTube to text converter or a transcript tool?

It is both. YT2Text first extracts the transcript from a supported YouTube video, then adds summary, export, and automation workflows on top of that transcript.

Q: Can I export a YouTube transcript?

Yes. YT2Text supports Markdown, plain text, JSON, HTML, and CSV exports so a transcript can move directly into docs, research notes, or downstream automation.

Q: What is the difference between a YouTube transcript and a summary?

A transcript captures what was said in the video. A summary condenses the key ideas into a shorter format for quick reading, notes, or publishing workflows.

Answers to the most common questions about YT2Text, the YouTube-to-document API platform.

Query-Led Answers

How do I convert a YouTube video to text?

To convert a YouTube video to text, submit a public video URL to YT2Text through the web app or the API. The system reads the available caption track, extracts the transcript text, preserves language information, and returns structured transcript data. In the web app, users can export completed jobs in multiple text formats. In the public API, transcript and summary data are returned as structured JSON, and Pro users can also download PDF exports for completed jobs. This is the cleanest workflow when someone needs a YouTube transcript for research, publishing, note-taking, or AI processing.

Can I summarize YouTube videos automatically?

Yes. YT2Text can summarize YouTube videos automatically after transcript extraction. This transcript-first workflow is more reliable than summary tools that look only at titles or descriptions because the model sees the actual spoken content. Available summary modes include TL;DR, Detailed, Study Notes, Timestamped, and Key Insights.

Is YT2Text a YouTube to text converter or a transcript tool?

YT2Text functions as both a YouTube to text converter and a transcript tool. It extracts the transcript from a supported YouTube video first, then layers AI summaries, exports, and automation workflows on top of that transcript. That makes it useful for both simple copy-paste tasks and production content pipelines.

Can I export a YouTube transcript?

Yes. Every completed YT2Text job can be retrieved in multiple export formats. Markdown works well for docs and note-taking systems, JSON is best for automation and databases, HTML is useful for direct rendering, CSV fits spreadsheet workflows, and plain text is the lowest-friction option for quick reuse.

What is the difference between a YouTube transcript and a summary?

A YouTube transcript is the full text representation of what was said in the video. A summary is a shorter interpretation of the most important ideas. In practice, teams often keep both: the transcript as source material and the summary as the fast-reading layer for notes, reports, or content briefs.

General

What is YT2Text?

YT2Text is a SaaS platform and REST API that converts YouTube videos into structured text documents, including full transcripts and AI-powered summaries. The service extracts captions from any public YouTube video with available subtitles, then processes the content through five distinct AI summary modes: TL;DR, Detailed, Study Notes, Timestamped, and Key Insights. Developers integrate via the public REST API at

https://api.yt2text.cc/api/v1

, while non-technical users access the same capabilities through the web application at yt2text.cc. The platform handles video processing asynchronously, returning structured data that teams can feed into content pipelines, research workflows, educational tools, and accessibility systems. Pro plan users also get batch processing, webhook notifications, custom prompts, infographic generation, AI chat, and PDF export. See Getting Started for a quickstart walkthrough.

What summary modes does YT2Text offer?

YT2Text offers five AI summary modes, each designed for a different use case and output style. TL;DR produces a concise two-to-three sentence overview of the video content, ideal for quick scanning and social sharing. Detailed generates a comprehensive breakdown with key points and supporting context, suitable for thorough review and documentation. Study Notes formats the output as structured educational notes with bullet points and section headings, built for learning workflows and exam preparation. Timestamped creates a chronological outline with time references linked to specific moments in the video, useful for navigation and precise citation. Key Insights extracts the most important takeaways and actionable conclusions from the content. You can request one or more modes per video by passing an array of mode identifiers to the

summary_modes

field on the processing endpoint. The Free plan supports TL;DR only, while Plus and Pro plans unlock all five modes. See Videos API for request format details.

What export formats are available?

YT2Text supports multiple output formats across the product, but the surface depends on which API you use. The public developer API returns structured JSON for job results and offers a PDF download endpoint for completed Pro/Admin jobs. The browser app's internal workflow supports richer export preferences such as Markdown, plain text, JSON, HTML, and CSV for completed jobs inside the product experience. If you are building an external integration, treat the public API response payload as the stable source of transcript and summary data, and use the PDF export endpoint when you need a rendered document.

How many languages does YT2Text support?

YT2Text supports transcript extraction in nine or more languages, covering the most widely spoken languages globally including English, Spanish, French, German, Portuguese, Japanese, Korean, Chinese, and Hindi. The platform works with both auto-generated captions produced by YouTube's speech recognition engine and manually uploaded subtitle tracks provided by video creators. Language detection happens automatically during processing -- YT2Text identifies the available caption tracks for a given video and selects the best match based on quality and completeness. For videos with multiple caption tracks, the system prioritizes manual subtitles over auto-generated ones because manual tracks are typically more accurate. The extracted transcript preserves the original language of the captions. AI summary generation processes the transcript content regardless of source language, though summary quality is highest for English-language content. Language availability depends entirely on what caption tracks exist for each specific YouTube video, as YT2Text reads existing captions rather than performing its own speech-to-text conversion.

API and Integration

How do I get a YT2Text API key?

The YT2Text API key is your credential for authenticating all requests to the REST API. To obtain one, first create an account at yt2text.cc/auth/signup using email and password, Google OAuth, or a magic link. After signing in, navigate to your Dashboard and open the API Keys section. Click "Create New API Key" to generate a new key. Your key uses the

sk_

prefix followed by 64 hexadecimal characters, for example

sk_8f4e5c...

. Copy the key immediately after creation because the full value is only displayed once and cannot be retrieved later. Pass the key on every API request using either the

Authorization: Bearer <api_key>

header or the

X-API-Key: <api_key>

header. Query parameter authentication is not supported. Store your key in environment variables and never expose it in client-side code or version control. See Authentication for complete security guidance and best practices.

What is the base URL for the YT2Text API?

The base URL for all YT2Text API requests is

https://api.yt2text.cc/api/v1

. Every endpoint documented in the API reference is relative to this base path. For example, the video processing endpoint resolves to

https://api.yt2text.cc/api/v1/videos/process

, the batch processing endpoint resolves to

https://api.yt2text.cc/api/v1/batch/process

, and the usage endpoint resolves to

https://api.yt2text.cc/api/v1/usage

. The API is served over HTTPS exclusively -- plain HTTP requests are not accepted and will be rejected. The backend runs on a dedicated VPS separate from the frontend web application at

yt2text.cc

, which is hosted on Cloudflare Pages. All API requests require authentication via an API key passed in the

Authorization

X-API-Key

header. Interactive API documentation with a Swagger UI interface is available at

https://api.yt2text.cc/api/docs

during development for testing endpoints directly in the browser. See Getting Started for a complete walkthrough of your first API call.

How does video processing work?

Video processing in YT2Text follows an asynchronous job pattern designed for reliability and scalability. You submit a YouTube URL to

POST /api/v1/videos/process

with your desired summary modes and optional parameters, and the API immediately returns a

job_id

with a

queued

status. The backend then extracts captions from the video, runs AI summarization across your selected modes, and assembles the structured output. To retrieve results, either poll

GET /api/v1/videos/status/{job_id}

until the status reaches

completed

, or register a

webhook_url

in your processing request to receive a

job.completed

callback automatically when finished. Once complete, fetch the full output from

GET /api/v1/videos/result/{job_id}

, which includes video metadata, the raw transcript, and all requested summaries. Most videos process in under 30 seconds, though longer videos may take additional time proportional to their duration. If a job fails, the status shows

failed

with an error code explaining the reason. See Videos API for complete endpoint details and request examples.

What is batch processing in YT2Text?

Batch processing is a Pro plan feature that allows you to submit multiple YouTube videos for processing in a single API call rather than sending individual requests for each video. You post a

videos

array to

POST /api/v1/batch/process

, and the API returns a

batch_id

that tracks the entire group as one unit. Each video can provide its own

url

and

options

object, including its own requested

summary_modes

. The current batch request schema uses a batch-level

webhook_url

and does not expose per-video custom prompts or per-video webhook destinations. That batch-level webhook URL is currently stored by the backend, but batch completion delivery is not yet wired up, so polling remains the reliable integration pattern. You monitor aggregate progress via

GET /api/v1/batch/status/{batch_id}

, which reports the status of every individual job within the batch alongside summary statistics. When processing finishes, retrieve all results at once from

GET /api/v1/batch/results/{batch_id}

. The batch status can be

completed

processing

failed

, or

partial_failure

depending on individual job outcomes. Batch processing is ideal for content teams, researchers, and automated pipelines that need to process entire playlists or video collections efficiently. See Batch API for request format details and response schemas.

How do webhooks work in YT2Text?

Webhooks in YT2Text are outbound HTTP POST callbacks that notify your server automatically when a processing job finishes, eliminating the need to poll the status endpoint. You include a

webhook_url

field in your video processing or batch processing request, and YT2Text sends a JSON payload to that URL when the job reaches a terminal state. Two event types are currently supported:

job.completed

delivers completion data, while

job.failed

reports the failure state. Webhooks are available on Pro plans. Production webhook destinations must use HTTPS and pass server-side SSRF validation. The transport layer can support HMAC signing internally, but the current public API does not expose webhook-secret configuration, so public integrations should not rely on signature headers today. See Webhooks for current payload details.

What SDKs are available for YT2Text?

YT2Text provides SDK examples in four languages to accelerate integration with the REST API. The JavaScript/TypeScript SDK (

yt2text

on npm) offers typed methods for video processing, batch operations, polling helpers, and error handling with classes like

RateLimitError

and

AuthenticationError

. The Python SDK (

yt2text

on PyPI) provides equivalent functionality with async support and automatic retry logic. The Swift SDK targets iOS and macOS applications, enabling native Apple platform integrations. The Kotlin SDK supports Android and JVM environments for mobile and server-side Java ecosystem projects. All SDKs authenticate using the

Authorization: Bearer

header pattern and accept API keys via constructor parameters or environment variables. Note that first-party SDK packages are planned but not yet published as official releases. The REST API remains the officially supported integration path today. See SDK Roadmap for current status.

Pricing and Plans

How much does YT2Text cost?

YT2Text offers three pricing tiers designed for different usage levels. The Free plan costs $0 and includes 5 videos per month with TL;DR summaries. The Plus plan costs $9 per month or $90 per year and includes 200 videos per month with access to all five summary modes and API keys. The Pro plan costs $29 per month or $290 per year and includes 1,000 videos per month with batch processing, webhook notifications, custom prompt instructions, infographic generation, AI chat capabilities, PDF export, and priority queue access. Yearly billing on both paid plans provides a discount equivalent to two months free. Paid plan limits reset monthly. See Rate Limits for daily and monthly quota breakdowns.

What are the rate limits for the YT2Text API?

The rate limits for the YT2Text API are enforced through a combination of per-key request throttling and plan-based video quotas. API responses can include rate limit headers such as

X-RateLimit-Limit

X-RateLimit-Used

X-RateLimit-Remaining

, and

X-RateLimit-Reset

, and throttled requests return HTTP

429 Too Many Requests

with

Retry-After

. Video processing quotas are separate: Free allows 5 videos per day and 5 per month, Plus allows 40 per day and 200 per month, and Pro allows 100 per day and 1,000 per month. Use exponential backoff when you receive a 429 response rather than retrying in a tight loop. See Rate Limits for the current quota table and implementation caveats.

Is there a free trial?

YT2Text provides two ways to evaluate the platform at no cost. The Free plan is permanently available and includes 5 video processing jobs per month with TL;DR summary mode, requiring no credit card to sign up. This plan never expires and is suitable for ongoing light usage or initial evaluation. Additionally, the Plus plan includes a 7-day free trial period that grants full access to all Plus features: 200 videos per month, all five AI summary modes, and API key access. The trial begins when you subscribe and requires payment information upfront, but you will not be charged until the trial period ends. You can cancel during the trial to avoid charges. There is no separate trial for the Pro plan, but you can upgrade from Plus to Pro at any time. Visit yt2text.cc/auth/signup to create your account and start processing videos immediately.

Troubleshooting

What does error 401 UNAUTHORIZED mean?

The 401 UNAUTHORIZED error indicates that your API request is missing valid authentication credentials or the provided credentials were rejected. The most common cause is an absent

Authorization

header -- every request to the YT2Text API must include either

Authorization: Bearer <api_key>

X-API-Key: <api_key>

. Other causes include a mistyped API key, a key that has been revoked or disabled, or a key that does not match the expected

sk_

prefix format. To resolve this error, first confirm your key is correctly formatted as

sk_

followed by 64 hexadecimal characters. Then verify you are passing it in a supported header and that no extra whitespace or characters were introduced during copy-paste. If the key was recently created, ensure it was fully copied before the creation dialog was dismissed, as keys are only shown once. See Authentication for the complete credential format and header requirements.

What does error 429 RATE_LIMITED mean?

The 429 RATE_LIMITED error means your API key has exceeded the allowed number of requests for the current rate limiting window. YT2Text enforces per-key request throttling and per-plan video processing quotas to ensure fair access across all users. When you receive this error, check the

X-RateLimit-Remaining

response header on your recent requests to see how close you are to the limit, and check the

Retry-After

header on the 429 response to determine how long to wait before sending another request. Implement exponential backoff in your client code: wait the number of seconds indicated by

Retry-After

, then double the wait interval on each subsequent 429 response. Do not retry in a tight loop, as this will extend your throttling period. If you consistently hit rate limits, consider upgrading your plan for higher quotas or optimizing your request patterns to reduce unnecessary calls. See Rate Limits for per-plan quota tables and header descriptions.

What does error 422 TRANSCRIPT_UNAVAILABLE mean?

The 422 TRANSCRIPT_UNAVAILABLE error means YT2Text could not extract captions or subtitles from the requested YouTube video. This occurs when the video has no available caption tracks, which is common in several situations: live streams that have not been archived with captions, private or unlisted videos that restrict access to subtitle data, videos where the creator has explicitly disabled captions, and very recently uploaded videos where YouTube has not yet generated automatic captions. YT2Text requires at least one caption track -- either auto-generated by YouTube's speech recognition or manually uploaded by the creator -- to produce a transcript. To confirm whether captions exist, open the video on YouTube and check if the closed captions (CC) button is available in the player controls. If the video genuinely has no captions, YT2Text cannot process it. Consider selecting a different video or waiting for YouTube to generate automatic captions, which typically appear within 24 hours of upload.

Why is my job stuck in processing status?

A job remaining in processing status usually indicates that the video requires more time than average to complete. Typical processing finishes in under 30 seconds, but longer videos, videos with extensive caption tracks, or jobs requesting multiple summary modes may take additional time due to the AI summarization step. First, verify the current status by calling

GET /api/v1/videos/status/{job_id}

and checking the

progress_percentage

and

current_step

fields for signs of active progress. If the job has been processing for more than five minutes without progress, the job may have encountered an internal error that was not properly surfaced. In this case, try submitting the same video as a new processing request. If the issue persists across multiple attempts, contact support with your

job_id

for investigation. Registering a

webhook_url

with your processing request eliminates the need for polling and delivers results automatically upon completion or failure.

What types of YouTube videos can YT2Text process?

YT2Text can process any public YouTube video that has available captions or subtitles. The platform supports both auto-generated captions produced by YouTube's speech recognition system and manually uploaded subtitle tracks provided by video creators. When multiple caption tracks are available, YT2Text prioritizes manual subtitles over auto-generated ones for higher accuracy. Videos must be publicly accessible -- private videos, deleted videos, and age-restricted content that requires authentication cannot be processed. Unlisted videos may work if the full URL is provided, but access depends on YouTube's caption availability policies. Live streams can be processed after they conclude and YouTube generates the archived caption track, which typically happens within a few hours. There is no maximum video length restriction imposed by YT2Text, though very long videos will take proportionally longer to process. The primary requirement is the existence of at least one caption track on the video.

Security

How should I store my API key securely?

YT2Text API keys are secret credentials that should be protected with the same care as passwords or database connection strings. Store your key exclusively on the server side and never embed it in client-side JavaScript, mobile application bundles, or publicly accessible configuration files. The recommended approach is to use environment variables -- set

YT2TEXT_API_KEY

in your server's environment and reference it in your application code rather than hardcoding the value. Never commit API keys to version control systems like Git, even in private repositories, as credential leaks from repository history are a common attack vector. Add your

.env

file to

.gitignore

to prevent accidental commits. Rotate your API keys on a regular schedule and immediately if you suspect exposure. If a key is compromised, revoke it from the API Keys dashboard and generate a new one. All SDK examples support reading the key from environment variables automatically. See Authentication for header format guidance.

How does webhook signature verification work?

The webhook transport layer can support HMAC signing internally, but the current public API does not expose webhook-secret configuration. Public integrations should therefore treat webhooks as unsigned callbacks today and rely on HTTPS, destination validation, and endpoint-level hardening on their own servers. If public signature support is added later, the webhook reference will be updated with the exact verification flow.