YouTube Transcript API: Extract Video Captions & Speech-to-Text
Need to get the text content from YouTube videos? The Transcript API extracts captions and subtitles from any YouTube video—perfect for content analysis, SEO, and accessibility.
Why Use YouTube Transcripts?
- Content Repurposing - Turn videos into blog posts
- SEO Optimization - Extract keywords from video content
- Research - Analyze video content at scale
- Accessibility - Provide captions for all users
- Translation - Get multi-language transcripts
- AI Training - Build datasets from video content
Using the Transcript API
const response = await fetch('https://api.sociavault.com/youtube/video/transcript', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
url: 'https://www.youtube.com/watch?v=dQw4w9WgXcQ'
})
});
const transcript = await response.json();
Sample Response
{
"videoId": "dQw4w9WgXcQ",
"title": "Video Title",
"duration": 212,
"transcript": {
"text": "Welcome to today's video. Today we're going to discuss...",
"segments": [
{
"start": 0.0,
"end": 3.5,
"text": "Welcome to today's video."
},
{
"start": 3.5,
"end": 7.2,
"text": "Today we're going to discuss..."
}
],
"language": "en",
"isAutoGenerated": false
},
"availableLanguages": ["en", "es", "fr", "de", "ja"]
}
Use Cases
Convert Videos to Blog Posts
Extract content for written articles:
const { transcript } = await getTranscript(videoUrl);
// Clean up for blog format
const blogContent = transcript.text
.replace(/\s+/g, ' ') // Remove extra whitespace
.split(/[.!?]+/) // Split into sentences
.filter(s => s.trim().length > 0)
.map(s => s.trim() + '.')
.join('\n\n');
console.log(blogContent);
Extract Keywords for SEO
Analyze video content for keyword research:
const { transcript } = await getTranscript(videoUrl);
const words = transcript.text.toLowerCase()
.replace(/[^\w\s]/g, '')
.split(/\s+/)
.filter(w => w.length > 3);
const wordFreq = {};
words.forEach(w => {
wordFreq[w] = (wordFreq[w] || 0) + 1;
});
const topKeywords = Object.entries(wordFreq)
.sort((a, b) => b[1] - a[1])
.slice(0, 20);
console.log('Top keywords:', topKeywords);
Create Timestamps
Generate chapter markers:
const { transcript } = await getTranscript(videoUrl);
// Find key phrases for chapters
const chapters = [];
let currentChapter = { start: 0, text: '' };
transcript.segments.forEach(segment => {
// Detect topic changes (simplified)
if (segment.text.includes('moving on') ||
segment.text.includes('next') ||
segment.text.includes('now let\'s talk about')) {
if (currentChapter.text) {
chapters.push(currentChapter);
}
currentChapter = {
start: segment.start,
text: segment.text
};
}
});
chapters.forEach(ch => {
const time = formatTimestamp(ch.start);
console.log(`${time} - ${ch.text}`);
});
Multi-Language Support
Get transcripts in different languages:
const data = await getTranscript(videoUrl);
console.log('Available languages:', data.availableLanguages);
// Get Spanish transcript if available
if (data.availableLanguages.includes('es')) {
const spanishTranscript = await getTranscript(videoUrl, { language: 'es' });
console.log(spanishTranscript.transcript.text);
}
Content Analysis at Scale
Analyze multiple videos:
const videoUrls = ['url1', 'url2', 'url3'];
const transcripts = await Promise.all(
videoUrls.map(url => getTranscript(url))
);
// Find common topics
const allText = transcripts.map(t => t.transcript.text).join(' ');
// Run through your analysis pipeline
const insights = analyzeContent(allText);
Related Endpoints
- YouTube Video Details - Video metadata
- YouTube Videos Scraper - Channel videos
- YouTube Comments - Video comments
- TikTok Transcript - TikTok transcripts
- Instagram Transcript - Instagram transcripts
Frequently Asked Questions
What if a video has no captions?
If no captions exist, the API uses speech-to-text to generate a transcript. The isAutoGenerated field indicates this.
How many languages are supported?
The API supports 100+ languages for both manual and auto-generated captions.
Are timestamps included?
Yes, the segments array includes start/end timestamps for each phrase, useful for caption generation.
Can I get transcripts for live streams?
Live streams must end and process before transcripts are available. Recent live streams may not have transcripts yet.
How accurate are auto-generated transcripts?
Auto-generated transcripts typically achieve 90-95% accuracy for clear English speech. Accuracy varies with audio quality and language.
Can I download SRT/VTT files?
The API returns JSON data. You can convert the segments to SRT/VTT format:
function toSRT(segments) {
return segments.map((seg, i) =>
`${i + 1}\n${formatTime(seg.start)} --> ${formatTime(seg.end)}\n${seg.text}\n`
).join('\n');
}
Get Started
Create your account and start extracting YouTube transcripts.
Documentation: /docs/api-reference/youtube/video-transcript
Found this helpful?
Share it with others who might benefit
Ready to Try SociaVault?
Start extracting social media data with our powerful API. No credit card required.