Back to Blog
Tutorial

Twitter Scraping API: How to Extract X/Twitter Data in 2026

February 17, 2026
10 min read
S
By SociaVault Team
Twitter ScrapingX APIData ExtractionPythonJavaScriptSocial Media API

Twitter Scraping API: How to Extract X/Twitter Data in 2026

Twitter's official API costs $100/month minimum. The free tier gives you 1,500 tweets/month—barely enough to test.

If you need serious Twitter data—tweets, profiles, followers, trends—at scale, you need a scraping API.

This guide shows you how to get Twitter/X data without the ridiculous pricing.

The Twitter API Problem

Here's what Twitter's official API tiers look like in 2026:

TierPriceTweet ReadsTweet Posts
Free$01,500/month500/month
Basic$100/month10,000/month3,000/month
Pro$5,000/month1,000,000/month300,000/month

For comparison, scraping 10,000 tweets with SociaVault costs about $10.

What Twitter Data Can You Scrape?

Data TypeWhat You Get
ProfilesUsername, bio, followers, following, tweet count, verified status, join date
TweetsText, likes, retweets, replies, quotes, media, timestamps
Search ResultsTweets matching keywords or hashtags
FollowersList of accounts following a user
FollowingList of accounts a user follows
TrendsTrending topics by location

All publicly available data—the same information anyone can see on twitter.com.

Method 1: Twitter Scraping API

The practical approach. Make HTTP requests, get JSON data. No rate limits to manage, no proxies needed.

Get a Twitter Profile

const API_KEY = 'your_api_key';

async function getTwitterProfile(username) {
  const response = await fetch(
    `https://api.sociavault.com/v1/scrape/twitter/profile?username=${username}`,
    {
      headers: {
        'Authorization': `Bearer ${API_KEY}`
      }
    }
  );
  
  return response.json();
}

// Usage
const profile = await getTwitterProfile('elonmusk');

console.log({
  name: profile.data.name,
  username: profile.data.username,
  followers: profile.data.followers_count,
  following: profile.data.following_count,
  tweets: profile.data.tweet_count,
  verified: profile.data.is_verified,
  bio: profile.data.description
});

/* Output:
{
  name: "Elon Musk",
  username: "elonmusk",
  followers: 170000000,
  following: 400,
  tweets: 35000,
  verified: true,
  bio: "..."
}
*/

Get User Tweets

async function getUserTweets(username, limit = 20) {
  const response = await fetch(
    `https://api.sociavault.com/v1/scrape/twitter/tweets?username=${username}&limit=${limit}`,
    {
      headers: { 'Authorization': `Bearer ${API_KEY}` }
    }
  );
  
  return response.json();
}

// Get latest tweets
const tweets = await getUserTweets('naval', 50);

tweets.data.forEach(tweet => {
  console.log({
    text: tweet.text.substring(0, 100),
    likes: tweet.favorite_count,
    retweets: tweet.retweet_count,
    replies: tweet.reply_count,
    date: tweet.created_at
  });
});

Search Twitter

async function searchTwitter(query, limit = 100) {
  const response = await fetch(
    `https://api.sociavault.com/v1/scrape/twitter/search?q=${encodeURIComponent(query)}&limit=${limit}`,
    {
      headers: { 'Authorization': `Bearer ${API_KEY}` }
    }
  );
  
  return response.json();
}

// Search for tweets about a topic
const results = await searchTwitter('artificial intelligence', 100);

console.log(`Found ${results.data.length} tweets about AI`);

// Get most engaged tweets
const topTweets = results.data
  .sort((a, b) => b.favorite_count - a.favorite_count)
  .slice(0, 10);

topTweets.forEach(t => {
  console.log(`${t.favorite_count} likes: ${t.text.substring(0, 80)}...`);
});

Get Hashtag Tweets

async function getHashtagTweets(hashtag, limit = 100) {
  const response = await fetch(
    `https://api.sociavault.com/v1/scrape/twitter/hashtag?tag=${hashtag}&limit=${limit}`,
    {
      headers: { 'Authorization': `Bearer ${API_KEY}` }
    }
  );
  
  return response.json();
}

// Monitor a hashtag
const tweets = await getHashtagTweets('buildinpublic', 200);

// Analyze hashtag performance
const totalLikes = tweets.data.reduce((sum, t) => sum + t.favorite_count, 0);
const avgLikes = totalLikes / tweets.data.length;

console.log(`#buildinpublic stats:`);
console.log(`  Average likes: ${avgLikes.toFixed(1)}`);
console.log(`  Total engagement: ${totalLikes.toLocaleString()}`);

Get Twitter Followers

async function getTwitterFollowers(username, limit = 100) {
  const response = await fetch(
    `https://api.sociavault.com/v1/scrape/twitter/followers?username=${username}&limit=${limit}`,
    {
      headers: { 'Authorization': `Bearer ${API_KEY}` }
    }
  );
  
  return response.json();
}

// Find influencers following an account
const followers = await getTwitterFollowers('ycombinator', 500);

const influencers = followers.data
  .filter(f => f.followers_count > 10000)
  .sort((a, b) => b.followers_count - a.followers_count);

console.log(`Influencers following @ycombinator:`);
influencers.slice(0, 20).forEach(f => {
  console.log(`  @${f.username} - ${f.followers_count.toLocaleString()} followers`);
});

Python Examples

import requests
import json
from datetime import datetime

API_KEY = 'your_api_key'
BASE_URL = 'https://api.sociavault.com/v1/scrape/twitter'

def get_profile(username):
    """Get Twitter profile data."""
    response = requests.get(
        f'{BASE_URL}/profile',
        params={'username': username},
        headers={'Authorization': f'Bearer {API_KEY}'}
    )
    return response.json()

def get_tweets(username, limit=20):
    """Get user's tweets."""
    response = requests.get(
        f'{BASE_URL}/tweets',
        params={'username': username, 'limit': limit},
        headers={'Authorization': f'Bearer {API_KEY}'}
    )
    return response.json()

def search_tweets(query, limit=100):
    """Search Twitter."""
    response = requests.get(
        f'{BASE_URL}/search',
        params={'q': query, 'limit': limit},
        headers={'Authorization': f'Bearer {API_KEY}'}
    )
    return response.json()

def get_followers(username, limit=100):
    """Get account followers."""
    response = requests.get(
        f'{BASE_URL}/followers',
        params={'username': username, 'limit': limit},
        headers={'Authorization': f'Bearer {API_KEY}'}
    )
    return response.json()


# Example: Competitor analysis
def analyze_competitor(username):
    """Analyze a Twitter competitor."""
    
    profile = get_profile(username)['data']
    tweets = get_tweets(username, 50)['data']
    
    # Calculate engagement metrics
    total_likes = sum(t['favorite_count'] for t in tweets)
    total_retweets = sum(t['retweet_count'] for t in tweets)
    total_replies = sum(t['reply_count'] for t in tweets)
    
    avg_engagement = (total_likes + total_retweets + total_replies) / len(tweets)
    engagement_rate = (avg_engagement / profile['followers_count']) * 100
    
    # Calculate posting frequency
    dates = [datetime.fromisoformat(t['created_at'].replace('Z', '+00:00')) for t in tweets]
    days_span = (dates[0] - dates[-1]).days or 1
    tweets_per_day = len(tweets) / days_span
    
    return {
        'username': username,
        'followers': profile['followers_count'],
        'avg_likes': total_likes / len(tweets),
        'avg_retweets': total_retweets / len(tweets),
        'engagement_rate': f"{engagement_rate:.3f}%",
        'tweets_per_day': round(tweets_per_day, 2)
    }


# Analyze multiple competitors
competitors = ['stripe', 'shopify', 'squarespace']

for comp in competitors:
    analysis = analyze_competitor(comp)
    print(f"\n@{analysis['username']}:")
    print(f"  Followers: {analysis['followers']:,}")
    print(f"  Avg likes: {analysis['avg_likes']:.0f}")
    print(f"  Engagement rate: {analysis['engagement_rate']}")
    print(f"  Tweets/day: {analysis['tweets_per_day']}")

Practical Use Cases

1. Twitter Lead Generation

Find potential customers mentioning problems you solve:

async function findLeads(painPoints, excludeKeywords = []) {
  const leads = [];
  
  for (const query of painPoints) {
    const results = await searchTwitter(query, 100);
    
    for (const tweet of results.data) {
      // Filter out competitors and spam
      const text = tweet.text.toLowerCase();
      if (excludeKeywords.some(kw => text.includes(kw))) continue;
      
      // Score based on engagement and recency
      const score = tweet.favorite_count + (tweet.retweet_count * 2);
      
      leads.push({
        username: tweet.user.username,
        tweet: tweet.text,
        followers: tweet.user.followers_count,
        score,
        url: `https://twitter.com/${tweet.user.username}/status/${tweet.id}`
      });
    }
  }
  
  // Return highest potential leads
  return leads
    .sort((a, b) => b.score - a.score)
    .slice(0, 50);
}

// Find people complaining about expensive APIs
const leads = await findLeads(
  ['"API too expensive"', '"need cheaper API"', '"API pricing is ridiculous"'],
  ['sponsor', 'ad', 'promoted']
);

leads.forEach(lead => {
  console.log(`@${lead.username} (${lead.followers} followers):`);
  console.log(`  ${lead.tweet.substring(0, 100)}...`);
  console.log(`  ${lead.url}\n`);
});

2. Brand Monitoring

Track mentions of your brand or competitors:

async function monitorBrand(brandName, competitors = []) {
  const allMentions = [];
  
  // Search for brand mentions
  const brandMentions = await searchTwitter(brandName, 200);
  
  for (const tweet of brandMentions.data) {
    const sentiment = analyzeSentiment(tweet.text);
    
    allMentions.push({
      brand: brandName,
      username: tweet.user.username,
      text: tweet.text,
      sentiment,
      engagement: tweet.favorite_count + tweet.retweet_count,
      date: tweet.created_at
    });
  }
  
  // Compare with competitors
  for (const competitor of competitors) {
    const mentions = await searchTwitter(competitor, 100);
    
    allMentions.push(...mentions.data.map(t => ({
      brand: competitor,
      username: t.user.username,
      text: t.text,
      sentiment: analyzeSentiment(t.text),
      engagement: t.favorite_count + t.retweet_count,
      date: t.created_at
    })));
  }
  
  // Generate report
  const report = {
    [brandName]: {
      total: allMentions.filter(m => m.brand === brandName).length,
      positive: allMentions.filter(m => m.brand === brandName && m.sentiment === 'positive').length,
      negative: allMentions.filter(m => m.brand === brandName && m.sentiment === 'negative').length
    }
  };
  
  competitors.forEach(c => {
    report[c] = {
      total: allMentions.filter(m => m.brand === c).length,
      positive: allMentions.filter(m => m.brand === c && m.sentiment === 'positive').length,
      negative: allMentions.filter(m => m.brand === c && m.sentiment === 'negative').length
    };
  });
  
  return report;
}

function analyzeSentiment(text) {
  const positive = /love|great|amazing|awesome|best|thank|excellent|perfect/i;
  const negative = /hate|terrible|awful|worst|bad|sucks|disappointed|broken/i;
  
  if (positive.test(text)) return 'positive';
  if (negative.test(text)) return 'negative';
  return 'neutral';
}

3. Export Tweets to CSV

import csv
from datetime import datetime

def export_tweets_to_csv(username, filename, limit=500):
    """Export user's tweets to CSV file."""
    
    tweets = get_tweets(username, limit)['data']
    
    with open(filename, 'w', newline='', encoding='utf-8') as f:
        writer = csv.writer(f)
        writer.writerow([
            'id', 'date', 'text', 'likes', 'retweets', 
            'replies', 'quotes', 'has_media', 'url'
        ])
        
        for tweet in tweets:
            writer.writerow([
                tweet['id'],
                tweet['created_at'],
                tweet['text'].replace('\n', ' '),
                tweet['favorite_count'],
                tweet['retweet_count'],
                tweet['reply_count'],
                tweet.get('quote_count', 0),
                bool(tweet.get('media')),
                f"https://twitter.com/{username}/status/{tweet['id']}"
            ])
    
    print(f"Exported {len(tweets)} tweets to {filename}")

# Export tweets
export_tweets_to_csv('paulg', 'paulg_tweets.csv', 500)

4. Thread Extraction

async function extractThread(tweetUrl) {
  // Get the original tweet
  const tweetId = tweetUrl.split('/').pop();
  
  const response = await fetch(
    `https://api.sociavault.com/v1/scrape/twitter/tweet?id=${tweetId}`,
    { headers: { 'Authorization': `Bearer ${API_KEY}` } }
  );
  
  const tweet = await response.json();
  
  // If it's a thread, get all replies from the same author
  if (tweet.data.conversation_id) {
    const thread = await fetch(
      `https://api.sociavault.com/v1/scrape/twitter/thread?id=${tweet.data.conversation_id}`,
      { headers: { 'Authorization': `Bearer ${API_KEY}` } }
    );
    
    const threadData = await thread.json();
    
    // Return tweets in order
    return threadData.data
      .filter(t => t.user.username === tweet.data.user.username)
      .sort((a, b) => new Date(a.created_at) - new Date(b.created_at));
  }
  
  return [tweet.data];
}

// Extract and display a thread
const thread = await extractThread('https://twitter.com/naval/status/1234567890');

console.log(`Thread by @${thread[0].user.username}:\n`);
thread.forEach((tweet, i) => {
  console.log(`${i + 1}. ${tweet.text}\n`);
});

Method 2: DIY Twitter Scraping

If you want full control, you can scrape Twitter directly. But it's challenging:

const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');

puppeteer.use(StealthPlugin());

async function scrapeTweets(username, maxTweets = 50) {
  const browser = await puppeteer.launch({ headless: 'new' });
  const page = await browser.newPage();
  
  await page.setViewport({ width: 1366, height: 768 });
  
  try {
    await page.goto(`https://twitter.com/${username}`, {
      waitUntil: 'networkidle2'
    });
    
    const tweets = [];
    let previousHeight = 0;
    
    while (tweets.length < maxTweets) {
      // Extract tweets from page
      const newTweets = await page.evaluate(() => {
        const articles = document.querySelectorAll('article[data-testid="tweet"]');
        
        return Array.from(articles).map(article => {
          const text = article.querySelector('[data-testid="tweetText"]')?.innerText;
          const time = article.querySelector('time')?.getAttribute('datetime');
          
          return { text, time };
        });
      });
      
      // Add unique tweets
      newTweets.forEach(t => {
        if (t.text && !tweets.find(e => e.text === t.text)) {
          tweets.push(t);
        }
      });
      
      // Scroll down
      await page.evaluate(() => window.scrollTo(0, document.body.scrollHeight));
      await page.waitForTimeout(2000);
      
      const currentHeight = await page.evaluate(() => document.body.scrollHeight);
      if (currentHeight === previousHeight) break;
      previousHeight = currentHeight;
    }
    
    return tweets.slice(0, maxTweets);
    
  } finally {
    await browser.close();
  }
}

Why DIY is Hard

  • Login walls - Twitter shows limited content without auth
  • Rate limits - IP bans after ~50-100 requests
  • Dynamic content - React-based UI is hard to scrape
  • Anti-bot detection - Sophisticated fingerprinting
  • Constant changes - DOM structure changes frequently

Twitter API vs Scraping API Comparison

FeatureOfficial APIScraping API
Cost for 10K tweets$100+/month~$10 one-time
Setup timeDays (app approval)Minutes
Rate limitsStrictFlexible
Data freshnessReal-timeNear real-time
Historical tweetsLimitedFull access
Follower listsLimitedFull access
Search capabilitiesLimitedFull access

Best Practices

  1. Cache aggressively - Store results to avoid re-scraping
  2. Respect rate limits - Add delays between requests
  3. Handle errors - Implement retry logic
  4. Stay updated - Twitter changes frequently
// Caching implementation
const cache = new Map();

async function getCachedTwitterData(endpoint, params, ttl = 3600000) {
  const cacheKey = `${endpoint}-${JSON.stringify(params)}`;
  
  if (cache.has(cacheKey)) {
    const { data, timestamp } = cache.get(cacheKey);
    if (Date.now() - timestamp < ttl) {
      return data;
    }
  }
  
  const response = await fetch(
    `https://api.sociavault.com/v1/scrape/twitter/${endpoint}?${new URLSearchParams(params)}`,
    { headers: { 'Authorization': `Bearer ${API_KEY}` } }
  );
  
  const data = await response.json();
  
  cache.set(cacheKey, { data, timestamp: Date.now() });
  
  return data;
}

Getting Started

  1. Sign up at sociavault.com/auth/sign-up
  2. Get 50 free credits to test
  3. Copy your API key
  4. Start scraping Twitter data

No approval process, no $100/month fees, no rate limit headaches.


Frequently Asked Questions

Yes, scraping publicly available Twitter data is legal. The hiQ Labs v. LinkedIn case established that public data isn't protected by the CFAA. However, never bypass login walls or scrape private accounts.

What's the difference between Twitter's API and a scraping API?

Twitter's official API has strict rate limits and costs $100-5,000/month. A scraping API like SociaVault extracts the same public data via web scraping, with pay-per-request pricing and no rate limits.

Can I scrape Twitter followers?

Yes. You can scrape follower lists from any public Twitter account. This includes their usernames, bios, follower counts, and verification status.

How much does Twitter scraping cost?

With SociaVault, about $0.001 per request. Scraping 10,000 tweets costs roughly $10 vs $100+/month with Twitter's official API.


Related guides:

Found this helpful?

Share it with others who might benefit

Ready to Try SociaVault?

Start extracting social media data with our powerful API. No credit card required.