Twitter/X Community Scraper: Extract Posts and Members from Any Community

TL;DR: Twitter/X Communities are niche groups where highly engaged users discuss specific topics — making them goldmines for audience research, lead generation, and competitor monitoring. The SociaVault API provides two endpoints to extract community metadata and tweets, giving you structured access to this data without the complexity of the official X API.

Twitter/X Communities are one of the platform's most underutilized data sources. Unlike the main feed — which is algorithmically curated and noisy — Communities are self-selected groups of people who are genuinely interested in a specific topic. A community about indie game development, DTC e-commerce, or quantitative trading contains exactly the kind of engaged, niche audience that's valuable for research, outreach, and competitive intelligence.

The problem: the official X API has become expensive and restrictive, and Communities data in particular is hard to access programmatically. This guide shows you how to use the SociaVault API to extract community data and tweets with clean Python code.

What Are Twitter/X Communities?

Twitter/X Communities are invite-based or open groups organized around a specific topic. Members can post within the community, and those posts are visible to other members and (for public communities) to anyone on the platform.

Communities differ from regular Twitter in a few important ways:

Self-selected membership. People join communities because they're genuinely interested in the topic — not because an algorithm showed them content.
Higher signal-to-noise ratio. Posts in a community are on-topic by design. A DTC founders community will have posts about e-commerce, not random viral content.
Concentrated expertise. Niche communities often contain domain experts, practitioners, and enthusiasts who post substantive content.
Engagement quality. Replies and interactions within communities tend to be more substantive than on the main feed.

For researchers, marketers, and developers, this makes Communities a high-quality data source for understanding a specific audience.

The Two Community Endpoints

SociaVault provides two endpoints for Twitter/X Communities:

Endpoint	Path	What it returns
Community Info	`/v1/scrape/twitter/community?id=COMMUNITY_ID`	Community metadata, member count, description
Community Tweets	`/v1/scrape/twitter/community-tweets?id=COMMUNITY_ID`	Posts from the community feed

Both require your API key in the x-api-key header.

Finding a Community ID

The community ID is in the URL when you visit a community on X. For example: https://twitter.com/i/communities/1234567890123456789

The ID is 1234567890123456789.

Endpoint 1: Community Info

The community endpoint returns metadata about a community — its name, description, member count, creation date, and admin information.

import requests

API_KEY = "your_api_key_here"
BASE_URL = "https://api.sociavault.com"

def get_community(community_id: str) -> dict:
    """Fetch metadata for a Twitter/X Community."""
    response = requests.get(
        f"{BASE_URL}/v1/scrape/twitter/community",
        params={"id": community_id},
        headers={"x-api-key": API_KEY},
        timeout=15
    )
    response.raise_for_status()
    return response.json()

# Example usage
community = get_community("1234567890123456789")
print(f"Name: {community['name']}")
print(f"Members: {community['member_count']:,}")
print(f"Description: {community['description']}")
print(f"Created: {community['created_at']}")

Sample response:

{
  "id": "1234567890123456789",
  "name": "DTC Founders",
  "description": "A community for direct-to-consumer brand founders to share learnings, ask questions, and connect.",
  "member_count": 8420,
  "post_count": 14203,
  "created_at": "2022-09-15T10:00:00Z",
  "admin": {
    "id": "987654321",
    "username": "dtcfounder",
    "display_name": "DTC Founder",
    "followers_count": 24500
  },
  "rules": [
    "No self-promotion without context",
    "Be respectful and constructive"
  ],
  "is_public": true
}

Endpoint 2: Community Tweets

The community tweets endpoint returns posts from the community feed, including the tweet text, author info, engagement metrics, and media.

def get_community_tweets(community_id: str, cursor: str = None) -> dict:
    """Fetch tweets from a Twitter/X Community."""
    params = {"id": community_id}
    if cursor:
        params["cursor"] = cursor

    response = requests.get(
        f"{BASE_URL}/v1/scrape/twitter/community-tweets",
        params=params,
        headers={"x-api-key": API_KEY},
        timeout=15
    )
    response.raise_for_status()
    return response.json()

# Example usage
data = get_community_tweets("1234567890123456789")
for tweet in data["tweets"]:
    print(f"@{tweet['author']['username']}: {tweet['text'][:100]}...")
    print(f"  Likes: {tweet['like_count']} | Replies: {tweet['reply_count']} | Reposts: {tweet['repost_count']}")

Sample tweet object:

{
  "id": "1789012345678901234",
  "text": "Just hit $100k MRR with our DTC skincare brand. Happy to share what worked — AMA in the comments.",
  "author": {
    "id": "111222333",
    "username": "skincarefounder",
    "display_name": "Sarah Chen",
    "followers_count": 3200,
    "verified": false,
    "profile_image_url": "https://pbs.twimg.com/profile_images/..."
  },
  "like_count": 284,
  "reply_count": 47,
  "repost_count": 31,
  "quote_count": 8,
  "posted_at": "2026-05-20T16:45:00Z",
  "media": [],
  "urls": [],
  "hashtags": [],
  "mentions": []
}

Paginating Through Community Tweets

import time

def get_all_community_tweets(community_id: str, max_tweets: int = 200) -> list:
    """Fetch all available tweets from a community."""
    all_tweets = []
    cursor = None

    while len(all_tweets) < max_tweets:
        data = get_community_tweets(community_id, cursor=cursor)
        tweets = data.get("tweets", [])
        all_tweets.extend(tweets)

        cursor = data.get("cursor")
        if not cursor:
            break

        time.sleep(0.5)  # Respect rate limits

    return all_tweets[:max_tweets]

Use Case 1: Niche Audience Research

Understanding what a niche audience talks about, what questions they ask, and what content resonates with them is foundational for content strategy, product development, and marketing.

from collections import Counter
import re

def analyze_community(community_id: str, tweet_count: int = 200):
    """Analyze a community's content patterns."""
    tweets = get_all_community_tweets(community_id, max_tweets=tweet_count)

    if not tweets:
        print("No tweets found.")
        return

    # Extract all words (simple tokenization)
    all_words = []
    for tweet in tweets:
        words = re.findall(r'\b[a-zA-Z]{4,}\b', tweet["text"].lower())
        all_words.extend(words)

    # Remove common stop words
    stop_words = {"that", "this", "with", "have", "from", "they", "will", "been",
                  "your", "what", "when", "just", "like", "more", "some", "also"}
    filtered_words = [w for w in all_words if w not in stop_words]

    # Top terms
    top_terms = Counter(filtered_words).most_common(20)

    # Top authors by engagement
    author_engagement = {}
    for tweet in tweets:
        username = tweet["author"]["username"]
        engagement = tweet["like_count"] + tweet["reply_count"] + tweet["repost_count"]
        author_engagement[username] = author_engagement.get(username, 0) + engagement

    top_authors = sorted(author_engagement.items(), key=lambda x: x[1], reverse=True)[:10]

    # High-engagement tweets
    top_tweets = sorted(tweets, key=lambda t: t["like_count"] + t["reply_count"], reverse=True)[:5]

    print(f"\n=== Community Analysis ({len(tweets)} tweets) ===")
    print(f"\nTop 20 terms: {', '.join(f'{w}({c})' for w, c in top_terms)}")
    print(f"\nTop authors by engagement:")
    for author, eng in top_authors:
        print(f"  @{author}: {eng} total engagement")
    print(f"\nTop 5 tweets by engagement:")
    for t in top_tweets:
        total = t["like_count"] + t["reply_count"]
        print(f"  [{total} eng] @{t['author']['username']}: {t['text'][:80]}...")

analyze_community("1234567890123456789", tweet_count=200)

Use Case 2: Competitor Community Monitoring

If a competitor has their own community, or if there's a community where your competitors are active, monitoring it gives you intelligence on their messaging, customer pain points, and product feedback.

def monitor_competitor_community(community_id: str, competitor_handles: list) -> list:
    """Find tweets from competitor accounts in a community."""
    tweets = get_all_community_tweets(community_id, max_tweets=500)
    competitor_tweets = [
        t for t in tweets
        if t["author"]["username"].lower() in [h.lower() for h in competitor_handles]
    ]

    print(f"Found {len(competitor_tweets)} tweets from competitors:")
    for tweet in competitor_tweets:
        print(f"\n@{tweet['author']['username']} ({tweet['posted_at'][:10]}):")
        print(f"  {tweet['text']}")
        print(f"  Likes: {tweet['like_count']} | Replies: {tweet['reply_count']}")

    return competitor_tweets

# Monitor a community for activity from specific competitors
monitor_competitor_community(
    "1234567890123456789",
    competitor_handles=["competitor1", "competitor2"]
)

Use Case 3: Lead Generation from Niche Communities

Communities are self-selected audiences. A community about e-commerce founders contains e-commerce founders. A community about real estate investing contains real estate investors. This makes them excellent sources for B2B lead generation.

def extract_leads(community_id: str, min_followers: int = 500, min_engagement: int = 10) -> list:
    """
    Extract potential leads from a community — active members with meaningful followings.
    """
    tweets = get_all_community_tweets(community_id, max_tweets=300)

    # Deduplicate by author
    seen_authors = set()
    leads = []

    for tweet in tweets:
        author = tweet["author"]
        username = author["username"]

        if username in seen_authors:
            continue
        seen_authors.add(username)

        # Filter by follower count and engagement
        total_engagement = tweet["like_count"] + tweet["reply_count"] + tweet["repost_count"]
        if author["followers_count"] >= min_followers and total_engagement >= min_engagement:
            leads.append({
                "username": username,
                "display_name": author["display_name"],
                "followers": author["followers_count"],
                "sample_tweet": tweet["text"][:100],
                "engagement": total_engagement,
                "profile_url": f"https://twitter.com/{username}"
            })

    # Sort by followers
    leads.sort(key=lambda x: x["followers"], reverse=True)

    print(f"Found {len(leads)} potential leads:")
    for lead in leads[:20]:
        print(f"  @{lead['username']} ({lead['followers']:,} followers) — {lead['sample_tweet']}...")

    return leads

leads = extract_leads("1234567890123456789", min_followers=1000, min_engagement=5)

Use Case 4: Content Ideas from High-Engagement Posts

The highest-engagement posts in a niche community tell you exactly what topics resonate with that audience. This is a goldmine for content strategy.

def find_content_ideas(community_id: str, top_n: int = 20) -> list:
    """Find the highest-engagement posts in a community for content inspiration."""
    tweets = get_all_community_tweets(community_id, max_tweets=300)

    # Score by engagement
    scored = [
        {
            "text": t["text"],
            "author": t["author"]["username"],
            "score": t["like_count"] * 2 + t["reply_count"] * 3 + t["repost_count"],
            "likes": t["like_count"],
            "replies": t["reply_count"],
            "url": f"https://twitter.com/{t['author']['username']}/status/{t['id']}"
        }
        for t in tweets
    ]

    top = sorted(scored, key=lambda x: x["score"], reverse=True)[:top_n]

    print(f"\nTop {top_n} content ideas from community:")
    for i, post in enumerate(top, 1):
        print(f"\n{i}. Score: {post['score']} | @{post['author']}")
        print(f"   {post['text'][:150]}")

    return top

find_content_ideas("1234567890123456789", top_n=10)

Frequently Asked Questions

What is a Twitter/X Community ID and how do I find it?

The community ID is the number in the URL when you visit a community: twitter.com/i/communities/COMMUNITY_ID. You can find communities by searching on X or by following links shared by community members.

Does this work for private communities?

No. The API only accesses publicly visible content. Private communities require membership to view, and their content is not accessible.

How many tweets can I fetch from a community?

This depends on the community's activity level and how far back the pagination goes. Active communities may have thousands of accessible tweets. The max_tweets parameter in the examples above lets you control how many you collect.

Can I get a list of all community members?

The community info endpoint returns the member count and admin information. A full member list is not available through the current API endpoints.

How is this different from searching Twitter for a hashtag?

Hashtag searches return any tweet using that hashtag — including spam, bots, and off-topic content. Community tweets are from self-selected members who joined specifically because of their interest in the topic. The signal quality is significantly higher.

What's the rate limit for these endpoints?

Rate limits depend on your SociaVault plan. Free tier: 100 requests/day. Starter: 1,000/day. Pro: 10,000/day. Each request returns one page of tweets; use the cursor for pagination.

Get started with a free account at sociavault.com — 50 free credits, no credit card required.

Twitter/X Community Scraper: Extract Posts and Members from Any Community

Twitter/X Community Scraper: Extract Posts and Members from Any Community

What Are Twitter/X Communities?

The Two Community Endpoints

Finding a Community ID

Endpoint 1: Community Info

Endpoint 2: Community Tweets

Paginating Through Community Tweets

Use Case 1: Niche Audience Research

Use Case 2: Competitor Community Monitoring

Use Case 3: Lead Generation from Niche Communities

Use Case 4: Content Ideas from High-Engagement Posts

Frequently Asked Questions

What is a Twitter/X Community ID and how do I find it?

Does this work for private communities?

How many tweets can I fetch from a community?

Can I get a list of all community members?

How is this different from searching Twitter for a hashtag?

What's the rate limit for these endpoints?

Found this helpful?

Ready to Try SociaVault?