Twitter/X Community Scraper: Extract Posts and Members from Any Community
TL;DR: Twitter/X Communities are niche groups where highly engaged users discuss specific topics — making them goldmines for audience research, lead generation, and competitor monitoring. The SociaVault API provides two endpoints to extract community metadata and tweets, giving you structured access to this data without the complexity of the official X API.
Twitter/X Communities are one of the platform's most underutilized data sources. Unlike the main feed — which is algorithmically curated and noisy — Communities are self-selected groups of people who are genuinely interested in a specific topic. A community about indie game development, DTC e-commerce, or quantitative trading contains exactly the kind of engaged, niche audience that's valuable for research, outreach, and competitive intelligence.
The problem: the official X API has become expensive and restrictive, and Communities data in particular is hard to access programmatically. This guide shows you how to use the SociaVault API to extract community data and tweets with clean Python code.
What Are Twitter/X Communities?
Twitter/X Communities are invite-based or open groups organized around a specific topic. Members can post within the community, and those posts are visible to other members and (for public communities) to anyone on the platform.
Communities differ from regular Twitter in a few important ways:
- Self-selected membership. People join communities because they're genuinely interested in the topic — not because an algorithm showed them content.
- Higher signal-to-noise ratio. Posts in a community are on-topic by design. A DTC founders community will have posts about e-commerce, not random viral content.
- Concentrated expertise. Niche communities often contain domain experts, practitioners, and enthusiasts who post substantive content.
- Engagement quality. Replies and interactions within communities tend to be more substantive than on the main feed.
For researchers, marketers, and developers, this makes Communities a high-quality data source for understanding a specific audience.
The Two Community Endpoints
SociaVault provides two endpoints for Twitter/X Communities:
| Endpoint | Path | What it returns |
|---|---|---|
| Community Info | /v1/scrape/twitter/community?id=COMMUNITY_ID | Community metadata, member count, description |
| Community Tweets | /v1/scrape/twitter/community-tweets?id=COMMUNITY_ID | Posts from the community feed |
Both require your API key in the x-api-key header.
Finding a Community ID
The community ID is in the URL when you visit a community on X. For example:
https://twitter.com/i/communities/1234567890123456789
The ID is 1234567890123456789.
Endpoint 1: Community Info
The community endpoint returns metadata about a community — its name, description, member count, creation date, and admin information.
import requests
API_KEY = "your_api_key_here"
BASE_URL = "https://api.sociavault.com"
def get_community(community_id: str) -> dict:
"""Fetch metadata for a Twitter/X Community."""
response = requests.get(
f"{BASE_URL}/v1/scrape/twitter/community",
params={"id": community_id},
headers={"x-api-key": API_KEY},
timeout=15
)
response.raise_for_status()
return response.json()
# Example usage
community = get_community("1234567890123456789")
print(f"Name: {community['name']}")
print(f"Members: {community['member_count']:,}")
print(f"Description: {community['description']}")
print(f"Created: {community['created_at']}")
Sample response:
{
"id": "1234567890123456789",
"name": "DTC Founders",
"description": "A community for direct-to-consumer brand founders to share learnings, ask questions, and connect.",
"member_count": 8420,
"post_count": 14203,
"created_at": "2022-09-15T10:00:00Z",
"admin": {
"id": "987654321",
"username": "dtcfounder",
"display_name": "DTC Founder",
"followers_count": 24500
},
"rules": [
"No self-promotion without context",
"Be respectful and constructive"
],
"is_public": true
}
Endpoint 2: Community Tweets
The community tweets endpoint returns posts from the community feed, including the tweet text, author info, engagement metrics, and media.
def get_community_tweets(community_id: str, cursor: str = None) -> dict:
"""Fetch tweets from a Twitter/X Community."""
params = {"id": community_id}
if cursor:
params["cursor"] = cursor
response = requests.get(
f"{BASE_URL}/v1/scrape/twitter/community-tweets",
params=params,
headers={"x-api-key": API_KEY},
timeout=15
)
response.raise_for_status()
return response.json()
# Example usage
data = get_community_tweets("1234567890123456789")
for tweet in data["tweets"]:
print(f"@{tweet['author']['username']}: {tweet['text'][:100]}...")
print(f" Likes: {tweet['like_count']} | Replies: {tweet['reply_count']} | Reposts: {tweet['repost_count']}")
Sample tweet object:
{
"id": "1789012345678901234",
"text": "Just hit $100k MRR with our DTC skincare brand. Happy to share what worked — AMA in the comments.",
"author": {
"id": "111222333",
"username": "skincarefounder",
"display_name": "Sarah Chen",
"followers_count": 3200,
"verified": false,
"profile_image_url": "https://pbs.twimg.com/profile_images/..."
},
"like_count": 284,
"reply_count": 47,
"repost_count": 31,
"quote_count": 8,
"posted_at": "2026-05-20T16:45:00Z",
"media": [],
"urls": [],
"hashtags": [],
"mentions": []
}
Paginating Through Community Tweets
import time
def get_all_community_tweets(community_id: str, max_tweets: int = 200) -> list:
"""Fetch all available tweets from a community."""
all_tweets = []
cursor = None
while len(all_tweets) < max_tweets:
data = get_community_tweets(community_id, cursor=cursor)
tweets = data.get("tweets", [])
all_tweets.extend(tweets)
cursor = data.get("cursor")
if not cursor:
break
time.sleep(0.5) # Respect rate limits
return all_tweets[:max_tweets]
Use Case 1: Niche Audience Research
Understanding what a niche audience talks about, what questions they ask, and what content resonates with them is foundational for content strategy, product development, and marketing.
from collections import Counter
import re
def analyze_community(community_id: str, tweet_count: int = 200):
"""Analyze a community's content patterns."""
tweets = get_all_community_tweets(community_id, max_tweets=tweet_count)
if not tweets:
print("No tweets found.")
return
# Extract all words (simple tokenization)
all_words = []
for tweet in tweets:
words = re.findall(r'\b[a-zA-Z]{4,}\b', tweet["text"].lower())
all_words.extend(words)
# Remove common stop words
stop_words = {"that", "this", "with", "have", "from", "they", "will", "been",
"your", "what", "when", "just", "like", "more", "some", "also"}
filtered_words = [w for w in all_words if w not in stop_words]
# Top terms
top_terms = Counter(filtered_words).most_common(20)
# Top authors by engagement
author_engagement = {}
for tweet in tweets:
username = tweet["author"]["username"]
engagement = tweet["like_count"] + tweet["reply_count"] + tweet["repost_count"]
author_engagement[username] = author_engagement.get(username, 0) + engagement
top_authors = sorted(author_engagement.items(), key=lambda x: x[1], reverse=True)[:10]
# High-engagement tweets
top_tweets = sorted(tweets, key=lambda t: t["like_count"] + t["reply_count"], reverse=True)[:5]
print(f"\n=== Community Analysis ({len(tweets)} tweets) ===")
print(f"\nTop 20 terms: {', '.join(f'{w}({c})' for w, c in top_terms)}")
print(f"\nTop authors by engagement:")
for author, eng in top_authors:
print(f" @{author}: {eng} total engagement")
print(f"\nTop 5 tweets by engagement:")
for t in top_tweets:
total = t["like_count"] + t["reply_count"]
print(f" [{total} eng] @{t['author']['username']}: {t['text'][:80]}...")
analyze_community("1234567890123456789", tweet_count=200)
Use Case 2: Competitor Community Monitoring
If a competitor has their own community, or if there's a community where your competitors are active, monitoring it gives you intelligence on their messaging, customer pain points, and product feedback.
def monitor_competitor_community(community_id: str, competitor_handles: list) -> list:
"""Find tweets from competitor accounts in a community."""
tweets = get_all_community_tweets(community_id, max_tweets=500)
competitor_tweets = [
t for t in tweets
if t["author"]["username"].lower() in [h.lower() for h in competitor_handles]
]
print(f"Found {len(competitor_tweets)} tweets from competitors:")
for tweet in competitor_tweets:
print(f"\n@{tweet['author']['username']} ({tweet['posted_at'][:10]}):")
print(f" {tweet['text']}")
print(f" Likes: {tweet['like_count']} | Replies: {tweet['reply_count']}")
return competitor_tweets
# Monitor a community for activity from specific competitors
monitor_competitor_community(
"1234567890123456789",
competitor_handles=["competitor1", "competitor2"]
)
Use Case 3: Lead Generation from Niche Communities
Communities are self-selected audiences. A community about e-commerce founders contains e-commerce founders. A community about real estate investing contains real estate investors. This makes them excellent sources for B2B lead generation.
def extract_leads(community_id: str, min_followers: int = 500, min_engagement: int = 10) -> list:
"""
Extract potential leads from a community — active members with meaningful followings.
"""
tweets = get_all_community_tweets(community_id, max_tweets=300)
# Deduplicate by author
seen_authors = set()
leads = []
for tweet in tweets:
author = tweet["author"]
username = author["username"]
if username in seen_authors:
continue
seen_authors.add(username)
# Filter by follower count and engagement
total_engagement = tweet["like_count"] + tweet["reply_count"] + tweet["repost_count"]
if author["followers_count"] >= min_followers and total_engagement >= min_engagement:
leads.append({
"username": username,
"display_name": author["display_name"],
"followers": author["followers_count"],
"sample_tweet": tweet["text"][:100],
"engagement": total_engagement,
"profile_url": f"https://twitter.com/{username}"
})
# Sort by followers
leads.sort(key=lambda x: x["followers"], reverse=True)
print(f"Found {len(leads)} potential leads:")
for lead in leads[:20]:
print(f" @{lead['username']} ({lead['followers']:,} followers) — {lead['sample_tweet']}...")
return leads
leads = extract_leads("1234567890123456789", min_followers=1000, min_engagement=5)
Use Case 4: Content Ideas from High-Engagement Posts
The highest-engagement posts in a niche community tell you exactly what topics resonate with that audience. This is a goldmine for content strategy.
def find_content_ideas(community_id: str, top_n: int = 20) -> list:
"""Find the highest-engagement posts in a community for content inspiration."""
tweets = get_all_community_tweets(community_id, max_tweets=300)
# Score by engagement
scored = [
{
"text": t["text"],
"author": t["author"]["username"],
"score": t["like_count"] * 2 + t["reply_count"] * 3 + t["repost_count"],
"likes": t["like_count"],
"replies": t["reply_count"],
"url": f"https://twitter.com/{t['author']['username']}/status/{t['id']}"
}
for t in tweets
]
top = sorted(scored, key=lambda x: x["score"], reverse=True)[:top_n]
print(f"\nTop {top_n} content ideas from community:")
for i, post in enumerate(top, 1):
print(f"\n{i}. Score: {post['score']} | @{post['author']}")
print(f" {post['text'][:150]}")
return top
find_content_ideas("1234567890123456789", top_n=10)
Frequently Asked Questions
What is a Twitter/X Community ID and how do I find it?
The community ID is the number in the URL when you visit a community: twitter.com/i/communities/COMMUNITY_ID. You can find communities by searching on X or by following links shared by community members.
Does this work for private communities?
No. The API only accesses publicly visible content. Private communities require membership to view, and their content is not accessible.
How many tweets can I fetch from a community?
This depends on the community's activity level and how far back the pagination goes. Active communities may have thousands of accessible tweets. The max_tweets parameter in the examples above lets you control how many you collect.
Can I get a list of all community members?
The community info endpoint returns the member count and admin information. A full member list is not available through the current API endpoints.
How is this different from searching Twitter for a hashtag?
Hashtag searches return any tweet using that hashtag — including spam, bots, and off-topic content. Community tweets are from self-selected members who joined specifically because of their interest in the topic. The signal quality is significantly higher.
What's the rate limit for these endpoints?
Rate limits depend on your SociaVault plan. Free tier: 100 requests/day. Starter: 1,000/day. Pro: 10,000/day. Each request returns one page of tweets; use the cursor for pagination.
Get started with a free account at sociavault.com — 50 free credits, no credit card required.
Found this helpful?
Share it with others who might benefit
Ready to Try SociaVault?
Start extracting social media data with our powerful API. No credit card required.