You want to know how a brand is being talked about in China. The catch: the conversation isn't on one platform. It's split across Weibo (microblog), RedNote / Xiaohongshu (product & lifestyle), Bilibili (video), Douban (long-form reviews) and Xueqiu (retail-investor chatter). So you wire up five scrapers — and that's where the real work starts.
The part nobody warns you about
Pulling each platform is the easy 20%. The other 80% is turning five raw feeds into one trustworthy dataset:
Five completely different shapes. A "post" on Weibo, a "note" on RedNote, a "video" on Bilibili, a "review" on Douban, a "cashtag comment" on Xueqiu — different fields, different engagement metrics, different date formats. Normalizing them into one table is a chore you redo every time a platform tweaks its response.
Duplicates everywhere. A KOL announces a collab and it's reposted across three platforms; creators cross-post the same clip. Count naively and your "mention volume" is inflated 2–3×, which quietly ruins every trend line and alert you build on top of it.










