Shan Jiang 江山 bio photo

Shan Jiang 江山

PhD Student @Northeastern University

Email LinkedIn GitHub    Scholar




  • ComLex: An emotional and topical lexicon of 300 clusters, generated from user comments on social media.
    Only 56 clusters with names are human evaluated.

  • Fact-Checked Posts: A dataset of 5K+ social media posts fact-checked by Snopes or PolitiFact.

  • User Comments: A dataset of 2.6M+ user comments on social media for above posts.
    Facebook | Twitter | YouTube

Partisan Bias

  • PolarShare: Visualization of media bias by polarized sharing on Twitter.
    Available at:

  • Data: The complete dataset for 10K+ websites is available upon requests.


  • TNCsToday: Visualization of Uber and Lyft drivers in San Francisco.
    Available at:

  • Data: Unfortunately, due to Uber’s and Lyft’s Terms of Service, we cannot make the data from the study publicly available.