Research dataset release for stock index prediction paper
This dataset is the news used for predicting Chinese Stock Index from 1 Jan 2015 to 14 Feb 2017. The dataset is used in paper:
Chen, Weiling, Chai Kiat Yeo, Chiew Tong Lau, and Bu Sung Lee. “Leveraging social media news to predict stock index movement using RNN-Boost.” submitted to Data & Knowledge Engineering.
training.csv includes all the news we have collected from the official accounts of Sina Weibo for prediction of CSI. mid indicates the unique id of the Weibo and uid indicates the user id of the author. It is very easy to get the full content of the Weibo using the api provided by Sina: http://open.weibo.com/wiki/2/statuses/show