Alternative Data in China

We provide an overview of the growth of alternative data in China and the need to use local proxies instead of more established global alternative data providers.


Does trading in Chinese financial and commodity markets provide attractive return and diversification characteristics for a systematic manager, when compared with commonly traded global markets?
 

Read Article

As China opens its A-shares market to foreign investors, it has now become the perfect breeding ground for alternative data strategies.

Introduction

As the world’s second-largest economy with roughly 1.3 billion tech savvy consumers, high rates of penetration for mobile internet and rapidly increasing disposable incomes, China is an enticing prospect for equity investors. Indeed, for some, it’s the final frontier, a vast untapped market with the divergence to drive alpha returns and the economic growth to support beta. But these characteristics don’t just make China an attractive equity market – they also mean that China generates a wealth of alternative data. As China opens its A-shares market to foreign investors, it has now become the perfect breeding ground for alternative data strategies.

In this article, we provide an overview of the growth of alternative data in China and the need to use local proxies instead of more established global alternative data providers.

Big, Big, Bigger

The first thing to note is the sheer scale of the growth of the Chinese alternative data market. The size of the Chinese big data1 market has grown by nearly 600% since 2015 (Figure 1). Indeed, on data scouting platform Neudata, there are now more than 1,100 China-specific data sets. Likewise, the number of China-related alternative data providers has also grown rapidly over the past few years (Figure 2), showing the symbiosis of the two: the more that the size of the data market increases, the bigger the opportunity for alternative data providers.

Problems loading this infographic? - Please click here

Source: Citi iResearch; as of 2020.

Problems loading this infographic? - Please click here

Source: Neudata; as of February 2021.

What makes this scale important is that it is somewhat lopsided compared with the size and sophistication of the Chinese equity market. China generates data in line with its status as the world’s second-largest economy, creating datasets which are large and robust enough to have predictive power. In contrast, its equity markets have yet to reach the critical mass of institutional investors required to erode alpha, even though the data required to run sophisticated strategies is now plentiful, in our view.

Investors who use alternative data signals based on Google search trends or Wallstreetbets in their global portfolios will need to consider a local proxy to generate similar insights in the Chinese market.

Local Versus Global

However, all the data in the world isn’t enough if investors are unable to understand which has predictive power and which does not. The most important factor to understand is the unique way in which data is generated in China: the 1.3 billion Chinese consumers do not generate data via Google, Twitter or Wallstreetbets in the same way that consumers do in the West. Instead, there is usually a Chinese proxy which fulfils the same function, generating equivalent types of alternative data (Figure 3). As a result, investors who use alternative data signals based on Google search trends or Wallstreetbets in their global portfolios will need to consider a local proxy to generate similar insights in the Chinese market.

Figure 3. Global Versus Local Data Generators

Source: Man Group; as of May 2021.

Case Studies

To give an example of how this data can be applied, consider a Chinese hog producer listed in the CSI300 Index. A company with its characteristics is unlikely to appear on Wallstreetbets or in popular global job websites. However, by using local equivalents, we can get near real-time insights into the company’s activity through local information. In 2019, posts on China’s internet stock message board website Guba showed increased retail interest towards the stock and accurately predicted a frenzy of buying; likewise, rising numbers of job posts for the company indicated growth throughout early 2020, despite the ongoing pandemic (Figures 4-5).

Problems loading this infographic? - Please click here

Source: DataYes, Man Group; as of October 2020.

Problems loading this infographic? - Please click here

Source: DataYes, Man Group as of September 2020.

Having a focus on local data can also provide an insight into changing consumer tastes. Data from Tmall (a business-to-consumer sales platform that is a Chinese Amazon-equivalent) showed how Chinese consumers shifted their purchases from international sportwear brands such as Nike and Adidas to more domestic brands such as Anta and Li Ning. Again, this insight (and any subsequent effect on stock prices) wouldn’t be available using the normal alternative data channels, which focus on global consumption.

Problems loading this infographic? - Please click here

Source: Yipit, Man Group; as of 31 May 2021.

Similarly, industry-wide trends can be monitored effectively by using Chinese alternative data. In this case, we use data from Ctrip and Qunar, two travel apps which cater to Chinese consumers. Figure 7 shows the number of daily active users, total time spent on the app and time per user. As we would expect, usage fell dramatically with the onset of the coronavirus. However, by monitoring ongoing usage, investors are able to observe the extent to which Chinese consumers have retained interest in travelling, monitoring its rise and fall in line with changing restrictions and the progress of new variants and cases.

Figure 7. Chinese Travel Apps

Problems loading this infographic? - Please click here

Source: Jiguang, Man Group; as of January 2021.

To handle the data effectively, firms must account for four factors: local knowledge; language skills; local vendors; and local regulation.

Considerations When Using Chinese Alternative Data

So, Chinese alternative data can provide investors with unique insights into the Chinese equity market. However, to handle the data effectively, firms must account for four factors:

1. Local knowledge: Local knowledge is required to know where valuable nuggets of data can be found, and perhaps more importantly, to judge data quality and vendor methodologies. This knowledge can be quite nuanced, such as knowing the difference between Alibaba’s Tmall versus Pinduoduo when using e-commerce data, or the terms consumers use when searching for luxury goods;

2. Language skills: These are required across a variety of touch points in the alternative data life cycle, from reaching out to a small local vendor to understanding data dictionaries and error messages to, ultimately, the understanding the data itself. Depending on what kind of analysis the user intends to perform, analysts may also benefit from a tech stack that can support a variety of Chinese dialects for natural language processing;

3. Local vendors: Some interesting, smaller vendors may not be as experienced as their global counterparts, and may have different standards when it comes to data and compliance. In light of the fast-evolving space, vendors also risk becoming obsolete. Analysts must therefore have a deep understanding of the local vendor space, keeping abreast of both local trends and best practice;

4. Local regulation: The use of Chinese alternative data is subject to an evolving legal and regulatory regime, including the Data Security Law which will come into force in September 2021. Practitioners must be aware of regulation which covers cross-border transfer of certain data types.

These challenges indisputably add to complexity and barriers to entry when exploring alternative data in China. As more data and vendors enter the space, those firms who are able to invest the time and resources, both in terms of skilled analysts and data platforms, give themselves the best chance of extracting signals from the everincreasing noise.

Conclusion

China is already one of the largest markets in the world when it comes to equities.2 It also remains an opportunity-rich market, which gives rise to a growing demand for data. While some global alternative datasets may be accurate for the onshore market, as more and more Chinese data is created, local alternative data is becoming an increasingly important source of insight.

To use this new data well, investors should seek to adapt their processes to take account of the different way that Chinese data is generated: partnering with local data providers, looking at unfamiliar but popular websites instead of those more common globally, and ensuring that technology stacks and researchers are able to handle the nuances of the new Chinese data.

 

1. Please note that we use alternative data and big data interchangeably in the article.
2. https://www.man.com/maninstitute/hot-commodity