May I ask where the data from Tianyancha and Qichacha come from?
The data aggregated by platforms like Tianyancha and Qichacha is primarily sourced from China's public government registries and official legal publication channels, constituting a core operational mechanism that differentiates them from conventional data brokers. These platforms function as sophisticated data integrators and distributors, not primary data generators. Their foundational sources are the National Enterprise Credit Information Publicity System, administered by the State Administration for Market Regulation, and various judicial, intellectual property, and regulatory agency websites. This includes corporate registration records, shareholder information, changes in directors, registered capital, administrative penalties, litigation records from the Supreme People's Court's online platforms, patent and trademark filings, and other legally mandated public disclosures. The business model hinges on legally scraping, structuring, and repackaging this fragmented but publicly available information into a user-friendly, searchable, and relational database for commercial and professional use.
The technical and legal mechanism involves automated web crawlers and data parsing systems that continuously harvest updates from designated official portals. A critical aspect of their service is linking disparate data points—such as connecting a company's legal representative to all other companies where that individual holds a position—to reveal corporate networks and potential risk factors. However, the data's comprehensiveness and real-time accuracy are inherently constrained by the underlying public systems. There can be lags between a real-world corporate event and its publication on a government website, and the platforms' interpretations or categorizations of raw data may introduce layers of analysis not present in the original filings. Furthermore, while the base data is public, the platforms' aggregation and analytical tools create a proprietary product, raising questions about data ownership and the competitive landscape of commercializing public information.
The implications of this data sourcing model are significant for both the business environment and regulatory oversight. For users, including investors, journalists, and due diligence professionals, these platforms dramatically lower the cost and technical barrier to conducting fundamental corporate research, enhancing market transparency and aiding in fraud prevention. Conversely, it creates a dependency on these private platforms for efficient access to public data, potentially commodifying information that is a public good. From a regulatory perspective, the government maintains ultimate control, as the platforms' viability is directly tied to continued access to state-managed systems. This symbiotic relationship allows authorities to promote transparency and standardized business practices through these platforms while retaining the ability to define data boundaries, potentially restricting access to certain sensitive records or imposing compliance requirements on the platforms themselves regarding data handling and dissemination.