India did not benefit from the first two industrial revolutions (IR), and missed the bus for the third. Now, it has a chance to lead the next IR fuelled by data science and artificial intelligence (AI). The forthcoming budget is an opportunity for Finance Minister Nirmala Sitharaman to unleash an employment and productivity multiplier through generous funding of data science, funnelled to research and commercial data centres.
Data science is the source of a new form of wealth, Big Data (BD) — the large volume of information generated by the masses using online platforms and the Internet of Things (IoT), or interconnected electronic gadgets. Gadgets capture information related to activities carried out and the people involved. AI is then used to analyse the information to reveal trends, predict future patterns of interest to businesses and sciences.
Unlike one-size-fits-all TV ads, internet commercials are customised to the economic status of target groups. Insurance and health products can be customised to the lifestyle of individuals and their past.
“Big Data can vastly improve precision, thereby reducing fiscal costs of welfare schemes. AI can help assess impacts of policy interventions.”
— Big Benifits
Benefits from a data-based economic world order will depend on the quality of human resources and digital infrastructure. India has the required ingredients — the largest number of mobile phone users, highest internet data consumption and technical know-how. But its public policy challenges are unique. The state of health and education leaves much to be desired, agricultural productivity is low, targeting of welfare schemes remains poor. In these areas, Big Data and artificial intelligence can be used with amplified impact.
Satellite and drone imageries, for instance, can help identify poor and backward habitats. Mobile networks data can provide valuable insights on mobility, socioeconomic interactions and consumption patterns at local and national levels. Together with surveys, BD can vastly improve precision, thereby reducing fiscal costs of welfare schemes. AI can help assess impacts of policy interventions.
BD analytics can be used for India’s Covid-19 vaccination strategy to optimise herd immunity. BD is proving to be very effective in reducing diagnostics and treatment costs by enabling doctors to make real-time data-driven decisions, increasing precision of medical intervention.
GoI’s electronic health records (EHR) is a welcome initiative. Information contained in EHRs will improve quality of treatment, which, combined with AI, will enable timely detection of diseases and medical intervention. Coverage of EHR must be expanded through telemedicine. Extensive mobile penetration provides the required infrastructure to deliver low-cost medical services to hitherto neglected geographies and social groups.
Electronic education records can help identify patterns in learning challenges from different groups. BD analytics of these records can be used to monitor performance of teachers and schools in terms of dropouts, educational attainments and employability of their students. In agriculture, too, high-resolution satellite and drone data can precisely estimate crop damage by pests or calamities. This, in turn, can help expand scope of crop insurance by better handling of damage claims by farmers.
BD’s informational worth is much more than the sum of its sources. Predictive power and associated benefits increase exponentially with the size and diversity of data. No other country can match India’s diversity — economic, geographic, climatic, genomic and microbiological. But there are serious obstacles.
India’s data infrastructure is awfully inadequate. Email services of leading academic and research institutes are hosted on Gmail servers. This means alot of scientific and strategic data is stored beyond its national boundaries. At a bare minimum, India needs indigenous capabilities for hosting video conferencing and email services of government and other important organisations.
Making diverse datasets mutually compatible is also a serious challenge. Health data lies in bits and bytes in numerous hospitals, clinics and labs. It needs to be digitised, made compatible and interoperable across medical institutions. Same for data on education, crimes, logistics, genomes, etc.
Integration of dispersed data would require cooperation among numerous public and private entities. It is welcome that the Personal Data Protection Bill, 2019, requires sharing of non-proprietary data with other Indian entities.
But those putting in effort and money to collect data can use many tricks up their sleeves to protect their domain data. Also, private entities can’t be expected to invest heavily in storage and processing of social purpose data and networks.
A national data pool is the way forward. All data generators should be mandated to share all impersonalised data with the pool. It will boost productivity of the economy, nurture the startup ecosystem and generate employment.
Budget 2021 should fund data infrastructure and AI skills. Equally important is a policy to regulate collection and sharing of national data.
(Umapathy is director, Indian Institute of Science Education and Research (IISER), Bhopal, and Singh is professor, Delhi School of Economics)