The health of a country’s economy is hiding in plain tweets


Bloomberg News July 31, 2019

This article was written by Jeff Kearns and Saijel Kishan. It appeared first on the Bloomberg Terminal.

Apurv Jain has used some unconventional data to predict the course of U.S. employment: 1.2 billion tweets and 830 million web searches.

As a Microsoft Corp. researcher in 2014, he was inspired by a comment from former Federal Reserve Chair Janet Yellen that labor-market conditions might be worse than official statistics indicated. So he began looking for unofficial ones.

He analyzed tweets from 230,000 Twitter users who’d lost or gained work to understand the sentiment behind the numbers. He also has discovered that scouring six years of web-search queries allows him to predict revisions in one of the world’s most critical indicators: the U.S. Labor Department’s monthly report on nonfarm payrolls.

It’s all part of the fast-growing world of alternative data, which “can provide details about the economic narrative of our country that the existing government data simply cannot,” said Jain, 41, who is now a visiting researcher at Harvard Business School. He presented his findings in March at a New York conference on artificial intelligence and data science in trading.

For years, major banks, hedge funds and other financial players have been racing to collect and crunch all sorts of statistics to gain an edge in the markets. Now the competition is increasing: Spending in the field will exceed $7 billion next year compared with $4.3 billion in 2017, estimates Opimas LLC, a financial-industry consultant.

Providers are proliferating, too, to more than 400, according to industry group One that attended the New York event was Thasos Group, which sells mobile-phone-location analytics to hedge funds tracking the health of retail chains and shopping malls.

“Malls have turned out to be a very good proxy for retail sales,” said founder Greg Skibiski, who noted the information predicted December’s unexpected plunge to a nine-year low — and January’s rebound.

Mobile phones and apps are providing geolocation data for a wide variety of customers, including oil traders, who can track the number of people working at a refinery to estimate production and gain insight into staffing changes.

Fidelity Investments, Point72 Asset Management and Neuberger Berman are among money managers buying alternative data and employing data scientists and quants seeking to gain an advantage in predicting market moves.

Far from Wall Street, officials in Washington are trying to keep up. Government agencies historically have relied on surveys of businesses and households to construct important indicators. But some “don’t like to fill out surveys,” Jain said. “So they just don’t respond to them.”

Supplemental resources

Other sources can supplement the official reports that remain the bedrock of economic measurement — even though the new information must carefully be vetted to ensure continuity.

Two conferences in March in Washington explored the trend as government, university and Fed economists debated the best ways to apply new measurements to old indicators.

“Traditional methods of collecting data from businesses and households face increasing challenges,” said one report co-authored by U.S. Census Bureau Deputy Director Ron Jarmin and presented at a National Bureau of Economic Research gathering. “These include declining response rates to surveys, increasing costs to traditional modes of data collection and the difficulty of keeping pace with rapid changes in the economy. The digitization of virtually all market transactions offers the potential for re-engineering key national economic indicators.”

Natural-Language processing

Bank of England research economist Arthur Turrell explained how the bank has used natural-language processing with 15 million U.K. help-wanted ads to measure labor-market demand by region and occupation — details company-based surveys don’t typically collect.

During the other event, at the Brookings Institution, researchers from the Commerce Department’s Bureau of Economic Analysis and the University of North Carolina showed that property-market data from Zillow Group Inc. offer a better read on home prices than official figures.

Crystal Konny, chief of the Bureau of Labor Statistics unit that produces the Consumer Price Index, discussed plans to replace a significant portion of the critical indicator with alternative data sources such as information scraped from the web or downloaded directly from companies rather than relying largely on surveys or employees visiting stores in person.

She described the purchase from an insurance carrier of a small data set covering medical claims in the Chicago area as part of the bureau’s research. “Currently, the medical-care major group has the worst response rate of all major groups in the CPI, and of that major group, ‘physicians’ services’ and ‘hospital services’ have the highest relative importances.”

Unexpected shift

Konny cautioned there are dangers in the new methods, such as importing price data directly from companies, citing an unexpected shift by a department store she referred to as CorpX.

“They changed their database structure, and they brought each store online at a different time,” she told attendees — which included Yellen. “It totally ruined any continuity and analysis of the history of data and the index calculation, so we had to start all over.”

Sign up for our newsletter