The Asian giant is building on its lead of stored data, distancing itself from the U.S., new analyses shows.
TO UNDERSTAND HOW rapidly the global datasphere – the amount of data around the world that is created, captured, and replicated online – is growing, consider one word: zettabyte.
In the language of people who measure data in the digital space, a zettabyte is equal to 1 trillion gigabytes. Put another way, eight years ago The Guardian described the amount of data contained in one zettabyte as being equal to 250 billion DVDs. Alternatively, Cisco’s Taru Khurana says that if each gigabyte in a zettabyte were a brick, one zettabyte could build 258 Great Walls of China.
And now, new research shows that the total amount of online data created, replicated and stored will increase from 33 zettabytes in 2018 to 175 zettabytes by 2025.
That growth will be experienced in all countries, but will happen at different rates around the world. China already overtook the U.S. as the country creating, replicating and storing the most data, according to reports from the International Data Corporation, or IDC, a marketing intelligence and advisory firm in Massachusetts, and Seagate, a California-based data storage company.
By 2025, China will be responsible for storing 27.8 percent of global online data, while the U.S. share will be 17.5 percent, a drop from its 21 percent share in 2018 and a reflection of how U.S. data growth will occur at a much slower rate than in other regions in the world.
The findings are significant, coming as global worries grow about online privacy and Beijing’s approach to online regulation and censorship is exported abroad. Yet not all data are created equal, say experts.[
“Data is not a fungible resource like oil,” says Paul Scharre, senior fellow and director of the technology and national security program at the bipartisan Center for a New American Security. “Data can be used to train algorithms for tasks specific to that data, but not other tasks. For example, facial recognition algorithms trained on Chinese citizens may do very well at recognizing Chinese faces but may fare poorly in Africa or Europe.”
In the U.S., the amount of data will rise from 6.9 zettabytes in 2018 to 30.6 zettabytes in 2025, according to the IDC-Seagate reports. That growth will be driven by increased amounts of metadata, video surveillance and the connection of devices to the internet, the so-called internet of things. Growth in entertainment-driven data will slow down, and productivity data will accelerate, according to the same source.
The estimates shouldn’t cause much concern in the U.S., as America already is far more advanced than other nations when it comes to data, say experts.
“Everything is growing massively, but just some (amounts of data owned by countries) are growing faster than others,” says Jeff Fochtman, vice president of global marketing at Seagate. “And that’s sometimes because they need to catch up.”
The United States is a global leader in public cloud storage, with Amazon, Microsoft and Google controlling the market for virtual machines, applications or storage available to the general public online. If this will change or be more balanced, it will be because these companies might look at expanding into other parts of the world.
“We see some balancing act(s) beginning to take place in the world where that early leadership in public cloud for the U.S. will be caught up to sometimes by those same providers, but building out in other regions,” Fochtman says.
Data storage growth in China is outpacing the global growth by an annual average of 3 percent.
“In 2018, China’s datasphere was 23.4 percent of the global datasphere, or 7.6 zettabytes,” the IDC-Seagate report notes. “This will grow to 48.6 zettabytes in 2025 and emerge as the largest datasphere in the world, at 27.8 percent of the global datasphere.”
Similar to patterns elsewhere around the world, the China datasphere will grow because of an increase of metadata, entertainment-related data, cloud storage, devices connected to the internet and edge computing (a computing structure that brings memory and computing power closer to where it’s needed, unlike concentrating it in a data center).
China’s lead in data storage is no surprise, says Fochtman, since the Asian giant is intensifying its competition with the U.S. and other countries in technology sectors.
At the same time, the combined Europe, the Middle East and Africa (EMEA) sector is surging more slowly than the overall pace of global datasphere growth. By 2025, the amount of data tied to countries in the far-ranging region is expected to decrease from 28.8 percent in 2018 to 27.6 percent. The type of data most popular in the EMEA sector will also shift from entertainment to productivity and the connection of devices to the internet.
Overall, about a third of global datasphere growth will be driven by increased video surveillance, the connection of devices to the internet, metadata and entertainment. “For example, user-created and user-consumed online video like YouTube is one of the top five fastest-growing segments of data creation,” according to a separate IDC study that examined that region.
Scharre says he is less concerned about a “data gap” between China and the U.S. than about other factors that contribute to innovation, such as human capital.
“There is a fierce competition for talent in the AI (artificial intelligence) sector and China is working hard to train additional AI researchers internally and aggressively recruit top experts from abroad,” he says. “Meanwhile, the number of foreign students applying to study in the United States declined over the past two years and the Trump administration has increased the rate of H-1B visa denials.”
While the world data map might soon be different, what’s important is to not lose focus on the purpose of data and its benefits for a global purpose, other experts say.
“It’s important to think about what problems we are trying to solve with data and how this layout could hurt us from solving these problems,” Fochtman says. “In many ways the world is much smaller than it’s ever been because of technology, and this is going to continue even with geopolitical constraints about where the data is housed.”