AI Transformation in Oil and Gas Starts with Data

2026-05-20 Viewed:32

Source:Xinjiang Daily

On May 17, 2026, at the “AI+ Empowering Scientific Research Innovation” scientists’ spirit sharing session held at the Xinjiang International Convention and Exhibition Center, Liu He, an academician of the Chinese Academy of Engineering and head of the Xinjiang Base research team of Huairou National Laboratory, spent nearly two hours candidly discussing the core issues in the intelligent transformation of the oil and gas industry. As a leading figure in China’s oil production engineering sector, Liu has in recent years served as chief engineer for the design of PetroChina’s major artificial intelligence program and completed the top-level planning for PetroChina’s AI development.

“What goes in as garbage can only come out as garbage.” During the lecture, Liu repeatedly emphasized a key distinction: data and information are not the same thing. “A considerable part of what we obtained historically was information, not data.”

He explained that data refers to unprocessed original facts and represents the truth, while information is the product of human interpretation. “Information is not necessarily fact.” In the current oil and gas system, large amounts of data need to go through collection, reporting, sorting, and other processes. Limitations in collection equipment and technology, as well as interference from human factors, can all create deviations between the data and the true original underground conditions.

“Because of the interference from these factors, major differences can arise between data and information.” Such deviations are particularly prominent in areas with a low level of automation, directly affecting the accuracy and reliability of subsequent model training.

Liu summarized the problem with a classic warning: “Garbage in, garbage out.” If upstream data is unreliable, all downstream applications will encounter problems.

In addition to data quality, data silos are the second major challenge facing the oil and gas industry’s AI transformation.

Liu pointed out that data related to exploration, development, and production in the oil and gas industry is held by different departments and organizations, while data-sharing mechanisms remain incomplete. Confidential data is subject to strict access restrictions, and the ownership of commercial data also requires long-term exploration.

“These issues have greatly constrained the training of large AI models and their industrial application,” he said. Liu called for the government to take the lead and work with enterprises to establish high-quality datasets and a unified data-sharing platform. This, he said, will be a foundational project for the development of energy AI in Xinjiang.

To illustrate how to overcome data difficulties, Liu shared his team’s own experience with rock thin-section identification.

Facing a huge volume of historical geological thin-section images, the team invited more than 20 senior experts to compare the images with the reports from that time and re-label them one by one, identifying which annotations were accurate and which needed correction. This process took five years, eventually raising the accuracy rate of AI recognition from 95% to 99.8% and helping form a national standard.

“This process was difficult, but without high-quality datasets, there would have been no later products capable of solving real problems.”

In his view, for oil and gas AI to truly be implemented, it must first pass the hurdle of data governance.

Existing business data should be standardized and cleaned, and barriers between multiple systems should be removed so that data can first become usable.

Efforts should focus on core business pain points, with dedicated datasets built on a small scale to enable rapid implementation and verification before iterative expansion.

Data desensitization and authorization traceability should be prepared in advance throughout the process to avoid rework caused by compliance issues later.

Liu proposed that AI implementation in the oil and gas industry requires coordinated efforts across four dimensions.

Data must be strengthened by building high-standard governance systems to solve the problem of having “no grain to cook with.”

Models must be optimized by deepening the injection of industry knowledge and eliminating incompatibility with real-world industry conditions.

Computing power must be upgraded by improving the efficiency of infrastructure and breaking through the “power bottleneck.”

Scale must be expanded by using cloud computing methodology to aggregate resources, solve difficulties in implementation and slow promotion, and ultimately achieve exponential improvements in efficiency.

At the end of the lecture, Liu expressed high expectations for young students. He encouraged young people to first develop a sincere love for scientific research, saying that “once you have a passion for it, nothing feels too hard.” He also particularly reminded them that while studying their major well, they should also “read some leisure books,” cultivate a hobby that helps relax both body and mind, and maintain healthy living habits. In his view, systematic cognition and a healthy body are both indispensable foundations on the road of scientific research.

Further Analysis

Liu He’s sharing session took place in Xinjiang, one of China’s important oil and gas production bases. Xinjiang is rich in oil and gas resources and provides solid support for the country’s energy supply. In the era of “AI+,” whether Xinjiang’s energy industry can take the lead in completing data governance and breaking down data silos will directly affect both regional AI competitiveness and national energy security.

From a national perspective, the intelligent transformation of the oil and gas industry generally faces the dilemma of “having large amounts of data that are difficult to use and many systems that do not interconnect.” Liu’s four-dimensional coordinated approach is not only applicable to Xinjiang, but also provides a clear action path for other energy bases in China.

It is worth noting that his repeated emphasis on the distinction between data and information reminds the industry to avoid a common misunderstanding: informatization should not be mistaken for digitalization, and having massive volumes of data does not automatically mean possessing big data capabilities. True big data depends on high-quality, shareable, and computable data assets.

It is foreseeable that in the coming years, whoever is willing to make real and sustained efforts in data governance will gain the first-mover advantage in the oil and gas AI race. As Liu said, “This process was difficult, but without high-quality datasets, there would have been no later products capable of solving real problems.”


Disclaimer: The above content was edited by Energy China Forum (www.energychinaforum.com), please contact ECF before reproduce.

Author:    News Time:2026-05-20

Back
Top
Message

Please leave your comment

Name*
Tel*
Email*
Company*
Message*
提交