By the early 1990s, petroleum companies spent decades exploring the North
Sea for oil. The Norwegian government realized that the data these companies
had been collecting held their own value, not only for the development at
hand, but for future development and other potential applications. The government
also saw a need to ensure the quality and longevity of the data.
The Norwegian government formed the first national geoscience data repository.
Run by the Norwegian Petroleum Directory, the data repository is both a system
for managing seismic data and a core store, where anyone can access the hundreds
of drill cores that petroleum companies have taken from the bottom of the
North Sea. To ensure that the repositorys data are readily available
and of high quality, the government mandated that companies submit data
seismic surveys, well logs and cores to the national data repository.
More importantly, the government decided that the repository should function
as more than a reference library of earth materials, but rather as a data
management system integral to the operations of both the industry and government
agencies.
A key concept intrinsic to supporting national data repositories is that geoscience
data are a valuable national asset. Though the exploration and production
staff of most companies recognize this value in geoscience data, many senior
leaders in industry view data storage and management as a cost center. In
Norway and the England, for example, companies offload the data management
of seismic data to the Norways DISKOS component of its repository, where
well logs and seismic data are stored, and offload well logs to Common Data
Access, a repository formed in England that also houses North Sea data. As
a result, companies reduce their burden of a direct cost center. Additionally,
with ready access to their data held by the national data repositories, the
companies are able to provide equal access to their data for all appropriate
parties. For example, in 1996, the Norwegian Parliament changed the reporting
requirements for petroleum producers to an on-request system. Now companies
no longer need to worry about meeting some of their reporting deadlines, as
the licensing and tax authorities are able to directly build the reports on-demand
from the data held in the repository. Additionally, with these systems, data
transfers during lease sales now take minutes compared to days and weeks,
as all that is required is reassignment of access rights within the data system.
This approach not only maximizes the value of the data repository concept,
but also ensures the long-term value of the data through extremely effective
access and quality control.
The concept of creating data repositories has since spread across the world,
with many countries either having recently established data repositories or
being in the process of initiating operations. In March 2002, the Norwegian
Petroleum Directorate, Department of Trade and Industries of the United Kingdom,
and the Petrochemical Open Software Consortium hosted the fourth meeting of
National Geoscience Data Repositories in Stavanger, Norway. Bringing together
data repository personnel from 18 of the approximately 50 countries that house
data repositories, the meeting revealed that national geoscience data repositories
in all countries face similar issues.
A data repository operates as more than a simple library of data.
It is an opportunity for government institutions and industry to cooperate
for mutual benefit. The issues driving the specific functionality and design
of any given countrys repository is who owns the data, the priorities
of national investment, and the nature of the geoscience industries in those
countries.
Petroleum geoscience activities drive the formation of national geoscience
data repositories. In most countries, the focus on data is well logs, seismic
data, and cores and cuttings. Other geoscience data, such as maps, scout tickets,
paleontological samples or materials gathered for mineral exploration are
also managed, but often at a lower priority. Depending on the particular countrys
need, some repositories focus on data related to all subsurface information,
including cores from mineral and groundwater cores.
Ownership rights of geoscience data vary from country to country. Most European
and North American countries grant initial data ownership to the acquiring
entity. These data, however, generally must be reported to the national data
repository and are held proprietary for a set amount of time or until the
company abandons their exploration and production leases. This open-market
approach made some companies reluctant to participate in establishing national
data repositories in these areas. In most other countries, including South
Africa and New Zealand, all geoscience data are the property of the government
and are borrowed by a company for exploration and development.
These countries also hold the data proprietary during the companys operations.
But companies otherwise have little control over their acquired data, particularly
during lease sales or transfers.
Most national data repositories function as government or quasi-government
organizations serving three sets of clients. They must provide data management
services to the operating companies, regulatory data management for the licensing
and tax authorities, and access for economic development agencies who try
to promote future exploration of the countrys resources. Thus, the data
repositories are responsible to multiple clients, all of whom have distinctly
different interests and valuations of the geoscience data. The only overarching
need of these groups, beyond basic access, is reasonable data quality.
During the March meeting, most of the representatives said the largest issue
they encounter is the quality of the data provided by the companies. Most
national data repositories spend a substantial proportion of their non-fixed
costs on quality control issues, including evaluation and data correction.
The quality issue is so critical that in some cases, such as New Zealand,
the national data repository will return the data to the acquiring company
until those data are sufficiently improved. The most common quality control
problem is with the metadata the data about the data. What is often
lacking is sufficient information about where and at what depth each core
was taken, or sufficient navigational information about seismic data. However,
all of the national data repository representatives who attended the March
meeting said that the ultimate responsibility for quality control rests not
with the companies acquiring the data, but with the organizations running
the repositories.
The United States is an exception, as it does not have a centralized data
repository system. The data at risk are also different in the United States.
Whereas other countries make preservation of well log and seismic data the
priority, the data most at risk in the United States are drill cores. Most
countries prioritize the active management of seismic and well log data, because
they are considered easier to manage, they can be made digital, and are the
critical first-order data for exploration. Most countries actively preserve
cores and cuttings, but they consider these data the details supporting the
use of well logs and seismic data. The American Geological Institute, with
support from the Department of Energy, is working to establish a National
Geoscience Data Repository System. And the National Research Council recently
published a report calling for preservation of geoscience data (see the story
on page 16). Both of these projects focus on preserving physical data, such
as cores and cuttings. The application of the free market to data preservation
in the United States has created large commercial markets for seismic and
well log resales, but limited commercial interest in cores and cuttings. Just
as in other countries, the cost of preserving the physical data is high compared
to the near-term return on investment. However, while most other countries
do recognize that cores and cuttings have substantial, long-term value, it
is difficult for a free-market system to address such needs.
Digital data management is also an area of concern, for which two distinct
perspectives exist. The largest data management issue is the retention of
original or field data. Unanimous agreement exists that paper records, such
as well logs, should be and are, except in the United States, universally
scanned and stored in digital databases, followed by disposal of the paper
files. At the same time, particularly with seismic data, no consensus exists
about retaining field data. A number of representatives said their countries
only want to preserve the stacked seismic data that are ready for interpretation.
Yet some representatives, such as those from Brazil, argued that unprocessed
field data must be retained as its value will likely increase significantly
with improvements in processing. However, storing field data is very expensive
because they are produced in large volumes, they must be transcribed periodically
to new media, and they are vulnerable to becoming worthless if their metadata
are lost.
The third major issue is determining the reality of a perceived need for massive
amounts of digital data to be online, versus a more cost-effective near-line
system where most digital data can be made available in hours or days. Both
DISKOS in Norway and CDA in the United Kingdom have complete online data access
of seismic data and well logs, respectively. Although these systems are exemplary,
those running most other national data repositories do not perceive that their
customers have immediate need for such rapid access, and their clients probably
would not appreciate the costs they would incur to enable such access. Wide
recognition exists for the eventual need to provide more instantaneously accessible
online data, but the cost structures of multi-terabyte data centers need to
be guaranteed funding, which has varied by the level of exploration activity
in each country.
The final major issue all national data repositories face is ensuring the
preservation and accessibility of the data. Key to preservation and accessibility
is valid and extensive metadata, including ownership, location, methodologies
and other pertinent parameters. Regretfully, poor metadata is a common issue
throughout the world, particularly for data that were taken in areas eventually
regarded as non-economical for exploitation. Without sufficient metadata,
quality control efforts often lead to the disposal of the data, and the fundamental
loss of a potential asset.
A geoscience data repository is more than a dusty building full of rocks and
paper. National data repositories are centers of active data management and
dsitribution focused on supporting a countrys economic and environmental
needs. The United States lacks such a centralized function and, given the
independent natures of companies, states and federal agencies, it is difficult
to envision one evolving. Many states house world-class state data repositories.
But all the domestic data repositories should look beyond U.S. borders and
understand not only data management best practices, but also the vision of
a dynamic data archive and management approach seen throughout the world.
![]() |
Geotimes Home | AGI Home | Information Services | Geoscience Education | Public Policy | Programs | Publications | Careers ![]() |