New Standards For Hydrological Information

The smallest unit of the “common language” used by the National Hydrologic Services (NHS) is the hydrologic datum.

Hence, the “smaller” the world becomes, the more “enlarged” and comprehensive the “language” of the community of NHSs needs to be. Nowadays, the astonishingly fast technological development of both distributed computing and web services has allowed the sharing of data among a wide range of users, not only for the benefit of a particular NHS itself, but also for the use of the whole community of a given transnational region, crossing international borders, and even, as we all know, transferring data from one continent to another.

Besides, the ever-increasing request of cross-disciplinary research demands greater-than-ever unambiguousness of intercommunication.

In a nutshell, as far as sharing data is concerned, the physical distances and boundaries among countries and regions of the globe are becoming more and more senseless, whereas a core challenge faced by the users, for the time being, is how to understand the data received from other sharers and to make their own data be understood by the whole community.

 In 1999, after having recognized:

(1) the responsibility of the Members in providing mitigation of water-related hazards and sustainable management of water resources,

(2) the potential benefits of enhanced exchange of hydrological data and information within shared river basins and aquifers,

(3) the continuing need for strengthening the capabilities of NHSs,

(4) the right of Governments to choose the manner by which (and the extent to which) they make hydrological data and products available,

(5) the right of Governments to choose the extent to which they make available internationally data which are vital to national defense and security,

(6) the requirement by some Members that their NHSs earn revenue from users and/or adopt commercial practices in managing their businesses, and

(7) the long-established provision of some hydrological products and services on a commercial basis (and in a competitive environment), and the impacts associated with such arrangements, WMO’s Resolution 25 adopted “a stand of committing to broadening and enhancing, whenever possible, the free and unrestricted international exchange of hydrological data and products, in consonance with the requirements for WMO’s scientific and technical programmes”.

In June of the year 2000, an expert meeting for the establishment of a global hydrological observation network for climate was held in Germany, headed by the Global Terrestrial Observing System (GTOS), the Global Climate Observing System (GCOS) and the WMO’s Hydrology and Water Resources Program (HWRP). The meeting provided a synthesis of the main requirements for hydrological observation data, and detected the following essential drivers for data exchange:

1) Improving climate and weather prediction;

2) Characterizing hydrological variability to detect climate change;

3) Developing the ability to predict the impacts of change;

4) Assessing water sustainability as a function of water use versus water availability; and

5) Understanding the water cycle at global level.

This ended up leading to the WMO’s Technical Report in Hydrology and Water Resources No. 74, entitled Exchange of Hydrological Data and Products. That report reminded (and, thus, reinforced) the three types of requirements for hydrological data and products identified in Resolution 25, that are:

(1) those which are, in the last instance, fundamental for the protection of life and property and for the well-being of all nations shall be provided on a free and unrestricted basis;

(2) those which are required to sustain programmes and projects of the UN agencies, the International Council for Science (ICSU) and other organizations of similar status should also be provided where available; and

(3) those that are exchanged under the auspices of WMO for the non-commercial activities of education and educational communities should be provided on a free and unrestricted basis.

The report also described operational hydrology, such as real-time applications (forecasting and warning of extreme events and project operation) and engineering design, as typically being carried out on a national scale, even though international exchange of information may be necessary where there are shared basins across the borders of the countries.

It also noted that Hydrological and Environmental Science as well as the monitoring of trends in the global environment may require an international exchange of information.

***

Raw information is very often out of a proper context, whenever sharing data among different systems is needed.

The lack of context is perhaps attributable to the specialized nature or the security requirements of a particular information system.

There are even cases in which one has to climb out the third database in order to climb down into the other. In any case, it seems that no doubt exists that a great endeavour at a worldwide level is needed towards the development and improvement of consistent information models for gathering and sharing both spatial and temporal data.

In this perspective, the ability to share metadata is paramount, inasmuch as how to understand the data is fundamental. That is, concisely speaking, one of the main tasks of the Global Earth Observation System of Systems (GEOSS).

A number of initiatives are currently being carried out across the world aiming at dealing with the mammoth complexity of incongruent data sets.

All of such proposed schemes and plans endeavour in bringing about improved standards for water information, covering both temporal and spatial data sets at different levels of complexity.

Whenever it is possible, most of them look to improving existing methodologies and standards, mainly in order to avoid re-inventing the wheel, so to speak.

An international workshop on Water Resources Information Models, held in Australia in 2007, pointed out that improved efficiency and quality of local information models and systems, wider use and re-use of information, and the development of new tools are some of the benefits of developing shared models. It also indicated that new value from existing information via unexpected uses could be another benefic side effect.

The workshop recommended that a harmonized information model and transfer formats for water data should therefore be developed. Then, a forum for the collaboration and development of standards for hydrological data, named Hydrology Domain Working Group (HDWG), was formed by WMO and the Open Geospatial Consortium (OGC).

Depending on the context within which a particular set of hydrological data are to operate, different standards are developed aimed at different targets, subject to different constraints and focused on different aspirations.

Among the hydrology community, one of the most popular data model for Water Resources is ArcHydro. A toolset based on this model is available for ArcGIS desktop applications.

That suite of tools makes possible the creation, manipulation and the display of ArcHydro objects and features, providing raster, vector and time series in a functional fashion.

Some of the main available tools are Digital Elevation Models (DEM) recognition, the creation of flow direction and the flow accumulation from a DEM grid, the stream definition, the catchment grid delineation, the creation of streamlines, creation of a line that follows the longest flow path based on the steepest descent in catchment or watershed, calculation of the river lengths, and many others.

NWIS is the National Water Information System of the United States Geological Survey (USGS). There is a tool that reads the USGS gauge identification that is at the map and retrieves stream discharge information from NWIS for a specified period of record in a time series table format.

ESAR is the acronym of Environmental Sampling, Analysis and Results, a data standard created by the Environmental Protection Agency (EPA) of the USA, in order to assist the sharing of laboratorial result data. It was based on ESAR that Water Quality Exchange (WQX) was developed by the Consortium for the Advancement of Hydrologic Sciences Inc (CUAHSI) to be used by the EPA. WQX’s focal point is the exchange of water quality information.

The standards used by WQX were developed by the Environmental Data Standards Council whose primary function is to develop and adopt documented agreements on terms, definitions, and formats.

To encode the semantics of hydrological observation discovery and retrieval as well as to implement water data services in a generic and unambiguous fashion across different data providers was the initial driver for the development of WaterML1.0, a standard developed by the CUAHSI.

Since its implementation was carried out as an Extensible Markup Language (XML) and does not make use of any other existing standard, one of the future objectives of developing a harmonized observation model is to allow its convergence into existing standards.

To meet its own needs, the Bureau of Meteorology (BOM) of Australia developed the Water Data Transfer Format (WDTF), guided by the principles of a solid modeling foundation, “adopt, adapt, invent”, validation, XML for current tools and normalization. Its scope is to allow for the encoding of information the state water agencies, that take hydrological measurements, send to BOM.

Not only observational data are addressed, but also descriptions of features, transactional information for synchronizing with a data warehouse, conversions (such as rating table conversion), and water quality samples.

Stating that “the loss of time and resources in searching for existing spatial data or establishing wheather they may be used for a particular purpose is a key obstacle to the full exploitation of the data available”, the Infrastructure for Spatial Information in the European Community (INSPIRE) initiative has the directive to develop a wide spatial data infrastructure for sharing spatial data sets within the European Community.

The standard of the United Kingdom Environmental Agency Time Series Data Exchange (UK-EA-TS) addresses rainfall amounts, river levels, river flows, tide levels, lake and reservoir levels, groundwater levels, areal modeled evaporation, soil moisture deficits, water quality parameters (such as dissolved oxygen and ammonia quantities), atmospheric temperature, wind speed and radiation, among others.

SANDRE is the French acronym that stands for the National Data Reference Centre for Water (Sécrétariat National des Données Relatives à l’Eau). Managed by the International Office for Water (OIEAU), SANDRE is in charge of putting a shared language into practice, setting up dictionaries, technical definitions, data conceptual models and reference lists for water data exchange within the French territory. It is part of the French Information System for Water (SIE), that is based on a set of specifications, rules and datasets entitled Water Data Frame of Reference, which, in its turn, aims at a technical and semantic interoperability, by means of the development of a common language for both exchange services and data banks.

SANDRE’s top priorities are to turn data definitions compatible and homogeneous in order to make possible the exchange of data among producers, users and databanks, at any administrative and hydrographic levels. It also provides, free of charge, specifications documents for being used in water data banking and interchange. SANDRE makes use of the International Standards Organization (ISO) for its metadata definitions, and several OGC service interfaces for exposing data assets.

According to many experts, the good quality of the information models developed within SANDRE turns it into an instrument of particular interest to the standard data harmonization process.

The Marine Metadata Interoperability (MMI) is a project aimed at the promotion of the exchange, the integration and the use of marine data via enhanced data publishing, discovery, documentation and accessibility.

Some prominent activities within MMI are guides that introduce readers to all aspects of metadata and best practices; an ontology repository associated to a semantic framework that can be used by the marine community to store, manage and work the vocabulary; references to vocabularies, standards and best practices, among many other useful assignments. Although it is not aimed at hydrological purposes, having a look at it maybe useful for the future of hydrologic data and metadata transfers.

The German Federal Waterways and Shipping (BAW, Bundesanstalt für Wasserbau) developed Xhydro for transferring information for its own use within Germany. The key point of interest is its time series model.

According to its documentation, a generic conceptual model has been created in a way that other schemes can be created from it, to address particular needs, which can be said to be the core premise of the proposed methodology. Last, but not least, its modularity also lends a hand when assessing the standard from a harmonization viewpoint.

The development of standards associated with encoding and transmitting sensor data, sensor descriptions, control, alerting and processing within the OGC is lead by a group known as Sensor Web Enablement (SWE).

SensorML, which means Sensor Model Language Encoding Standard, specifies models that provide a framework within which the dynamic, geometric, and observational characteristics of both sensors and sensor systems can be defined. Within SWE there is a common specification called SWE Common, that resulted from a combination of a common need for data models for SensorML and Observations & Measurements. Its initial definition was provided within the SensorML Technical Specification, but later it acquired its own specification so that it can be easily used by different standards.

The Global Runoff Data Centre (GRDC) developed a metadata profile for hydrological datasets, using a model-driven approach aligned with the metadata specifications developed by ISO. Its definitions of hydrological features as well as its use of standards turns this specification into an important tool for future harmonization of standards.

The Climate Science Modelling Language (CSML) is a standard-based data model and a Geography Markup Language (GML) application schema for atmospheric and oceanographic data. It tries to sum up important semantics of climate science data in a generic fashion. Since it provides an abstract semantic model for representing nonspecific data objects, it seems to be worthy for the NHSs to investigate CSML.

Ground Water Markup Language (GWML) is also a GML application focused on the definition of features. Its scopes are the geological aspect of groundwater and technical details of wells and groundwater measurements.

The origin of the data includes aquifers, water quantity, flow system, water quality, suspended, dissolved and colloidal contents, water wells and wells components such as screens and casing, for instance. It seems that it can be used as a methodological reference.

Despite the Integrated Ocean Observing System (IOOS) has been designed to collect, deliver and use ocean information, the fact that the project makes use of O&M, GML and SWE Common turns it into a somewhat generic approach that can serve as a kind of guidance for NHSs to develop common standards for hydrologic data exchange.

More than half a century ago, the Brazilian Engineer of German origin Otto Pfafstetter developed a codification for hydrographic basins based on successive subdivisions of the drainage areas, to which integer numerical values are progressively associated. The method starts at continental scale and continues until it reaches the level of tiny brooks and rivulets.

The procedure is always accomplished upwards. In other words, it is always carried out from downstream to upstream. In conformity to such methodology, the main watercourse is the one that drains the largest area. The process is initiated at the main mouth (or estuary) and, at each confluence, the stretch endowed with the largest contribution area is considered, independently of the cartographic denomination of the stretch. Since the designations of the rivers are not taken into account, a certain watercourse is allowed, by this method, to have many geographic denominations.

By the way, it is far from being an abnormal occurrence in real life, inasmuch as a given river happens, not very seldom, to be assigned by different names by different peoples and cultures that share its waters. That is how the main watercourse is identified by this method.

After having identified the main watercourse, the basins of its four tributaries that have the largest contribution area are numbered with even integers (2, 4, 6 and 8) from downstream to upstream.

As illustrative examples, in South America, basin 2 is the Orinoco river basin, basin 4 is the Amazon river basin, basin 6 is the Tocantins river basin, and basin 8 is the Paraná river basin. In North America, 2 stands for the Mackenzie river basin, 4 represents the Nelson river basin, 6 denotes the Saint Lawrence river basin, and 8 corresponds to the Mississippi river basin. In Africa, the Congo river basin is numbered as basin 2, the Zambezi river basin is labeled as basin 4, the Nile river basin is designated by basin 6, and the Niger river basin is indicated as basin 8.

The contribution areas that drain directly towards the main watercourse are entitled inter-basins. Each inter-basin is given an odd integer (1, 3, 5, 7 and 9), also from downstream to upstream. It can be beforehand observed that the inter-basin numbered 1 is the one that contains the mouth (or the estuary) of the main river, and that the inter-basin numbered as 9 is where its headwaters are.

These integers are then placed at the right of the number of the main watercourse. For each basin and inter-basin, the process continues until it achieves the smallest basin at the cartographic scale used by the Brazilian National Water Resources Information System (SNIRH), that is 1:1,000,000. For instance, the basin 7464 is the basin of the Gorutuba river, a tributary of the Verde Grande river (whose basin is 746), which, in its turn, is a tributary of the São Francisco river, whose basin is 74, which is a river that belongs to the immense inter-basin 7.

The inter-basin 7 surrounds anticlockwise the South American continent from the mouth of the Paraná river, near Buenos Aires, until the mouth of the Tocantins river in the neighborhood of Belém, in the Amazon rainforest.

An essential characteristic of this representation is its topological consistency, i.e., the hydrologic flow of the watercourses is correctly represented. Another convenient feature is that crescent levels of basin details can be easily incorporated to such an extent that in most cases there is no need of changing the codes derived from former divisions; thus, the coherence of the codification is maintained for any scale.

Furthermore, the logic behind the method allows great readiness for implementing tabular inquiries provided with the same topological consistency of the conventional spatial inquiries, due to its capableness of being easily integrated with Geographic Information Systems.

According to the terminology adopted by the SNIRH, a river stretch is defined as any portion of a watercourse that lies between two adjacent tributaries of the watercourse.

Thus, the number of river stretches that a certain watercourse may have is somewhat dependent on the scale used. As it has already been seen, the scale used by the SNIRH is 1:1,000,000.

SNIRHS’ methodology essentially consists in splitting the hydrographic network into watercourse stretches and generating the contribution area for each stretch by means of DEM such as, for instance, the data provided by the Shuttle Radar Topographic Mission (SRTM) spearheaded by NASA and USA’s National Geospatial-Intelligence Agency (NGA).

Then, based on the above described Hydrographic Basin Coding System, both upstream and downstream topological hydrographic information are stored for each river stretch.

Some typical information that can be immediately obtained are: the land use, the vegetation and the geology of the drainage area of any river stretch, the users that exist upstream at a certain point, the population and the drainage area upstream any river stretch, and the stream gauges and/or hydrometeorological stations that are upstream a certain village or city, among other convenient attributes.

The concept, as a whole, is referred to, by the SNIRH staff, as Hydro-Referecing, according to which, all the information is geographical.

As a whole, the set of procedures that are presently being developed by the SNIRH team is gradually becoming a new robust and consistent standard for hydrological information, conceptually speaking.

In the specific case of metadata standards, they must provide for the creation of “formal metadata” endowed with consistent collection criteria, terminology and structure in a way that they must be approved by a standards organization, such as ISO, for example.

They specify the types of information needed to describe the data and the digital storage models.

If dealing exclusively with data, the metadata standard is said to be of the descriptive (or content) type, which means that it specifies a list (or hierarchy) of required metadata elements to be included in the metadata description. The specific information contained in each metadata element is combined with the content from other elements that fully describes a data set.

If dealing exclusively with digital storage models, the metadata standard is of the format type, which describes the digital storage and structural requirements of metadata.

This kind of standard are implemented in file formats, and must assure that different software programs are able to read or query the data. Metadata standards may include either one type or another, or even both.

Metadata standards may be (1) widely-used, such as the ones adopted by major national or international organizations or communities, (2) broadly-applicable, designed to describe a variety of different types of data that users might have; and (3) maintained, that are supported by a standard body and estended and updated as appropriate.

ISO 19115 is one of the most well-known metadata standard. It was developed by ISO to describe geographic information, containing both mandatory and optimal components, organized into sections, and has defined methods for extending the standard to fit specific needs.

The Content Standard for Digital Geospatial Metadata (CSDGM) was developed (and maintained) by the Federal Geographic Data Committee of the USA. It is mandatory for any geospatial data produced by USA Federal Agencies. Many US State and Local Governments also use it.

Certainly one of the oldest standards is the Directory Interchange Format (DIF), which was created in 1987 as a way of creating catalog inter-operability. As far as the number of elements is concerned, it is one of the smallest standards, having only eight mandatory elements out of thirty-six total elements.

Dublin Core is another small standard. Although it was originally designed to deal with bibliographic data, it is presently used much more widely, due to its flexibility.

ADN stands for the initials of ADEPT/DLESE/NASA, and is its metadata framework. Its origin was to describe educational resources for Earth sciences educators. Presently, it is being used in a very wide range of activities, such as geospatial and temporal aspects of data. The set of required elements for the metadata cataloguer to include is very small.

A small set for the resource creator to fill out, such as language and terms is also included. Finally, there is a larger set of optional elements that can describe the content in more details.

Antônio Cardoso Neto