GLUE2 Design

This topic explores the relationship between the Registry Information Model and Data Model and the Glue2 OGF standard.

Registry Information Model and Data Model

Based on the needs of Globus.org, and the needs of Teragrid and the needs of BIRN, a single registry with a fixed schema (data model) is not likely to be able to serve the needs of any of these organizations, let alone try to serve the needs of all of these organizations simultaneously.  It is a goal of the Registry work to provide a repository of resource information to which various organizations/actors can publish information into and others can query.  The Registry work seeks to support Globus.org, Teragrid and BIRN, and perhaps other organizations.

Flexibility in the information model is key to achieving these goals.  Also important to observe that there is not a single registry, but rather, there are multiple "kinds" of registry (eg a Resource Catalog, a Capability Registry, a Registry of current SLA measurements, etc.) and multiple instances of any given registry "kind".  It is possible, for example, for a Virtual Organization to be formed, and adminstrators of that VO build an instance of one or more registry "kinds" to serve the needs of that VO.  For example, a VO-specific resource catalog could be created and populated with a subset of resource metadata pertient to the hardware, software and service resources of primary interest to the members of that VO.  This data would be a replicated sub-set of another resource catalog, with some sort of synchronization mechansim to keep the VO-specific resource catalog current.

In order to achieve the ability to realize and deploy a network of registry instances of various kinds, we need a fundamental underlying information model that is sufficiently flexible to allow developers to specify simple, but powerful data transformations to create and syncronize the contents of the various registry instances in the network of registries.

We have two choices upon which to base the Registry Information Model:

a) a green field information model we create

or

b) an industry standard of some sort.

It is clear that none of the communities we serve have the design resources nor the patience to invest in developing a brand new systems management resource information model.  The industry has lots of these efforts.  Each new effort takes time and talent to develop and a significant investment to achieve buy in from the various organizations that own and run the hardware, software and services our customers want to use.  It seems to me that starting from scratch makes little sense at this point.  Therefore we dismiss option a).

With respect option b), two industry standards come to mind: The DMTF Common Information Model or CIM and the Glue standard published by OGF. 

While CIM is very comprehensive and has some support within the major IT systems management vendors, its breadth, complexity and WS-* focus suggests it is innappropriate for Grid-centric activities such as Globus.org, BIRN and Teragrid.  Furthermore, many of the important resource providers with which our communities work (eg the major Teragrid resource providers) have some deployment of systems management infrastructure that maintains or generates resource metadata conformant to Glue.  Therefore option b), based on Glue seems to be the most appropriate basis for the Registry information model.

Overview of Glue2

Glue V 2.0 (aka Glue2) is the latest published version of a standards effort running within the OGF.  The purpose of Glue2 is to define a "standard" information model and data models for defining common entities in a grid computing environment. 

An information model is published in the Glue2 Spec. The information model defines an abstract "Entity" entity and around a dozen specializations.  The following figure shows the Entity specializations defined in the Glue2 information model:
We note that the information model, as represented by the set of entities, covers alot of territory.  Not only does it have entities for "traditional" network available hardware, software and services (Resource, Manager, Endpoint, Service), but it also has entities to describe other Virtual Organization concepts (Domain, Share, Contact).  Glue2 also describes an authorization mechanism (Policy) that ties into the Virtual Organization Entities and a form of task management (Activity).  The information model also describes 3 possible forms of extensibility, including the Extension entity, a specific "OtherInfo" property (for free-form information about any Entity) and traditional "extension by specialization".

Glue2 Strengths

  • The information model is MUCH simpler than CIM
  • Has some standing within the Grid community
  • Some organizations in the Grid community have already modeled their resources based on Glue2
  • Separates the information model from concrete renderings (eg an XML data model, an SQL data model, etc.)
  • Uses URI as the identifier mechanism for entity instances
  • Has rich extensibility mechanism (although not mechanisms that are commonly used)

Glue2 Weaknesses

  • Information model, although simpler than CIM, is still to monolithic for our purposes, covering territory such as VO entities, policy entity etc. that Globus.org is handling by other means outside a registry initiative.
  • It is somewhat weak that the notion of a "user" is conflated with the means to contact a user in the Contact entity.  It is much more appropriate, for our purposes, to separate these concepts into separate entities.  Furthermore, these entities are being developed by the User Profile Management service initiative within Globus.org
  • The XML representation (XML data model) does not fit our needs.  We would prefer to use Atom Syndication Format to list collections of entity instances and use atom:link elements to model associations between entity instances and use more standard XML mechanisms (eg attribute and element extensibility through anyAttribute and anyElement)

Glue2-based Registry Information Model

Therefore we propose to base the Registry Information Model on a subset of the Glue2 information model, with some specific concrete specializations to reflect the common (2 or 3 dozen) entities encountered in Grid infrastructure.  We further propose that an Atom-based XML data model be the basis for the concrete data model used by the Registry work.

Subset of Glue2

We will use the following subset of entities from the Glue2 Information Model:

  • Service
  • Endpoint
  • Resource and
  • Manager

Specific Extensions of Entity

We will define concrete entities for the following common Grid infrastructure:

  • GridFTP
  • GRAM5 (GRAM services need to be distinguishable as Fork or specific local resource manager)
  • GRAM4/WS-GRAM (GRAM services need to be distinguishable as Fork or specific local resource manager)
  • GRAM2/PreWS GRAM (GRAM services need to be distinguishable as Fork or specific local resource manager)
  • SRB service (Storage Request Broker/Data Replication Service)
  • (GSI)OpenSSH login service
  • (GSI)OpenSSH SCP w/HPN data movement service
  • Group/Team/VO management service
  • MyProxy/credential service
  • User/Profile service
  • IIS (endpoints for accessing various registries)
  • Infrastructure Portal website
  • Infrastructure Home website
  • Nimbus IaaS
  • EC2 IaaS
  • Data Movement SaaS
  • WS-OGSADAI service
  • RabbitMQ/AMQP service

These legacy services:

  • WS-MDS4
  • WS-Delegation
  • WS-RFT
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.