Relations between Users, Groups and datasets

Related Issues

Document Status

#17

In Progress

#21

In Progress

Section authors:

Philipp S. Sommer orcid , Linda Baldewein orcid , Hatef Takyar, Andrea Pörsch, Emanuel Söding orcid , Housam Dibeh, Carsten Lemmen orcid , Elke Meyer, Marcus Lange, Sascha Hokamp, Ute Daewel, Julian Quinting, Annika Oertel, Andreas Lehmann, Klaus Getzlaff orcid

Datasets, groups, users and authors are related to each other in the model data explorer framework. The kind of relations between the objects

  1. describes how the relation is displayed on the frontend

  2. has implications about who can edit what

  3. categorizes the content and makes it findeable

This is necessary to generate pages where all datasets of a data group are listed, or all datasets that an author participated in. With this strategy we honor active users and groups that make their data available to the public, and it helps to browse the available resources.

In the following, we describe the relations that are possible in the model data explorer, namely between

  • users and authors

  • authors and datasets

  • users and data groups

  • data groups

  • datasets

  • datasets and data groups

See the Terminology section if you want to know more about the different terms.

Relation-based permission system

From a database perspective, a relation is a many-to-many relationship with a certain access right. Consider the following model:

class DatasetUserRelation:

    description: string  # Human-readable description of the relation, optional.
    left: Dataset
    right: User
    permissions: List[Permission]
    relation_type: List  # e.g. "originated by", "distributed by", ...

Here we have a relation between a dataset and a user. This relation is described by a certain relation_type coming from a controlled vocabulary (see the roles for Authors and Contact Persons for instance) and involves two parties: a left party (the dataset) and a right party (the user). Our aim within the model data explorer is to get this information (relation_type, left and right party) from the metadata in the original dataset, e.g. the netCDF-Header or INSPIRE ISO (see Metadata of datasets).

Such a relation may, however, also imply some priviliges. If a user is the distributor of a dataset, he or she should also have the possiblity to edit the dataset. Therefore the creator of a dataset can equip a relation with permissions, e.g. the permission to edit a dataset, or to view a dataset, or to list a dataset on the users personal page in the model data explorer.

These permissions must be confirmed by both parties, the granting party (the creator of the left dataset) and the granted party (the right user in the relation above).

As such, our Database looks like this:

digraph model_graph {
  // Dotfile by Django-Extensions graph_models
  // Created: 2023-02-14 14:49
  // Cli Options: --output /home/docs/checkouts/readthedocs.org/user_builds/mde-prototype/checkouts/latest/source/tmp_graph.dot templateapp

  fontname = "Roboto"
  fontsize = 8
  splines  = true
  rankdir = "TB"

  node [
    fontname = "Roboto"
    fontsize = 8
    shape = "plaintext"
  ]

  edge [
    fontname = "Roboto"
    fontsize = 8
  ]

  // Labels


  templateapp_models_RelationPermission [label=<
    <TABLE BGCOLOR="white" BORDER="1" CELLBORDER="0" CELLSPACING="0">
    <TR><TD COLSPAN="2" CELLPADDING="5" ALIGN="CENTER" BGCOLOR="#1b563f">
    <FONT FACE="Roboto" COLOR="white" POINT-SIZE="10"><B>
    RelationPermission
    </B></FONT></TD></TR>
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto"><B>id</B></FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto"><B>BigAutoField</B></FONT>
    </TD></TR>
  
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto">left_approved</FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto">BooleanField</FONT>
    </TD></TR>
  
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto">name</FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto">CharField</FONT>
    </TD></TR>
  
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto">right_approved</FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto">BooleanField</FONT>
    </TD></TR>
  
  
    </TABLE>
    >]

  templateapp_models_DatasetUserRelation [label=<
    <TABLE BGCOLOR="white" BORDER="1" CELLBORDER="0" CELLSPACING="0">
    <TR><TD COLSPAN="2" CELLPADDING="5" ALIGN="CENTER" BGCOLOR="#1b563f">
    <FONT FACE="Roboto" COLOR="white" POINT-SIZE="10"><B>
    DatasetUserRelation
    </B></FONT></TD></TR>
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto"><B>id</B></FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto"><B>BigAutoField</B></FONT>
    </TD></TR>
  
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto"><B>dataset</B></FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto"><B>ForeignKey (id)</B></FONT>
    </TD></TR>
  
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto"><B>user</B></FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto"><B>ForeignKey (id)</B></FONT>
    </TD></TR>
  
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto">description</FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto">TextField</FONT>
    </TD></TR>
  
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto">visible</FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto">BooleanField</FONT>
    </TD></TR>
  
  
    </TABLE>
    >]

  templateapp_models_Dataset [label=<
    <TABLE BGCOLOR="white" BORDER="1" CELLBORDER="0" CELLSPACING="0">
    <TR><TD COLSPAN="2" CELLPADDING="5" ALIGN="CENTER" BGCOLOR="#1b563f">
    <FONT FACE="Roboto" COLOR="white" POINT-SIZE="10"><B>
    Dataset
    </B></FONT></TD></TR>
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto"><B>id</B></FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto"><B>BigAutoField</B></FONT>
    </TD></TR>
  
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto">title</FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto">CharField</FONT>
    </TD></TR>
  
  
    </TABLE>
    >]




  // Relations

  templateapp_models_DatasetUserRelation -> templateapp_models_Dataset
  [label=" dataset (datasetuserrelation)"] [arrowhead=none, arrowtail=dot, dir=both];
  django_contrib_auth_models_User [label=<
  <TABLE BGCOLOR="white" BORDER="0" CELLBORDER="0" CELLSPACING="0">
  <TR><TD COLSPAN="2" CELLPADDING="4" ALIGN="CENTER" BGCOLOR="#1b563f">
  <FONT FACE="Roboto" POINT-SIZE="12" COLOR="white">User</FONT>
  </TD></TR>
  </TABLE>
  >]
  templateapp_models_DatasetUserRelation -> django_contrib_auth_models_User
  [label=" user (datasetuserrelation)"] [arrowhead=none, arrowtail=dot, dir=both];

  templateapp_models_DatasetUserRelation -> templateapp_models_RelationPermission
  [label=" permissions (datasetuserrelation)"] [arrowhead=dot arrowtail=dot, dir=both];


}

Representation of the Dataset - User relation.

from django.contrib.auth.models import User


class RelationPermission(models.Model):

    name = models.CharField(max_length=20)
    left_approved = models.BooleanField(default=False)
    right_approved = models.BooleanField(default=False)


class DatasetUserRelation(models.Model):

    description = models.TextField(
        help_text="Human-readable description of the relation."
    )
    dataset = models.ForeignKey("Dataset", on_delete=models.CASCADE)
    user = models.ForeignKey(User, on_delete=models.CASCADE)
    permissions = models.ManyToManyField(RelationPermission)
    visible = models.BooleanField(default=True)


class Dataset(models.Model):

    title = models.CharField(max_length=50)
    users = models.ManyToManyField(
        User, through=DatasetUserRelation
    )

This methods allows us to implement a fine-grained permission system where the dataset owner can tightly control who can do what with his or her data. But to simplify this, we also define roles within the model-data explorer that are preconfigured permission sets. If a user has an Owner role for a dataset, for instance, the corresponding DatasetUserRelation has the can_edit, can_view, can_delete and can_list permission (or even more fine-grained, see Users and datasets).

Users and Authors

A user actively with a login for the the Model Data Explorer, is always associated with one specific author.

Not every author has a user, but every author is related to one or more datasets and is listed on the detail page of the dataset.

Authors and datasets

An author can be a contributor to a dataset, meaning that the author participated in the generation of a dataset.

This has no further implications, but all authors are listed when displaying the metadata of a dataset.

Users and datasets

A user can be linked to a dataset and can be equipped with a combination of the following roles:

Owner

The user has full control over the dataset. He or she can

  • delete the dataset

  • change the metadata

  • change the groups

  • register new services (for visualization, analysis, etc.)

  • remove services

Data Manager

The user can register and remove services

Editor

The user can change the metadata of a dataset. He or she can

  • change the metadata

  • change the groups

Viewer

The user can see the metadata and services of the dataset.

Authors and data groups

There are no planned relations between authors and data groups. Mainly because there is no way to validate whether an author is really a member of a data group, as the author does not have the possibility to verify unless he or she has an associated user account.

Users and data groups

A data group is a collection of users and datasets. Each related user of a data group can have multiple of the following roles:

Owner

The user has full control over the data group and the associated contents. He or she can

  • delete the group

  • change the metadata of the group

  • change the group-to-group relations

  • remove or add linked datasets

  • add new users to the group

  • approve roles of users

  • remove (disable) users from the group

User manager

A user manager can control who is in the data group and approve roles

Data manager

The user has Data manager priviliges on all datasets that are owned by the group (see Users and datasets above, and Datasets and data groups below)

Data editor

The user has Editor priviliges on all datasets that are owned by the group (see Users and datasets above, and Datasets and data groups below)

Editor

The user can change the metadata, e.g.

  • title and group description

  • group-to-group relations

Member

The user can view all datasets that the group has view permissions on. Members of a group can also grant the following permissions to the group:

  1. edit permissions: all datasets of the user can be edited by the data managers/editors of the group

  2. view permissions: all datasets of the user can be viewed by members of this group

These permissions can also be granted on a per-dataset basis (see Datasets and data groups below).

He or she can furthermore specify if all datasets of the user are automatically marked as products of the group (see below).

Data groups

Data groups can also be related to each other. This way, we can visualize the network of a group, and we can make sure that the content is maintained, even if a group ends.

Relations between two groups must be confirmed by the owners of both groups (i.e. users with can_edit permission).

Two groups can be related in the following ways:

Group A is parent of Group B

Members of a data group might form a subgroup of a larger group to visualize the content of this group on a dedicated side.

  • All datasets related to the child group are also related to the parent group

  • Owners of the parent groups have the same rights as the owners of the child group

  • Members of the child group have view permissions on datasets of the parent group

  • Members of the parent group do not automatically have view permissions on the items of the child group (unless explicitly configured)

Helmholtz-Zentrum Hereon is parent of the Institute of Coastal Systems - Analysis and Modeling

Group A is collaborating with Group B

A permanent collaboration between is a visual implication and acknowledges the partner group for their contribution.

No implications on permissions, just to display the network on the webpage.

Adds datasets of Group B as dataset through collaboration with Group A (see below)

  • MuSSeL is collaborating with Hereon

  • Hereon is collaborating with AWI

digraph model_graph {
  // Dotfile by Django-Extensions graph_models
  // Created: 2023-02-14 14:49
  // Cli Options: --output /home/docs/checkouts/readthedocs.org/user_builds/mde-prototype/checkouts/latest/source/tmp_graph.dot templateapp

  fontname = "Roboto"
  fontsize = 8
  splines  = true
  rankdir = "TB"

  node [
    fontname = "Roboto"
    fontsize = 8
    shape = "plaintext"
  ]

  edge [
    fontname = "Roboto"
    fontsize = 8
  ]

  // Labels


  templateapp_models_DataGroup [label=<
    <TABLE BGCOLOR="white" BORDER="1" CELLBORDER="0" CELLSPACING="0">
    <TR><TD COLSPAN="2" CELLPADDING="5" ALIGN="CENTER" BGCOLOR="#1b563f">
    <FONT FACE="Roboto" COLOR="white" POINT-SIZE="10"><B>
    DataGroup
    </B></FONT></TD></TR>
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto"><B>id</B></FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto"><B>BigAutoField</B></FONT>
    </TD></TR>
  
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto">name</FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto">CharField</FONT>
    </TD></TR>
  
  
    </TABLE>
    >]

  templateapp_models_RelationPermission [label=<
    <TABLE BGCOLOR="white" BORDER="1" CELLBORDER="0" CELLSPACING="0">
    <TR><TD COLSPAN="2" CELLPADDING="5" ALIGN="CENTER" BGCOLOR="#1b563f">
    <FONT FACE="Roboto" COLOR="white" POINT-SIZE="10"><B>
    RelationPermission
    </B></FONT></TD></TR>
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto"><B>id</B></FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto"><B>BigAutoField</B></FONT>
    </TD></TR>
  
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto">left_approved</FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto">BooleanField</FONT>
    </TD></TR>
  
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto">name</FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto">CharField</FONT>
    </TD></TR>
  
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto">right_approved</FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto">BooleanField</FONT>
    </TD></TR>
  
  
    </TABLE>
    >]

  templateapp_models_DataGroupRelation [label=<
    <TABLE BGCOLOR="white" BORDER="1" CELLBORDER="0" CELLSPACING="0">
    <TR><TD COLSPAN="2" CELLPADDING="5" ALIGN="CENTER" BGCOLOR="#1b563f">
    <FONT FACE="Roboto" COLOR="white" POINT-SIZE="10"><B>
    DataGroupRelation
    </B></FONT></TD></TR>
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto"><B>id</B></FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto"><B>BigAutoField</B></FONT>
    </TD></TR>
  
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto"><B>left</B></FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto"><B>ForeignKey (id)</B></FONT>
    </TD></TR>
  
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto"><B>right</B></FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto"><B>ForeignKey (id)</B></FONT>
    </TD></TR>
  
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto">description</FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto">TextField</FONT>
    </TD></TR>
  
  
  
    <TR><TD ALIGN="LEFT" BORDER="0">
    <FONT FACE="Roboto">visible</FONT>
    </TD><TD ALIGN="LEFT">
    <FONT FACE="Roboto">BooleanField</FONT>
    </TD></TR>
  
  
    </TABLE>
    >]




  // Relations

  templateapp_models_DataGroupRelation -> templateapp_models_DataGroup
  [label=" left (related_group_left)"] [arrowhead=none, arrowtail=dot, dir=both];

  templateapp_models_DataGroupRelation -> templateapp_models_DataGroup
  [label=" right (related_group_right)"] [arrowhead=none, arrowtail=dot, dir=both];

  templateapp_models_DataGroupRelation -> templateapp_models_RelationPermission
  [label=" permissions (datagrouprelation)"] [arrowhead=dot arrowtail=dot, dir=both];


}

Relations between data groups

class DataGroup(models.Model):

    name = models.CharField(max_length=100)
    related_groups = models.ManyToManyField(
        "self",
        through="DataGroupRelation",
        through_fields=("left", "right"),
    )


class RelationPermission(models.Model):

    name = models.CharField(max_length=20)
    left_approved = models.BooleanField(default=False)
    right_approved = models.BooleanField(default=False)


class DataGroupRelation(models.Model):

    description = models.TextField(
        help_text="Human-readable description of the relation."
    )
    left = models.ForeignKey(
        DataGroup,
        on_delete=models.CASCADE,
        related_name="related_group_left",
    )
    right = models.ForeignKey(
        DataGroup,
        on_delete=models.CASCADE,
        related_name="related_group_right",
    )
    permissions = models.ManyToManyField(RelationPermission)
    visible = models.BooleanField(default=True)

Datasets and data groups

The link between datasets and data groups is important in multiple aspects:

  1. datasets should still be editable, even if the creator of the data left the institute/project/science

  2. data groups should be able to list the datasets that their members have made public.

To view and manage datasets of a specific data group, one can set the same roles as for Users and datasets. Depending on the role in the data group, the users will then get the appropriate rights (see Users and data groups).

Datasets

Datasets can also be related to each other. These relations would not have any implications on the permissions who can see or edit datasets. They are of informative nature only, e.g. to create a machine-readable representation what dataset has been forced etc. by what other dataset.

The relations that we can think of are

  • Dataset A is forced by Dataset B (i.e. dataset A uses dataset B as (local) boundary conditions)

  • Dataset A supplements Dataset B

  • Dataset A references Dataset B

  • Dataset A is new version of Dataset B what does this mean?

  • Dataset A continues Dataset B

  • Dataset A has Dataset B as its initial conditions

  • Dataset A requires Dataset B

  • Dataset A replaces Dataset B

Datasets should also have a status, e.g. inactive, active, deprecated.

Metadata through Tags

When using metadata tags (see Tags), one can add another relation, namely Dataset A is described by tag B. For instance MPI-ESM-LR-rcp45 is described by the rcp45 tag. And since rcp45 is a child tag of CMIP, it can be automatically found under the CMIP tag.

This gives us a possibility to generate a controlled vocabulary for metadata with unique handles per metadata items.