Datasets

Models representing datasets.

class renku.domain_model.dataset.AnnotationJson(*, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None)[source]

Bases: Schema

Schema for Annotations.

class renku.domain_model.dataset.Dataset(*args, **kwargs)[source]

Bases: Persistent

Represent a dataset.

add_or_update_files(files)[source]

Add new files or update existing files.

clear_files()[source]

Remove all files.

copy()[source]

Return a clone of this dataset.

property creators_csv

Comma-separated list of creators associated with dataset.

property creators_full_csv

Comma-separated list of creators with full identity.

derive_from(dataset, creator, identifier=None, date_created=None)[source]

Make self a derivative of dataset and update related fields.

property files

Return list of existing files.

find_file(path)[source]

Find a file in the dataset using its relative path.

static generate_id(identifier)[source]

Generate an identifier for Dataset.

get_datadir()[source]

Return dataset’s data directory relative to project’s root.

is_derivation()[source]

Return if a dataset has correct derived_from.

is_removed()[source]

Return true if dataset is removed.

is_within_datadir(path)[source]

Return True if a given path is inside dataset’s data directory.

property keywords_csv

Comma-separated list of keywords associated with dataset.

remove(date=None)[source]

Mark the dataset as removed.

replace_identifier(identifier=None)[source]

Replace dataset’s identifier and update relevant fields.

NOTE: Call this only for newly-created/-imported datasets that don’t have a mutability chain because it sets initial_identifier.

Mark a file as removed using its relative path.

update_files_from(current_dataset, date=None)[source]

Check current_files to reuse existing entries and mark removed files.

update_metadata(**kwargs)[source]

Updates metadata.

update_metadata_from(other, exclude=None)[source]

Update metadata from another dataset.

class renku.domain_model.dataset.DatasetCreatorsJson(*, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None)[source]

Bases: Schema

Schema for the dataset creators.

class renku.domain_model.dataset.DatasetDetailsJson(*, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None)[source]

Bases: Schema

Serialize a dataset to a response object.

static get_datadir(obj)[source]

Get data directory.

class renku.domain_model.dataset.DatasetFile(*args, **kwargs)[source]

Bases: Slots

A file in a dataset.

Create and return an empty instance of the class.

copy()[source]

Return a clone of this object.

correct_linked_attribute()[source]

Replace is_external attribute with linked for linked dataset files.

classmethod from_dataset_file(other)[source]

Return a copy with a different id.

classmethod from_path(path, source=None, based_on=None, checksum=None, size=None)[source]

Return an instance from a path.

static generate_id()[source]

Generate an identifier for DatasetFile.

NOTE: ID should not rely on Entity properties because the same Entity can be added and removed multiple times. So, it should be marked by different DatasetFiles.

has_valid_checksum()[source]

Return if file has a valid checksum.

has_valid_size()[source]

Return if file has a valid size.

is_equal_to(other)[source]

Compare content.

NOTE: id is generated randomly and should not be included in this comparison.

is_removed()[source]

Return true if dataset is removed and should not be accessed.

remove(date=None)[source]

Create a new instance and mark it as removed.

class renku.domain_model.dataset.DatasetFileDetailsJson(*, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None)[source]

Bases: Schema

Serialize dataset files to a response object.

class renku.domain_model.dataset.DatasetTag(*args, **kwargs)[source]

Bases: Persistent

Represents a Tag of an instance of a dataset.

static generate_id(dataset_id, name)[source]

Define default value for id field.

class renku.domain_model.dataset.ImageObjectJson(*, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None)[source]

Bases: Schema

ImageObject json schema.

class renku.domain_model.dataset.ImageObjectRequestJson(*, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None)[source]

Bases: Schema

ImageObject json schema.

class renku.domain_model.dataset.Language(*args, **kwargs)[source]

Bases: Immutable

Represent a language of an object.

Create and return an empty instance of the class.

static generate_id(name)[source]

Generate @id field.

class renku.domain_model.dataset.RemoteEntity(*args, **kwargs)[source]

Bases: Slots

Reference to an Entity in a remote repository.

Create and return an empty instance of the class.

static generate_id(checksum, path, url)[source]

Generate an id.

class renku.domain_model.dataset.Url(*, id=None, url=None, url_str=None, url_id=None)[source]

Bases: object

Represents a schema URL reference.

static generate_id(url_str, url_id)[source]

Generate an identifier for Url.

property value

Returns the url value as string.

renku.domain_model.dataset.generate_default_slug(name, version=None)[source]

Get dataset slug.

renku.domain_model.dataset.get_file_path_in_dataset(dataset, dataset_file)[source]

Return path of a file relative to dataset’s data dir.

renku.domain_model.dataset.is_dataset_slug_valid(slug)[source]

Check if a given slug is valid.