Gateways

Renku uses several gateways to abstract away dependencies on external systems such as the database or git.

Interfaces

Interfaces that the Gateways implement.

Renku activity gateway interface.

class renku.core.interface.activity_gateway.IActivityGateway[source]

Bases: abc.ABC

Interface for the ActivityGateway.

add(activity)[source]

Add an Activity to storage.

add_activity_collection(activity_collection)[source]

Add an ActivityCollection to storage.

get_activities_by_generation(path, checksum=None)[source]

Return the list of all activities that generate a path.

get_all_activities()[source]

Get all activities in the project.

get_all_activity_collections()[source]

Get all activity collections in the project.

get_all_generation_paths()[source]

Return all generation paths.

get_all_usage_paths()[source]

Return all usage paths.

get_downstream_activities(activity, max_depth=None)[source]

Get downstream activities that depend on this activity.

get_downstream_activity_chains(activity)[source]

Get a list of tuples of all downstream paths of this activity.

get_upstream_activity_chains(activity)[source]

Get a list of tuples of all upstream paths of this activity.

Renku client dispatcher interface.

class renku.core.interface.client_dispatcher.IClientDispatcher[source]

Bases: abc.ABC

Interface for the ClientDispatcher.

Handles getting current client (LocalClient) and entering/exiting the stack for the client.

property current_client

Get the currently active client.

pop_client()[source]

Remove the current client from the stack.

push_client_to_stack(path, renku_home='.renku', external_storage_requested=True)[source]

Create and push a new client to the stack.

push_created_client_to_stack(client)[source]

Push an already created client to the stack.

Renku database dispatcher interface.

class renku.core.interface.database_dispatcher.IDatabaseDispatcher[source]

Bases: abc.ABC

Interface for the DatabaseDispatcher.

Handles getting current database (Database) and entering/exiting the stack for the database.

property current_database

Get the currently active database.

finalize_dispatcher()[source]

Close all database contexts.

pop_database()[source]

Remove the current database from the stack.

push_database_to_stack(path, commit=False)[source]

Create and push a new database to the stack.

Renku database gateway interface.

class renku.core.interface.database_gateway.IDatabaseGateway[source]

Bases: abc.ABC

Gateway interface for basic database operations.

commit()[source]

Commit changes to database.

get_modified_objects_from_revision(revision_or_range)[source]

Get all database objects modified in a revision.

initialize()[source]

Initialize the database.

Renku dataset gateway interface.

class renku.core.interface.dataset_gateway.IDatasetGateway[source]

Bases: abc.ABC

Interface for the DatasetGateway.

add_or_remove(dataset)[source]

Add or remove a dataset.

add_tag(dataset, tag)[source]

Add a tag from a dataset.

get_all_active_datasets()[source]

Get all datasets.

get_all_tags(dataset)[source]

Return the list of all tags for a dataset.

get_by_id(id)[source]

Get a dataset by id.

get_by_name(name)[source]

Get a dataset by id.

get_provenance_tails()[source]

Return the provenance for all datasets.

remove_tag(dataset, tag)[source]

Remove a tag from a dataset.

Renku plan gateway interface.

class renku.core.interface.plan_gateway.IPlanGateway[source]

Bases: abc.ABC

Interface for the PlanGateway.

add(plan)[source]

Add a plan to the database.

get_all_plans()[source]

Get all plans in project.

get_by_id(id)[source]

Get a plan by id.

get_by_name(name)[source]

Get a plan by name.

get_newest_plans_by_names(with_invalidated=False)[source]

Return a list of all newest plans with their names.

list_by_name(starts_with, ends_with=None)[source]

Search plans by name.

Renku project gateway interface.

class renku.core.interface.project_gateway.IProjectGateway[source]

Bases: abc.ABC

Interface for the ProjectGateway.

get_project()[source]

Get project metadata.

update_project(project)[source]

Update project metadata.

Implementations

Implementation of Gateway interfaces.

Renku activity database gateway implementation.

class renku.infrastructure.gateway.activity_gateway.ActivityGateway[source]

Bases: renku.core.interface.activity_gateway.IActivityGateway

Gateway for activity database operations.

add(activity)[source]

Add an Activity to storage.

add_activity_collection(activity_collection)[source]

Add an ActivityCollection to storage.

get_activities_by_generation(path, checksum=None)[source]

Return the list of all activities that generate a path.

get_all_activities()[source]

Get all activities in the project.

get_all_activity_collections()[source]

Get all activity collections in the project.

get_all_generation_paths()[source]

Return all generation paths.

get_all_usage_paths()[source]

Return all usage paths.

get_downstream_activities(activity, max_depth=None)[source]

Get downstream activities that depend on this activity.

get_downstream_activity_chains(activity)[source]

Get a list of tuples of all downstream paths of this activity.

get_upstream_activity_chains(activity)[source]

Get a list of tuples of all upstream paths of this activity.

Renku generic database gateway implementation.

class renku.infrastructure.gateway.database_gateway.ActivityDownstreamRelation(downstream, upstream)[source]

Bases: object

Implementation of Downstream interface.

class renku.infrastructure.gateway.database_gateway.DatabaseGateway[source]

Bases: renku.core.interface.database_gateway.IDatabaseGateway

Gateway for base database operations.

commit()[source]

Commit changes to database.

get_modified_objects_from_revision(revision_or_range)[source]

Get all database objects modified in a revision.

initialize()[source]

Initialize the database.

renku.infrastructure.gateway.database_gateway.dump_activity(activity, catalog, cache)[source]

Get storage token for an activity.

renku.infrastructure.gateway.database_gateway.dump_downstream_relations(relation, catalog, cache, database_dispatcher)[source]

Dump relation entry to database.

renku.infrastructure.gateway.database_gateway.initialize_database(database)[source]

Initialize an empty database with all required metadata.

renku.infrastructure.gateway.database_gateway.load_activity(token, catalog, cache, database_dispatcher)[source]

Load activity from storage token.

renku.infrastructure.gateway.database_gateway.load_downstream_relations(token, catalog, cache, database_dispatcher)[source]

Load relation entry from database.

Renku dataset gateway interface.

class renku.infrastructure.gateway.dataset_gateway.DatasetGateway[source]

Bases: renku.core.interface.dataset_gateway.IDatasetGateway

Gateway for dataset database operations.

add_or_remove(dataset)[source]

Add or remove a dataset.

add_tag(dataset, tag)[source]

Add a tag from a dataset.

get_all_active_datasets()[source]

Return all datasets.

get_all_tags(dataset)[source]

Return the list of all tags for a dataset.

get_by_id(id)[source]

Get a dataset by id.

get_by_name(name)[source]

Get a dataset by id.

get_provenance_tails()[source]

Return the provenance for all datasets.

remove_tag(dataset, tag)[source]

Remove a tag from a dataset.

Renku plan database gateway implementation.

class renku.infrastructure.gateway.plan_gateway.PlanGateway[source]

Bases: renku.core.interface.plan_gateway.IPlanGateway

Gateway for plan database operations.

add(plan)[source]

Add a plan to the database.

get_all_plans()[source]

Get all plans in project.

get_by_id(id)[source]

Get a plan by id.

get_by_name(name)[source]

Get a plan by name.

get_newest_plans_by_names(with_invalidated=False)[source]

Return a list of all newest plans with their names.

list_by_name(starts_with, ends_with=None)[source]

Search plans by name.

Renku project gateway interface.

class renku.infrastructure.gateway.project_gateway.ProjectGateway[source]

Bases: renku.core.interface.project_gateway.IProjectGateway

Gateway for project database operations.

get_project()[source]

Get project metadata.

update_project(project)[source]

Update project metadata.

Repository

Renku uses git repositories for tracking changes. To abstract away git internals, we delegate all git calls to the Repository class.

An abstraction layer for the underlying VCS.

class renku.infrastructure.repository.Actor(name, email)[source]

Bases: tuple

Author/creator of a commit.

Create new instance of Actor(name, email)

property email

Alias for field number 1

property name

Alias for field number 0

class renku.infrastructure.repository.BaseRepository(path='.', repository=None)[source]

Bases: object

Abstract Base repository.

property active_branch

Return current checked out branch.

add(*paths, force=False, all=False)[source]

Add a list of files to be committed to the VCS.

property branches

Return all branches.

checkout(reference)[source]

Check-out a specific reference.

clean(paths=None)[source]

Remove untracked files.

commit(message, *, amend=False, author=None, committer=None, no_verify=False, no_edit=False, paths=None)[source]

Commit added files to the VCS.

contains(path)[source]

Return True if path is tracked in the repository.

copy_content_to_file(path, revision=None, checksum=None, output_file=None, apply_filters=True)[source]

Get content of an object using its checksum, write it to a file, and return the file’s path.

fetch(remote=None, refspec=None, all=False, tags=False, unshallow=False, depth=None)[source]

Update a remote branches.

property files

Return a list of all files in the current version of the repository.

get_attributes(*paths)[source]

Return a map from paths to its attributes.

NOTE: Dict keys are the same relative or absolute path as inputs.

get_commit(revision)[source]

Return Commit with the provided sha.

get_configuration(writable=False, scope=None)[source]

Return git configuration.

NOTE: Scope can be “global” or “local”.

get_content(path, *, revision=None, checksum=None, binary=False)[source]

Get content of a file in a given revision as text or binary.

get_existing_paths_in_revision(paths=None, revision='HEAD')[source]

List all paths that exist in a revision.

static get_global_configuration(writable=False)[source]

Return global git configuration.

static get_global_user()[source]

Return the global git user.

get_ignored_paths(*paths)[source]

Return ignored paths matching .gitignore file.

get_object_hash(path, revision=None)[source]

Return git hash of an object in a Repo or its submodule.

NOTE: path must be relative to the repo’s root regardless if this function is called from a subdirectory or not.

get_object_hashes(paths, revision=None)[source]

Return git hash of an object in a Repo or its submodule.

NOTE: path must be relative to the repo’s root regardless if this function is called from a subdirectory or not.

get_previous_commit(path, revision=None, first=False, full_history=True, submodule=False)[source]

Return a previous commit for a given path starting from revision.

get_raw_content(*, path, revision=None, checksum=None)[source]

Get raw content of a file in a given revision as text without applying any filter on it.

get_user()[source]

Return the local/global git user.

static hash_object(path)[source]

Create a git hash for a a path. The path doesn’t need to be in a repository.

static hash_objects(paths)[source]

Create a git hash for a list of paths. The paths don’t need to be in a repository.

property head

HEAD of the repository.

is_dirty(untracked_files=False)[source]

Return True if the repository has modified or untracked files ignoring submodules.

is_valid()[source]

Return True if a valid repository exists.

iterate_commits(*paths, revision=None, reverse=False, full_history=False, max_count=- 1)[source]

Return a list of commits.

move(*sources, destination, force=False)[source]

Move source files to the destination.

property path

Absolute path to the repository’s root.

pull(remote=None, refspec=None)[source]

Update changes from remotes.

push(remote=None, refspec=None, *, no_verify=False, set_upstream=False, delete=False, force=False)[source]

Push local changes to a remote repository.

property remotes

Return all remotes.

remove(*paths, index=False, not_exists_ok=False, recursive=False, force=False)[source]

Remove paths from repository or index.

reset(reference=None, hard=False)[source]

Reset a git repository to a given reference.

run_git_command(command, *args, **kwargs)[source]

Run a git command in this repository.

property staged_changes

Return a list of staged changes.

NOTE: This can be implemented by git diff --cached --name-status -z.

status()[source]

Return status of a repository.

property submodules

Return a list of submodules.

property tags

Return all available tags.

property unmerged_blobs

Return a map of path to stage and blob for unmerged blobs in the current index.

property unstaged_changes

Return a list of changes that are not staged.

property untracked_files

Return the list of untracked files.

class renku.infrastructure.repository.Branch(repository, path)[source]

Bases: renku.infrastructure.repository.Reference

A git branch.

classmethod from_head(repository, head)[source]

Create an instance from a git.Head.

property remote_branch

Return the remote branch if any.

class renku.infrastructure.repository.BranchManager(repository)[source]

Bases: object

Manage branches of a Repository.

add(name)[source]

Add a new branch.

remove(branch, force=False)[source]

Remove an existing branch.

class renku.infrastructure.repository.Commit(repository, commit)[source]

Bases: object

A VCS commit.

property author

Author of the commit.

property authored_datetime

Commit authored date.

property committed_datetime

Commit date.

property committer

Committer of the commit.

compare_to(other)[source]

Return -1 if self is made before other.

classmethod from_commit(repository, commit)[source]

Create an instance from a git Commit object.

get_changes(paths=None, commit=None)[source]

Return list of changes in a commit.

NOTE: This function can be implemented with git diff-tree.

property hexsha

Commit sha.

property message

Commit message.

property parents

List of commit parents.

traverse()[source]

Traverse over all objects that are present in this commit.

property tree

Return all objects in the commit’s tree.

class renku.infrastructure.repository.Configuration(repository=None, scope=None, writable=True)[source]

Bases: object

Git configuration manager.

get_value(section, option, default=None)[source]

Return a config value.

has_section(section)[source]

Return if config file has a section.

remove_value(section, option)[source]

Remove a config entry.

set_value(section, option, value=None)[source]

Set a config value.

class renku.infrastructure.repository.Diff(a_path, b_path, change_type)[source]

Bases: tuple

A single diff object between two trees.

Create new instance of Diff(a_path, b_path, change_type)

property a_path

Alias for field number 0

property added

True if file was added.

property b_path

Possible values: A = Added D = Deleted R = Renamed M = Modified T = Changed in the type

property change_type

Alias for field number 2

property deleted

True if file was deleted.

classmethod from_diff(diff)[source]

Create an instance from a git object.

class renku.infrastructure.repository.Object(path, type, size, hexsha)[source]

Bases: tuple

Represent a git object.

Create new instance of Object(path, type, size, hexsha)

classmethod from_object(object)[source]

Create an instance from a git object.

property hexsha

Alias for field number 3

property path

Alias for field number 0

property size

Alias for field number 2

property type

Alias for field number 1

class renku.infrastructure.repository.Reference(repository, path)[source]

Bases: object

A git reference.

property commit

Commit pointed to by the reference.

classmethod from_reference(repository, reference)[source]

Create an instance from a git reference.

is_valid()[source]

Return True if the reference is valid.

property name

Reference name.

property path

Reference path.

class renku.infrastructure.repository.Remote(repository, name)[source]

Bases: object

Remote of a Repository.

classmethod from_remote(repository, remote)[source]

Create an instance from a git remote.

is_valid()[source]

Return True if remote exists.

property name

Remote’s name.

property references

Return a list of remote references.

set_url(url)[source]

Change URL of a remote.

property url

Remote’s URL.

class renku.infrastructure.repository.RemoteManager(repository)[source]

Bases: object

Manage remotes of a Repository.

add(name, url)[source]

Add a new remote.

remove(remote)[source]

Remove an existing remote.

class renku.infrastructure.repository.RemoteReference(repository, path)[source]

Bases: renku.infrastructure.repository.Reference

A git remote reference.

property remote

Return reference’s remote.

class renku.infrastructure.repository.Repository(path='.', search_parent_directories=False, repository=None)[source]

Bases: renku.infrastructure.repository.BaseRepository

Abstract Base repository.

classmethod clone_from(url, path, *, branch=None, recursive=False, depth=None, progress=None, no_checkout=False, env=None, clone_options=None)[source]

Clone a remote repository and create an instance.

classmethod initialize(path, *, bare=False, branch=None)[source]

Initialize a git repository.

class renku.infrastructure.repository.Submodule(parent, name, path, url)[source]

Bases: renku.infrastructure.repository.BaseRepository

A git submodule.

classmethod from_submodule(parent, submodule)[source]

Create an instance from a git submodule.

property name

Return submodule’s name.

property relative_path

Relative submodule’s path to its parent repository.

property url

Return submodule’s url.

class renku.infrastructure.repository.SubmoduleManager(repository)[source]

Bases: object

Manage submodules of a Repository.

remove(submodule, force=False)[source]

Remove an existing submodule.

update(initialize=True)[source]

Update all submodule.

class renku.infrastructure.repository.SymbolicReference(repository, path)[source]

Bases: renku.infrastructure.repository.Reference

A git symbolic reference.

property reference

Return the reference that this object points to.

class renku.infrastructure.repository.Tag(repository, path)[source]

Bases: renku.infrastructure.repository.Reference

A git tag.

property commit

Return the commit the tag refers to.

classmethod from_tag(repository, tag)[source]

Create an instance from a git.Head.

class renku.infrastructure.repository.TagManager(repository)[source]

Bases: object

Manage tags of a Repository.

add(name)[source]

Add a new tag.

remove(tag)[source]

Remove an existing tag.

renku.infrastructure.repository.git_unicode_unescape(s, encoding='utf-8')[source]

Undoes git/GitPython unicode encoding.

renku.infrastructure.repository.split_paths(*paths)[source]

Return a generator with split list of paths.