Repository API

This API is built on top of Git and Git-LFS.

Renku repository management.

renku.core.management.RENKU_HOME = '.renku'

Project directory name.

Datasets

Client for handling datasets.

class renku.core.management.datasets.DatasetsApiMixin[source]

Client for handling datasets.

Method generated by attrs for class DatasetsApiMixin.

CACHE = 'cache'

Directory to cache transient data.

DATASET_IMAGES = 'dataset_images'

Directory for dataset images.

POINTERS = 'pointers'

Directory for storing external pointer files.

add_data_to_dataset(dataset, urls, force=False, overwrite=False, sources=(), destination='', ref=None, external=False, extract=False, all_at_once=False, destination_names=None, repository=None, clear_files_before=False)[source]

Import the data into the data directory.

create_dataset(name=None, title=None, description=None, creators=None, keywords=None, images=None, safe_image_paths=None, update_provenance=True, custom_metadata=None)[source]

Create a dataset.

property datasets

A map from datasets name to datasets.

static get_dataset(name, strict=False, immutable=False)[source]

Load dataset reference file.

has_external_files()[source]

Return True if project has external files.

is_protected_path(path)[source]

Checks if a path is a protected path.

move_files(files, to_dataset)[source]

Move files and their metadata from one or more datasets to a target dataset.

static remove_file(filepath)[source]

Remove a file/symlink and its pointer file (for external files).

property renku_dataset_images_path

Return a Path instance of Renku dataset metadata folder.

property renku_pointers_path

Return a Path instance of Renku pointer files folder.

set_dataset_images(dataset, images, safe_image_paths=None)[source]

Set the images on a dataset.

update_dataset_custom_metadata(dataset, custom_metadata)[source]

Update custom metadata on a dataset.

update_dataset_git_files(files, ref, delete=False)[source]

Update files and dataset metadata according to their remotes.

Parameters
  • files – List of files to be updated

  • delete – Indicates whether to delete files or not

Returns

List of files that should be deleted

update_dataset_local_files(records, delete=False)[source]

Update files metadata from the git history.

update_external_files(records)[source]

Update files linked to external storage.

with_dataset(database_dispatcher, name=None, create=False, commit_database=False, creator=None)[source]

Yield an editable metadata object for a dataset.

Repository

Client for handling a local repository.

class renku.core.management.repository.PathMixin(path=<function default_path>)[source]

Define a default path attribute.

Method generated by attrs for class PathMixin.

class renku.core.management.repository.RepositoryApiMixin(renku_home='.renku', parent=None, remote_cache=NOTHING, *, data_dir='data')[source]

Client for handling a local repository.

Method generated by attrs for class RepositoryApiMixin.

DATABASE_PATH = 'metadata'

Directory for metadata storage.

DOCKERFILE = 'Dockerfile'

Name of the Dockerfile in the repository.

LOCK_SUFFIX = '.lock'

Default suffix for Renku lock file.

data_dir

Define a name of the folder for storing datasets.

property database_path

Path to the metadata storage directory.

property docker_path

Path to the Dockerfile.

get_in_submodules(commit, path)[source]

Resolve filename in submodules.

get_template_files(template_path, metadata)[source]

Gets paths in a rendered renku template.

has_graph_files()[source]

Return true if database exists.

import_from_template(template_path, metadata, force=False)[source]

Render template files from a template directory.

init_repository(force=False, user=None, initial_branch=None)[source]

Initialize an empty Renku repository.

is_project_set()[source]

Return if project is set for the client.

property latest_agent

Returns latest agent version used in the repository.

property lock

Create a Renku config lock.

parent

Store a pointer to the parent repository.

property project

Return the Project instance.

property remote

Return host, owner and name of the remote if it exists.

renku_home

Define a name of the Renku folder (default: .renku).

renku_path

Store a Path instance of the Renku folder.

property template_checksums

Return a Path instance to the template checksums file.

property transaction_id

Get a transaction id for the current client to be used for grouping git commits.

with_metadata(project_gateway, database_gateway, read_only=False, name=None, description=None, keywords=None, custom_metadata=None)[source]

Yield an editable metadata object.

renku.core.management.repository.path_converter(path)[source]

Converter for path in PathMixin.

Git Internals

Wrap Git client.

class renku.core.management.git.GitCore[source]

Wrap Git client.

Method generated by attrs for class GitCore.

property candidate_paths

Return all paths in the index and untracked files.

commit(commit_only=None, commit_empty=True, raise_if_empty=False, commit_message=None, abbreviate_message=True, skip_dirty_checks=False)[source]

Automatic commit.

property dirty_paths

Get paths of dirty files in the repository.

ensure_clean(ignore_std_streams=False)[source]

Make sure the repository is clean.

ensure_unstaged(path)[source]

Ensure that path is not part of git staged files.

ensure_untracked(path)[source]

Ensure that path is not part of git untracked files.

find_ignored_paths(*paths)[source]

Return ignored paths matching .gitignore file.

property modified_paths

Return paths of modified files.

remove_unmodified(paths, autocommit=True)[source]

Remove unmodified paths and return their names.

setup_credential_helper()[source]

Setup git credential helper to cache if not set already.

worktree(path=None, branch_name=None, commit=None, merge_args=('--ff-only',))[source]

Create new worktree.

renku.core.management.git.finalize_commit(client, diff_before, commit_only=None, commit_empty=True, raise_if_empty=False, commit_message=None, abbreviate_message=True)[source]

Commit modified/added paths.

renku.core.management.git.finalize_worktree(client, isolation, path, branch_name, delete, new_branch, merge_args=('--ff-only',), exception=None)[source]

Cleanup and merge a previously created Git worktree.

renku.core.management.git.get_mapped_std_streams(lookup_paths, streams=('stdin', 'stdout', 'stderr'))[source]

Get a mapping of standard streams to given paths.

renku.core.management.git.prepare_commit(client, commit_only=None, skip_dirty_checks=False)[source]

Gather information about repo needed for committing later on.

renku.core.management.git.prepare_worktree(original_client, path=None, branch_name=None, commit=None)[source]

Set up a Git worktree to provide isolation.

Git utilities.

class renku.core.models.git.GitURL(href, pathname=None, protocol='ssh', hostname='localhost', username=None, password=None, port=None, owner=None, name=None, slug=None, regex=None)[source]

Parser for common Git URLs.

Method generated by attrs for class GitURL.

property image

Return image name.

property instance_url

Get the url of the git instance.

classmethod parse(href)[source]

Derive URI components.

renku.core.models.git.filter_repo_name(repo_name)[source]

Remove the .git extension from the repo name.

Command Builder

Most renku commands require context (database/git/etc.) to be set up for them. The command builder pattern makes this easy by wrapping commands in factory methods.

Renku Command Builder .

class renku.core.management.command_builder.Command[source]

Base renku command builder.

__init__ of Command.

add_injection_pre_hook(order, hook)[source]

Add a pre-execution hook for dependency injection.

Parameters
  • order – Determines the order of executed hooks, lower numbers get executed first.

  • hook – The hook to add

add_post_hook(order, hook)[source]

Add a post-execution hook.

Parameters
  • order – Determines the order of executed hooks, lower numbers get executed first.

  • hook – The hook to add

add_pre_hook(order, hook)[source]

Add a pre-execution hook.

Parameters
  • order – Determines the order of executed hooks, lower numbers get executed first.

  • hook – The hook to add

build()[source]

Build (finalize) the command.

command(operation)[source]

Set the wrapped command.

Parameters

operation – The function to wrap in the command builder.

execute(*args, **kwargs)[source]

Execute the wrapped operation.

First executes pre_hooks in ascending order, passing a read/write context between them. It then calls the wrapped operation. The result of the operation then gets pass to all the post_hooks, but in descending order. It then returns the result or error if there was one.

property finalized

Whether this builder is still being constructed or has been finalized.

lock_dataset()[source]

Acquire a lock for a dataset.

lock_project()[source]

Acquire a lock for the whole project.

require_clean()[source]

Check that the repository is clean.

require_migration()[source]

Check if a migration is needed.

require_nodejs()[source]

Ensure nodejs is installed.

track_std_streams()[source]

Whether to track STD streams or not.

with_commit(message=None, commit_if_empty=False, raise_if_empty=False, commit_only=None)[source]

Create a commit.

Parameters
  • message – The commit message. Auto-generated if left empty.

  • commit_if_empty – Whether to commit if there are no modified files .

  • raise_if_empty – Whether to raise an exception if there are no modified files.

  • commit_only – Only commit the supplied paths.

with_communicator(communicator)[source]

Create a communicator.

with_database(write=False, path=None, create=False)[source]

Provide an object database connection.

with_git_isolation()[source]

Whether to run in git isolation or not.

working_directory(directory)[source]

Set the working directory for the command.

Parameters

directory – The working directory to work in.