Repository

class pynblint.repository.GitHubRepository(github_url: str)[source]

This class stores data about a GitHub repository

class pynblint.repository.LocalRepository(source_path: pathlib.Path)[source]

This class stores data about a local code repository. The source_path can point either to a local directory or a zip archive

class pynblint.repository.Repository(path: pathlib.Path)[source]

This class stores data about a code repository.

property large_file_paths: List[pathlib.Path]

Return the list of files whose size is above the fixed threshold.

The threshold size for data files is defined in the settings.

Returns

the list of large files.

Return type

List[Path]

Repository Linting

Linting functions for repositories containing notebooks

pynblint.repo_linting.dependencies_unmanaged(repo: pynblint.repository.Repository) bool[source]

Check the absence of configuration files for dependency management tools.

All configuration files are searched in the root of the repository.

pynblint.repo_linting.duplicate_notebook_filename(repo: pynblint.repository.Repository) List[pathlib.Path][source]

Check the existence of notebooks with the same filename within a repository

pynblint.repo_linting.repository_not_versioned(repo: pynblint.repository.Repository) bool[source]

Check the absence of the .git folder.

pynblint.repo_linting.unversioned_large_data_files(repo: pynblint.repository.Repository) List[pathlib.Path][source]

Check the presence of unversioned large data files.

Check wether large data files are versioned with a data version control system.

Currently, the only data VCS that Pynblint detects is DVC (https://dvc.org). Alternative solutions will be added soon.

Parameters

repo (Repository) – The repository to be analyzed.

Returns

the list of large data files that should be put under version control.

Return type

List[Path]