gimie package¶
Subpackages¶
Submodules¶
gimie.cli module¶
Command line interface to the gimie package.
- class gimie.cli.RDFFormatChoice(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]¶
-
- jsonld = 'json-ld'¶
- nt = 'nt'¶
- ttl = 'ttl'¶
- gimie.cli.advice(url: str)[source]¶
Show a metadata completion report for a Git repository at the target URL.
NOTE: Not implemented yet
- gimie.cli.callback(version: bool | None = <typer.models.OptionInfo object>)[source]¶
gimie digs Git repositories for metadata.
- gimie.cli.data(url: str, format: ~gimie.cli.RDFFormatChoice = <typer.models.OptionInfo object>, base_url: str | None = <typer.models.OptionInfo object>, include_parser: ~typing.List[str] | None = <typer.models.OptionInfo object>, exclude_parser: ~typing.List[str] | None = <typer.models.OptionInfo object>, version: bool | None = <typer.models.OptionInfo object>)[source]¶
Extract linked metadata from a Git repository at the target URL.
The output is sent to stdout, and turtle is used as the default serialization format.
gimie.io module¶
Standard input interfaces to local or remote resources for gimie.
- class gimie.io.IterStream(iterator: Iterator[bytes])[source]¶
Bases:
RawIOBase
Wraps an iterator under a like a file-like interface. Empty elements in the iterator are ignored.
- Parameters:
iterator – An iterator yielding bytes.
Examples
>>> stream = IterStream(iter([b"Hello ", b"", b"World"])) >>> stream.read() b'Hello World'
- class gimie.io.LocalResource(path: str | PathLike)[source]¶
Bases:
Resource
Providing read-only access to local data via a file-like interface.
Examples
>>> resource = LocalResource("README.md")
- class gimie.io.RemoteResource(path: str, url: str, headers: dict | None = None)[source]¶
Bases:
Resource
Provides read-only access to remote data via a file-like interface.
- Parameters:
url – The URL where the resource. can be downladed from.
headers – Optional headers to pass to the request.
Examples
>>> url = "https://raw.githubusercontent.com/sdsc-ordes/gimie/main/README.md" >>> content = RemoteResource("README.md", url).open().read() >>> assert isinstance(content, bytes)
gimie.models module¶
Data models to represent nodes in the graph generated by gimie.
- class gimie.models.Organization(_id: str, name: str, legal_name: str | None = None, email: List[str] | None = None, description: str | None = None, logo: str | None = None)[source]¶
Bases:
object
See http//schema.org/Organization
- class gimie.models.OrganizationSchema(*args, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None, flattened=False, lazy=False, _all_objects=None, _visited=None, _top_level=True)[source]¶
Bases:
JsonLDSchema
- class Meta[source]¶
Bases:
object
- model¶
alias of
Organization
- rdf_type = rdflib.term.URIRef('http://schema.org/Organization')¶
- opts: SchemaOpts = <calamus.schema.JsonLDSchemaOpts object>¶
- class gimie.models.Person(_id: str, identifier: str, name: str | None = None, email: str | None = None, affiliations: List[Organization] | None = None)[source]¶
Bases:
object
See http//schema.org/Person
- affiliations: List[Organization] | None = None¶
- class gimie.models.PersonSchema(*args, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None, flattened=False, lazy=False, _all_objects=None, _visited=None, _top_level=True)[source]¶
Bases:
JsonLDSchema
- opts: SchemaOpts = <calamus.schema.JsonLDSchemaOpts object>¶
- class gimie.models.Release(tag: str, date: <module 'datetime' from '/usr/lib/python3.12/datetime.py'>, commit_hash: str)[source]¶
Bases:
object
This class represents a release of a repository.
- Parameters:
tag (str) – The tag of the release.
date (datetime.datetime) – The date of the release.
commit_hash (str) – The commit hash of the release.
- date: <module 'datetime' from '/usr/lib/python3.12/datetime.py'>¶
- class gimie.models.Repository(url: str, name: str, authors: ~typing.List[~gimie.models.Organization | ~gimie.models.Person] | None = None, contributors: ~typing.List[~gimie.models.Person] | None = None, date_created: <module 'datetime' from '/usr/lib/python3.12/datetime.py'> | None = None, date_modified: <module 'datetime' from '/usr/lib/python3.12/datetime.py'> | None = None, date_published: <module 'datetime' from '/usr/lib/python3.12/datetime.py'> | None = None, description: str | None = None, download_url: str | None = None, identifier: str | None = None, keywords: ~typing.List[str] | None = None, licenses: ~typing.List[str] | None = None, parent_repository: str | None = None, prog_langs: ~typing.List[str] | None = None, version: str | None = None)[source]¶
Bases:
object
This class represents a git repository. It does not contain any information about the content of the repository. See https://schema.org/SoftwareSourceCode
- authors: List[Organization | Person] | None = None¶
- date_created: <module 'datetime' from '/usr/lib/python3.12/datetime.py'> | None = None¶
- date_modified: <module 'datetime' from '/usr/lib/python3.12/datetime.py'> | None = None¶
- date_published: <module 'datetime' from '/usr/lib/python3.12/datetime.py'> | None = None¶
- class gimie.models.RepositorySchema(*args, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None, flattened=False, lazy=False, _all_objects=None, _visited=None, _top_level=True)[source]¶
Bases:
JsonLDSchema
This defines the schema used for json-ld serialization.
- class Meta[source]¶
Bases:
object
- add_value_types = False¶
- model¶
alias of
Repository
- rdf_type = rdflib.term.URIRef('http://schema.org/SoftwareSourceCode')¶
- opts: SchemaOpts = <calamus.schema.JsonLDSchemaOpts object>¶
gimie.project module¶
Orchestration of multiple extractors for a given project. This is the main entry point for end-to-end analysis.
- class gimie.project.Project(path: str, base_url: str | None = None, git_provider: str | None = None, parser_names: Iterable[str] | None = None)[source]¶
Bases:
object
A class to represent a project’s git repository.
- Parameters:
path – The full path (URL) of the repository.
base_url – The base URL of the git remote. Can be used to specify delimitation between base URL and project name.
git_provider – The name of the git provider to extract metadata from. (‘git’, ‘github’, ‘gitlab’)
parser_names – Names of file parsers to use. (‘license’). If None, default parsers are used (see gimie.parsers.PARSERS).
Examples
>>> proj = Project("https://github.com/sdsc-ordes/gimie") >>> assert isinstance(proj.extract(), Graph)