gimie package¶
Subpackages¶
Submodules¶
gimie.cli module¶
Command line interface to the gimie package.
- class gimie.cli.RDFFormatChoice(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]¶
-
- jsonld = 'json-ld'¶
- nt = 'nt'¶
- ttl = 'ttl'¶
- gimie.cli.advice(url: str)[source]¶
Show a metadata completion report for a Git repository at the target URL.
NOTE: Not implemented yet
- gimie.cli.callback(version: bool | None = <typer.models.OptionInfo object>)[source]¶
gimie digs Git repositories for metadata.
- gimie.cli.data(url: str, format: ~gimie.cli.RDFFormatChoice = <typer.models.OptionInfo object>, base_url: str | None = <typer.models.OptionInfo object>, include_parser: ~typing.List[str] | None = <typer.models.OptionInfo object>, exclude_parser: ~typing.List[str] | None = <typer.models.OptionInfo object>, version: bool | None = <typer.models.OptionInfo object>)[source]¶
Extract linked metadata from a Git repository at the target URL.
The output is sent to stdout, and turtle is used as the default serialization format.
gimie.io module¶
Standard input interfaces to local or remote resources for gimie.
- class gimie.io.IterStream(iterator: Iterator[bytes])[source]¶
Bases:
RawIOBaseWraps an iterator under a like a file-like interface. Empty elements in the iterator are ignored.
- Parameters:
iterator – An iterator yielding bytes.
Examples
>>> stream = IterStream(iter([b"Hello ", b"", b"World"])) >>> stream.read() b'Hello World'
- class gimie.io.LocalResource(path: str | PathLike)[source]¶
Bases:
ResourceProviding read-only access to local data via a file-like interface.
Examples
>>> resource = LocalResource("README.md")
- class gimie.io.RemoteResource(path: str, url: str, headers: dict | None = None)[source]¶
Bases:
ResourceProvides read-only access to remote data via a file-like interface.
- Parameters:
url – The URL where the resource. can be downladed from.
headers – Optional headers to pass to the request.
Examples
>>> url = "https://raw.githubusercontent.com/sdsc-ordes/gimie/main/README.md" >>> content = RemoteResource("README.md", url).open().read() >>> assert isinstance(content, bytes)
gimie.models module¶
Data models to represent nodes in the graph generated by gimie.
- class gimie.models.Organization(_id: str, name: str, legal_name: str | None = None, email: List[str] | None = None, description: str | None = None, logo: str | None = None)[source]¶
Bases:
objectSee http//schema.org/Organization
- class gimie.models.OrganizationSchema(*args, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None, flattened=False, lazy=False, _all_objects=None, _visited=None, _top_level=True)[source]¶
Bases:
JsonLDSchema- class Meta[source]¶
Bases:
object- model¶
alias of
Organization
- rdf_type = rdflib.term.URIRef('http://schema.org/Organization')¶
- opts: SchemaOpts = <calamus.schema.JsonLDSchemaOpts object>¶
- class gimie.models.Person(_id: str, identifier: str, name: str | None = None, email: str | None = None, affiliations: List[Organization] | None = None)[source]¶
Bases:
objectSee http//schema.org/Person
- affiliations: List[Organization] | None = None¶
- class gimie.models.PersonSchema(*args, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None, flattened=False, lazy=False, _all_objects=None, _visited=None, _top_level=True)[source]¶
Bases:
JsonLDSchema- opts: SchemaOpts = <calamus.schema.JsonLDSchemaOpts object>¶
- class gimie.models.Release(tag: str, date: <module 'datetime' from '/usr/lib/python3.12/datetime.py'>, commit_hash: str)[source]¶
Bases:
objectThis class represents a release of a repository.
- Parameters:
tag (str) – The tag of the release.
date (datetime.datetime) – The date of the release.
commit_hash (str) – The commit hash of the release.
- date: <module 'datetime' from '/usr/lib/python3.12/datetime.py'>¶
- class gimie.models.Repository(url: str, name: str, authors: ~typing.List[~gimie.models.Organization | ~gimie.models.Person] | None = None, contributors: ~typing.List[~gimie.models.Person] | None = None, date_created: <module 'datetime' from '/usr/lib/python3.12/datetime.py'> | None = None, date_modified: <module 'datetime' from '/usr/lib/python3.12/datetime.py'> | None = None, date_published: <module 'datetime' from '/usr/lib/python3.12/datetime.py'> | None = None, description: str | None = None, download_url: str | None = None, identifier: str | None = None, keywords: ~typing.List[str] | None = None, licenses: ~typing.List[str] | None = None, parent_repository: str | None = None, prog_langs: ~typing.List[str] | None = None, version: str | None = None)[source]¶
Bases:
objectThis class represents a git repository. It does not contain any information about the content of the repository. See https://schema.org/SoftwareSourceCode
- authors: List[Organization | Person] | None = None¶
- date_created: <module 'datetime' from '/usr/lib/python3.12/datetime.py'> | None = None¶
- date_modified: <module 'datetime' from '/usr/lib/python3.12/datetime.py'> | None = None¶
- date_published: <module 'datetime' from '/usr/lib/python3.12/datetime.py'> | None = None¶
- class gimie.models.RepositorySchema(*args, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None, flattened=False, lazy=False, _all_objects=None, _visited=None, _top_level=True)[source]¶
Bases:
JsonLDSchemaThis defines the schema used for json-ld serialization.
- class Meta[source]¶
Bases:
object- add_value_types = False¶
- model¶
alias of
Repository
- rdf_type = rdflib.term.URIRef('http://schema.org/SoftwareSourceCode')¶
- opts: SchemaOpts = <calamus.schema.JsonLDSchemaOpts object>¶
gimie.project module¶
Orchestration of multiple extractors for a given project. This is the main entry point for end-to-end analysis.
- class gimie.project.Project(path: str, base_url: str | None = None, git_provider: str | None = None, parser_names: Iterable[str] | None = None)[source]¶
Bases:
objectA class to represent a project’s git repository.
- Parameters:
path – The full path (URL) of the repository.
base_url – The base URL of the git remote. Can be used to specify delimitation between base URL and project name.
git_provider – The name of the git provider to extract metadata from. (‘git’, ‘github’, ‘gitlab’)
parser_names – Names of file parsers to use. (‘license’). If None, default parsers are used (see gimie.parsers.PARSERS).
Examples
>>> proj = Project("https://github.com/sdsc-ordes/gimie") >>> assert isinstance(proj.extract(), Graph)