Tools
MAVA tools so far:
TIB-AV A
Category | Link or Value |
---|---|
Project Page | TIB AV-Analytics ⧉ |
Demo | Video and Demo ⧉ |
Code | tibava ⧉ |
Status | stable with heavy new development |
About
TIB-AV-A analysizes single videos with AI tools, but also allows for manual annotations
Export
Exports are offered in the following formats:
- individual
tsv
s - merged
tsv
s - ELAN ⧉ exports
TIB-AV-A Export data (TSVs)
One TSV file represents an "annotation band".
- 5 fields for anything with a duration:
start
,start in seconds
,duration
,duration in seconds
andannotations
- 3 fields for anything that happens in a specific moment:
start in seconds
,start hh:mm:ss.ms
andannotations
Fields:
start in seconds
-
Seconds from start of the video
start hh:mm:ss.ms
-
maps
start in seconds
to a human readable format duration in seconds
-
duration in seconds
duration hh:mm:ss.ms
-
maps
duration in seconds
to a human readable format annotations
-
either
numeric value
(Example:Angry.tsv
) orarray of numeric values
(Example:Dominant Color(s).tsv
) or Array of Strings (Face Emotions.tsv
,Speech Recognition.tsv
). The string array values usually start with a qualifier:'Transcript:: Here is the first German television with the daily news.'
or'Emotion::Neutral'
Examples:
Example of a transcript annotation file with duration (annotation is a string)
start hh:mm:ss.ms start in seconds duration hh:mm:ss.ms duration in seconds annotations
00:00:00.0 0.0 00:00:14.0 14.0 ['Transcript:: Sieh mal was ich gefunden hab.']
00:00:14.0 14.0 00:00:03.0 3.0 ['Transcript:: Was ist das?']
00:00:17.0 17.0 00:00:07.0 7.0 ['Transcript:: Samenkörner.']
00:00:24.0 24.0 00:00:02.0 2.0 ['Transcript:: Wir sind uns so ähnlich wie beide.']
Example of an audio average power annotation file with time points (annotation is a number)
Open Questions
Question
Does TIB-AV-A plan to have MAVA imports?
Videoscope
Category | Link or Value |
---|---|
Platform | Videoscope ⧉ |
Landing Page | Videoscope in LiRI ⧉ |
Tutorial | Videoscope Manual ⧉ |
Code | private code base |
Context | LiRI-Corpus-Platform-LCP ⧉ |
Status | stable with heavy new development |
About
Videoscope is part of a bigger application that imports linguistic corporas to make them searchable. Next to videoscope, that takes care of video corpora, there is also soundscript ⧉ and catchphrase ⧉, taking care of audio collection and text collections.
Import
- imports need to be done for a corpus of multiple videos all at once
- videoscope might come with additionsl context, that is shared between videos of a corpora
Videoscope internal data
SRT files:
1
00:00:10,433 --> 00:00:14,366
It shouldn't be that complicated to
carry out simple edits in a speech corpus.
2
00:00:14,766 --> 00:00:18,200
And luckily with databaseExplorer
it isn't.
Direct parts of the SRT file
Document
: SRT File or a part of it (for example when speaker changes it could be a new document)Segment
: an SRT item that has a time intervallToken
: words in the srt file: everything seperated by a blank or punctuationChar
: all characters are counted to provide the position ofTokens
Derived parts of the SRT file
Form
: is a uniqueToken
within the file: so aForm
can appear more than once: in the examplet
appears two times inshouldn't
andìsn't
.Lemma
: the tutorial says that all corpora need to define alemma
attribute, but it is left open what it is: in the tutorial theform
is just reused aslemma
. In the examples it is high german where as the form is Swiss German.
There can be other derived attributes besides lemma
, such as for example shortended
. lemma
is just the example of a mapping of tokens to some attribute.
Identification
Globally within videoscope
Segments
get a uuid, they are the building blocks of the SRT file
Locally within video
document
,token
,form
,lemma
get asequence number
within the video, but no uuid: so they are only identified within the scope of the video but not globally. The identifiers aretoken_id
,form_id
,lemma_id
.
Position measurements
char_range
for position within the text of the video: all characters are countedframe_range
: for time position in the video and is also counted as 25 frames per second
Files
├── annotation.csv
├── document.csv
├── fts_vector.csv
├── global_attribute_who.csv
├── meta.json
├── segment.csv
├── token.csv
├── token_form.csv
└── token_lemma.csv
Video parts ("Layers")
FK=Foreign Key to another table
File Name | Identifier(s) | Position Columns | Metadata / Other Columns | Description / Purpose |
---|---|---|---|---|
document.csv |
document_id (assumed) |
char_range , frame_range |
, name , who_id , transcriptor , tool , transcription_phase , normalisation , media |
Document-level info |
segment.csv |
segment_id (uuid) |
char_range , frame_range |
who_id , meta (link/details?) |
Segments within documents |
fts_vector.csv |
segment_id (FK) |
(N/A - relates to segment) | Segment as Vector data | Full-text search vectors |
token.csv |
token_id (assumed), segment_id (FK), form_id (FK), lemma_id (FK) |
char_range , frame_range |
xpos , unclear , truncated , vocal |
Tokens within segments |
Annotations
File Name | Identifier(s) | Position Columns | Metadata / Other Columns | Description / Purpose |
---|---|---|---|---|
annotations.csv |
annotation_id (assumed) |
char_range (link) |
type , meta , (needs clarification) |
Annotations on text spans (mostly on char_range gaps between tokens) |
Metadata
File Name | Identifier(s) | Metadata / Other Columns | Description / Purpose |
---|---|---|---|
global_attribute_who.csv |
who_id |
Speaker/Participant Metadata (name, gender, occupation) | Speaker/participant details |
meta.json |
doi in meta/url |
JSON structure (needs clarification) | Corpus-level metadata |
token_form.csv |
form_id |
form |
Unique token forms (surface text) |
token_lemma.csv |
lemma_id |
lemma |
Unique token lemmas (base form) clarify |
Open Questions
Question
Does videoscope plan to offer MAVA exports?
VIAN
Category | Link or Value |
---|---|
Outdated Project Page | VIAN (predecessor) ⧉ |
VIAN in the digital humanities context | Methods and Advanced Tools for the Analysis of Film Colors in Digital Humanities ⧉ |
Status | VIAN is currently refactored and redisigned by the TIB-AV-A team |
Code | No code based is currently published |
About
VIAN is a project to annotate movies manually. It serves to train students in categorizing shots in films.
VIAN Internal data
VIAN Lite uses JSON as input/output format for both its web and electron app.
There are 3 metadata files used, one for the overview of the project(s) and two specific for a given project.
To accompany the metadata there's also screenshots and thumbnails in JPEG format.
Files
JSON metadata
meta.json
contains information about the projects (name and ID).
Simplified example (in YAML)
main.json
contains general information on a given video.
Simplified example (in YAML)
`fps` = frames per secondundoable.json
contains information about the annotations done by the user.
Simplified example (in YAML)
id: "c9b3a996-3556-46c4-b451-144146cde671"
subtitles: null
subtitlesVisible: true
timelines:
0:
type: "screenshots"
name: "Screenshots"
id: "f3d05e17-cc24-4c47-b714-e11e38d43339"
data:
0:
frame: 0
thumbnail: "app:///home/vian-lite/c9b3a996-3556-46c4-b451-144146cde671/screenshots/00000000_mini.jpg"
image: "app:///home/vian-lite/c9b3a996-3556-46c4-b451-144146cde671/screenshots/00000000.jpg"
id: "571bccc2-7354-4ed9-b07a-178093b101b5"
1:
frame: 2400
thumbnail: "app:///home/vian-lite/c9b3a996-3556-46c4-b451-144146cde671/screenshots/00002400_mini.jpg"
image: "app:///home/vian-lite/c9b3a996-3556-46c4-b451-144146cde671/screenshots/00002400.jpg"
id: "a8cba49f-c88e-477d-ae02-ce4370fb97d7"
1:
type: "shots"
name: "Shots"
id: "cf54019d-abf8-4818-97e6-f8c0bfeac1ef"
data:
0:
start: 0
end: 751
id: "9785c66d-747a-48a1-bed0-cca63c501dee"
annotation: "test"
1:
start: 752
end: 1017
id: "c6d5aade-b94c-4f4e-aef9-5e1614bc3203"
2: {...}
JSON metadata interpretation
The id
represents a unique identifier for a video and its used as a key across all metadata files. Via the id
, one can retrieve:
- The name of the project the video is in (
meta.json
) - The duration, the frames per second (
fps
) and where the video is stored (main.json
) - Subtitle presence and various annotations (
undoable.json
)
flowchart LR
id --> name
id --> fps
id --> videoDuration
id --> video
id --> subtitles
id --> subtitlesVisible
id --> timelines
subgraph undoable.json
timelines_desc["video annotations"]
subtitles_desc["path to subtitles file or null"]
subtitlesVisible_desc["show subtitles (true) or not (false)"]
subtitles --> subtitles_desc
subtitlesVisible --> subtitlesVisible_desc
timelines --> timelines_desc
end
subgraph main.json
frames[frames per second]
duration[video duration in seconds]
videoPath[path to video storage]
fps --> frames
videoDuration --> duration
video --> videoPath
end
subgraph meta.json
name --> main
main["Project name"]
end
Metrics: from frames per second to timestamps
The fps
and videoDuration
entries (main.json
) are important to understand how the video annotations (undoable.json
) are handled internally in VIAN. All the annotations are defined for either a specific frame (screenshots
) or a frame range (shots
). To go from an annotation to a timestamp in the video, one has to multiply the fps
with the videoDuration
to obtain the total frames in a video. Then, the timestamp can be obtained by using a proportion: if the total video duration has a certain amount of total frames, which video timestamp has X frames?
flowchart TD
subgraph Video
videoDetails["`fps: 24
videoDuration: 888`"]
end
calculation1[""`21'312 frames
888 seconds`""]
videoDetails --> calculation1
subgraph Screenshot
frame[frame: 2400]
end
calculation2[""`2'400 frames
X seconds`""]
frame --> calculation2
calculation3["`(2'400 x 800)/21'312
=> 100 seconds`"]
calculation1 --> calculation3
calculation2 --> calculation3
Metadata overview
File Name | Identifier(s) | Metadata / Other Columns | Description / Purpose |
---|---|---|---|
meta.json |
id |
Name of project(s) | List of projects |
main.json |
id |
FPS, path to the video and video duration | General video information |
undoable.json |
id |
Presence of subtitles and annotations (either for a frame or for a shot) | User annotations |
Open Questions
Question
Status and current data of VIAN Light? What are VIANs plans for MAVA?