multimodal_fin.embeddings.speech_tree package
Submodules
multimodal_fin.embeddings.speech_tree.conference_node module
- class multimodal_fin.embeddings.speech_tree.conference_node.ConferenceNode(name, node_type, text_embeddings=<factory>, audio_embeddings=<factory>, video_embeddings=<factory>, num_sentences=None, metadata=<factory>)[source]
Bases:
NodeMixinRepresents a node in the hierarchical structure of a financial conference (monologue, question, or answer).
Each node may contain multimodal embeddings (text, audio, video), metadata, and a reference to its parent node (used by anytree for tree traversal).
- audio_embeddings: Dict[str, List[List[float]]]
Audio embeddings per sentence.
- metadata: Dict
Metadata dictionary for classification, QA response, coherence, etc.
- name: str
Unique identifier for the node.
- node_type: str
‘monologue’, ‘question’, or ‘answer’.
- Type:
Type of node
- num_sentences: int | None = None
Total number of sentences (optional).
- text_embeddings: Dict[str, List[List[float]]]
Textual embeddings per sentence.
- video_embeddings: Dict[str, List[List[float]]]
Video embeddings per sentence.
multimodal_fin.embeddings.speech_tree.conference_tree_builder module
- class multimodal_fin.embeddings.speech_tree.conference_tree_builder.ConferenceTreeBuilder(json_path)[source]
Bases:
objectBuilds a hierarchical tree of a financial conference using the ConferenceNode class.
This includes: - A root node for the whole conference. - Monologue nodes as direct children. - QA pair nodes, each with a question and an answer node.
- json_path
Path to the JSON file containing the conference data.
- Type:
str