multimodal_fin.embeddings.speech_tree package

Submodules

multimodal_fin.embeddings.speech_tree.conference_node module

class multimodal_fin.embeddings.speech_tree.conference_node.ConferenceNode(name, node_type, text_embeddings=<factory>, audio_embeddings=<factory>, video_embeddings=<factory>, num_sentences=None, metadata=<factory>)[source]

Bases: NodeMixin

Represents a node in the hierarchical structure of a financial conference (monologue, question, or answer).

Each node may contain multimodal embeddings (text, audio, video), metadata, and a reference to its parent node (used by anytree for tree traversal).

audio_embeddings: Dict[str, List[List[float]]]: Audio embeddings per sentence.

metadata: Dict: Metadata dictionary for classification, QA response, coherence, etc.

name: str: Unique identifier for the node.

node_type: str

‘monologue’, ‘question’, or ‘answer’.

Type:: Type of node

num_sentences: int | None = None: Total number of sentences (optional).

text_embeddings: Dict[str, List[List[float]]]: Textual embeddings per sentence.

video_embeddings: Dict[str, List[List[float]]]: Video embeddings per sentence.

multimodal_fin.embeddings.speech_tree.conference_tree_builder module

class multimodal_fin.embeddings.speech_tree.conference_tree_builder.ConferenceTreeBuilder(json_path)[source]

Bases: object

Builds a hierarchical tree of a financial conference using the ConferenceNode class.

This includes: - A root node for the whole conference. - Monologue nodes as direct children. - QA pair nodes, each with a question and an answer node.

json_path

Path to the JSON file containing the conference data.

Type:: str

build_tree()[source]

Builds and returns the root node of the conference tree.

Returns:: Root node with full tree structure as children.
Return type:: ConferenceNode

multimodal_fin.embeddings.speech_tree package

Submodules

multimodal_fin.embeddings.speech_tree.conference_node module

multimodal_fin.embeddings.speech_tree.conference_tree_builder module

Module contents