multimodal_fin.embeddings.speech_tree package

Submodules

multimodal_fin.embeddings.speech_tree.conference_node module

class multimodal_fin.embeddings.speech_tree.conference_node.ConferenceNode(name, node_type, text_embeddings=<factory>, audio_embeddings=<factory>, video_embeddings=<factory>, num_sentences=None, metadata=<factory>)[source]

Bases: NodeMixin

Represents a node in the hierarchical structure of a financial conference (monologue, question, or answer).

Each node may contain multimodal embeddings (text, audio, video), metadata, and a reference to its parent node (used by anytree for tree traversal).

audio_embeddings: Dict[str, List[List[float]]]

Audio embeddings per sentence.

metadata: Dict

Metadata dictionary for classification, QA response, coherence, etc.

name: str

Unique identifier for the node.

node_type: str

‘monologue’, ‘question’, or ‘answer’.

Type:

Type of node

num_sentences: int | None = None

Total number of sentences (optional).

text_embeddings: Dict[str, List[List[float]]]

Textual embeddings per sentence.

video_embeddings: Dict[str, List[List[float]]]

Video embeddings per sentence.

multimodal_fin.embeddings.speech_tree.conference_tree_builder module

class multimodal_fin.embeddings.speech_tree.conference_tree_builder.ConferenceTreeBuilder(json_path)[source]

Bases: object

Builds a hierarchical tree of a financial conference using the ConferenceNode class.

This includes: - A root node for the whole conference. - Monologue nodes as direct children. - QA pair nodes, each with a question and an answer node.

json_path

Path to the JSON file containing the conference data.

Type:

str

build_tree()[source]

Builds and returns the root node of the conference tree.

Returns:

Root node with full tree structure as children.

Return type:

ConferenceNode

Module contents