multimodal_fin.processing.multimodal.audio package

Submodules

multimodal_fin.processing.multimodal.audio.audio_emotion_analyzer module

class multimodal_fin.processing.multimodal.audio.audio_emotion_analyzer.AudioEmotionAnalyzer(mode='emotion2vec', device='cpu', model_name='iic/emotion2vec_plus_large')[source]

Bases: object

Extracts emotion-based audio embeddings or emotion classifications using a specified recognizer.

classify_audio(audio_path)[source]

Returns the top predicted emotion for a given audio file.

Return type:: str

classify_dataframe(df)[source]

Adds a ‘classification’ column to a DataFrame by predicting emotions from audio file paths.

Parameters:: df (DataFrame) – Must contain a ‘Path’ column with paths to audio files.
Returns:: The same DataFrame with a new ‘classification’ column.
Return type:: DataFrame

device: str = 'cpu': The computation device to use (‘cuda’ or ‘cpu’).

get_embeddings(audio_path)[source]

Returns a centered logits vector representing emotional content from the given audio file.

The vector is ordered as:: [‘happy’, ‘neutral’, ‘surprise’, ‘disgust’, ‘anger’, ‘sadness’, ‘fear’]

Parameters:: audio_path (str) – Path to the audio file.
Returns:: Centered logits vector of emotion scores.
Return type:: Tensor

mode: str = 'emotion2vec': The name of the recognition model type. Currently, only ‘emotion2vec’ is supported.

model_name: str = 'iic/emotion2vec_plus_large': Name or path of the model to be loaded.

multimodal_fin.processing.multimodal.audio package

Submodules

multimodal_fin.processing.multimodal.audio.audio_emotion_analyzer module

Module contents