multimodal_fin.utils package
Submodules
multimodal_fin.utils.cli module
CLI-related utility functions for validating user input in multimodal_fin.
This module provides validation logic for CLI arguments, specifically used when embedding enriched JSON files.
- multimodal_fin.utils.cli.validate_embed_inputs(json_path=None, json_csv=None)[source]
Validate and resolve the JSON file paths to be embedded.
This function ensures that exactly one of the two options is provided: either a single JSON path or a CSV file containing multiple paths.
- Parameters:
json_path (
Optional[Path]) – Path to a single enriched transcript JSON file.json_csv (
Optional[Path]) – Path to a CSV containing a ‘Paths’ column.
- Returns:
A list of file paths to be processed.
- Return type:
List[str]- Raises:
typer.Exit – If both or neither arguments are provided, or if CSV reading fails.
multimodal_fin.utils.files module
File and path utilities for the multimodal_fin package.
Includes reusable helpers for reading CSV and JSON files, and locating specific files within conference directories.
- multimodal_fin.utils.files.find_audio_file(directory)[source]
Locate the first audio file in a directory (supports mp3, wav, flac).
- Parameters:
directory (
Path) – Directory to search in.- Returns:
Full path to the first found audio file.
- Return type:
Path- Raises:
FileNotFoundError – If no supported audio file is found.
- multimodal_fin.utils.files.find_level3_json(directory)[source]
Locate a LEVEL_3.json file in a given directory.
- Parameters:
directory (
Path) – Directory to search in.- Returns:
Full path to the LEVEL_3.json file.
- Return type:
Path- Raises:
FileNotFoundError – If the file is not found.
- multimodal_fin.utils.files.make_processed_path(original)[source]
Generate the processed output path from an original conference path.
If the original path contains a folder named ‘companies’, it will be replaced with ‘processed_companies’. Otherwise, the method appends ‘_processed’ to the directory name under the same parent.
- Parameters:
original (
Path) – Original input directory path.- Returns:
Transformed path pointing to processed data.
- Return type:
Path
- multimodal_fin.utils.files.read_json_file(json_path)[source]
Read a JSON file and return its parsed content.
- Parameters:
json_path (
Path) – Full path to a JSON file.- Returns:
Parsed JSON content (usually a dict or list).
- Return type:
Any- Raises:
FileNotFoundError – If the file does not exist.
json.JSONDecodeError – If the file content is not valid JSON.
- multimodal_fin.utils.files.read_paths_csv(csv_path)[source]
Read a CSV file with a ‘path’ column and return a list of valid paths.
- Parameters:
csv_path (
str) – Path to the CSV file containing a ‘path’ column.- Returns:
List of paths to directories or files.
- Return type:
List[str]- Raises:
ValueError – If the ‘path’ column is missing from the CSV.
multimodal_fin.utils.logging module
Logging utilities for the multimodal_fin package.
Provides a standardized logger with timestamps and log levels.
- multimodal_fin.utils.logging.get_logger(name)[source]
Create and configure a logger for a given module or component.
Ensures a consistent logging format and level across the package.
- Parameters:
name (
str) – Name of the logger (typically use __name__).- Returns:
A configured logger instance.
- Return type:
Logger
- multimodal_fin.utils.logging.log_ensemble_prediction(model_outputs, final_label, final_confidence, logger=None)[source]
Logs the ensemble classification results in a clean, readable format.
- Parameters:
model_outputs (
List[Tuple[str,str,float]]) – Tuples (model_name, predicted_label, confidence).final_label (
str) – Final combined prediction label.final_confidence (
float) – Final confidence (0 to 100).logger (
Optional[Logger]) – Logger to use. If None, uses default package logger.