phlower.services.trainer.PhlowerTrainer¶
- class phlower.services.trainer.PhlowerTrainer(setting, restart_directory=None)[source]¶
Bases:
object
PhlowerTrainer is a class that manages the training process.
Examples
>>> trainer = PhlowerTrainer.from_setting(setting) >>> trainer.train( ... output_directory, ... train_directories, ... validation_directories ... )
Methods
__init__
(setting[, restart_directory])Initialize PhlowerTrainer without updating trainer's
attach_handler
(name, handler[, allow_overwrite])Attach handler to the trainer :param name: Name of the handler :type name: str :param handler: Handler to attach :type handler: PhlowerHandlersRunner
draw_model
(output_directory)Draw model
from_setting
(setting[, decrypt_key])Create PhlowerTrainer from PhlowerSetting
Get the number of handlers
Get registered trainer setting
load_pretrained
(model_directory, selection_mode)Load pretrained model
restart_from
(model_directory[, decrypt_key])Restart PhlowerTrainer from model directory
train
()train_ddp
(rank, world_size, output_directory)Train the model with Distributed Data Parallel (DDP)
- Parameters:
setting (PhlowerSetting)
restart_directory (pathlib.Path | None)
- attach_handler(name, handler, allow_overwrite=False)[source]¶
Attach handler to the trainer :param name: Name of the handler :type name: str :param handler: Handler to attach :type handler: PhlowerHandlersRunner
- Raises:
ValueError – If handler with the same name already exists
- Parameters:
name (str)
handler (PhlowerHandlersRunner)
allow_overwrite (bool)
- Return type:
None
- draw_model(output_directory)[source]¶
Draw model
- Parameters:
output_directory (Path) – pathlib.Path Output directory
- Return type:
None
- classmethod from_setting(setting, decrypt_key=None)[source]¶
Create PhlowerTrainer from PhlowerSetting
- Parameters:
setting (PhlowerSetting) – PhlowerSetting PhlowerSetting
decrypt_key (bytes | None)
- Returns:
PhlowerTrainer
- Return type:
- get_registered_trainer_setting()[source]¶
Get registered trainer setting
- Returns:
Trainer setting
- Return type:
- load_pretrained(model_directory, selection_mode, target_epoch=None, map_location=None, decrypt_key=None)[source]¶
Load pretrained model
- Parameters:
model_directory (Path) – pathlib.Path Model directory
selection_mode (Literal['best', 'latest', 'train_best', 'specified']) – Literal[“best”, “latest”, “train_best”, “specified”] Selection mode
target_epoch (int | None) – int | None Target epoch. Defaults to None.
device – str | None Device. Defaults to None.
decrypt_key (bytes | None) – bytes | None Decrypt key. Defaults to None.
map_location (str | dict | None)
- Return type:
None
- classmethod restart_from(model_directory, decrypt_key=None)[source]¶
Restart PhlowerTrainer from model directory
- Parameters:
model_directory (Path) – pathlib.Path Model directory
decrypt_key (bytes | None)
- Returns:
PhlowerTrainer
- Return type:
- train_ddp(rank, world_size, output_directory, train_directories=None, validation_directories=None, disable_dimensions=False, decrypt_key=None, encrypt_key=None)[source]¶
Train the model with Distributed Data Parallel (DDP)
- Parameters:
rank (int) – Rank of the current process
world_size (int) – Total number of processes
output_directory (pathlib.Path) – Output directory
train_directories (list[pathlib.Path] | None, optional) – List of directories containing training data. If None, directories defined in the setting are used. Default is None.
validation_directories (list[pathlib.Path] | None, optional) – List of directories containing validation data. If None, directories defined in the setting are used. Default is None.
disable_dimensions (bool, optional) – Disable dimensions. Default is False.
decrypt_key (bytes | None, optional) – Key used for decrypting data files, if necessary. Default is None.
encrypt_key (bytes | None, optional) – Key used for encrypting output files, if necessary. Default is None.
- Return type:
float
Examples
>>> import torch.multiprocessing as mp >>> trainer = PhlowerTrainer.from_setting(setting) >>> mp.spawn( ... trainer.train_ddp, ... args=(world_size, output_directory), ... nprocs=world_size, ... join=True ... )