ocl.scheduling
Scheduling of learning rate and hyperparameters.
HPScheduler
Base class for scheduling of scalar hyperparameters based on the number of training steps.
A separate callback ocl.callbacks.UpdateHyperparameterScheduling calls
update_global_step
to update the state of the hyperparameter according
to the scheduling.
This class can be used in computations similar to a regular float if operations
are applied from the left otherwise it needs to be converted using
float(instance)
which will return the currently scheduled value of the
hyperparameter.
Source code in ocl/scheduling.py
update_global_step
Update global step used in compute_scheduled_value
.
This should be called by the ocl.callbacks.UpdateHyperparameterScheduling callback.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
global_step |
int
|
The current global step. |
required |
Source code in ocl/scheduling.py
compute_scheduled_value
abstractmethod
Return current value of hyperparameter based on global step.
Returns:
Type | Description |
---|---|
float
|
The scheduled hyperparameter value. |
LinearHPScheduler
Bases: HPScheduler
Linearly increase value of a hyperparameter.
Source code in ocl/scheduling.py
__init__
Initialize LinearHPScheduler.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
end_value |
float
|
Value after scheduling. |
required |
end_step |
int
|
|
required |
start_value |
float
|
Value to be used prior to |
0.0
|
start_step |
int
|
Value at which linear scheduling schould start. |
0
|
Source code in ocl/scheduling.py
StepHPScheduler
Bases: HPScheduler
Schedule hyperparameter using discrete step.
Source code in ocl/scheduling.py
__init__
Initialize StepHPScheduler.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
end_value |
float
|
Value after |
required |
switch_step |
int
|
|
required |
start_value |
float
|
Value to be used prior to |
0.0
|
Source code in ocl/scheduling.py
CosineAnnealingHPScheduler
Bases: HPScheduler
Cosine annealing of hyperparameter.
Source code in ocl/scheduling.py
__init__
Initialize CosineAnnealingHPScheduler.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
end_value |
float
|
Value after scheduling. |
required |
end_step |
int
|
|
required |
start_value |
float
|
Value to be used prior to |
required |
start_step |
int
|
Value at which cosine scheduling schould start. |
required |
Source code in ocl/scheduling.py
exponential_decay_with_optional_warmup
Return pytorch lighting optimizer configuration for exponential decay with optional warmup.
Exponential decay is applied at each optimization step. Exponential decay starts while warmup is still taking place. This is in line with the typical scheduling used to train Transformer models.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
optimizer |
Optimizer
|
Pytorch lighting optimizer of which the learning rate should be scheduled. |
required |
decay_rate |
float
|
Decay rate of exponential decay. |
1.0
|
decay_steps |
int
|
Number of optimization steps after which learning rate should be decayed by decay factor. |
10000
|
warmup_steps |
int
|
Number of warmup steps. |
0
|
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dict with structure compatible with ptl. See pytorch lightning documentation |
Source code in ocl/scheduling.py
exponential_decay_after_optional_warmup
Return pytorch lighting optimizer configuration for exponential decay with optional warmup.
Exponential decay is applied at each optimization step. Exponential decay starts after warmup is took place.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
optimizer |
Optimizer
|
Pytorch lighting optimizer of which the learning rate should be scheduled. |
required |
decay_rate |
float
|
Decay rate of exponential decay. |
1.0
|
decay_steps |
int
|
Number of optimization steps after which learning rate should be decayed by decay factor. |
10000
|
warmup_steps |
int
|
Number of warmup steps. |
0
|
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dict with structure compatible with ptl. See pytorch lightning documentation |
Source code in ocl/scheduling.py
plateau_decay
Return pytorch lighting optimizer configuration for plato decay.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
optimizer |
Optimizer
|
Pytorch lighting optimizer of which the learning rate should be scheduled. |
required |
decay_rate |
float
|
Factor by which learning rate should be decayed when plateau is reached. |
1.0
|
patience |
int
|
Number of epochs to wait for improvement. |
10
|
mode |
str
|
|
'min'
|
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dict with structure compatible with ptl. See pytorch lightning documentation |
Source code in ocl/scheduling.py
cosine_annealing_with_optional_warmup
Return pytorch lighting optimizer configuration for cosine annealing with warmup.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
optimizer |
Optimizer
|
Pytorch lighting optimizer of which the learning rate should be scheduled. |
required |
T_max |
int
|
The length of the scheduling in steps. |
100000
|
eta_min |
float
|
Minimal fraction of initial learning rate that should be reached when scheduling cycle is complete. |
0.0
|
warmup_steps |
int
|
Number of warmup steps. |
0
|
error_on_exceeding_steps |
bool
|
Raise error if more than |
False
|
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dict with structure compatible with ptl. See pytorch lightning documentation |