autogen/flaml/tune
Qingyun Wu f4f3f4f17b
update image url (#71)
* update image url

* ArffException

* OpenMLError is ValueError

* CatBoostError

* reduce build on push

Co-authored-by: Chi Wang (MSR) <wang.chi@microsoft.com>
2021-04-21 01:36:06 -07:00
..
README.md update image url (#71) 2021-04-21 01:36:06 -07:00
__init__.py Issue58 (#59) 2021-04-08 09:29:55 -07:00
analysis.py V0.2.2 (#19) 2021-02-05 21:41:14 -08:00
sample.py Issue58 (#59) 2021-04-08 09:29:55 -07:00
trial.py Issue58 (#59) 2021-04-08 09:29:55 -07:00
trial_runner.py Issue58 (#59) 2021-04-08 09:29:55 -07:00
tune.py Issue58 (#59) 2021-04-08 09:29:55 -07:00

README.md

Economical Hyperparameter Optimization

flaml.tune is a module for economical hyperparameter tuning. It frees users from manually tuning many hyperparameters for a software, such as machine learning training procedures. It can be used standalone, or together with ray tune or nni.

  • Example for sequential tuning (recommended when compute resource is limited and each trial can consume all the resources):
# require: pip install flaml[blendsearch]
from flaml import tune
import time

def evaluate_config(config):
    '''evaluate a hyperparameter configuration'''
    # we uss a toy example with 2 hyperparameters
    metric = (round(config['x'])-85000)**2 - config['x']/config['y']
    # usually the evaluation takes an non-neglible cost
    # and the cost could be related to certain hyperparameters
    # in this example, we assume it's proportional to x
    time.sleep(config['x']/100000)
    # use tune.report to report the metric to optimize    
    tune.report(metric=metric) 

analysis = tune.run(
    evaluate_config,    # the function to evaluate a config
    config={
        'x': tune.qloguniform(lower=1, upper=100000, q=1),
        'y': tune.randint(lower=1, upper=100000)
    }, # the search space
    low_cost_partial_config={'x':1},    # a initial (partial) config with low cost
    metric='metric',    # the name of the metric used for optimization
    mode='min',         # the optimization mode, 'min' or 'max'
    num_samples=-1,    # the maximal number of configs to try, -1 means infinite
    time_budget_s=60,   # the time budget in seconds
    local_dir='logs/',  # the local directory to store logs
    # verbose=0,          # verbosity    
    # use_ray=True, # uncomment when performing parallel tuning using ray
    )

print(analysis.best_trial.last_result)  # the best trial's result
print(analysis.best_config) # the best config
  • Example for using ray tune's API:
# require: pip install flaml[blendsearch] ray[tune]
from ray import tune as raytune
from flaml import CFO, BlendSearch
import time

def evaluate_config(config):
    '''evaluate a hyperparameter configuration'''
    # we use a toy example with 2 hyperparameters
    metric = (round(config['x'])-85000)**2 - config['x']/config['y']
    # usually the evaluation takes a non-neglible cost
    # and the cost could be related to certain hyperparameters
    # in this example, we assume it's proportional to x
    time.sleep(config['x']/100000)
    # use tune.report to report the metric to optimize    
    tune.report(metric=metric) 

analysis = raytune.run(
    evaluate_config,    # the function to evaluate a config
    config={
        'x': tune.qloguniform(lower=1, upper=100000, q=1),
        'y': tune.randint(lower=1, upper=100000)
    }, # the search space
    metric='metric',    # the name of the metric used for optimization
    mode='min',         # the optimization mode, 'min' or 'max'
    num_samples=-1,    # the maximal number of configs to try, -1 means infinite
    time_budget_s=60,   # the time budget in seconds
    local_dir='logs/',  # the local directory to store logs
    search_alg=CFO(low_cost_partial_config=[{'x':1}]) # or BlendSearch
    )

print(analysis.best_trial.last_result)  # the best trial's result
print(analysis.best_config) # the best config
  • Example for using NNI: An example of using BlendSearch with NNI can be seen in test. CFO can be used as well in a similar manner. To run the example, first make sure you have NNI installed, then run:
$nnictl create --config ./config.yml
  • For more examples, please check out notebooks.

flaml offers two HPO methods: CFO and BlendSearch. flaml.tune uses BlendSearch by default.


CFO uses the randomized direct search method FLOW2 with adaptive stepsize and random restart. It requires a low-cost initial point as input if such point exists. The search begins with the low-cost initial point and gradually move to high cost region if needed. The local search method has a provable convergence rate and bounded cost.

About FLOW2: FLOW2 is a simple yet effective randomized direct search method. It is an iterative optimization method that can optimize for black-box functions. FLOW2 only requires pairwise comparisons between function values to perform iterative update. Comparing to existing HPO methods, FLOW2 has the following appealing properties:

  1. It is applicable to general black-box functions with a good convergence rate in terms of loss.
  2. It provides theoretical guarantees on the total evaluation cost incurred.

The GIFs attached below demostrates an example search trajectory of FLOW2 shown in the loss and evaluation cost (i.e., the training time ) space respectively. From the demonstration, we can see that (1) FLOW2 can quickly move toward the low-loss region, showing good convergence property and (2) FLOW2 tends to avoid exploring the high-cost region until necessary.


Figure 1. FLOW2 in tuning the # of leaves and the # of trees for XGBoost. The two background heatmaps show the loss and cost distribution of all configurations. The black dots are the points evaluated in FLOW2. Black dots connected by lines are points that yield better loss performance when evaluated.

Example:

from flaml import CFO
tune.run(...
    search_alg = CFO(low_cost_partial_config=low_cost_partial_config),
)

Recommended scenario: there exist cost-related hyperparameters and a low-cost initial point is known before optimization. If the search space is complex and CFO gets trapped into local optima, consider using BlendSearch.

BlendSearch: Economical Hyperparameter Optimization With Blended Search Strategy


BlendSearch combines local search with global search. It leverages the frugality of CFO and the space exploration ability of global search methods such as Bayesian optimization. Like CFO, BlendSearch requires a low-cost initial point as input if such point exists, and starts the search from there. Different from CFO, BlendSearch will not wait for the local search to fully converge before trying new start points. The new start points are suggested by the global search method and filtered based on their distance to the existing points in the cost-related dimensions. BlendSearch still gradually increases the trial cost. It prioritizes among the global search thread and multiple local search threads based on optimism in face of uncertainty.

Example:

# require: pip install flaml[blendsearch]
from flaml import BlendSearch
tune.run(...
    search_alg = BlendSearch(low_cost_partial_config=low_cost_partial_config),
)

Recommended scenario: cost-related hyperparameters exist, a low-cost initial point is known, and the search space is complex such that local search is prone to be stuck at local optima.

For more technical details, please check our papers.

@inproceedings{wu2021cfo,
    title={Frugal Optimization for Cost-related Hyperparameters},
    author={Qingyun Wu and Chi Wang and Silu Huang},
    year={2021},
    booktitle={AAAI'21},
}
@inproceedings{wang2021blendsearch,
    title={Economical Hyperparameter Optimization With Blended Search Strategy},
    author={Chi Wang and Qingyun Wu and Silu Huang and Amin Saied},
    year={2021},
    booktitle={ICLR'21},
}