improve max_valid_n and doc (#933)

* improve max_valid_n and doc

* Update README.md

Co-authored-by: Li Jiang <lijiang1@microsoft.com>

* newline at end of file

* doc

---------

Co-authored-by: Li Jiang <lijiang1@microsoft.com>
Co-authored-by: Susan Xueqing Liu <liususan091219@users.noreply.github.com>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
This commit is contained in:
Chi Wang 2023-03-05 08:40:57 -08:00 committed by GitHub
parent 97928609ba
commit 1ec77b58b4
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
9 changed files with 1780 additions and 1752 deletions

View File

@ -14,20 +14,22 @@
<br> <br>
</p> </p>
:fire: An [upcoming tutorial on FLAML](https://github.com/microsoft/FLAML/tree/tutorial-aaai23/tutorial) at [AAAI-23](https://aaai.org/Conferences/AAAI-23/aaai23tutorials/) (to be held on Feb 08, 2023) :fire: OpenAI GPT-3 models support in v1.1.3. ChatGPT support is coming.
:fire: A [lab forum](https://github.com/microsoft/FLAML/tree/tutorial-aaai23/tutorial) on FLAML at AAAI 2023.
:fire: A [hands-on tutorial](https://github.com/microsoft/FLAML/tree/tutorial/tutorial) on FLAML presented at KDD 2022 :fire: A [hands-on tutorial](https://github.com/microsoft/FLAML/tree/tutorial/tutorial) on FLAML presented at KDD 2022
## What is FLAML ## What is FLAML
FLAML is a lightweight Python library that finds accurate machine FLAML is a lightweight Python library that finds accurate machine
learning models automatically, efficiently and economically. It frees users from selecting learning models automatically, efficiently and economically. It frees users from selecting
learners and hyperparameters for each learner. It can also be used to tune generic hyperparameters for MLOps workflows, pipelines, mathematical/statistical models, algorithms, computing experiments, software configurations and so on. models and hyperparameters for each model. It can also be used to tune generic hyperparameters for large language models (LLM), MLOps/LMOps workflows, pipelines, mathematical/statistical models, algorithms, computing experiments, software configurations and so on.
1. For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It supports both classifcal machine learning models and deep neural networks. 1. For common machine learning or AI tasks like classification, regression, and generation, it quickly finds quality models for user-provided data with low computational resources. It supports both classical machine learning models and deep neural networks, including large language models such as the OpenAI GPT-3 models.
1. It is easy to customize or extend. Users can find their desired customizability from a smooth range: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space and metric), or full customization (arbitrary training and evaluation code). 1. It is easy to customize or extend. Users can find their desired customizability from a smooth range: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space and metric), or full customization (arbitrary training and evaluation code).
1. It supports fast automatic tuning, capable of handling complex constraints/guidance/early stopping. FLAML is powered by a new, [cost-effective 1. It supports fast automatic tuning, capable of handling complex constraints/guidance/early stopping. FLAML is powered by a new, [cost-effective
hyperparameter optimization](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function/#hyperparameter-optimization-algorithm) hyperparameter optimization](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function/#hyperparameter-optimization-algorithm)
and learner selection method invented by Microsoft Research. and model selection method invented by Microsoft Research, and many followup [research studies](https://microsoft.github.io/FLAML/docs/Research).
FLAML has a .NET implementation in [ML.NET](http://dot.net/ml), an open-source, cross-platform machine learning framework for .NET. In ML.NET, you can use FLAML via low-code solutions like [Model Builder](https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet/model-builder) Visual Studio extension and the cross-platform [ML.NET CLI](https://docs.microsoft.com/dotnet/machine-learning/automate-training-with-cli). Alternatively, you can use the [ML.NET AutoML API](https://www.nuget.org/packages/Microsoft.ML.AutoML/#versions-body-tab) for a code-first experience. FLAML has a .NET implementation in [ML.NET](http://dot.net/ml), an open-source, cross-platform machine learning framework for .NET. In ML.NET, you can use FLAML via low-code solutions like [Model Builder](https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet/model-builder) Visual Studio extension and the cross-platform [ML.NET CLI](https://docs.microsoft.com/dotnet/machine-learning/automate-training-with-cli). Alternatively, you can use the [ML.NET AutoML API](https://www.nuget.org/packages/Microsoft.ML.AutoML/#versions-body-tab) for a code-first experience.

View File

@ -207,11 +207,11 @@ def metric_loss_score(
except ImportError: except ImportError:
raise ValueError( raise ValueError(
metric_name metric_name
+ " is not an built-in sklearn metric and nlp is not installed. " + " is not an built-in sklearn metric and [hf] is not installed. "
"Currently built-in sklearn metrics are: " "Currently built-in sklearn metrics are: "
"r2, rmse, mae, mse, accuracy, roc_auc, roc_auc_ovr, roc_auc_ovo," "r2, rmse, mae, mse, accuracy, roc_auc, roc_auc_ovr, roc_auc_ovo,"
"log_loss, mape, f1, micro_f1, macro_f1, ap. " "log_loss, mape, f1, micro_f1, macro_f1, ap. "
"If the metric is an nlp metric, please pip install flaml[nlp] ", "If the metric is a huggingface metric, please pip install flaml[hf] ",
"or pass a customized metric function to AutoML.fit(metric=func)", "or pass a customized metric function to AutoML.fit(metric=func)",
) )
# If the metric is not found from huggingface dataset metric list (i.e., FileNotFoundError) # If the metric is not found from huggingface dataset metric list (i.e., FileNotFoundError)

View File

@ -179,6 +179,7 @@ class Completion:
""" """
cost = 0 cost = 0
data = cls.data data = cls.data
data_length = len(data)
target_n_tokens = ( target_n_tokens = (
1000 * cls.inference_budget / cls.price1K[config["model"]] 1000 * cls.inference_budget / cls.price1K[config["model"]]
if cls.inference_budget and cls.price1K.get(config["model"]) if cls.inference_budget and cls.price1K.get(config["model"])
@ -187,26 +188,33 @@ class Completion:
prune_hp = cls._prune_hp prune_hp = cls._prune_hp
metric = cls._metric metric = cls._metric
config_n = config[prune_hp] config_n = config[prune_hp]
max_tokens = config["max_tokens"] max_tokens = config.get("max_tokens", 16) # default value in OpenAI is 16
region_key = cls._get_region_key(config) region_key = cls._get_region_key(config)
prompt = cls._prompts[config["prompt"]] prompt = cls._prompts[config["prompt"]]
stop = cls._stops and cls._stops[config["stop"]] stop = cls._stops and cls._stops[config["stop"]]
if prune and target_n_tokens: if prune and target_n_tokens:
max_valid_n = cls._get_max_valid_n(region_key, max_tokens) max_valid_n = cls._get_max_valid_n(region_key, max_tokens)
min_invalid_n = cls._get_min_invalid_n(region_key, max_tokens) if cls.avg_input_tokens:
if min_invalid_n is not None and config_n >= min_invalid_n: # max_tokens bounds the maximum tokens
if config_n > max_valid_n: # so using it we can calculate a valid n according to the avg # input tokens
max_valid_n = max(
max_valid_n,
int((target_n_tokens - cls.avg_input_tokens) // max_tokens),
)
else:
input_tokens = [None] * data_length
if config_n <= max_valid_n:
start_n = config_n
else:
min_invalid_n = cls._get_min_invalid_n(region_key, max_tokens)
if min_invalid_n is not None and config_n >= min_invalid_n:
# prune this config # prune this config
return { return {
"inference_cost": np.inf, "inference_cost": np.inf,
metric: np.inf if cls._mode == "min" else -np.inf, metric: np.inf if cls._mode == "min" else -np.inf,
"cost": cost, "cost": cost,
} }
# since config_n<=max_valid_n, there is a chance config_n is valid start_n = max_valid_n + 1
start_n = config_n
else:
# start from a valid n
start_n = min(max_valid_n, config_n)
else: else:
start_n = config_n start_n = config_n
params = config.copy() params = config.copy()
@ -214,7 +222,6 @@ class Completion:
temperature_or_top_p = params.pop("temperature_or_top_p", None) temperature_or_top_p = params.pop("temperature_or_top_p", None)
if temperature_or_top_p: if temperature_or_top_p:
params.update(temperature_or_top_p) params.update(temperature_or_top_p)
data_length = len(data)
num_completions, previous_num_completions = start_n, 0 num_completions, previous_num_completions = start_n, 0
n_tokens_list, result, responses_list = [], {}, [] n_tokens_list, result, responses_list = [], {}, []
while True: # n <= config_n while True: # n <= config_n
@ -242,6 +249,14 @@ class Completion:
if previous_num_completions if previous_num_completions
else response["usage"]["total_tokens"] else response["usage"]["total_tokens"]
) )
if (
prune
and target_n_tokens
and not cls.avg_input_tokens
and not input_tokens[i]
):
# store the # input tokens
input_tokens[i] = response["usage"]["prompt_tokens"]
# Under Assumption 1, we should count both the input and output tokens in the first query, # Under Assumption 1, we should count both the input and output tokens in the first query,
# and only count ouput tokens afterwards # and only count ouput tokens afterwards
query_cost = ( query_cost = (
@ -335,6 +350,8 @@ class Completion:
result["inference_cost"] = ( result["inference_cost"] = (
avg_n_tokens * cls.price1K[config["model"]] / 1000 avg_n_tokens * cls.price1K[config["model"]] / 1000
) )
if prune and target_n_tokens and not cls.avg_input_tokens:
cls.avg_input_tokens = np.mean(input_tokens)
break break
else: else:
if data_early_stop: if data_early_stop:
@ -424,6 +441,7 @@ class Completion:
cls._total_cost = 0 # total optimization cost cls._total_cost = 0 # total optimization cost
cls._eval_func = eval_func cls._eval_func = eval_func
cls.data = data cls.data = data
cls.avg_input_tokens = None
search_alg = BlendSearch( search_alg = BlendSearch(
cost_attr="cost", cost_attr="cost",

File diff suppressed because one or more lines are too long

View File

@ -30,10 +30,10 @@
"execution_count": 1, "execution_count": 1,
"metadata": { "metadata": {
"execution": { "execution": {
"iopub.execute_input": "2023-02-13T23:40:52.317406Z", "iopub.execute_input": "2023-02-24T23:25:36.910966Z",
"iopub.status.busy": "2023-02-13T23:40:52.316561Z", "iopub.status.busy": "2023-02-24T23:25:36.910473Z",
"iopub.status.idle": "2023-02-13T23:40:52.321193Z", "iopub.status.idle": "2023-02-24T23:25:36.914554Z",
"shell.execute_reply": "2023-02-13T23:40:52.320628Z" "shell.execute_reply": "2023-02-24T23:25:36.914030Z"
} }
}, },
"outputs": [], "outputs": [],
@ -54,10 +54,10 @@
"execution_count": 2, "execution_count": 2,
"metadata": { "metadata": {
"execution": { "execution": {
"iopub.execute_input": "2023-02-13T23:40:52.324240Z", "iopub.execute_input": "2023-02-24T23:25:36.917301Z",
"iopub.status.busy": "2023-02-13T23:40:52.323783Z", "iopub.status.busy": "2023-02-24T23:25:36.917011Z",
"iopub.status.idle": "2023-02-13T23:40:52.330570Z", "iopub.status.idle": "2023-02-24T23:25:36.923156Z",
"shell.execute_reply": "2023-02-13T23:40:52.329750Z" "shell.execute_reply": "2023-02-24T23:25:36.922619Z"
} }
}, },
"outputs": [], "outputs": [],
@ -81,10 +81,10 @@
"execution_count": 3, "execution_count": 3,
"metadata": { "metadata": {
"execution": { "execution": {
"iopub.execute_input": "2023-02-13T23:40:52.333547Z", "iopub.execute_input": "2023-02-24T23:25:36.925804Z",
"iopub.status.busy": "2023-02-13T23:40:52.333249Z", "iopub.status.busy": "2023-02-24T23:25:36.925423Z",
"iopub.status.idle": "2023-02-13T23:40:52.336508Z", "iopub.status.idle": "2023-02-24T23:25:36.928191Z",
"shell.execute_reply": "2023-02-13T23:40:52.335858Z" "shell.execute_reply": "2023-02-24T23:25:36.927673Z"
} }
}, },
"outputs": [], "outputs": [],
@ -109,10 +109,10 @@
"execution_count": 4, "execution_count": 4,
"metadata": { "metadata": {
"execution": { "execution": {
"iopub.execute_input": "2023-02-13T23:40:52.339977Z", "iopub.execute_input": "2023-02-24T23:25:36.931255Z",
"iopub.status.busy": "2023-02-13T23:40:52.339556Z", "iopub.status.busy": "2023-02-24T23:25:36.930838Z",
"iopub.status.idle": "2023-02-13T23:40:54.603349Z", "iopub.status.idle": "2023-02-24T23:25:39.148799Z",
"shell.execute_reply": "2023-02-13T23:40:54.602630Z" "shell.execute_reply": "2023-02-24T23:25:39.148113Z"
} }
}, },
"outputs": [ "outputs": [
@ -126,7 +126,7 @@
{ {
"data": { "data": {
"application/vnd.jupyter.widget-view+json": { "application/vnd.jupyter.widget-view+json": {
"model_id": "454146d0f7224f038689031002906e6f", "model_id": "35cd066a31b242bb87b2c106ee72e5f2",
"version_major": 2, "version_major": 2,
"version_minor": 0 "version_minor": 0
}, },
@ -186,10 +186,10 @@
"execution_count": 5, "execution_count": 5,
"metadata": { "metadata": {
"execution": { "execution": {
"iopub.execute_input": "2023-02-13T23:40:54.607152Z", "iopub.execute_input": "2023-02-24T23:25:39.152156Z",
"iopub.status.busy": "2023-02-13T23:40:54.606441Z", "iopub.status.busy": "2023-02-24T23:25:39.151531Z",
"iopub.status.idle": "2023-02-13T23:40:54.610504Z", "iopub.status.idle": "2023-02-24T23:25:39.155313Z",
"shell.execute_reply": "2023-02-13T23:40:54.609759Z" "shell.execute_reply": "2023-02-24T23:25:39.154731Z"
}, },
"slideshow": { "slideshow": {
"slide_type": "subslide" "slide_type": "subslide"
@ -238,10 +238,10 @@
"execution_count": 6, "execution_count": 6,
"metadata": { "metadata": {
"execution": { "execution": {
"iopub.execute_input": "2023-02-13T23:40:54.613590Z", "iopub.execute_input": "2023-02-24T23:25:39.158398Z",
"iopub.status.busy": "2023-02-13T23:40:54.613168Z", "iopub.status.busy": "2023-02-24T23:25:39.157766Z",
"iopub.status.idle": "2023-02-13T23:40:54.616873Z", "iopub.status.idle": "2023-02-24T23:25:39.161396Z",
"shell.execute_reply": "2023-02-13T23:40:54.616193Z" "shell.execute_reply": "2023-02-24T23:25:39.160797Z"
} }
}, },
"outputs": [ "outputs": [
@ -287,10 +287,10 @@
"execution_count": 7, "execution_count": 7,
"metadata": { "metadata": {
"execution": { "execution": {
"iopub.execute_input": "2023-02-13T23:40:54.619618Z", "iopub.execute_input": "2023-02-24T23:25:39.164187Z",
"iopub.status.busy": "2023-02-13T23:40:54.619218Z", "iopub.status.busy": "2023-02-24T23:25:39.163867Z",
"iopub.status.idle": "2023-02-13T23:40:54.624272Z", "iopub.status.idle": "2023-02-24T23:25:39.169009Z",
"shell.execute_reply": "2023-02-13T23:40:54.623664Z" "shell.execute_reply": "2023-02-24T23:25:39.168427Z"
} }
}, },
"outputs": [], "outputs": [],
@ -337,10 +337,10 @@
"execution_count": 8, "execution_count": 8,
"metadata": { "metadata": {
"execution": { "execution": {
"iopub.execute_input": "2023-02-13T23:40:54.626998Z", "iopub.execute_input": "2023-02-24T23:25:39.171752Z",
"iopub.status.busy": "2023-02-13T23:40:54.626593Z", "iopub.status.busy": "2023-02-24T23:25:39.171347Z",
"iopub.status.idle": "2023-02-13T23:40:54.631383Z", "iopub.status.idle": "2023-02-24T23:25:39.176343Z",
"shell.execute_reply": "2023-02-13T23:40:54.630770Z" "shell.execute_reply": "2023-02-24T23:25:39.175510Z"
} }
}, },
"outputs": [], "outputs": [],
@ -391,10 +391,10 @@
"execution_count": 9, "execution_count": 9,
"metadata": { "metadata": {
"execution": { "execution": {
"iopub.execute_input": "2023-02-13T23:40:54.634335Z", "iopub.execute_input": "2023-02-24T23:25:39.179030Z",
"iopub.status.busy": "2023-02-13T23:40:54.633929Z", "iopub.status.busy": "2023-02-24T23:25:39.178624Z",
"iopub.status.idle": "2023-02-13T23:40:56.105700Z", "iopub.status.idle": "2023-02-24T23:25:40.584410Z",
"shell.execute_reply": "2023-02-13T23:40:56.105085Z" "shell.execute_reply": "2023-02-24T23:25:40.583802Z"
}, },
"slideshow": { "slideshow": {
"slide_type": "slide" "slide_type": "slide"
@ -418,10 +418,10 @@
"execution_count": 10, "execution_count": 10,
"metadata": { "metadata": {
"execution": { "execution": {
"iopub.execute_input": "2023-02-13T23:40:56.109177Z", "iopub.execute_input": "2023-02-24T23:25:40.587815Z",
"iopub.status.busy": "2023-02-13T23:40:56.108624Z", "iopub.status.busy": "2023-02-24T23:25:40.587283Z",
"iopub.status.idle": "2023-02-13T23:40:56.112651Z", "iopub.status.idle": "2023-02-24T23:25:40.590826Z",
"shell.execute_reply": "2023-02-13T23:40:56.112076Z" "shell.execute_reply": "2023-02-24T23:25:40.590158Z"
}, },
"slideshow": { "slideshow": {
"slide_type": "slide" "slide_type": "slide"
@ -483,10 +483,10 @@
"execution_count": 11, "execution_count": 11,
"metadata": { "metadata": {
"execution": { "execution": {
"iopub.execute_input": "2023-02-13T23:40:56.115383Z", "iopub.execute_input": "2023-02-24T23:25:40.593603Z",
"iopub.status.busy": "2023-02-13T23:40:56.114975Z", "iopub.status.busy": "2023-02-24T23:25:40.593269Z",
"iopub.status.idle": "2023-02-13T23:41:55.045654Z", "iopub.status.idle": "2023-02-24T23:26:38.349191Z",
"shell.execute_reply": "2023-02-13T23:41:55.044973Z" "shell.execute_reply": "2023-02-24T23:26:38.348392Z"
} }
}, },
"outputs": [ "outputs": [
@ -494,119 +494,119 @@
"name": "stderr", "name": "stderr",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"\u001b[32m[I 2023-02-13 23:40:56,159]\u001b[0m A new study created in memory with name: optuna\u001b[0m\n" "\u001b[32m[I 2023-02-24 23:25:40,643]\u001b[0m A new study created in memory with name: optuna\u001b[0m\n"
] ]
}, },
{ {
"name": "stderr", "name": "stderr",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"\u001b[32m[I 2023-02-13 23:40:56,161]\u001b[0m A new study created in memory with name: optuna\u001b[0m\n" "\u001b[32m[I 2023-02-24 23:25:40,646]\u001b[0m A new study created in memory with name: optuna\u001b[0m\n"
] ]
}, },
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"[flaml.tune.tune: 02-13 23:40:56] {806} INFO - trial 1 config: {'model': 'code-davinci-002', 'temperature_or_top_p': {'temperature': 0.36865945026811975}, 'max_tokens': 347, 'n': 1, 'prompt': 1, 'stop': 0}\n" "[flaml.tune.tune: 02-24 23:25:40] {811} INFO - trial 1 config: {'model': 'code-davinci-002', 'temperature_or_top_p': {'temperature': 0.36865945026811975}, 'max_tokens': 347, 'n': 1, 'prompt': 1, 'stop': 0}\n"
] ]
}, },
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"[flaml.tune.tune: 02-13 23:40:59] {215} INFO - result: {'expected_success': 0.6, 'success': 0.6, 'total_cost': 0.4624999999999999, 'cost': 0.4624999999999999, 'inference_cost': 0.023125, 'training_iteration': 0, 'config': {'model': 'code-davinci-002', 'temperature_or_top_p': {'temperature': 0.36865945026811975}, 'max_tokens': 347, 'n': 1, 'prompt': 1, 'stop': 0}, 'config/model': 'code-davinci-002', 'config/temperature_or_top_p': {'temperature': 0.36865945026811975}, 'config/max_tokens': 347, 'config/n': 1, 'config/prompt': 1, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 3.7016141414642334}\n" "[flaml.tune.tune: 02-24 23:25:44] {215} INFO - result: {'expected_success': 0.6, 'success': 0.6, 'total_cost': 0.4624999999999999, 'cost': 0.4624999999999999, 'inference_cost': 0.023125, 'training_iteration': 0, 'config': {'model': 'code-davinci-002', 'temperature_or_top_p': {'temperature': 0.36865945026811975}, 'max_tokens': 347, 'n': 1, 'prompt': 1, 'stop': 0}, 'config/model': 'code-davinci-002', 'config/temperature_or_top_p': {'temperature': 0.36865945026811975}, 'config/max_tokens': 347, 'config/n': 1, 'config/prompt': 1, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 3.687161445617676}\n"
] ]
}, },
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"[flaml.tune.tune: 02-13 23:40:59] {806} INFO - trial 2 config: {'model': 'code-cushman-001', 'temperature_or_top_p': {'temperature': 0.36865945026811975}, 'max_tokens': 347, 'n': 1, 'prompt': 1, 'stop': 0}\n" "[flaml.tune.tune: 02-24 23:25:44] {811} INFO - trial 2 config: {'model': 'code-cushman-001', 'temperature_or_top_p': {'temperature': 0.36865945026811975}, 'max_tokens': 347, 'n': 1, 'prompt': 1, 'stop': 0}\n"
] ]
}, },
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"[flaml.tune.tune: 02-13 23:41:00] {215} INFO - result: {'expected_success': 0.35, 'success': 0.35, 'total_cost': 0.5671159999999997, 'cost': 0.104616, 'inference_cost': 0.0052308, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'temperature': 0.36865945026811975}, 'max_tokens': 347, 'n': 1, 'prompt': 1, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'temperature': 0.36865945026811975}, 'config/max_tokens': 347, 'config/n': 1, 'config/prompt': 1, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 0.673302412033081}\n" "[flaml.tune.tune: 02-24 23:25:45] {215} INFO - result: {'expected_success': 0.35, 'success': 0.35, 'total_cost': 0.5671159999999997, 'cost': 0.104616, 'inference_cost': 0.0052308, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'temperature': 0.36865945026811975}, 'max_tokens': 347, 'n': 1, 'prompt': 1, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'temperature': 0.36865945026811975}, 'config/max_tokens': 347, 'config/n': 1, 'config/prompt': 1, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 0.6666913032531738}\n"
] ]
}, },
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"[flaml.tune.tune: 02-13 23:41:00] {806} INFO - trial 3 config: {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.4985070123025904}, 'max_tokens': 97, 'n': 20, 'prompt': 0, 'stop': 0}\n" "[flaml.tune.tune: 02-24 23:25:45] {811} INFO - trial 3 config: {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.4985070123025904}, 'max_tokens': 97, 'n': 20, 'prompt': 0, 'stop': 0}\n"
] ]
}, },
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"[flaml.tune.tune: 02-13 23:41:17] {215} INFO - result: {'expected_success': 0.5080706992649381, 'success': 0.55, 'total_cost': 1.1848999999999996, 'cost': 0.617784, 'inference_cost': 0.0287676, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.4985070123025904}, 'max_tokens': 97, 'n': 20, 'prompt': 0, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'top_p': 0.4985070123025904}, 'config/max_tokens': 97, 'config/n': 20, 'config/prompt': 0, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 16.56331181526184}\n" "[flaml.tune.tune: 02-24 23:26:01] {215} INFO - result: {'expected_success': 0.5080706992649381, 'success': 0.55, 'total_cost': 1.1424679999999998, 'cost': 0.575352, 'inference_cost': 0.0287676, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.4985070123025904}, 'max_tokens': 97, 'n': 20, 'prompt': 0, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'top_p': 0.4985070123025904}, 'config/max_tokens': 97, 'config/n': 20, 'config/prompt': 0, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 16.66586470603943}\n"
] ]
}, },
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"[flaml.tune.tune: 02-13 23:41:17] {806} INFO - trial 4 config: {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.6125260668293881}, 'max_tokens': 433, 'n': 29, 'prompt': 0, 'stop': 0}\n" "[flaml.tune.tune: 02-24 23:26:01] {811} INFO - trial 4 config: {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.6125260668293881}, 'max_tokens': 433, 'n': 29, 'prompt': 0, 'stop': 0}\n"
] ]
}, },
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"[flaml.tune.tune: 02-13 23:41:51] {215} INFO - result: {'expected_success': 0.6186627404336135, 'success': 0.65, 'total_cost': 2.4239719999999987, 'cost': 1.2390720000000002, 'inference_cost': 0.059620799999999995, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.6125260668293881}, 'max_tokens': 433, 'n': 29, 'prompt': 0, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'top_p': 0.6125260668293881}, 'config/max_tokens': 433, 'config/n': 29, 'config/prompt': 0, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 34.57707595825195}\n" "[flaml.tune.tune: 02-24 23:26:38] {215} INFO - result: {'expected_success': 0.6186627404336135, 'success': 0.65, 'total_cost': 2.3693479999999987, 'cost': 1.2268800000000002, 'inference_cost': 0.059620799999999995, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.6125260668293881}, 'max_tokens': 433, 'n': 29, 'prompt': 0, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'top_p': 0.6125260668293881}, 'config/max_tokens': 433, 'config/n': 29, 'config/prompt': 0, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 36.605130434036255}\n"
] ]
}, },
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"[flaml.tune.tune: 02-13 23:41:51] {806} INFO - trial 5 config: {'model': 'code-davinci-002', 'temperature_or_top_p': {'temperature': 0.6177669784693172}, 'max_tokens': 231, 'n': 65, 'prompt': 3, 'stop': 0}\n" "[flaml.tune.tune: 02-24 23:26:38] {811} INFO - trial 5 config: {'model': 'code-davinci-002', 'temperature_or_top_p': {'temperature': 0.6177669784693172}, 'max_tokens': 231, 'n': 65, 'prompt': 3, 'stop': 0}\n"
] ]
}, },
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"[flaml.tune.tune: 02-13 23:41:51] {215} INFO - result: {'expected_success': 0, 'total_cost': 2.6356719999999987, 'cost': 0.2117, 'training_iteration': 0, 'config': {'model': 'code-davinci-002', 'temperature_or_top_p': {'temperature': 0.6177669784693172}, 'max_tokens': 231, 'n': 65, 'prompt': 3, 'stop': 0}, 'config/model': 'code-davinci-002', 'config/temperature_or_top_p': {'temperature': 0.6177669784693172}, 'config/max_tokens': 231, 'config/n': 65, 'config/prompt': 3, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 0.0022132396697998047}\n" "[flaml.tune.tune: 02-24 23:26:38] {215} INFO - result: {'expected_success': 0, 'total_cost': 2.5295479999999984, 'cost': 0.1602, 'training_iteration': 0, 'config': {'model': 'code-davinci-002', 'temperature_or_top_p': {'temperature': 0.6177669784693172}, 'max_tokens': 231, 'n': 65, 'prompt': 3, 'stop': 0}, 'config/model': 'code-davinci-002', 'config/temperature_or_top_p': {'temperature': 0.6177669784693172}, 'config/max_tokens': 231, 'config/n': 65, 'config/prompt': 3, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 0.0020499229431152344}\n"
] ]
}, },
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"[flaml.tune.tune: 02-13 23:41:51] {806} INFO - trial 6 config: {'model': 'code-davinci-002', 'max_tokens': 263, 'n': 41, 'prompt': 0, 'stop': 0, 'temperature_or_top_p': {'top_p': 0.49834557213253655}}\n" "[flaml.tune.tune: 02-24 23:26:38] {811} INFO - trial 6 config: {'model': 'code-davinci-002', 'max_tokens': 263, 'n': 41, 'prompt': 0, 'stop': 0, 'temperature_or_top_p': {'top_p': 0.49834557213253655}}\n"
] ]
}, },
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"[flaml.tune.tune: 02-13 23:41:54] {215} INFO - result: {'expected_success': 0, 'total_cost': 3.003171999999999, 'cost': 0.3675, 'training_iteration': 0, 'config': {'model': 'code-davinci-002', 'max_tokens': 263, 'n': 41, 'prompt': 0, 'stop': 0, 'temperature_or_top_p': {'top_p': 0.49834557213253655}}, 'config/model': 'code-davinci-002', 'config/max_tokens': 263, 'config/n': 41, 'config/prompt': 0, 'config/stop': 0, 'config/temperature_or_top_p': {'top_p': 0.49834557213253655}, 'experiment_tag': 'exp', 'time_total_s': 3.3002660274505615}\n" "[flaml.tune.tune: 02-24 23:26:38] {215} INFO - result: {'expected_success': 0, 'total_cost': 2.8578479999999984, 'cost': 0.32830000000000004, 'training_iteration': 0, 'config': {'model': 'code-davinci-002', 'max_tokens': 263, 'n': 41, 'prompt': 0, 'stop': 0, 'temperature_or_top_p': {'top_p': 0.49834557213253655}}, 'config/model': 'code-davinci-002', 'config/max_tokens': 263, 'config/n': 41, 'config/prompt': 0, 'config/stop': 0, 'config/temperature_or_top_p': {'top_p': 0.49834557213253655}, 'experiment_tag': 'exp', 'time_total_s': 0.002808809280395508}\n"
] ]
}, },
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"[flaml.tune.tune: 02-13 23:41:55] {806} INFO - trial 7 config: {'model': 'code-cushman-001', 'temperature_or_top_p': {'temperature': 0.8286813263076767}, 'max_tokens': 57, 'n': 63, 'prompt': 3, 'stop': 0}\n" "[flaml.tune.tune: 02-24 23:26:38] {811} INFO - trial 7 config: {'model': 'code-cushman-001', 'temperature_or_top_p': {'temperature': 0.8286813263076767}, 'max_tokens': 57, 'n': 63, 'prompt': 3, 'stop': 0}\n"
] ]
}, },
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"[flaml.tune.tune: 02-13 23:41:55] {215} INFO - result: {'expected_success': 0, 'total_cost': 4.046379999999999, 'cost': 1.043208, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'temperature': 0.8286813263076767}, 'max_tokens': 57, 'n': 63, 'prompt': 3, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'temperature': 0.8286813263076767}, 'config/max_tokens': 57, 'config/n': 63, 'config/prompt': 3, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 0.007852792739868164}\n" "[flaml.tune.tune: 02-24 23:26:38] {215} INFO - result: {'expected_success': 0, 'total_cost': 4.028831999999999, 'cost': 1.170984, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'temperature': 0.8286813263076767}, 'max_tokens': 57, 'n': 63, 'prompt': 3, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'temperature': 0.8286813263076767}, 'config/max_tokens': 57, 'config/n': 63, 'config/prompt': 3, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 0.015198230743408203}\n"
] ]
}, },
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"[flaml.tune.tune: 02-13 23:41:55] {827} WARNING - fail to sample a trial for 100 times in a row, stopping.\n" "[flaml.tune.tune: 02-24 23:26:38] {834} WARNING - fail to sample a trial for 100 times in a row, stopping.\n"
] ]
} }
], ],
@ -656,10 +656,10 @@
"execution_count": 12, "execution_count": 12,
"metadata": { "metadata": {
"execution": { "execution": {
"iopub.execute_input": "2023-02-13T23:41:55.049204Z", "iopub.execute_input": "2023-02-24T23:26:38.352710Z",
"iopub.status.busy": "2023-02-13T23:41:55.048871Z", "iopub.status.busy": "2023-02-24T23:26:38.352378Z",
"iopub.status.idle": "2023-02-13T23:41:55.053284Z", "iopub.status.idle": "2023-02-24T23:26:38.356939Z",
"shell.execute_reply": "2023-02-13T23:41:55.052574Z" "shell.execute_reply": "2023-02-24T23:26:38.356217Z"
} }
}, },
"outputs": [ "outputs": [
@ -668,7 +668,7 @@
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"optimized config {'model': 'code-cushman-001', 'max_tokens': 433, 'n': 29, 'prompt': '{prompt}', 'stop': ['\\nclass', '\\ndef', '\\nif', '\\nprint'], 'top_p': 0.6125260668293881}\n", "optimized config {'model': 'code-cushman-001', 'max_tokens': 433, 'n': 29, 'prompt': '{prompt}', 'stop': ['\\nclass', '\\ndef', '\\nif', '\\nprint'], 'top_p': 0.6125260668293881}\n",
"best result on tuning data {'expected_success': 0.6186627404336135, 'success': 0.65, 'total_cost': 2.4239719999999987, 'cost': 1.2390720000000002, 'inference_cost': 0.059620799999999995, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.6125260668293881}, 'max_tokens': 433, 'n': 29, 'prompt': 0, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'top_p': 0.6125260668293881}, 'config/max_tokens': 433, 'config/n': 29, 'config/prompt': 0, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 34.57707595825195}\n" "best result on tuning data {'expected_success': 0.6186627404336135, 'success': 0.65, 'total_cost': 2.3693479999999987, 'cost': 1.2268800000000002, 'inference_cost': 0.059620799999999995, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.6125260668293881}, 'max_tokens': 433, 'n': 29, 'prompt': 0, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'top_p': 0.6125260668293881}, 'config/max_tokens': 433, 'config/n': 29, 'config/prompt': 0, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 36.605130434036255}\n"
] ]
} }
], ],
@ -696,10 +696,10 @@
"execution_count": 13, "execution_count": 13,
"metadata": { "metadata": {
"execution": { "execution": {
"iopub.execute_input": "2023-02-13T23:41:55.056205Z", "iopub.execute_input": "2023-02-24T23:26:38.359902Z",
"iopub.status.busy": "2023-02-13T23:41:55.055631Z", "iopub.status.busy": "2023-02-24T23:26:38.359506Z",
"iopub.status.idle": "2023-02-13T23:41:56.039259Z", "iopub.status.idle": "2023-02-24T23:26:39.343921Z",
"shell.execute_reply": "2023-02-13T23:41:56.038427Z" "shell.execute_reply": "2023-02-24T23:26:39.343051Z"
}, },
"slideshow": { "slideshow": {
"slide_type": "subslide" "slide_type": "subslide"
@ -921,7 +921,7 @@
"source": [ "source": [
"### Evaluate the success rate on the test data\n", "### Evaluate the success rate on the test data\n",
"\n", "\n",
"You can use flaml's `oai.Completion.eval` to evaluate the performance of an entire dataset with the tuned config. To do that you need to set `oai.Completion.data` to the data to evaluate. The following code will take a while to evaluate all the 144 test data instances. Compared to the baseline success rate (0.46) on the [HELM benchmark](https://crfm.stanford.edu/helm/latest/?group=code_humaneval), the tuned config has a success rate of 0.68. It can be further improved if the inference budget and optimization budget are further increased." "You can use flaml's `oai.Completion.eval` to evaluate the performance of an entire dataset with the tuned config. To do that you need to set `oai.Completion.data` to the data to evaluate. The following code will take a while to evaluate all the 144 test data instances. Compared to the baseline success rate (46%) on the [HELM benchmark](https://crfm.stanford.edu/helm/latest/?group=code_humaneval), the tuned config has a success rate of 68%. It can be further improved if the inference budget and optimization budget are further increased."
] ]
}, },
{ {
@ -929,10 +929,10 @@
"execution_count": 14, "execution_count": 14,
"metadata": { "metadata": {
"execution": { "execution": {
"iopub.execute_input": "2023-02-13T23:41:56.042764Z", "iopub.execute_input": "2023-02-24T23:26:39.347295Z",
"iopub.status.busy": "2023-02-13T23:41:56.042086Z", "iopub.status.busy": "2023-02-24T23:26:39.346994Z",
"iopub.status.idle": "2023-02-13T23:53:05.597643Z", "iopub.status.idle": "2023-02-24T23:29:27.160335Z",
"shell.execute_reply": "2023-02-13T23:53:05.596603Z" "shell.execute_reply": "2023-02-24T23:29:27.159519Z"
} }
}, },
"outputs": [ "outputs": [
@ -940,7 +940,7 @@
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"{'expected_success': 0.6364503360372493, 'success': 0.6805555555555556, 'total_cost': 12.227739999999997, 'cost': 8.181360000000003, 'inference_cost': 0.056815}\n" "{'expected_success': 0.6364503360372493, 'success': 0.6805555555555556, 'total_cost': 12.210191999999997, 'cost': 8.181360000000003, 'inference_cost': 0.056815}\n"
] ]
} }
], ],
@ -977,60 +977,25 @@
"widgets": { "widgets": {
"application/vnd.jupyter.widget-state+json": { "application/vnd.jupyter.widget-state+json": {
"state": { "state": {
"2d910cfd2d2a4fc49fc30fbbdc5576a7": { "24dd93300e0442788ee6cc1310e5bf14": {
"model_module": "@jupyter-widgets/base", "model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0", "model_module_version": "2.0.0",
"model_name": "LayoutModel", "model_name": "HTMLStyleModel",
"state": { "state": {
"_model_module": "@jupyter-widgets/base", "_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0", "_model_module_version": "2.0.0",
"_model_name": "LayoutModel", "_model_name": "HTMLStyleModel",
"_view_count": null, "_view_count": null,
"_view_module": "@jupyter-widgets/base", "_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0", "_view_module_version": "2.0.0",
"_view_name": "LayoutView", "_view_name": "StyleView",
"align_content": null, "background": null,
"align_items": null, "description_width": "",
"align_self": null, "font_size": null,
"border_bottom": null, "text_color": null
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
} }
}, },
"454146d0f7224f038689031002906e6f": { "35cd066a31b242bb87b2c106ee72e5f2": {
"model_module": "@jupyter-widgets/controls", "model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0", "model_module_version": "2.0.0",
"model_name": "HBoxModel", "model_name": "HBoxModel",
@ -1045,95 +1010,34 @@
"_view_name": "HBoxView", "_view_name": "HBoxView",
"box_style": "", "box_style": "",
"children": [ "children": [
"IPY_MODEL_e4ae2b6f5a974fd4bafb6abb9d12ff26", "IPY_MODEL_8e7ee7687a99410d88a98a74ecfcea99",
"IPY_MODEL_577e1e3cc4db4942b0883577b3b52755", "IPY_MODEL_421e02a11a974b40b3ddb75382b3b640",
"IPY_MODEL_b40bdfb1ac1d4cffb7cefcb870c64d45" "IPY_MODEL_77db9797e78b49438d21c5c8da34b4cb"
], ],
"layout": "IPY_MODEL_dc83c7bff2f241309537a8119dfc7555", "layout": "IPY_MODEL_47d3046236a54b0e8f9ae455a82c7e0b",
"tabbable": null, "tabbable": null,
"tooltip": null "tooltip": null
} }
}, },
"577e1e3cc4db4942b0883577b3b52755": { "3d5d106a38954af2bb3bde5777702f4e": {
"model_module": "@jupyter-widgets/controls", "model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0", "model_module_version": "2.0.0",
"model_name": "FloatProgressModel", "model_name": "HTMLStyleModel",
"state": { "state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls", "_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0", "_model_module_version": "2.0.0",
"_model_name": "FloatProgressModel", "_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_2d910cfd2d2a4fc49fc30fbbdc5576a7",
"max": 1,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_74a6ba0c3cbc4051be0a83e152fe1e62",
"tabbable": null,
"tooltip": null,
"value": 1
}
},
"6086462a12d54bafa59d3c4566f06cb2": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null, "_view_count": null,
"_view_module": "@jupyter-widgets/base", "_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0", "_view_module_version": "2.0.0",
"_view_name": "LayoutView", "_view_name": "StyleView",
"align_content": null, "background": null,
"align_items": null, "description_width": "",
"align_self": null, "font_size": null,
"border_bottom": null, "text_color": null
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
} }
}, },
"74a6ba0c3cbc4051be0a83e152fe1e62": { "3e1ebb31412443b0bca86a301cbdac11": {
"model_module": "@jupyter-widgets/controls", "model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0", "model_module_version": "2.0.0",
"model_name": "ProgressStyleModel", "model_name": "ProgressStyleModel",
@ -1149,66 +1053,33 @@
"description_width": "" "description_width": ""
} }
}, },
"7d3f3d9e15894d05a4d188ff4f466554": { "421e02a11a974b40b3ddb75382b3b640": {
"model_module": "@jupyter-widgets/controls", "model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0", "model_module_version": "2.0.0",
"model_name": "HTMLStyleModel", "model_name": "FloatProgressModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"b40bdfb1ac1d4cffb7cefcb870c64d45": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": { "state": {
"_dom_classes": [], "_dom_classes": [],
"_model_module": "@jupyter-widgets/controls", "_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0", "_model_module_version": "2.0.0",
"_model_name": "HTMLModel", "_model_name": "FloatProgressModel",
"_view_count": null, "_view_count": null,
"_view_module": "@jupyter-widgets/controls", "_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0", "_view_module_version": "2.0.0",
"_view_name": "HTMLView", "_view_name": "ProgressView",
"bar_style": "success",
"description": "", "description": "",
"description_allow_html": false, "description_allow_html": false,
"layout": "IPY_MODEL_f1355871cc6f4dd4b50d9df5af20e5c8", "layout": "IPY_MODEL_e6398d4027c9459a97965b9d91ae484f",
"placeholder": "", "max": 1,
"style": "IPY_MODEL_ca245376fd9f4354af6b2befe4af4466", "min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_3e1ebb31412443b0bca86a301cbdac11",
"tabbable": null, "tabbable": null,
"tooltip": null, "tooltip": null,
"value": " 1/1 [00:00&lt;00:00, 44.69it/s]" "value": 1
} }
}, },
"ca245376fd9f4354af6b2befe4af4466": { "47d3046236a54b0e8f9ae455a82c7e0b": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"dc83c7bff2f241309537a8119dfc7555": {
"model_module": "@jupyter-widgets/base", "model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0", "model_module_version": "2.0.0",
"model_name": "LayoutModel", "model_name": "LayoutModel",
@ -1261,7 +1132,60 @@
"width": null "width": null
} }
}, },
"e4ae2b6f5a974fd4bafb6abb9d12ff26": { "754800f7feb04acea977696e4787d1ff": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"77db9797e78b49438d21c5c8da34b4cb": {
"model_module": "@jupyter-widgets/controls", "model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0", "model_module_version": "2.0.0",
"model_name": "HTMLModel", "model_name": "HTMLModel",
@ -1276,15 +1200,91 @@
"_view_name": "HTMLView", "_view_name": "HTMLView",
"description": "", "description": "",
"description_allow_html": false, "description_allow_html": false,
"layout": "IPY_MODEL_6086462a12d54bafa59d3c4566f06cb2", "layout": "IPY_MODEL_7b6c4e1c11e249409a1edcd63be450d8",
"placeholder": "", "placeholder": "",
"style": "IPY_MODEL_7d3f3d9e15894d05a4d188ff4f466554", "style": "IPY_MODEL_3d5d106a38954af2bb3bde5777702f4e",
"tabbable": null,
"tooltip": null,
"value": " 1/1 [00:00&lt;00:00, 44.40it/s]"
}
},
"7b6c4e1c11e249409a1edcd63be450d8": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"8e7ee7687a99410d88a98a74ecfcea99": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_754800f7feb04acea977696e4787d1ff",
"placeholder": "",
"style": "IPY_MODEL_24dd93300e0442788ee6cc1310e5bf14",
"tabbable": null, "tabbable": null,
"tooltip": null, "tooltip": null,
"value": "100%" "value": "100%"
} }
}, },
"f1355871cc6f4dd4b50d9df5af20e5c8": { "e6398d4027c9459a97965b9d91ae484f": {
"model_module": "@jupyter-widgets/base", "model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0", "model_module_version": "2.0.0",
"model_name": "LayoutModel", "model_name": "LayoutModel",

View File

@ -92,7 +92,14 @@ setuptools.setup(
"vw": [ "vw": [
"vowpalwabbit>=8.10.0, <9.0.0", "vowpalwabbit>=8.10.0, <9.0.0",
], ],
"nlp": [ "hf": [
"transformers[torch]==4.26",
"datasets",
"nltk",
"rouge_score",
"seqeval",
],
"nlp": [ # for backward compatibility; hf is the new option name
"transformers[torch]==4.26", "transformers[torch]==4.26",
"datasets", "datasets",
"nltk", "nltk",

View File

@ -2,9 +2,9 @@
### Requirements ### Requirements
This example requires GPU. Install the [nlp] option: This example requires GPU. Install the [hf] option:
```python ```python
pip install "flaml[nlp]" pip install "flaml[hf]"
``` ```
### A simple sequence classification example ### A simple sequence classification example

View File

@ -3,17 +3,17 @@
<!-- ### Welcome to FLAML, a Fast Library for Automated Machine Learning & Tuning! --> <!-- ### Welcome to FLAML, a Fast Library for Automated Machine Learning & Tuning! -->
FLAML is a lightweight Python library that finds accurate machine FLAML is a lightweight Python library that finds accurate machine
learning models automatically, efficiently and economically. It frees users from selecting learners and hyperparameters for each learner. learning models automatically, efficiently and economically. It frees users from selecting models and hyperparameters for each model.
### Main Features ### Main Features
1. For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It supports both classical machine learning models and deep neural networks. 1. For common machine learning or AI tasks like classification, regression, and generation, it quickly finds quality models for user-provided data with low computational resources. It supports both classical machine learning models and deep neural networks, including large language models such as the OpenAI GPT-3 models.
2. It is easy to customize or extend. Users can find their desired customizability from a smooth range: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space and metric), or full customization (arbitrary training and evaluation code). Users can customize only when and what they need to, and leave the rest to the library. 2. It is easy to customize or extend. Users can find their desired customizability from a smooth range: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space and metric), or full customization (arbitrary training and evaluation code). Users can customize only when and what they need to, and leave the rest to the library.
3. It supports fast and economical automatic tuning, capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping. FLAML is powered by a new, [cost-effective 3. It supports fast and economical automatic tuning, capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping. FLAML is powered by a new, [cost-effective
hyperparameter optimization](Use-Cases/Tune-User-Defined-Function#hyperparameter-optimization-algorithm) hyperparameter optimization](Use-Cases/Tune-User-Defined-Function#hyperparameter-optimization-algorithm)
and learner selection method invented by Microsoft Research. and model selection method invented by Microsoft Research, and many followup [research studies](Research).
### Quickstart ### Quickstart

View File

@ -24,8 +24,11 @@ install flaml with the [notebook] option:
pip install flaml[notebook] pip install flaml[notebook]
``` ```
#### Extra learners #### Extra learners/models
* openai models
```bash
pip install flaml[openai]
```
* catboost * catboost
```bash ```bash
pip install flaml[catboost] pip install flaml[catboost]
@ -38,10 +41,9 @@ pip install flaml[vw]
```bash ```bash
pip install flaml[forecast] pip install flaml[forecast]
``` ```
* huggingface transformers
* natural language processing: transformers
```bash ```bash
pip install flaml[nlp] pip install flaml[hf]
``` ```
#### Distributed tuning #### Distributed tuning