mirror of https://github.com/microsoft/autogen.git
improve max_valid_n and doc (#933)
* improve max_valid_n and doc * Update README.md Co-authored-by: Li Jiang <lijiang1@microsoft.com> * newline at end of file * doc --------- Co-authored-by: Li Jiang <lijiang1@microsoft.com> Co-authored-by: Susan Xueqing Liu <liususan091219@users.noreply.github.com> Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
This commit is contained in:
parent
97928609ba
commit
1ec77b58b4
10
README.md
10
README.md
|
@ -14,20 +14,22 @@
|
||||||
<br>
|
<br>
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
:fire: An [upcoming tutorial on FLAML](https://github.com/microsoft/FLAML/tree/tutorial-aaai23/tutorial) at [AAAI-23](https://aaai.org/Conferences/AAAI-23/aaai23tutorials/) (to be held on Feb 08, 2023)
|
:fire: OpenAI GPT-3 models support in v1.1.3. ChatGPT support is coming.
|
||||||
|
|
||||||
|
:fire: A [lab forum](https://github.com/microsoft/FLAML/tree/tutorial-aaai23/tutorial) on FLAML at AAAI 2023.
|
||||||
|
|
||||||
:fire: A [hands-on tutorial](https://github.com/microsoft/FLAML/tree/tutorial/tutorial) on FLAML presented at KDD 2022
|
:fire: A [hands-on tutorial](https://github.com/microsoft/FLAML/tree/tutorial/tutorial) on FLAML presented at KDD 2022
|
||||||
|
|
||||||
## What is FLAML
|
## What is FLAML
|
||||||
FLAML is a lightweight Python library that finds accurate machine
|
FLAML is a lightweight Python library that finds accurate machine
|
||||||
learning models automatically, efficiently and economically. It frees users from selecting
|
learning models automatically, efficiently and economically. It frees users from selecting
|
||||||
learners and hyperparameters for each learner. It can also be used to tune generic hyperparameters for MLOps workflows, pipelines, mathematical/statistical models, algorithms, computing experiments, software configurations and so on.
|
models and hyperparameters for each model. It can also be used to tune generic hyperparameters for large language models (LLM), MLOps/LMOps workflows, pipelines, mathematical/statistical models, algorithms, computing experiments, software configurations and so on.
|
||||||
|
|
||||||
1. For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It supports both classifcal machine learning models and deep neural networks.
|
1. For common machine learning or AI tasks like classification, regression, and generation, it quickly finds quality models for user-provided data with low computational resources. It supports both classical machine learning models and deep neural networks, including large language models such as the OpenAI GPT-3 models.
|
||||||
1. It is easy to customize or extend. Users can find their desired customizability from a smooth range: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space and metric), or full customization (arbitrary training and evaluation code).
|
1. It is easy to customize or extend. Users can find their desired customizability from a smooth range: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space and metric), or full customization (arbitrary training and evaluation code).
|
||||||
1. It supports fast automatic tuning, capable of handling complex constraints/guidance/early stopping. FLAML is powered by a new, [cost-effective
|
1. It supports fast automatic tuning, capable of handling complex constraints/guidance/early stopping. FLAML is powered by a new, [cost-effective
|
||||||
hyperparameter optimization](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function/#hyperparameter-optimization-algorithm)
|
hyperparameter optimization](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function/#hyperparameter-optimization-algorithm)
|
||||||
and learner selection method invented by Microsoft Research.
|
and model selection method invented by Microsoft Research, and many followup [research studies](https://microsoft.github.io/FLAML/docs/Research).
|
||||||
|
|
||||||
FLAML has a .NET implementation in [ML.NET](http://dot.net/ml), an open-source, cross-platform machine learning framework for .NET. In ML.NET, you can use FLAML via low-code solutions like [Model Builder](https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet/model-builder) Visual Studio extension and the cross-platform [ML.NET CLI](https://docs.microsoft.com/dotnet/machine-learning/automate-training-with-cli). Alternatively, you can use the [ML.NET AutoML API](https://www.nuget.org/packages/Microsoft.ML.AutoML/#versions-body-tab) for a code-first experience.
|
FLAML has a .NET implementation in [ML.NET](http://dot.net/ml), an open-source, cross-platform machine learning framework for .NET. In ML.NET, you can use FLAML via low-code solutions like [Model Builder](https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet/model-builder) Visual Studio extension and the cross-platform [ML.NET CLI](https://docs.microsoft.com/dotnet/machine-learning/automate-training-with-cli). Alternatively, you can use the [ML.NET AutoML API](https://www.nuget.org/packages/Microsoft.ML.AutoML/#versions-body-tab) for a code-first experience.
|
||||||
|
|
||||||
|
|
|
@ -207,11 +207,11 @@ def metric_loss_score(
|
||||||
except ImportError:
|
except ImportError:
|
||||||
raise ValueError(
|
raise ValueError(
|
||||||
metric_name
|
metric_name
|
||||||
+ " is not an built-in sklearn metric and nlp is not installed. "
|
+ " is not an built-in sklearn metric and [hf] is not installed. "
|
||||||
"Currently built-in sklearn metrics are: "
|
"Currently built-in sklearn metrics are: "
|
||||||
"r2, rmse, mae, mse, accuracy, roc_auc, roc_auc_ovr, roc_auc_ovo,"
|
"r2, rmse, mae, mse, accuracy, roc_auc, roc_auc_ovr, roc_auc_ovo,"
|
||||||
"log_loss, mape, f1, micro_f1, macro_f1, ap. "
|
"log_loss, mape, f1, micro_f1, macro_f1, ap. "
|
||||||
"If the metric is an nlp metric, please pip install flaml[nlp] ",
|
"If the metric is a huggingface metric, please pip install flaml[hf] ",
|
||||||
"or pass a customized metric function to AutoML.fit(metric=func)",
|
"or pass a customized metric function to AutoML.fit(metric=func)",
|
||||||
)
|
)
|
||||||
# If the metric is not found from huggingface dataset metric list (i.e., FileNotFoundError)
|
# If the metric is not found from huggingface dataset metric list (i.e., FileNotFoundError)
|
||||||
|
|
|
@ -179,6 +179,7 @@ class Completion:
|
||||||
"""
|
"""
|
||||||
cost = 0
|
cost = 0
|
||||||
data = cls.data
|
data = cls.data
|
||||||
|
data_length = len(data)
|
||||||
target_n_tokens = (
|
target_n_tokens = (
|
||||||
1000 * cls.inference_budget / cls.price1K[config["model"]]
|
1000 * cls.inference_budget / cls.price1K[config["model"]]
|
||||||
if cls.inference_budget and cls.price1K.get(config["model"])
|
if cls.inference_budget and cls.price1K.get(config["model"])
|
||||||
|
@ -187,26 +188,33 @@ class Completion:
|
||||||
prune_hp = cls._prune_hp
|
prune_hp = cls._prune_hp
|
||||||
metric = cls._metric
|
metric = cls._metric
|
||||||
config_n = config[prune_hp]
|
config_n = config[prune_hp]
|
||||||
max_tokens = config["max_tokens"]
|
max_tokens = config.get("max_tokens", 16) # default value in OpenAI is 16
|
||||||
region_key = cls._get_region_key(config)
|
region_key = cls._get_region_key(config)
|
||||||
prompt = cls._prompts[config["prompt"]]
|
prompt = cls._prompts[config["prompt"]]
|
||||||
stop = cls._stops and cls._stops[config["stop"]]
|
stop = cls._stops and cls._stops[config["stop"]]
|
||||||
if prune and target_n_tokens:
|
if prune and target_n_tokens:
|
||||||
max_valid_n = cls._get_max_valid_n(region_key, max_tokens)
|
max_valid_n = cls._get_max_valid_n(region_key, max_tokens)
|
||||||
min_invalid_n = cls._get_min_invalid_n(region_key, max_tokens)
|
if cls.avg_input_tokens:
|
||||||
if min_invalid_n is not None and config_n >= min_invalid_n:
|
# max_tokens bounds the maximum tokens
|
||||||
if config_n > max_valid_n:
|
# so using it we can calculate a valid n according to the avg # input tokens
|
||||||
|
max_valid_n = max(
|
||||||
|
max_valid_n,
|
||||||
|
int((target_n_tokens - cls.avg_input_tokens) // max_tokens),
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
input_tokens = [None] * data_length
|
||||||
|
if config_n <= max_valid_n:
|
||||||
|
start_n = config_n
|
||||||
|
else:
|
||||||
|
min_invalid_n = cls._get_min_invalid_n(region_key, max_tokens)
|
||||||
|
if min_invalid_n is not None and config_n >= min_invalid_n:
|
||||||
# prune this config
|
# prune this config
|
||||||
return {
|
return {
|
||||||
"inference_cost": np.inf,
|
"inference_cost": np.inf,
|
||||||
metric: np.inf if cls._mode == "min" else -np.inf,
|
metric: np.inf if cls._mode == "min" else -np.inf,
|
||||||
"cost": cost,
|
"cost": cost,
|
||||||
}
|
}
|
||||||
# since config_n<=max_valid_n, there is a chance config_n is valid
|
start_n = max_valid_n + 1
|
||||||
start_n = config_n
|
|
||||||
else:
|
|
||||||
# start from a valid n
|
|
||||||
start_n = min(max_valid_n, config_n)
|
|
||||||
else:
|
else:
|
||||||
start_n = config_n
|
start_n = config_n
|
||||||
params = config.copy()
|
params = config.copy()
|
||||||
|
@ -214,7 +222,6 @@ class Completion:
|
||||||
temperature_or_top_p = params.pop("temperature_or_top_p", None)
|
temperature_or_top_p = params.pop("temperature_or_top_p", None)
|
||||||
if temperature_or_top_p:
|
if temperature_or_top_p:
|
||||||
params.update(temperature_or_top_p)
|
params.update(temperature_or_top_p)
|
||||||
data_length = len(data)
|
|
||||||
num_completions, previous_num_completions = start_n, 0
|
num_completions, previous_num_completions = start_n, 0
|
||||||
n_tokens_list, result, responses_list = [], {}, []
|
n_tokens_list, result, responses_list = [], {}, []
|
||||||
while True: # n <= config_n
|
while True: # n <= config_n
|
||||||
|
@ -242,6 +249,14 @@ class Completion:
|
||||||
if previous_num_completions
|
if previous_num_completions
|
||||||
else response["usage"]["total_tokens"]
|
else response["usage"]["total_tokens"]
|
||||||
)
|
)
|
||||||
|
if (
|
||||||
|
prune
|
||||||
|
and target_n_tokens
|
||||||
|
and not cls.avg_input_tokens
|
||||||
|
and not input_tokens[i]
|
||||||
|
):
|
||||||
|
# store the # input tokens
|
||||||
|
input_tokens[i] = response["usage"]["prompt_tokens"]
|
||||||
# Under Assumption 1, we should count both the input and output tokens in the first query,
|
# Under Assumption 1, we should count both the input and output tokens in the first query,
|
||||||
# and only count ouput tokens afterwards
|
# and only count ouput tokens afterwards
|
||||||
query_cost = (
|
query_cost = (
|
||||||
|
@ -335,6 +350,8 @@ class Completion:
|
||||||
result["inference_cost"] = (
|
result["inference_cost"] = (
|
||||||
avg_n_tokens * cls.price1K[config["model"]] / 1000
|
avg_n_tokens * cls.price1K[config["model"]] / 1000
|
||||||
)
|
)
|
||||||
|
if prune and target_n_tokens and not cls.avg_input_tokens:
|
||||||
|
cls.avg_input_tokens = np.mean(input_tokens)
|
||||||
break
|
break
|
||||||
else:
|
else:
|
||||||
if data_early_stop:
|
if data_early_stop:
|
||||||
|
@ -424,6 +441,7 @@ class Completion:
|
||||||
cls._total_cost = 0 # total optimization cost
|
cls._total_cost = 0 # total optimization cost
|
||||||
cls._eval_func = eval_func
|
cls._eval_func = eval_func
|
||||||
cls.data = data
|
cls.data = data
|
||||||
|
cls.avg_input_tokens = None
|
||||||
|
|
||||||
search_alg = BlendSearch(
|
search_alg = BlendSearch(
|
||||||
cost_attr="cost",
|
cost_attr="cost",
|
||||||
|
|
File diff suppressed because one or more lines are too long
|
@ -30,10 +30,10 @@
|
||||||
"execution_count": 1,
|
"execution_count": 1,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"execution": {
|
"execution": {
|
||||||
"iopub.execute_input": "2023-02-13T23:40:52.317406Z",
|
"iopub.execute_input": "2023-02-24T23:25:36.910966Z",
|
||||||
"iopub.status.busy": "2023-02-13T23:40:52.316561Z",
|
"iopub.status.busy": "2023-02-24T23:25:36.910473Z",
|
||||||
"iopub.status.idle": "2023-02-13T23:40:52.321193Z",
|
"iopub.status.idle": "2023-02-24T23:25:36.914554Z",
|
||||||
"shell.execute_reply": "2023-02-13T23:40:52.320628Z"
|
"shell.execute_reply": "2023-02-24T23:25:36.914030Z"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
|
@ -54,10 +54,10 @@
|
||||||
"execution_count": 2,
|
"execution_count": 2,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"execution": {
|
"execution": {
|
||||||
"iopub.execute_input": "2023-02-13T23:40:52.324240Z",
|
"iopub.execute_input": "2023-02-24T23:25:36.917301Z",
|
||||||
"iopub.status.busy": "2023-02-13T23:40:52.323783Z",
|
"iopub.status.busy": "2023-02-24T23:25:36.917011Z",
|
||||||
"iopub.status.idle": "2023-02-13T23:40:52.330570Z",
|
"iopub.status.idle": "2023-02-24T23:25:36.923156Z",
|
||||||
"shell.execute_reply": "2023-02-13T23:40:52.329750Z"
|
"shell.execute_reply": "2023-02-24T23:25:36.922619Z"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
|
@ -81,10 +81,10 @@
|
||||||
"execution_count": 3,
|
"execution_count": 3,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"execution": {
|
"execution": {
|
||||||
"iopub.execute_input": "2023-02-13T23:40:52.333547Z",
|
"iopub.execute_input": "2023-02-24T23:25:36.925804Z",
|
||||||
"iopub.status.busy": "2023-02-13T23:40:52.333249Z",
|
"iopub.status.busy": "2023-02-24T23:25:36.925423Z",
|
||||||
"iopub.status.idle": "2023-02-13T23:40:52.336508Z",
|
"iopub.status.idle": "2023-02-24T23:25:36.928191Z",
|
||||||
"shell.execute_reply": "2023-02-13T23:40:52.335858Z"
|
"shell.execute_reply": "2023-02-24T23:25:36.927673Z"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
|
@ -109,10 +109,10 @@
|
||||||
"execution_count": 4,
|
"execution_count": 4,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"execution": {
|
"execution": {
|
||||||
"iopub.execute_input": "2023-02-13T23:40:52.339977Z",
|
"iopub.execute_input": "2023-02-24T23:25:36.931255Z",
|
||||||
"iopub.status.busy": "2023-02-13T23:40:52.339556Z",
|
"iopub.status.busy": "2023-02-24T23:25:36.930838Z",
|
||||||
"iopub.status.idle": "2023-02-13T23:40:54.603349Z",
|
"iopub.status.idle": "2023-02-24T23:25:39.148799Z",
|
||||||
"shell.execute_reply": "2023-02-13T23:40:54.602630Z"
|
"shell.execute_reply": "2023-02-24T23:25:39.148113Z"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
|
@ -126,7 +126,7 @@
|
||||||
{
|
{
|
||||||
"data": {
|
"data": {
|
||||||
"application/vnd.jupyter.widget-view+json": {
|
"application/vnd.jupyter.widget-view+json": {
|
||||||
"model_id": "454146d0f7224f038689031002906e6f",
|
"model_id": "35cd066a31b242bb87b2c106ee72e5f2",
|
||||||
"version_major": 2,
|
"version_major": 2,
|
||||||
"version_minor": 0
|
"version_minor": 0
|
||||||
},
|
},
|
||||||
|
@ -186,10 +186,10 @@
|
||||||
"execution_count": 5,
|
"execution_count": 5,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"execution": {
|
"execution": {
|
||||||
"iopub.execute_input": "2023-02-13T23:40:54.607152Z",
|
"iopub.execute_input": "2023-02-24T23:25:39.152156Z",
|
||||||
"iopub.status.busy": "2023-02-13T23:40:54.606441Z",
|
"iopub.status.busy": "2023-02-24T23:25:39.151531Z",
|
||||||
"iopub.status.idle": "2023-02-13T23:40:54.610504Z",
|
"iopub.status.idle": "2023-02-24T23:25:39.155313Z",
|
||||||
"shell.execute_reply": "2023-02-13T23:40:54.609759Z"
|
"shell.execute_reply": "2023-02-24T23:25:39.154731Z"
|
||||||
},
|
},
|
||||||
"slideshow": {
|
"slideshow": {
|
||||||
"slide_type": "subslide"
|
"slide_type": "subslide"
|
||||||
|
@ -238,10 +238,10 @@
|
||||||
"execution_count": 6,
|
"execution_count": 6,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"execution": {
|
"execution": {
|
||||||
"iopub.execute_input": "2023-02-13T23:40:54.613590Z",
|
"iopub.execute_input": "2023-02-24T23:25:39.158398Z",
|
||||||
"iopub.status.busy": "2023-02-13T23:40:54.613168Z",
|
"iopub.status.busy": "2023-02-24T23:25:39.157766Z",
|
||||||
"iopub.status.idle": "2023-02-13T23:40:54.616873Z",
|
"iopub.status.idle": "2023-02-24T23:25:39.161396Z",
|
||||||
"shell.execute_reply": "2023-02-13T23:40:54.616193Z"
|
"shell.execute_reply": "2023-02-24T23:25:39.160797Z"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
|
@ -287,10 +287,10 @@
|
||||||
"execution_count": 7,
|
"execution_count": 7,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"execution": {
|
"execution": {
|
||||||
"iopub.execute_input": "2023-02-13T23:40:54.619618Z",
|
"iopub.execute_input": "2023-02-24T23:25:39.164187Z",
|
||||||
"iopub.status.busy": "2023-02-13T23:40:54.619218Z",
|
"iopub.status.busy": "2023-02-24T23:25:39.163867Z",
|
||||||
"iopub.status.idle": "2023-02-13T23:40:54.624272Z",
|
"iopub.status.idle": "2023-02-24T23:25:39.169009Z",
|
||||||
"shell.execute_reply": "2023-02-13T23:40:54.623664Z"
|
"shell.execute_reply": "2023-02-24T23:25:39.168427Z"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
|
@ -337,10 +337,10 @@
|
||||||
"execution_count": 8,
|
"execution_count": 8,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"execution": {
|
"execution": {
|
||||||
"iopub.execute_input": "2023-02-13T23:40:54.626998Z",
|
"iopub.execute_input": "2023-02-24T23:25:39.171752Z",
|
||||||
"iopub.status.busy": "2023-02-13T23:40:54.626593Z",
|
"iopub.status.busy": "2023-02-24T23:25:39.171347Z",
|
||||||
"iopub.status.idle": "2023-02-13T23:40:54.631383Z",
|
"iopub.status.idle": "2023-02-24T23:25:39.176343Z",
|
||||||
"shell.execute_reply": "2023-02-13T23:40:54.630770Z"
|
"shell.execute_reply": "2023-02-24T23:25:39.175510Z"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
|
@ -391,10 +391,10 @@
|
||||||
"execution_count": 9,
|
"execution_count": 9,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"execution": {
|
"execution": {
|
||||||
"iopub.execute_input": "2023-02-13T23:40:54.634335Z",
|
"iopub.execute_input": "2023-02-24T23:25:39.179030Z",
|
||||||
"iopub.status.busy": "2023-02-13T23:40:54.633929Z",
|
"iopub.status.busy": "2023-02-24T23:25:39.178624Z",
|
||||||
"iopub.status.idle": "2023-02-13T23:40:56.105700Z",
|
"iopub.status.idle": "2023-02-24T23:25:40.584410Z",
|
||||||
"shell.execute_reply": "2023-02-13T23:40:56.105085Z"
|
"shell.execute_reply": "2023-02-24T23:25:40.583802Z"
|
||||||
},
|
},
|
||||||
"slideshow": {
|
"slideshow": {
|
||||||
"slide_type": "slide"
|
"slide_type": "slide"
|
||||||
|
@ -418,10 +418,10 @@
|
||||||
"execution_count": 10,
|
"execution_count": 10,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"execution": {
|
"execution": {
|
||||||
"iopub.execute_input": "2023-02-13T23:40:56.109177Z",
|
"iopub.execute_input": "2023-02-24T23:25:40.587815Z",
|
||||||
"iopub.status.busy": "2023-02-13T23:40:56.108624Z",
|
"iopub.status.busy": "2023-02-24T23:25:40.587283Z",
|
||||||
"iopub.status.idle": "2023-02-13T23:40:56.112651Z",
|
"iopub.status.idle": "2023-02-24T23:25:40.590826Z",
|
||||||
"shell.execute_reply": "2023-02-13T23:40:56.112076Z"
|
"shell.execute_reply": "2023-02-24T23:25:40.590158Z"
|
||||||
},
|
},
|
||||||
"slideshow": {
|
"slideshow": {
|
||||||
"slide_type": "slide"
|
"slide_type": "slide"
|
||||||
|
@ -483,10 +483,10 @@
|
||||||
"execution_count": 11,
|
"execution_count": 11,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"execution": {
|
"execution": {
|
||||||
"iopub.execute_input": "2023-02-13T23:40:56.115383Z",
|
"iopub.execute_input": "2023-02-24T23:25:40.593603Z",
|
||||||
"iopub.status.busy": "2023-02-13T23:40:56.114975Z",
|
"iopub.status.busy": "2023-02-24T23:25:40.593269Z",
|
||||||
"iopub.status.idle": "2023-02-13T23:41:55.045654Z",
|
"iopub.status.idle": "2023-02-24T23:26:38.349191Z",
|
||||||
"shell.execute_reply": "2023-02-13T23:41:55.044973Z"
|
"shell.execute_reply": "2023-02-24T23:26:38.348392Z"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
|
@ -494,119 +494,119 @@
|
||||||
"name": "stderr",
|
"name": "stderr",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"\u001b[32m[I 2023-02-13 23:40:56,159]\u001b[0m A new study created in memory with name: optuna\u001b[0m\n"
|
"\u001b[32m[I 2023-02-24 23:25:40,643]\u001b[0m A new study created in memory with name: optuna\u001b[0m\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "stderr",
|
"name": "stderr",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"\u001b[32m[I 2023-02-13 23:40:56,161]\u001b[0m A new study created in memory with name: optuna\u001b[0m\n"
|
"\u001b[32m[I 2023-02-24 23:25:40,646]\u001b[0m A new study created in memory with name: optuna\u001b[0m\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[flaml.tune.tune: 02-13 23:40:56] {806} INFO - trial 1 config: {'model': 'code-davinci-002', 'temperature_or_top_p': {'temperature': 0.36865945026811975}, 'max_tokens': 347, 'n': 1, 'prompt': 1, 'stop': 0}\n"
|
"[flaml.tune.tune: 02-24 23:25:40] {811} INFO - trial 1 config: {'model': 'code-davinci-002', 'temperature_or_top_p': {'temperature': 0.36865945026811975}, 'max_tokens': 347, 'n': 1, 'prompt': 1, 'stop': 0}\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[flaml.tune.tune: 02-13 23:40:59] {215} INFO - result: {'expected_success': 0.6, 'success': 0.6, 'total_cost': 0.4624999999999999, 'cost': 0.4624999999999999, 'inference_cost': 0.023125, 'training_iteration': 0, 'config': {'model': 'code-davinci-002', 'temperature_or_top_p': {'temperature': 0.36865945026811975}, 'max_tokens': 347, 'n': 1, 'prompt': 1, 'stop': 0}, 'config/model': 'code-davinci-002', 'config/temperature_or_top_p': {'temperature': 0.36865945026811975}, 'config/max_tokens': 347, 'config/n': 1, 'config/prompt': 1, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 3.7016141414642334}\n"
|
"[flaml.tune.tune: 02-24 23:25:44] {215} INFO - result: {'expected_success': 0.6, 'success': 0.6, 'total_cost': 0.4624999999999999, 'cost': 0.4624999999999999, 'inference_cost': 0.023125, 'training_iteration': 0, 'config': {'model': 'code-davinci-002', 'temperature_or_top_p': {'temperature': 0.36865945026811975}, 'max_tokens': 347, 'n': 1, 'prompt': 1, 'stop': 0}, 'config/model': 'code-davinci-002', 'config/temperature_or_top_p': {'temperature': 0.36865945026811975}, 'config/max_tokens': 347, 'config/n': 1, 'config/prompt': 1, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 3.687161445617676}\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[flaml.tune.tune: 02-13 23:40:59] {806} INFO - trial 2 config: {'model': 'code-cushman-001', 'temperature_or_top_p': {'temperature': 0.36865945026811975}, 'max_tokens': 347, 'n': 1, 'prompt': 1, 'stop': 0}\n"
|
"[flaml.tune.tune: 02-24 23:25:44] {811} INFO - trial 2 config: {'model': 'code-cushman-001', 'temperature_or_top_p': {'temperature': 0.36865945026811975}, 'max_tokens': 347, 'n': 1, 'prompt': 1, 'stop': 0}\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[flaml.tune.tune: 02-13 23:41:00] {215} INFO - result: {'expected_success': 0.35, 'success': 0.35, 'total_cost': 0.5671159999999997, 'cost': 0.104616, 'inference_cost': 0.0052308, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'temperature': 0.36865945026811975}, 'max_tokens': 347, 'n': 1, 'prompt': 1, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'temperature': 0.36865945026811975}, 'config/max_tokens': 347, 'config/n': 1, 'config/prompt': 1, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 0.673302412033081}\n"
|
"[flaml.tune.tune: 02-24 23:25:45] {215} INFO - result: {'expected_success': 0.35, 'success': 0.35, 'total_cost': 0.5671159999999997, 'cost': 0.104616, 'inference_cost': 0.0052308, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'temperature': 0.36865945026811975}, 'max_tokens': 347, 'n': 1, 'prompt': 1, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'temperature': 0.36865945026811975}, 'config/max_tokens': 347, 'config/n': 1, 'config/prompt': 1, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 0.6666913032531738}\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[flaml.tune.tune: 02-13 23:41:00] {806} INFO - trial 3 config: {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.4985070123025904}, 'max_tokens': 97, 'n': 20, 'prompt': 0, 'stop': 0}\n"
|
"[flaml.tune.tune: 02-24 23:25:45] {811} INFO - trial 3 config: {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.4985070123025904}, 'max_tokens': 97, 'n': 20, 'prompt': 0, 'stop': 0}\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[flaml.tune.tune: 02-13 23:41:17] {215} INFO - result: {'expected_success': 0.5080706992649381, 'success': 0.55, 'total_cost': 1.1848999999999996, 'cost': 0.617784, 'inference_cost': 0.0287676, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.4985070123025904}, 'max_tokens': 97, 'n': 20, 'prompt': 0, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'top_p': 0.4985070123025904}, 'config/max_tokens': 97, 'config/n': 20, 'config/prompt': 0, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 16.56331181526184}\n"
|
"[flaml.tune.tune: 02-24 23:26:01] {215} INFO - result: {'expected_success': 0.5080706992649381, 'success': 0.55, 'total_cost': 1.1424679999999998, 'cost': 0.575352, 'inference_cost': 0.0287676, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.4985070123025904}, 'max_tokens': 97, 'n': 20, 'prompt': 0, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'top_p': 0.4985070123025904}, 'config/max_tokens': 97, 'config/n': 20, 'config/prompt': 0, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 16.66586470603943}\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[flaml.tune.tune: 02-13 23:41:17] {806} INFO - trial 4 config: {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.6125260668293881}, 'max_tokens': 433, 'n': 29, 'prompt': 0, 'stop': 0}\n"
|
"[flaml.tune.tune: 02-24 23:26:01] {811} INFO - trial 4 config: {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.6125260668293881}, 'max_tokens': 433, 'n': 29, 'prompt': 0, 'stop': 0}\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[flaml.tune.tune: 02-13 23:41:51] {215} INFO - result: {'expected_success': 0.6186627404336135, 'success': 0.65, 'total_cost': 2.4239719999999987, 'cost': 1.2390720000000002, 'inference_cost': 0.059620799999999995, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.6125260668293881}, 'max_tokens': 433, 'n': 29, 'prompt': 0, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'top_p': 0.6125260668293881}, 'config/max_tokens': 433, 'config/n': 29, 'config/prompt': 0, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 34.57707595825195}\n"
|
"[flaml.tune.tune: 02-24 23:26:38] {215} INFO - result: {'expected_success': 0.6186627404336135, 'success': 0.65, 'total_cost': 2.3693479999999987, 'cost': 1.2268800000000002, 'inference_cost': 0.059620799999999995, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.6125260668293881}, 'max_tokens': 433, 'n': 29, 'prompt': 0, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'top_p': 0.6125260668293881}, 'config/max_tokens': 433, 'config/n': 29, 'config/prompt': 0, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 36.605130434036255}\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[flaml.tune.tune: 02-13 23:41:51] {806} INFO - trial 5 config: {'model': 'code-davinci-002', 'temperature_or_top_p': {'temperature': 0.6177669784693172}, 'max_tokens': 231, 'n': 65, 'prompt': 3, 'stop': 0}\n"
|
"[flaml.tune.tune: 02-24 23:26:38] {811} INFO - trial 5 config: {'model': 'code-davinci-002', 'temperature_or_top_p': {'temperature': 0.6177669784693172}, 'max_tokens': 231, 'n': 65, 'prompt': 3, 'stop': 0}\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[flaml.tune.tune: 02-13 23:41:51] {215} INFO - result: {'expected_success': 0, 'total_cost': 2.6356719999999987, 'cost': 0.2117, 'training_iteration': 0, 'config': {'model': 'code-davinci-002', 'temperature_or_top_p': {'temperature': 0.6177669784693172}, 'max_tokens': 231, 'n': 65, 'prompt': 3, 'stop': 0}, 'config/model': 'code-davinci-002', 'config/temperature_or_top_p': {'temperature': 0.6177669784693172}, 'config/max_tokens': 231, 'config/n': 65, 'config/prompt': 3, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 0.0022132396697998047}\n"
|
"[flaml.tune.tune: 02-24 23:26:38] {215} INFO - result: {'expected_success': 0, 'total_cost': 2.5295479999999984, 'cost': 0.1602, 'training_iteration': 0, 'config': {'model': 'code-davinci-002', 'temperature_or_top_p': {'temperature': 0.6177669784693172}, 'max_tokens': 231, 'n': 65, 'prompt': 3, 'stop': 0}, 'config/model': 'code-davinci-002', 'config/temperature_or_top_p': {'temperature': 0.6177669784693172}, 'config/max_tokens': 231, 'config/n': 65, 'config/prompt': 3, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 0.0020499229431152344}\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[flaml.tune.tune: 02-13 23:41:51] {806} INFO - trial 6 config: {'model': 'code-davinci-002', 'max_tokens': 263, 'n': 41, 'prompt': 0, 'stop': 0, 'temperature_or_top_p': {'top_p': 0.49834557213253655}}\n"
|
"[flaml.tune.tune: 02-24 23:26:38] {811} INFO - trial 6 config: {'model': 'code-davinci-002', 'max_tokens': 263, 'n': 41, 'prompt': 0, 'stop': 0, 'temperature_or_top_p': {'top_p': 0.49834557213253655}}\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[flaml.tune.tune: 02-13 23:41:54] {215} INFO - result: {'expected_success': 0, 'total_cost': 3.003171999999999, 'cost': 0.3675, 'training_iteration': 0, 'config': {'model': 'code-davinci-002', 'max_tokens': 263, 'n': 41, 'prompt': 0, 'stop': 0, 'temperature_or_top_p': {'top_p': 0.49834557213253655}}, 'config/model': 'code-davinci-002', 'config/max_tokens': 263, 'config/n': 41, 'config/prompt': 0, 'config/stop': 0, 'config/temperature_or_top_p': {'top_p': 0.49834557213253655}, 'experiment_tag': 'exp', 'time_total_s': 3.3002660274505615}\n"
|
"[flaml.tune.tune: 02-24 23:26:38] {215} INFO - result: {'expected_success': 0, 'total_cost': 2.8578479999999984, 'cost': 0.32830000000000004, 'training_iteration': 0, 'config': {'model': 'code-davinci-002', 'max_tokens': 263, 'n': 41, 'prompt': 0, 'stop': 0, 'temperature_or_top_p': {'top_p': 0.49834557213253655}}, 'config/model': 'code-davinci-002', 'config/max_tokens': 263, 'config/n': 41, 'config/prompt': 0, 'config/stop': 0, 'config/temperature_or_top_p': {'top_p': 0.49834557213253655}, 'experiment_tag': 'exp', 'time_total_s': 0.002808809280395508}\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[flaml.tune.tune: 02-13 23:41:55] {806} INFO - trial 7 config: {'model': 'code-cushman-001', 'temperature_or_top_p': {'temperature': 0.8286813263076767}, 'max_tokens': 57, 'n': 63, 'prompt': 3, 'stop': 0}\n"
|
"[flaml.tune.tune: 02-24 23:26:38] {811} INFO - trial 7 config: {'model': 'code-cushman-001', 'temperature_or_top_p': {'temperature': 0.8286813263076767}, 'max_tokens': 57, 'n': 63, 'prompt': 3, 'stop': 0}\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[flaml.tune.tune: 02-13 23:41:55] {215} INFO - result: {'expected_success': 0, 'total_cost': 4.046379999999999, 'cost': 1.043208, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'temperature': 0.8286813263076767}, 'max_tokens': 57, 'n': 63, 'prompt': 3, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'temperature': 0.8286813263076767}, 'config/max_tokens': 57, 'config/n': 63, 'config/prompt': 3, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 0.007852792739868164}\n"
|
"[flaml.tune.tune: 02-24 23:26:38] {215} INFO - result: {'expected_success': 0, 'total_cost': 4.028831999999999, 'cost': 1.170984, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'temperature': 0.8286813263076767}, 'max_tokens': 57, 'n': 63, 'prompt': 3, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'temperature': 0.8286813263076767}, 'config/max_tokens': 57, 'config/n': 63, 'config/prompt': 3, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 0.015198230743408203}\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[flaml.tune.tune: 02-13 23:41:55] {827} WARNING - fail to sample a trial for 100 times in a row, stopping.\n"
|
"[flaml.tune.tune: 02-24 23:26:38] {834} WARNING - fail to sample a trial for 100 times in a row, stopping.\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
|
@ -656,10 +656,10 @@
|
||||||
"execution_count": 12,
|
"execution_count": 12,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"execution": {
|
"execution": {
|
||||||
"iopub.execute_input": "2023-02-13T23:41:55.049204Z",
|
"iopub.execute_input": "2023-02-24T23:26:38.352710Z",
|
||||||
"iopub.status.busy": "2023-02-13T23:41:55.048871Z",
|
"iopub.status.busy": "2023-02-24T23:26:38.352378Z",
|
||||||
"iopub.status.idle": "2023-02-13T23:41:55.053284Z",
|
"iopub.status.idle": "2023-02-24T23:26:38.356939Z",
|
||||||
"shell.execute_reply": "2023-02-13T23:41:55.052574Z"
|
"shell.execute_reply": "2023-02-24T23:26:38.356217Z"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
|
@ -668,7 +668,7 @@
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"optimized config {'model': 'code-cushman-001', 'max_tokens': 433, 'n': 29, 'prompt': '{prompt}', 'stop': ['\\nclass', '\\ndef', '\\nif', '\\nprint'], 'top_p': 0.6125260668293881}\n",
|
"optimized config {'model': 'code-cushman-001', 'max_tokens': 433, 'n': 29, 'prompt': '{prompt}', 'stop': ['\\nclass', '\\ndef', '\\nif', '\\nprint'], 'top_p': 0.6125260668293881}\n",
|
||||||
"best result on tuning data {'expected_success': 0.6186627404336135, 'success': 0.65, 'total_cost': 2.4239719999999987, 'cost': 1.2390720000000002, 'inference_cost': 0.059620799999999995, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.6125260668293881}, 'max_tokens': 433, 'n': 29, 'prompt': 0, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'top_p': 0.6125260668293881}, 'config/max_tokens': 433, 'config/n': 29, 'config/prompt': 0, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 34.57707595825195}\n"
|
"best result on tuning data {'expected_success': 0.6186627404336135, 'success': 0.65, 'total_cost': 2.3693479999999987, 'cost': 1.2268800000000002, 'inference_cost': 0.059620799999999995, 'training_iteration': 0, 'config': {'model': 'code-cushman-001', 'temperature_or_top_p': {'top_p': 0.6125260668293881}, 'max_tokens': 433, 'n': 29, 'prompt': 0, 'stop': 0}, 'config/model': 'code-cushman-001', 'config/temperature_or_top_p': {'top_p': 0.6125260668293881}, 'config/max_tokens': 433, 'config/n': 29, 'config/prompt': 0, 'config/stop': 0, 'experiment_tag': 'exp', 'time_total_s': 36.605130434036255}\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
|
@ -696,10 +696,10 @@
|
||||||
"execution_count": 13,
|
"execution_count": 13,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"execution": {
|
"execution": {
|
||||||
"iopub.execute_input": "2023-02-13T23:41:55.056205Z",
|
"iopub.execute_input": "2023-02-24T23:26:38.359902Z",
|
||||||
"iopub.status.busy": "2023-02-13T23:41:55.055631Z",
|
"iopub.status.busy": "2023-02-24T23:26:38.359506Z",
|
||||||
"iopub.status.idle": "2023-02-13T23:41:56.039259Z",
|
"iopub.status.idle": "2023-02-24T23:26:39.343921Z",
|
||||||
"shell.execute_reply": "2023-02-13T23:41:56.038427Z"
|
"shell.execute_reply": "2023-02-24T23:26:39.343051Z"
|
||||||
},
|
},
|
||||||
"slideshow": {
|
"slideshow": {
|
||||||
"slide_type": "subslide"
|
"slide_type": "subslide"
|
||||||
|
@ -921,7 +921,7 @@
|
||||||
"source": [
|
"source": [
|
||||||
"### Evaluate the success rate on the test data\n",
|
"### Evaluate the success rate on the test data\n",
|
||||||
"\n",
|
"\n",
|
||||||
"You can use flaml's `oai.Completion.eval` to evaluate the performance of an entire dataset with the tuned config. To do that you need to set `oai.Completion.data` to the data to evaluate. The following code will take a while to evaluate all the 144 test data instances. Compared to the baseline success rate (0.46) on the [HELM benchmark](https://crfm.stanford.edu/helm/latest/?group=code_humaneval), the tuned config has a success rate of 0.68. It can be further improved if the inference budget and optimization budget are further increased."
|
"You can use flaml's `oai.Completion.eval` to evaluate the performance of an entire dataset with the tuned config. To do that you need to set `oai.Completion.data` to the data to evaluate. The following code will take a while to evaluate all the 144 test data instances. Compared to the baseline success rate (46%) on the [HELM benchmark](https://crfm.stanford.edu/helm/latest/?group=code_humaneval), the tuned config has a success rate of 68%. It can be further improved if the inference budget and optimization budget are further increased."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -929,10 +929,10 @@
|
||||||
"execution_count": 14,
|
"execution_count": 14,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"execution": {
|
"execution": {
|
||||||
"iopub.execute_input": "2023-02-13T23:41:56.042764Z",
|
"iopub.execute_input": "2023-02-24T23:26:39.347295Z",
|
||||||
"iopub.status.busy": "2023-02-13T23:41:56.042086Z",
|
"iopub.status.busy": "2023-02-24T23:26:39.346994Z",
|
||||||
"iopub.status.idle": "2023-02-13T23:53:05.597643Z",
|
"iopub.status.idle": "2023-02-24T23:29:27.160335Z",
|
||||||
"shell.execute_reply": "2023-02-13T23:53:05.596603Z"
|
"shell.execute_reply": "2023-02-24T23:29:27.159519Z"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
|
@ -940,7 +940,7 @@
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"{'expected_success': 0.6364503360372493, 'success': 0.6805555555555556, 'total_cost': 12.227739999999997, 'cost': 8.181360000000003, 'inference_cost': 0.056815}\n"
|
"{'expected_success': 0.6364503360372493, 'success': 0.6805555555555556, 'total_cost': 12.210191999999997, 'cost': 8.181360000000003, 'inference_cost': 0.056815}\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
|
@ -977,60 +977,25 @@
|
||||||
"widgets": {
|
"widgets": {
|
||||||
"application/vnd.jupyter.widget-state+json": {
|
"application/vnd.jupyter.widget-state+json": {
|
||||||
"state": {
|
"state": {
|
||||||
"2d910cfd2d2a4fc49fc30fbbdc5576a7": {
|
"24dd93300e0442788ee6cc1310e5bf14": {
|
||||||
"model_module": "@jupyter-widgets/base",
|
"model_module": "@jupyter-widgets/controls",
|
||||||
"model_module_version": "2.0.0",
|
"model_module_version": "2.0.0",
|
||||||
"model_name": "LayoutModel",
|
"model_name": "HTMLStyleModel",
|
||||||
"state": {
|
"state": {
|
||||||
"_model_module": "@jupyter-widgets/base",
|
"_model_module": "@jupyter-widgets/controls",
|
||||||
"_model_module_version": "2.0.0",
|
"_model_module_version": "2.0.0",
|
||||||
"_model_name": "LayoutModel",
|
"_model_name": "HTMLStyleModel",
|
||||||
"_view_count": null,
|
"_view_count": null,
|
||||||
"_view_module": "@jupyter-widgets/base",
|
"_view_module": "@jupyter-widgets/base",
|
||||||
"_view_module_version": "2.0.0",
|
"_view_module_version": "2.0.0",
|
||||||
"_view_name": "LayoutView",
|
"_view_name": "StyleView",
|
||||||
"align_content": null,
|
"background": null,
|
||||||
"align_items": null,
|
"description_width": "",
|
||||||
"align_self": null,
|
"font_size": null,
|
||||||
"border_bottom": null,
|
"text_color": null
|
||||||
"border_left": null,
|
|
||||||
"border_right": null,
|
|
||||||
"border_top": null,
|
|
||||||
"bottom": null,
|
|
||||||
"display": null,
|
|
||||||
"flex": null,
|
|
||||||
"flex_flow": null,
|
|
||||||
"grid_area": null,
|
|
||||||
"grid_auto_columns": null,
|
|
||||||
"grid_auto_flow": null,
|
|
||||||
"grid_auto_rows": null,
|
|
||||||
"grid_column": null,
|
|
||||||
"grid_gap": null,
|
|
||||||
"grid_row": null,
|
|
||||||
"grid_template_areas": null,
|
|
||||||
"grid_template_columns": null,
|
|
||||||
"grid_template_rows": null,
|
|
||||||
"height": null,
|
|
||||||
"justify_content": null,
|
|
||||||
"justify_items": null,
|
|
||||||
"left": null,
|
|
||||||
"margin": null,
|
|
||||||
"max_height": null,
|
|
||||||
"max_width": null,
|
|
||||||
"min_height": null,
|
|
||||||
"min_width": null,
|
|
||||||
"object_fit": null,
|
|
||||||
"object_position": null,
|
|
||||||
"order": null,
|
|
||||||
"overflow": null,
|
|
||||||
"padding": null,
|
|
||||||
"right": null,
|
|
||||||
"top": null,
|
|
||||||
"visibility": null,
|
|
||||||
"width": null
|
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"454146d0f7224f038689031002906e6f": {
|
"35cd066a31b242bb87b2c106ee72e5f2": {
|
||||||
"model_module": "@jupyter-widgets/controls",
|
"model_module": "@jupyter-widgets/controls",
|
||||||
"model_module_version": "2.0.0",
|
"model_module_version": "2.0.0",
|
||||||
"model_name": "HBoxModel",
|
"model_name": "HBoxModel",
|
||||||
|
@ -1045,95 +1010,34 @@
|
||||||
"_view_name": "HBoxView",
|
"_view_name": "HBoxView",
|
||||||
"box_style": "",
|
"box_style": "",
|
||||||
"children": [
|
"children": [
|
||||||
"IPY_MODEL_e4ae2b6f5a974fd4bafb6abb9d12ff26",
|
"IPY_MODEL_8e7ee7687a99410d88a98a74ecfcea99",
|
||||||
"IPY_MODEL_577e1e3cc4db4942b0883577b3b52755",
|
"IPY_MODEL_421e02a11a974b40b3ddb75382b3b640",
|
||||||
"IPY_MODEL_b40bdfb1ac1d4cffb7cefcb870c64d45"
|
"IPY_MODEL_77db9797e78b49438d21c5c8da34b4cb"
|
||||||
],
|
],
|
||||||
"layout": "IPY_MODEL_dc83c7bff2f241309537a8119dfc7555",
|
"layout": "IPY_MODEL_47d3046236a54b0e8f9ae455a82c7e0b",
|
||||||
"tabbable": null,
|
"tabbable": null,
|
||||||
"tooltip": null
|
"tooltip": null
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"577e1e3cc4db4942b0883577b3b52755": {
|
"3d5d106a38954af2bb3bde5777702f4e": {
|
||||||
"model_module": "@jupyter-widgets/controls",
|
"model_module": "@jupyter-widgets/controls",
|
||||||
"model_module_version": "2.0.0",
|
"model_module_version": "2.0.0",
|
||||||
"model_name": "FloatProgressModel",
|
"model_name": "HTMLStyleModel",
|
||||||
"state": {
|
"state": {
|
||||||
"_dom_classes": [],
|
|
||||||
"_model_module": "@jupyter-widgets/controls",
|
"_model_module": "@jupyter-widgets/controls",
|
||||||
"_model_module_version": "2.0.0",
|
"_model_module_version": "2.0.0",
|
||||||
"_model_name": "FloatProgressModel",
|
"_model_name": "HTMLStyleModel",
|
||||||
"_view_count": null,
|
|
||||||
"_view_module": "@jupyter-widgets/controls",
|
|
||||||
"_view_module_version": "2.0.0",
|
|
||||||
"_view_name": "ProgressView",
|
|
||||||
"bar_style": "success",
|
|
||||||
"description": "",
|
|
||||||
"description_allow_html": false,
|
|
||||||
"layout": "IPY_MODEL_2d910cfd2d2a4fc49fc30fbbdc5576a7",
|
|
||||||
"max": 1,
|
|
||||||
"min": 0,
|
|
||||||
"orientation": "horizontal",
|
|
||||||
"style": "IPY_MODEL_74a6ba0c3cbc4051be0a83e152fe1e62",
|
|
||||||
"tabbable": null,
|
|
||||||
"tooltip": null,
|
|
||||||
"value": 1
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"6086462a12d54bafa59d3c4566f06cb2": {
|
|
||||||
"model_module": "@jupyter-widgets/base",
|
|
||||||
"model_module_version": "2.0.0",
|
|
||||||
"model_name": "LayoutModel",
|
|
||||||
"state": {
|
|
||||||
"_model_module": "@jupyter-widgets/base",
|
|
||||||
"_model_module_version": "2.0.0",
|
|
||||||
"_model_name": "LayoutModel",
|
|
||||||
"_view_count": null,
|
"_view_count": null,
|
||||||
"_view_module": "@jupyter-widgets/base",
|
"_view_module": "@jupyter-widgets/base",
|
||||||
"_view_module_version": "2.0.0",
|
"_view_module_version": "2.0.0",
|
||||||
"_view_name": "LayoutView",
|
"_view_name": "StyleView",
|
||||||
"align_content": null,
|
"background": null,
|
||||||
"align_items": null,
|
"description_width": "",
|
||||||
"align_self": null,
|
"font_size": null,
|
||||||
"border_bottom": null,
|
"text_color": null
|
||||||
"border_left": null,
|
|
||||||
"border_right": null,
|
|
||||||
"border_top": null,
|
|
||||||
"bottom": null,
|
|
||||||
"display": null,
|
|
||||||
"flex": null,
|
|
||||||
"flex_flow": null,
|
|
||||||
"grid_area": null,
|
|
||||||
"grid_auto_columns": null,
|
|
||||||
"grid_auto_flow": null,
|
|
||||||
"grid_auto_rows": null,
|
|
||||||
"grid_column": null,
|
|
||||||
"grid_gap": null,
|
|
||||||
"grid_row": null,
|
|
||||||
"grid_template_areas": null,
|
|
||||||
"grid_template_columns": null,
|
|
||||||
"grid_template_rows": null,
|
|
||||||
"height": null,
|
|
||||||
"justify_content": null,
|
|
||||||
"justify_items": null,
|
|
||||||
"left": null,
|
|
||||||
"margin": null,
|
|
||||||
"max_height": null,
|
|
||||||
"max_width": null,
|
|
||||||
"min_height": null,
|
|
||||||
"min_width": null,
|
|
||||||
"object_fit": null,
|
|
||||||
"object_position": null,
|
|
||||||
"order": null,
|
|
||||||
"overflow": null,
|
|
||||||
"padding": null,
|
|
||||||
"right": null,
|
|
||||||
"top": null,
|
|
||||||
"visibility": null,
|
|
||||||
"width": null
|
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"74a6ba0c3cbc4051be0a83e152fe1e62": {
|
"3e1ebb31412443b0bca86a301cbdac11": {
|
||||||
"model_module": "@jupyter-widgets/controls",
|
"model_module": "@jupyter-widgets/controls",
|
||||||
"model_module_version": "2.0.0",
|
"model_module_version": "2.0.0",
|
||||||
"model_name": "ProgressStyleModel",
|
"model_name": "ProgressStyleModel",
|
||||||
|
@ -1149,66 +1053,33 @@
|
||||||
"description_width": ""
|
"description_width": ""
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"7d3f3d9e15894d05a4d188ff4f466554": {
|
"421e02a11a974b40b3ddb75382b3b640": {
|
||||||
"model_module": "@jupyter-widgets/controls",
|
"model_module": "@jupyter-widgets/controls",
|
||||||
"model_module_version": "2.0.0",
|
"model_module_version": "2.0.0",
|
||||||
"model_name": "HTMLStyleModel",
|
"model_name": "FloatProgressModel",
|
||||||
"state": {
|
|
||||||
"_model_module": "@jupyter-widgets/controls",
|
|
||||||
"_model_module_version": "2.0.0",
|
|
||||||
"_model_name": "HTMLStyleModel",
|
|
||||||
"_view_count": null,
|
|
||||||
"_view_module": "@jupyter-widgets/base",
|
|
||||||
"_view_module_version": "2.0.0",
|
|
||||||
"_view_name": "StyleView",
|
|
||||||
"background": null,
|
|
||||||
"description_width": "",
|
|
||||||
"font_size": null,
|
|
||||||
"text_color": null
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"b40bdfb1ac1d4cffb7cefcb870c64d45": {
|
|
||||||
"model_module": "@jupyter-widgets/controls",
|
|
||||||
"model_module_version": "2.0.0",
|
|
||||||
"model_name": "HTMLModel",
|
|
||||||
"state": {
|
"state": {
|
||||||
"_dom_classes": [],
|
"_dom_classes": [],
|
||||||
"_model_module": "@jupyter-widgets/controls",
|
"_model_module": "@jupyter-widgets/controls",
|
||||||
"_model_module_version": "2.0.0",
|
"_model_module_version": "2.0.0",
|
||||||
"_model_name": "HTMLModel",
|
"_model_name": "FloatProgressModel",
|
||||||
"_view_count": null,
|
"_view_count": null,
|
||||||
"_view_module": "@jupyter-widgets/controls",
|
"_view_module": "@jupyter-widgets/controls",
|
||||||
"_view_module_version": "2.0.0",
|
"_view_module_version": "2.0.0",
|
||||||
"_view_name": "HTMLView",
|
"_view_name": "ProgressView",
|
||||||
|
"bar_style": "success",
|
||||||
"description": "",
|
"description": "",
|
||||||
"description_allow_html": false,
|
"description_allow_html": false,
|
||||||
"layout": "IPY_MODEL_f1355871cc6f4dd4b50d9df5af20e5c8",
|
"layout": "IPY_MODEL_e6398d4027c9459a97965b9d91ae484f",
|
||||||
"placeholder": "",
|
"max": 1,
|
||||||
"style": "IPY_MODEL_ca245376fd9f4354af6b2befe4af4466",
|
"min": 0,
|
||||||
|
"orientation": "horizontal",
|
||||||
|
"style": "IPY_MODEL_3e1ebb31412443b0bca86a301cbdac11",
|
||||||
"tabbable": null,
|
"tabbable": null,
|
||||||
"tooltip": null,
|
"tooltip": null,
|
||||||
"value": " 1/1 [00:00<00:00, 44.69it/s]"
|
"value": 1
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"ca245376fd9f4354af6b2befe4af4466": {
|
"47d3046236a54b0e8f9ae455a82c7e0b": {
|
||||||
"model_module": "@jupyter-widgets/controls",
|
|
||||||
"model_module_version": "2.0.0",
|
|
||||||
"model_name": "HTMLStyleModel",
|
|
||||||
"state": {
|
|
||||||
"_model_module": "@jupyter-widgets/controls",
|
|
||||||
"_model_module_version": "2.0.0",
|
|
||||||
"_model_name": "HTMLStyleModel",
|
|
||||||
"_view_count": null,
|
|
||||||
"_view_module": "@jupyter-widgets/base",
|
|
||||||
"_view_module_version": "2.0.0",
|
|
||||||
"_view_name": "StyleView",
|
|
||||||
"background": null,
|
|
||||||
"description_width": "",
|
|
||||||
"font_size": null,
|
|
||||||
"text_color": null
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"dc83c7bff2f241309537a8119dfc7555": {
|
|
||||||
"model_module": "@jupyter-widgets/base",
|
"model_module": "@jupyter-widgets/base",
|
||||||
"model_module_version": "2.0.0",
|
"model_module_version": "2.0.0",
|
||||||
"model_name": "LayoutModel",
|
"model_name": "LayoutModel",
|
||||||
|
@ -1261,7 +1132,60 @@
|
||||||
"width": null
|
"width": null
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"e4ae2b6f5a974fd4bafb6abb9d12ff26": {
|
"754800f7feb04acea977696e4787d1ff": {
|
||||||
|
"model_module": "@jupyter-widgets/base",
|
||||||
|
"model_module_version": "2.0.0",
|
||||||
|
"model_name": "LayoutModel",
|
||||||
|
"state": {
|
||||||
|
"_model_module": "@jupyter-widgets/base",
|
||||||
|
"_model_module_version": "2.0.0",
|
||||||
|
"_model_name": "LayoutModel",
|
||||||
|
"_view_count": null,
|
||||||
|
"_view_module": "@jupyter-widgets/base",
|
||||||
|
"_view_module_version": "2.0.0",
|
||||||
|
"_view_name": "LayoutView",
|
||||||
|
"align_content": null,
|
||||||
|
"align_items": null,
|
||||||
|
"align_self": null,
|
||||||
|
"border_bottom": null,
|
||||||
|
"border_left": null,
|
||||||
|
"border_right": null,
|
||||||
|
"border_top": null,
|
||||||
|
"bottom": null,
|
||||||
|
"display": null,
|
||||||
|
"flex": null,
|
||||||
|
"flex_flow": null,
|
||||||
|
"grid_area": null,
|
||||||
|
"grid_auto_columns": null,
|
||||||
|
"grid_auto_flow": null,
|
||||||
|
"grid_auto_rows": null,
|
||||||
|
"grid_column": null,
|
||||||
|
"grid_gap": null,
|
||||||
|
"grid_row": null,
|
||||||
|
"grid_template_areas": null,
|
||||||
|
"grid_template_columns": null,
|
||||||
|
"grid_template_rows": null,
|
||||||
|
"height": null,
|
||||||
|
"justify_content": null,
|
||||||
|
"justify_items": null,
|
||||||
|
"left": null,
|
||||||
|
"margin": null,
|
||||||
|
"max_height": null,
|
||||||
|
"max_width": null,
|
||||||
|
"min_height": null,
|
||||||
|
"min_width": null,
|
||||||
|
"object_fit": null,
|
||||||
|
"object_position": null,
|
||||||
|
"order": null,
|
||||||
|
"overflow": null,
|
||||||
|
"padding": null,
|
||||||
|
"right": null,
|
||||||
|
"top": null,
|
||||||
|
"visibility": null,
|
||||||
|
"width": null
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"77db9797e78b49438d21c5c8da34b4cb": {
|
||||||
"model_module": "@jupyter-widgets/controls",
|
"model_module": "@jupyter-widgets/controls",
|
||||||
"model_module_version": "2.0.0",
|
"model_module_version": "2.0.0",
|
||||||
"model_name": "HTMLModel",
|
"model_name": "HTMLModel",
|
||||||
|
@ -1276,15 +1200,91 @@
|
||||||
"_view_name": "HTMLView",
|
"_view_name": "HTMLView",
|
||||||
"description": "",
|
"description": "",
|
||||||
"description_allow_html": false,
|
"description_allow_html": false,
|
||||||
"layout": "IPY_MODEL_6086462a12d54bafa59d3c4566f06cb2",
|
"layout": "IPY_MODEL_7b6c4e1c11e249409a1edcd63be450d8",
|
||||||
"placeholder": "",
|
"placeholder": "",
|
||||||
"style": "IPY_MODEL_7d3f3d9e15894d05a4d188ff4f466554",
|
"style": "IPY_MODEL_3d5d106a38954af2bb3bde5777702f4e",
|
||||||
|
"tabbable": null,
|
||||||
|
"tooltip": null,
|
||||||
|
"value": " 1/1 [00:00<00:00, 44.40it/s]"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"7b6c4e1c11e249409a1edcd63be450d8": {
|
||||||
|
"model_module": "@jupyter-widgets/base",
|
||||||
|
"model_module_version": "2.0.0",
|
||||||
|
"model_name": "LayoutModel",
|
||||||
|
"state": {
|
||||||
|
"_model_module": "@jupyter-widgets/base",
|
||||||
|
"_model_module_version": "2.0.0",
|
||||||
|
"_model_name": "LayoutModel",
|
||||||
|
"_view_count": null,
|
||||||
|
"_view_module": "@jupyter-widgets/base",
|
||||||
|
"_view_module_version": "2.0.0",
|
||||||
|
"_view_name": "LayoutView",
|
||||||
|
"align_content": null,
|
||||||
|
"align_items": null,
|
||||||
|
"align_self": null,
|
||||||
|
"border_bottom": null,
|
||||||
|
"border_left": null,
|
||||||
|
"border_right": null,
|
||||||
|
"border_top": null,
|
||||||
|
"bottom": null,
|
||||||
|
"display": null,
|
||||||
|
"flex": null,
|
||||||
|
"flex_flow": null,
|
||||||
|
"grid_area": null,
|
||||||
|
"grid_auto_columns": null,
|
||||||
|
"grid_auto_flow": null,
|
||||||
|
"grid_auto_rows": null,
|
||||||
|
"grid_column": null,
|
||||||
|
"grid_gap": null,
|
||||||
|
"grid_row": null,
|
||||||
|
"grid_template_areas": null,
|
||||||
|
"grid_template_columns": null,
|
||||||
|
"grid_template_rows": null,
|
||||||
|
"height": null,
|
||||||
|
"justify_content": null,
|
||||||
|
"justify_items": null,
|
||||||
|
"left": null,
|
||||||
|
"margin": null,
|
||||||
|
"max_height": null,
|
||||||
|
"max_width": null,
|
||||||
|
"min_height": null,
|
||||||
|
"min_width": null,
|
||||||
|
"object_fit": null,
|
||||||
|
"object_position": null,
|
||||||
|
"order": null,
|
||||||
|
"overflow": null,
|
||||||
|
"padding": null,
|
||||||
|
"right": null,
|
||||||
|
"top": null,
|
||||||
|
"visibility": null,
|
||||||
|
"width": null
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"8e7ee7687a99410d88a98a74ecfcea99": {
|
||||||
|
"model_module": "@jupyter-widgets/controls",
|
||||||
|
"model_module_version": "2.0.0",
|
||||||
|
"model_name": "HTMLModel",
|
||||||
|
"state": {
|
||||||
|
"_dom_classes": [],
|
||||||
|
"_model_module": "@jupyter-widgets/controls",
|
||||||
|
"_model_module_version": "2.0.0",
|
||||||
|
"_model_name": "HTMLModel",
|
||||||
|
"_view_count": null,
|
||||||
|
"_view_module": "@jupyter-widgets/controls",
|
||||||
|
"_view_module_version": "2.0.0",
|
||||||
|
"_view_name": "HTMLView",
|
||||||
|
"description": "",
|
||||||
|
"description_allow_html": false,
|
||||||
|
"layout": "IPY_MODEL_754800f7feb04acea977696e4787d1ff",
|
||||||
|
"placeholder": "",
|
||||||
|
"style": "IPY_MODEL_24dd93300e0442788ee6cc1310e5bf14",
|
||||||
"tabbable": null,
|
"tabbable": null,
|
||||||
"tooltip": null,
|
"tooltip": null,
|
||||||
"value": "100%"
|
"value": "100%"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"f1355871cc6f4dd4b50d9df5af20e5c8": {
|
"e6398d4027c9459a97965b9d91ae484f": {
|
||||||
"model_module": "@jupyter-widgets/base",
|
"model_module": "@jupyter-widgets/base",
|
||||||
"model_module_version": "2.0.0",
|
"model_module_version": "2.0.0",
|
||||||
"model_name": "LayoutModel",
|
"model_name": "LayoutModel",
|
||||||
|
|
9
setup.py
9
setup.py
|
@ -92,7 +92,14 @@ setuptools.setup(
|
||||||
"vw": [
|
"vw": [
|
||||||
"vowpalwabbit>=8.10.0, <9.0.0",
|
"vowpalwabbit>=8.10.0, <9.0.0",
|
||||||
],
|
],
|
||||||
"nlp": [
|
"hf": [
|
||||||
|
"transformers[torch]==4.26",
|
||||||
|
"datasets",
|
||||||
|
"nltk",
|
||||||
|
"rouge_score",
|
||||||
|
"seqeval",
|
||||||
|
],
|
||||||
|
"nlp": [ # for backward compatibility; hf is the new option name
|
||||||
"transformers[torch]==4.26",
|
"transformers[torch]==4.26",
|
||||||
"datasets",
|
"datasets",
|
||||||
"nltk",
|
"nltk",
|
||||||
|
|
|
@ -2,9 +2,9 @@
|
||||||
|
|
||||||
### Requirements
|
### Requirements
|
||||||
|
|
||||||
This example requires GPU. Install the [nlp] option:
|
This example requires GPU. Install the [hf] option:
|
||||||
```python
|
```python
|
||||||
pip install "flaml[nlp]"
|
pip install "flaml[hf]"
|
||||||
```
|
```
|
||||||
|
|
||||||
### A simple sequence classification example
|
### A simple sequence classification example
|
||||||
|
|
|
@ -3,17 +3,17 @@
|
||||||
<!-- ### Welcome to FLAML, a Fast Library for Automated Machine Learning & Tuning! -->
|
<!-- ### Welcome to FLAML, a Fast Library for Automated Machine Learning & Tuning! -->
|
||||||
|
|
||||||
FLAML is a lightweight Python library that finds accurate machine
|
FLAML is a lightweight Python library that finds accurate machine
|
||||||
learning models automatically, efficiently and economically. It frees users from selecting learners and hyperparameters for each learner.
|
learning models automatically, efficiently and economically. It frees users from selecting models and hyperparameters for each model.
|
||||||
|
|
||||||
### Main Features
|
### Main Features
|
||||||
|
|
||||||
1. For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It supports both classical machine learning models and deep neural networks.
|
1. For common machine learning or AI tasks like classification, regression, and generation, it quickly finds quality models for user-provided data with low computational resources. It supports both classical machine learning models and deep neural networks, including large language models such as the OpenAI GPT-3 models.
|
||||||
|
|
||||||
2. It is easy to customize or extend. Users can find their desired customizability from a smooth range: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space and metric), or full customization (arbitrary training and evaluation code). Users can customize only when and what they need to, and leave the rest to the library.
|
2. It is easy to customize or extend. Users can find their desired customizability from a smooth range: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space and metric), or full customization (arbitrary training and evaluation code). Users can customize only when and what they need to, and leave the rest to the library.
|
||||||
|
|
||||||
3. It supports fast and economical automatic tuning, capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping. FLAML is powered by a new, [cost-effective
|
3. It supports fast and economical automatic tuning, capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping. FLAML is powered by a new, [cost-effective
|
||||||
hyperparameter optimization](Use-Cases/Tune-User-Defined-Function#hyperparameter-optimization-algorithm)
|
hyperparameter optimization](Use-Cases/Tune-User-Defined-Function#hyperparameter-optimization-algorithm)
|
||||||
and learner selection method invented by Microsoft Research.
|
and model selection method invented by Microsoft Research, and many followup [research studies](Research).
|
||||||
|
|
||||||
### Quickstart
|
### Quickstart
|
||||||
|
|
||||||
|
|
|
@ -24,8 +24,11 @@ install flaml with the [notebook] option:
|
||||||
pip install flaml[notebook]
|
pip install flaml[notebook]
|
||||||
```
|
```
|
||||||
|
|
||||||
#### Extra learners
|
#### Extra learners/models
|
||||||
|
* openai models
|
||||||
|
```bash
|
||||||
|
pip install flaml[openai]
|
||||||
|
```
|
||||||
* catboost
|
* catboost
|
||||||
```bash
|
```bash
|
||||||
pip install flaml[catboost]
|
pip install flaml[catboost]
|
||||||
|
@ -38,10 +41,9 @@ pip install flaml[vw]
|
||||||
```bash
|
```bash
|
||||||
pip install flaml[forecast]
|
pip install flaml[forecast]
|
||||||
```
|
```
|
||||||
|
* huggingface transformers
|
||||||
* natural language processing: transformers
|
|
||||||
```bash
|
```bash
|
||||||
pip install flaml[nlp]
|
pip install flaml[hf]
|
||||||
```
|
```
|
||||||
|
|
||||||
#### Distributed tuning
|
#### Distributed tuning
|
||||||
|
|
Loading…
Reference in New Issue