Commit Graph

126 Commits

Author SHA1 Message Date
Shaokun dd9202bb01
Update OptunaSearch (#1106)
* update optuna

* update setup

* fix dependencies

* fix bugs in test

* fix bugs, web format

---------

Co-authored-by: “skzhang1” <“shaokunzhang529@gmail.com”>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-07-05 03:14:02 +00:00
Shaokun 7a64148676
support string alg in tune (#1093)
* support string alg in tune

* add test, enforce string feasible, support lexico in set_search_priorities in CFO

* fix bug

* fix bug

* fix bug

* fix bug

* fix bugs

* fix yiran

---------

Co-authored-by: “skzhang1” <“shaokunzhang529@gmail.com”>
2023-07-01 03:01:14 +00:00
Li Jiang aa05434c87
temp solution for joblib 1.3.0 issue (#1100)
* temp solution for joblib 1.3.0 issue, no need once https://github.com/joblib/joblib-spark/pull/48 is merged

* update option
2023-06-30 14:17:55 +00:00
Yiran Wu e3ca95bf8a
An agent implementation of MathChat (#1090)
* mathcaht implementation

* code forrmat

* update readme

* update openai.yml

* update openai.yml

* update openai.yml
2023-06-25 13:49:34 +00:00
EgorKraevTransferwise 5245efbd2c
Factor out time series-related functionality into a time series Task object (#989)
* Refactor into automl subpackage

Moved some of the packages into an automl subpackage to tidy before the
task-based refactor. This is in response to discussions with the group
and a comment on the first task-based PR.

Only changes here are moving subpackages and modules into the new
automl, fixing imports to work with this structure and fixing some
dependencies in setup.py.

* Fix doc building post automl subpackage refactor

* Fix broken links in website post automl subpackage refactor

* Fix broken links in website post automl subpackage refactor

* Remove vw from test deps as this is breaking the build

* Move default back to the top-level

I'd moved this to automl as that's where it's used internally, but had
missed that this is actually part of the public interface so makes sense
to live where it was.

* Re-add top level modules with deprecation warnings

flaml.data, flaml.ml and flaml.model are re-added to the top level,
being re-exported from flaml.automl for backwards compatability. Adding
a deprecation warning so that we can have a planned removal later.

* Fix model.py line-endings

* WIP

* WIP - Notes below

Got to the point where the methods from AutoML are pulled to
GenericTask. Started removing private markers and removing the passing
of automl to these methods. Done with decide_split_type, started on
prepare_data. Need to do the others after

* Re-add generic_task

* Most of the merge done, test_forecast_automl fit succeeds, fails at predict()

* Remaining fixes - test_forecast.py passes

* Comment out holidays-related code as it's not currently used

* Further holidays cleanup

* Fix imports in a test

* tidy up validate_data in time series task

* Test fixes

* Fix tests: add Task.__str__

* Fix tests: test for ray.ObjectRef

* Hotwire TS_Sklearn wrapper to fix test fail

* Attempt at test fix

* Fix test where val_pred_y is a list

* Attempt to fix remaining tests

* Push to retrigger tests

* Push to retrigger tests

* Push to retrigger tests

* Push to retrigger tests

* Remove plots from automl/test_forecast

* Remove unused data size field from Task

* Fix import for CLASSIFICATION in notebook

* Monkey patch TFT to avoid plotting, to fix tests on MacOS

* Monkey patch TFT to avoid plotting v2, to fix tests on MacOS

* Monkey patch TFT to avoid plotting v2, to fix tests on MacOS

* Fix circular import

* remove redundant code in task.py post-merge

* Fix test: set svd_solver="full" in PCA

* Update flaml/automl/data.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Fix review comments

* Fix task -> str in custom learner constructor

* Remove unused CLASSIFICATION imports

* Hotwire TS_Sklearn wrapper to fix test fail by setting
optimizer_for_horizon == False

* Revert changes to the automl_classification and pin FLAML version

* Fix imports in reverted notebook

* Fix FLAML version in automl notebooks

* Fix ml.py line endings

* Fix CLASSIFICATION task import in automl_classification notebook

* Uncomment pip install in notebook and revert import

Not convinced this will work because of installing an older version of
the package into the environment in which we're running the tests, but
let's see.

* Revert c6a5dd1a0

* Fix get_classification_objective import in suggest.py

* Remove hcrystallball docs reference in TS_Sklearn

* Merge markharley:extract-task-class-from-automl into this

* Fix import, remove smooth.py

* Fix dependencies to fix TFT fail on Windows Python 3.8 and 3.9

* Add tensorboardX dependency to fix TFT fail on Windows Python 3.8 and 3.9

* Set pytorch-lightning==1.9.0 to fix  TFT fail on Windows Python 3.8 and 3.9

* Set pytorch-lightning==1.9.0 to fix  TFT fail on Windows Python 3.8 and 3.9

* Disable PCA reduction of lagged features for now, to fix svd convervence fail

* Merge flaml/main into time_series_task

* Attempt to fix formatting

* Attempt to fix formatting

* tentatively implement holt-winters-no covariates

* fix forecast method, clean class

* checking external regressors too

* update test forecast

* remove duplicated test file, re-add sarimax, search space cleanup

* Update flaml/automl/model.py

removed links. Most important one probably was: https://robjhyndman.com/hyndsight/ets-regressors/

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* prevent short series

* add docs

* First attempt at merging Holt-Winters

* Linter fix

* Add holt-winters to TimeSeriesTask.estimators

* Fix spark test fail

* Attempt to fix another spark test fail

* Attempt to fix another spark test fail

* Change Black max line length to 127

* Change Black max line length to 120

* Add logging for ARIMA params, clean up time series models inheritance

* Add more logging for missing ARIMA params

* Remove a meaningless test causing a fail, add stricter check on ARIMA params

* Fix a bug in HoltWinters

* A pointless change to hopefully trigger the on and off KeyError in ARIMA.fit()

* Fix formatting

* Attempt to fix formatting

* Attempt to fix formatting

* Attempt to fix formatting

* Attempt to fix formatting

* Add type annotations to _train_with_config() in state.py

* Add type annotations to prepare_sample_train_data() in state.py

* Add docstring for time_col argument of AutoML.fit()

* Address @sonichi's comments on PR

* Fix formatting

* Fix formatting

* Reduce test time budget

* Reduce test time budget

* Increase time budget for the test to pass

* Remove redundant imports

* Remove more redundant imports

* Minor fixes of points raised by Qingyun

* Try to fix pandas import fail

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Formatting fixes

* More formatting fixes

* Added test that loops over TS models to ensure coverage

* Fix formatting issues

* Fix more formatting issues

* Fix random fail in check

* Put back in tests for ARIMA predict without fit

* Put back in tests for lgbm

* Update test/test_model.py

cover dedup

* Match target length to X length in missing test

---------

Co-authored-by: Mark Harley <mark.harley@transferwise.com>
Co-authored-by: Mark Harley <mharley.code@gmail.com>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Andrea W <a.ruggerini@ammagamma.com>
Co-authored-by: Andrea Ruggerini <nescio.adv@gmail.com>
Co-authored-by: Egor Kraev <Egor.Kraev@tw.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-06-19 11:20:32 +00:00
Chi Wang e1da7f7d68
update openai model support (#1082)
* update openai model support

* new gpt3.5

* docstr

* function_call and content may co-exist

* test function call

---------

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-06-16 00:58:44 +00:00
Qingyun Wu 9356a92ba5
add pands requirement in benchmark option (#1070)
Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-06-13 08:25:41 +00:00
Chi Wang 5387a0a607
Agent notebook example with human feedback; Support shell command and multiple code blocks; Improve the system message for assistant agent; Improve utility functions for config lists; reuse docker image (#1056)
* add agent notebook and documentation

* fix bug

* set flush to True when printing msg in agent

* add a math problem in agent notebook

* remove

* header

* improve notebook doc

* notebook update

* improve notebook example

* improve doc

* agent notebook example with user feedback

* log

* log

* improve notebook doc

* improve print

* doc

* human_input_mode

* human_input_mode str

* indent

* indent

* Update flaml/autogen/agent/user_proxy_agent.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* shell command and multiple code blocks

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* coding agent

* math notebook

* renaming and doc format

* typo

* infer lang

* sh

* docker

* docker

* reset consecutive autoreply counter

* fix explanation

* paper talk

* human feedback

* web info

* rename test

* config list explanation

* link to blogpost

* installation

* homepage features

* features

* features

* rename agent

* remove notebook

* notebook test

* docker command

* notebook update

* lang -> cmd

* notebook

* make it work for gpt-3.5

* return full log

* quote

* docker

* docker

* docker

* docker

* docker

* docker image list

* notebook

* notebook

* use_docker

* use_docker

* use_docker

* doc

* agent

* doc

* abs path

* pandas

* docker

* reuse docker image

* context window

* news

* print format

* pyspark version in py3.8

* pyspark in py3.8

* pyspark and ray

* quote

* pyspark

* pyspark

* pyspark

---------

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-06-09 18:40:04 +00:00
Chi Wang a0b318b12e
create an automl option to remove unnecessary dependency for autogen and tune (#1007)
* version update post release v1.2.2

* automl option

* import pandas

* remove automl.utils

* default

* test

* type hint and version update

* dependency update

* link to open in colab

* use packging.version to close #725

---------

Co-authored-by: Li Jiang <lijiang1@microsoft.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-05-24 23:55:04 +00:00
Chi Wang b3fba9734e
Mark experimental classes; doc; multi-config trial (#1021)
* Mark experimental classes

* template

* multi model

* test

* multi-config doc

* doc

* doc

* test

---------

Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-05-05 02:48:31 +00:00
Chi Wang 19aee67f55
coding agent; logging (#1011)
* coding agent

* tsp

* tsp

* aoai

* logging

* compact

* Handle Import Error

* cost function

* reset counter; doc

* reset_counter

* home page update

* use case

* catboost in linux

* catboost

* catboost

* catboost

* doc

* intro

* catboost
2023-05-02 20:38:23 +00:00
Li Jiang 39b9a9a417
Fix catboost failure in mac-os python<3.9 (#1020) 2023-05-02 14:19:56 +00:00
Jirka Borovec 73bb6e7667
pyproject.toml & switch to Ruff (#976)
* unify config to pyproject.toml
replace flake8 with Ruff

* drop configs

* update

* fixing

* Apply suggestions from code review

Co-authored-by: Zvi Baratz <z.baratz@gmail.com>

* setup

* ci

* pr template

* reword

---------

Co-authored-by: Zvi Baratz <z.baratz@gmail.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2023-04-28 01:54:55 +00:00
Chi Wang fa5ccea862
extract code from text; solve_problem; request_timeout in config; improve code (#999)
* extract code from text

* solve_problem; request_timeout in config

* improve

* move import statement

* improve code

* generate assertions

* constant

* configs for implement; voting

* doc

* execute code in docker

* success indicator of code executation in docker

* success indicator

* execute code

* strip n

* add cost in generate_code

* add docstr

* filename

* bytes

* check docker version

* print log

* python test

* remove api key address

* rename exit code

* success exit code

* datasets

* exit code

* recover openai tests

* cache and pattern match

* wait

* wait

* cache and test

* timeout test

* python image name and skip macos

* windows image

* docker images

* volume path and yaml

* win path -> posix

* extensions

* path

* path

* path

* path

* path

* path

* path

* path

* path

* path

* path

* skip windows

* path

* timeout in windows

* use_docker

* use_docker

* hot fix from #1000

---------

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-04-23 11:50:29 +00:00
Li Jiang c9fc622af1
fix tests failure caused by version incompatibility (#995) 2023-04-15 14:52:40 +00:00
Chi Wang 82f0a4309d
autogen subpackage (#968)
* math utils in autogen

* cleanup

* code utils

* remove check function from code response

* comment out test

* GPT-4

* increase request timeout

* name

* logging and error handling

* better doc

* doc

* codegen optimized

* GPT series

* text

* no demo example

* math

* import openai

* import openai

* azure model name

* azure model name

* openai version

* generate assertion if necessary

* condition to generate assertions

* init region key

* rename

* comments about budget

* prompt

---------

Co-authored-by: Susan Xueqing Liu <liususan091219@users.noreply.github.com>
2023-04-08 03:04:01 +00:00
Chi Wang 595f5a8025
gpt-4 support; openai workflow fix; model str; timeout; voting (#958)
* workflow; model str; timeout

* voting

* notebook

* pull request

* recover workflow

* voted answer

* aoai

* ignore None answer

* default config

* note

* gpt-4

* n=5

* cleanup

* config name

* introduction

* readme

* avoid None

* add output/ to gitignore

* openai version

* invalid var

* comment long running cells
2023-03-26 17:13:06 +00:00
Li Jiang 50334f2c52
Support spark dataframe as input dataset and spark models as estimators (#934)
* add basic support to Spark dataframe

add support to SynapseML LightGBM model

update to pyspark>=3.2.0 to leverage pandas_on_Spark API

* clean code, add TODOs

* add sample_train_data for pyspark.pandas dataframe, fix bugs

* improve some functions, fix bugs

* fix dict change size during iteration

* update model predict

* update LightGBM model, update test

* update SynapseML LightGBM params

* update synapseML and tests

* update TODOs

* Added support to roc_auc for spark models

* Added support to score of spark estimator

* Added test for automl score of spark estimator

* Added cv support to pyspark.pandas dataframe

* Update test, fix bugs

* Added tests

* Updated docs, tests, added a notebook

* Fix bugs in non-spark env

* Fix bugs and improve tests

* Fix uninstall pyspark

* Fix tests error

* Fix java.lang.OutOfMemoryError: Java heap space

* Fix test_performance

* Update test_sparkml to test_0sparkml to use the expected spark conf

* Remove unnecessary widgets in notebook

* Fix iloc java.lang.StackOverflowError

* fix pre-commit

* Added params check for spark dataframes

* Refactor code for train_test_split to a function

* Update train_test_split_pyspark

* Refactor if-else, remove unnecessary code

* Remove y from predict, remove mem control from n_iter compute

* Update workflow

* Improve _split_pyspark

* Fix test failure of too short training time

* Fix typos, improve docstrings

* Fix index errors of pandas_on_spark, add spark loss metric

* Fix typo of ndcgAtK

* Update NDCG metrics and tests

* Remove unuseful logger

* Use cache and count to ensure consistent indexes

* refactor for merge maain

* fix errors of refactor

* Updated SparkLightGBMEstimator and cache

* Updated config2params

* Remove unused import

* Fix unknown parameters

* Update default_estimator_list

* Add unit tests for spark metrics
2023-03-25 19:59:46 +00:00
Chi Wang 169012f3e7
ChatGPT support (#942)
* improve max_valid_n and doc

* Update README.md

Co-authored-by: Li Jiang <lijiang1@microsoft.com>

* add support for chatgpt

* notebook

* newline at end of file

* chatgpt notebook

* ChatGPT in Azure

* doc

* math

* warning, timeout, log file name

* handle import error

* doc update; default value

* paper

* doc

* docstr

* eval_func

* prompt and messages

* remove confusing words

* notebook name

---------

Co-authored-by: Li Jiang <lijiang1@microsoft.com>
Co-authored-by: Susan Xueqing Liu <liususan091219@users.noreply.github.com>
2023-03-10 19:35:36 +00:00
Chi Wang 1ec77b58b4
improve max_valid_n and doc (#933)
* improve max_valid_n and doc

* Update README.md

Co-authored-by: Li Jiang <lijiang1@microsoft.com>

* newline at end of file

* doc

---------

Co-authored-by: Li Jiang <lijiang1@microsoft.com>
Co-authored-by: Susan Xueqing Liu <liususan091219@users.noreply.github.com>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-03-05 16:40:57 +00:00
Susan Xueqing Liu 2273937e68
Update hf version (#918)
* update hf version

* adding transformers version

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-02-17 23:52:36 +00:00
Li Jiang 138eb78dbc
Added extras for synapse (#916)
* Added extras for synapse

* Update Installation doc
2023-02-17 16:38:55 +00:00
Chi Wang 35ce9b79e8
azure oai (#920)
* azure oai

* price update in notebook

* text Davinci

* pytorch-lightning version

* trigger action in merge queue

* types

* doc check in mege group
2023-02-16 23:38:50 +00:00
Chi Wang 63d350d4c8
Openai (#905)
* add cost budget; move loc of make_dir

* support openai completion

* install pytest in workflow

* skip openai test

* test openai

* path for docs rebuild

* install datasets

* signal

* notebook

* notebook in workflow

* optional arguments and special params

* key -> k

* improve readability

* assumption

* optimize for model selection

* larger range of max_tokens

* notebook

* python package workflow

* skip on win
2023-02-05 20:13:08 -08:00
Chi Wang 75e3454120
notebook test; spark warning message; reproducibility bug; sequential tuning stop condition (#869)
* notebook test

* add ipykernel, remove except

* only create dir if not empty

* Stop sequential tuning when result is None

* fix reproducibility of global search

* save gs seed

* use get to avoid KeyError

* test
2023-01-07 18:39:29 -08:00
Li Jiang da2cd7ca89
Add supporting using Spark as the backend of parallel training (#846)
* Added spark support for parallel training.

* Added tests and fixed a bug

* Added more tests and updated docs

* Updated setup.py and docs

* Added customize_learner and tests

* Update spark tests and setup.py

* Update docs and verbose

* Update logging, fix issue in cloud notebook

* Update github workflow for spark tests

* Update github workflow

* Remove hack of handling _choice_

* Allow for failures

* Fix tests, update docs

* Update setup.py

* Update Dockerfile for Spark

* Update tests, remove some warnings

* Add test for notebooks, update utils

* Add performance test for Spark

* Fix lru_cache maxsize

* Fix test failures on some platforms

* Fix coverage report failure

* resovle PR comments

* resovle PR comments 2nd round

* resovle PR comments 3rd round

* fix lint and rename test class

* resovle PR comments 4th round

* refactor customize_learner to broadcast_code
2022-12-23 08:18:49 -08:00
Chi Wang 92b79221b6
make performance test reproducible (#837)
* make performance test reproducible

* fix test error

* Doc update and disable logging

* document random_state and version

* remove hardcoded budget

* fix test error and dependency; close #777

* iloc
2022-12-06 10:13:39 -08:00
Chi Wang 595af7a04f
install editable package in codespace (#826)
* install editable package in codespace

* fix test error in test_forecast

* fix test error in test_space

* openml version

* break tests; pre-commit

* skip on py10+win32

* install mlflow in test

* install mlflow in [test]

* skip test in windows

* import

* handle PermissionError

* skip test in windows

* skip test in windows

* skip test in windows

* skip test in windows

* remove ts_forecast_panel from doc
2022-11-27 14:22:54 -05:00
Anonymous-submission-repo f7a9d42dc7 update 2022-10-10 01:15:17 +00:00
Anonymous-submission-repo 4050d3f1cb update 2022-10-09 13:18:15 -04:00
Qingyun Wu 8b3c6e4d7b
VW version requirement and documentation on config_constraints vs metric_constraints (#686)
* add vw version requirement

* vw version

* version range

* add documentation

* vw version range

* skip test on py3.10

* vw version

* rephrase

* don't install vw on py 3.10

* move import location

* remove inherit

* 3.10 in version

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2022-08-15 20:16:11 -07:00
Kevin Chen f718d18b5e
time series forecasting with panel datasets (#541)
* time series forecasting with panel datasets
- integrate Temporal Fusion Transformer as a learner based on pytorchforecasting

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update setup.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update test_forecast.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update setup.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update setup.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update model.py and test_forecast.py
- remove blank lines

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update model.py to prevent errors

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update automl.py and data.py
- change forecast task name
- update documentation for fit() method

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update test_forecast.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update test_forecast.py
- add performance test
- use 'fit_kwargs_by_estimator'

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* add time index function

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update test_forecast.py performance test

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update data.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update automl.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update data.py to prevent type error

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update setup.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update for pytorch forecasting tft on panel datasets

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update automl.py documentations

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* - rename estimator
- add 'gpu_per_trial' for tft estimator

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update test_forecast.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* include ts panel forecasting as an example

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update model.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update documentations

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update automl_time_series_forecast.ipynb

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update documentations

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* "weights_summary" argument deprecated and removed for pl.Trainer()

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update model.py tft estimator prediction method

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update model.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update `fit_kwargs` documentation

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update automl.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2022-08-12 08:39:22 -07:00
Xueqing Liu 21fa6c10ec
Fixing the issue that FLAML trial number is significantly smaller than Transformers.hyperparameter_search (#657)
* fix 636

* adding low cost config

* update padding; update tokenization output y type (series -> DF); update low cost init config

* updating todf; updating metric_loss_score
2022-08-03 00:11:29 -04:00
Chi Wang cbb85e2aab
Py36 (#614)
* allow installation in py 3.6

* test py 3.6
2022-06-26 08:32:28 -07:00
Chi Wang c45741a67b
support latest xgboost version (#599)
* support latest xgboost version

* Update test_classification.py

* Update 

Exists problems when installing xgb1.6.1 in py3.6

* cleanup

* xgboost version

* remove time_budget_s in test

* remove redundancy

* stop support of python 3.6

Co-authored-by: zsk <shaokunzhang529@gmail.com>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2022-06-21 18:59:07 -07:00
Chi Wang 2d31138191
set holiday version <0.14 for prophet (#573)
* set holiday version <0.14 for prophet

* bump version to 1.0.5
2022-06-05 09:53:18 -07:00
Chi Wang c79c07f450 version update 2022-06-02 12:43:59 -07:00
Chi Wang d747800509 include .json file in flaml.default package 2022-05-31 06:56:58 -07:00
Chi Wang 49e8f7f028
use zeroshot when no budget is given; custom_hp (#563)
* use zeroshot when no budget is given; custom_hp

* update Getting-Started

* protobuf version

* X_val
2022-05-28 17:22:09 -07:00
LinWencong 515a77ac71
solve issue #542. fix pickle.UnpickingError while blendsearch warm start (#554)
Issue I encountered:

#542 run test_restore.py and got _pickle.UnpicklingError: state is not a dictionary

I observed:

1. numpy version
  i. When numpy==1.16*, np.random.RandomState.__getstate__() returns a tuple, not a dict.
    _pickle.UnpicklingError occurs
  ii. When numpy>1.17.0rc1, it returns a dict;
    _pickle.UnpicklingError does not occur
  iii. When numpy>1.17.0rc1, flaml uses np_random_generator = np.random.Generator,
    _pickle.UnpicklingError does not occur

2. class _BackwardsCompatibleNumpyRng
  When I remove func _BackwardsCompatibleNumpyRng.__getattr__() , _pickle.UnpicklingError doesn't occur (regardless of numpy version == 1.16* or 1.17*)

To sum up,
I think making modifications to class _BackwardsCompatibleNumpyRng is not a good choice (_BackwardsCompatibleNumpyRng came from ray)and we still need to learn more about the operation mechanism of pickle.

So I upgraded the numpy version that flaml requires:
  setup.py:"NumPy>=1.17.0rc1"
2022-05-23 11:23:00 -07:00
Chi Wang b4d312412a
bump ray version to 1.10 (#450)
* bump ray version to 1.10

* init ray in test

* Update setup.py to include hotfixes

Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2022-02-09 15:04:29 -08:00
Kevin Chen 81f54026c9
Support time series forecasting for discrete target variable (#416)
* support 'ts_forecast_classification' task to forecast discrete values

* update test_forecast.py
- add test for forecasting discrete values

* update test_model.py

* pre-commit changes
2022-01-24 18:39:36 -08:00
Kevin Chen d4273669e6
Time series forecasting with sklearn regressors (#362)
* add sklearn regressors as learners for ts_forecast task

* add direct forecasting strategy
warnings and errors for duplicate rows and missing values

- add preprocess for sklearn time series forecast
 update automl.py
 update test/test_forecast.py

* update model.py and test_forecast.py for cv eval_method

* add "hcrystalball" dependency in setup.py

* update automl.py
- add _validate_ts_data function for abstraction
- include xgb_limitdepth as a learner

* update model.py
- update search space for sklearn ts regressors

* update automl.py and test_forecast.py for numpy array inputs

* add documentations to model.py

* add documentation for removing catboost regressor

* update automl.py
- _validate_ts_data() function

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
2022-01-06 23:12:38 -08:00
Chi Wang cd9740f022
Fix several issues for nlp tasks (#380)
* num cpu issue #378;
* temp fix for ray issue #379;
* transformers version.
2022-01-05 13:49:12 -08:00
Xueqing Liu 207b6935d9
adding token classification (#376)
* adding ner
2022-01-03 13:44:10 -05:00
oberonbot 9c00e4272a
Finish the Multiple Choice Classification (#367)
* adding multiple choice

* update test cases (hard coded)

* merged common code in predict_proba and predict in TransformersEstimator
2022-01-02 20:12:34 -05:00
Xueqing Liu ee3162e232
Adding the NLP task summarization (#346)
* Add test_autohf_summarization.py

* adding seq2seq

* Update flaml/nlp/huggingface/trainer.py

* rouge metrics

Co-authored-by: XinZofStevens <xzhao4346@gmail.com>
Co-authored-by: JinzhuoWu <wujinzhuo0105@gmail.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-12-20 14:19:32 -08:00
Chi Wang efd85b4c86
Deploy a new doc website (#338)
A new documentation website. And:

* add actions for doc

* update docstr

* installation instructions for doc dev

* unify README and Getting Started

* rename notebook

* doc about best_model_for_estimator #340

* docstr for keep_search_state #340

* DNN

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
Co-authored-by: Z.sk <shaokunzhang@psu.edu>
2021-12-16 17:11:33 -08:00
Xueqing Liu 42de3075e9
Make NLP tasks available from AutoML.fit() (#210)
Sequence classification and regression: "seq-classification" and "seq-regression"

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-11-16 11:06:20 -08:00
Chi Wang 92ebd1f7f9
when max_iter=1, skip search only if retrain_final (#280)
* when max_iter=1, skip search only if retrain_final

* remove nlp
redesign in #210

* minor change in readme example
2021-11-09 21:51:23 -08:00
Chi Wang 549a0dfb53
limit time and memory consumption (#264)
* limit time and memory

* separate tests

* lrl1 can't be limited by limit_resource

* free memory when possible

* passthrough=False when ensemble fails;
retrain when trained_estimator is None

* use callback to for resource limit

* handle lower version of xgb with no callback

* free mem ratio

* reduce verbosity

* retrain_final when max_iter==1

* remove trained_estimator from result

* model_history

* wheel

* retrain time as best_config_train_time

* ci: libomp version for xgboost on macos

* limit_resource not working in windows

* test pickle load

* mute forecaster

* notebook update

* check hard

* preventive callback

* add use_ray
2021-11-03 19:08:23 -07:00
Kevin Chen 519bfc2a18
Integrate multivariate time series forecasting (#254)
* Integrate multivariate time series forecasting, now supports
continuous and categorical variables

- update data.py to transform time series data
- update search space
- update documentations to reflect changes
- update test_forecast.py
- rename 'forecast' task to 'ts_forecast' task

* update automl.py and test_forecast.py

* update forecast notebook

* update README.md and setup.py

* update ml.py and test_forecast.py

- make "ds" and "y" constant variables

* replace constants with constant variables

* bump version to 0.7.0

* update setup.py
- support 'forecast' and 'ts_forecast'

* update automl.py and data.py
- support 'forecast' and 'ts_forecast' tasks
2021-10-30 09:48:57 -07:00
Chi Wang ddc1a63a76
Package (#244)
* build and upload pypi package

* pandas in dependency
2021-10-10 22:57:22 -07:00
Chi Wang f48ca2618f
warning -> info for low cost partial config (#231)
* warning -> info for low cost partial config
#195, #110

* when n_estimators < 0, use trained_estimator's

* log debug info

* test random seed

* remove "objective"; avoid ZeroDivisionError

* hp config to estimator params

* check type of searcher

* default n_jobs

* try import

* Update searchalgo_auto.py

* CLASSIFICATION

* auto_augment flag

* min_sample_size

* make catboost optional
2021-10-08 16:09:43 -07:00
Chi Wang f4529dfe89
package name in setup (#198)
* package name

* learning to rank example: close #200

* try import prophet #201
2021-09-11 21:19:18 -07:00
Chi Wang 71219df6c6
notebook example (#189)
* config in result

* value can be float

* pytorch notebook example

* docker, pre-commit

* max_failure (#192); early_stop

* extend starting_points (#196)

Co-authored-by: Chi Wang (MSR) <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qw2ky@virginia.edu>
2021-09-10 16:39:16 -07:00
Chi Wang 6ab0730793
remove catboost training dir; ensemble api; blendsearch for hierarchical space; ranking task; forecast improvement (#178)
* remove catboost training dir

* close #48

* bs for hierarchical space. close #85

* retrain for hierarchical space

* clean ml (#180)

Co-authored-by: Qingyun Wu <qxw5138@psu.edu>

* support ranking task

* examples

* cv shuffle

* forecast api and implementation cleaner

* period constraints

* delete groups after fit
2021-09-01 16:25:04 -07:00
Chi Wang 1bc8786dcb
remove big objects after fit (#176)
* remove big objects after fit

* xgboost>1.3.3 has a weird auc socre on:
kr-vs-kp, fold 5, 1h1c

* keep_search_state
2021-08-26 13:45:13 -07:00
Kevin Chen 3d0a3d26a2
Forecast (#162)
* added 'forecast' task with estimators ['fbprophet', 'arima', 'sarimax']

* update setup.py

* add TimeSeriesSplit to 'regression' and 'classification' task

* add 'time' split_type for 'classification' and 'regression' task

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* feature importance

* variable name

* Update test/test_split.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update test/test_forecast.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* prophet installation fail in windows

* upload flaml_forecast.ipynb

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
2021-08-23 13:26:46 -07:00
Xueqing Liu eeaf5b5963
space -> main (#148)
* subspace in flow2

* search space and trainable from AutoML

* experimental features: multivariate TPE, grouping, add_evaluated_points

* test experimental features

* readme

* define by run

* set time_budget_s for bs

Co-authored-by: liususan091219 <Xqq630517>

* version

* acl

* test define_by_run_func

* size

* constraints

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-08-02 16:10:26 -07:00
Qingyun Wu 58c0ec959d
Update readme for flaml.tune (#137)
* add time_budget_s for bs in readme

* version update

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-07-24 17:10:43 -07:00
Qingyun Wu a291abfab9
Cha cha (#127)
* unordered categorical

* allow cost attribute to be None

* tensorboardX version

* quote

* cfo cat

* trunc

* Update version.py

* incumbent is normalized

* python 3.9

* remove ConcurrencyLimiter

* seed

* estimator

* update autovw notebook

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qiw@microsoft.com>
2021-07-05 18:17:26 -07:00
Xueqing Liu e41b42842a
fixing "discount running thread " (#122)
* add tf to test dependency

Co-authored-by: liususan091219 <Xqq630517>
2021-06-25 22:26:47 -07:00
Xueqing Liu a5a5a4bc20
fixed API doc and import (#108)
* removed run_analysis.py, run_autohf.py, test_jupyter.py
2021-06-15 09:55:23 -07:00
Xueqing Liu 926589bdda
exception, coverage for autohf (#106)
* increase coverage

* fixing exception messages

* fixing import
2021-06-14 14:11:40 -07:00
Xueqing Liu a4049ad9b6
autohf (#43)
automate huggingface transformer
2021-06-09 08:37:03 -07:00
Qingyun Wu 0d3a0bfab6
Add ChaCha (#92)
* pickle the AutoML object

* get best model per estimator

* test deberta

* stateless API

* pickle the AutoML object

* get best model per estimator

* test deberta

* stateless API

* prevent divide by zero

* test roberta

* BlendSearchTuner

* sync

* version number

* update gitignore

* delta time

* reindex columns when dropping int-indexed columns

* add seed

* add seed in Args

* merge

* init upload of ChaCha

* remove redundancy

* add back catboost

* improve AutoVW API

* set min_resource_lease in VWOnlineTrial

* docstr

* rename

* docstr

* add docstr

* improve API and documentation

* fix name

* docstr

* naming

* remove max_resource in scheduler

* add TODO in flow2

* remove redundancy in rearcher

* add input type

* adapt code from ray.tune

* move files

* naming

* documentation

* fix import error

* fix format issues

* remove cb in worse than test

* improve _generate_all_comb

* remove ray tune

* naming

* VowpalWabbitTrial

* import error

* import error

* merge test code

* scheduler import

* fix import

* remove

* import, minor bug and version

* Float or Categorical

* fix default

* add test_autovw.py

* add vowpalwabbit and openml

* lint

* reorg

* lint

* indent

* add autovw notebook

* update notebook

* update log msg and autovw notebook

* update autovw notebook

* update autovw notebook

* add available strings for model_select_policy

* string for metric

* Update vw format in flaml/onlineml/trial.py

Co-authored-by: olgavrou <olgavrou@gmail.com>

* make init_config optional

* add _setup_trial_runner and update notebook

* space

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qiw@microsoft.com>
Co-authored-by: olgavrou <olgavrou@gmail.com>
2021-06-02 22:08:24 -04:00
Chi Wang 97a7c114ee
Issue58 (#59)
* iter per learner

* code cleanup
2021-04-08 09:29:55 -07:00
Chi Wang b7a91e0385
V0.3.0 (#55)
* flaml v0.3

* low cost partial config
2021-04-06 11:37:52 -07:00
Chi Wang 4a8110c87b
pickle the AutoML object (#37)
* pickle the AutoML object

* get best model per estimator

* test deberta

* stateless API

* Add Gitter badge (#41)

* prevent divide by zero

* test roberta

* BlendSearchTuner

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: The Gitter Badger <badger@gitter.im>
2021-03-16 22:13:35 -07:00
Chi Wang 7bd231e497
v0.2.6 (#32)
* xgboost notebook

* finetuning notebook

* finetuning test

* experimental nni support

* support nested search space

* log file name

* record training_iteration

* eps

* reset times

* std set to default step size if 0
2021-02-28 12:43:43 -08:00
Chi Wang (MSR) bd16eeee69 sample_weight; dependency; notebook 2021-02-13 10:43:11 -08:00
Chi Wang (MSR) 582eacdc79 cleanup 2021-02-05 22:02:14 -08:00
Chi Wang 776aa55189
V0.2.2 (#19)
* v0.2.2

separate the HPO part into the module flaml.tune
enhanced implementation of FLOW^2, CFO and BlendSearch
support parallel tuning using ray tune
add support for sample_weight and generic fit arguments
enable mlflow logging

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: qingyun-wu <qw2ky@virginia.edu>
2021-02-05 21:41:14 -08:00
Chi Wang cb5ce4e3a6
v0.1.3 Set default logging level to INFO (#14)
* set default logging level to INFO

* remove unnecessary import

* API future compatibility

* add test for customized learner

* test dependency

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
2020-12-15 08:10:43 -08:00
Chi Wang (MSR) 492990655d v0.1.0 2020-12-04 09:40:27 -08:00