GGitHubfix(security): use RestrictedUnpickler in load_instance (#2153 )

3097dcc9创建于 3月10日历史提交

文件	最后提交记录	最后更新时间
Makefile	fix_DDG-DA_workflow_bug (#1516) * 1.specify group_keys=False to avoid FutureWarning; 2.fix get train_start from dict unexpected problem; * fix black * Add comments * Add make file --------- Co-authored-by: Young <afe.young@gmail.com>	2 年前
README.md	Adjust rolling api (#1594) * Intermediate version * Fix yaml template & Successfully run rolling * Be compatible with benchmark * Get same results with previous linear model * Black formatting * Update black * Update the placeholder mechanism * Update CI * Update CI * Upgrade Black * Fix CI and simplify code * Fix CI * Move the data processing caching mechanism into utils. * Adjusting DDG-DA * Organize import	2 年前
requirements.txt	DDG-DA paper code (#743) * Merge data selection to main * Update trainer for reweighter * Typos fixed. * update data selection interface * successfully run exp after refactor some interface * data selection share handler & trainer * fix meta model time series bug * fix online workflow set_uri bug * fix set_uri bug * updawte ds docs and delay trainer bug * docs * resume reweighter * add reweighting result * fix qlib model import * make recorder more friendly * fix experiment workflow bug * commit for merging master incase of conflictions * Successful run DDG-DA with a single command * remove unused code * asdd more docs * Update README.md * Update & fix some bugs. * Update configuration & remove debug functions * Update README.md * Modfify horizon from code rather than yaml * Update performance in README.md * fix part comments * Remove unfinished TCTS. * Fix some details. * Update meta docs * Update README.md of the benchmarks_dynamic * Update README.md files * Add README.md to the rolling_benchmark baseline. * Refine the docs and link * Rename README.md in benchmarks_dynamic. * Remove comments. * auto download data Co-authored-by: wendili-cs <wendili.academic@qq.com> Co-authored-by: demon143 <785696300@qq.com>	4 年前
vis_data.py	fix(security): use RestrictedUnpickler in load_instance (#2153) * fix(security): enforce RestrictedUnpickler for load_instance to prevent unsafe pickle deserialization * fix: lint error	2 个月前
workflow.py	Add some misc features. (#1816) * Normal mod * Black linting * Linting	1 年前

Introduction

This is the implementation of DDG-DA based on Meta Controller component provided by Qlib.

Please refer to the paper for more details: DDG-DA: Data Distribution Generation for Predictable Concept Drift Adaptation [arXiv]

Background

In many real-world scenarios, we often deal with streaming data that is sequentially collected over time. Due to the non-stationary nature of the environment, the streaming data distribution may change in unpredictable ways, which is known as concept drift. To handle concept drift, previous methods first detect when/where the concept drift happens and then adapt models to fit the distribution of the latest data. However, there are still many cases that some underlying factors of environment evolution are predictable, making it possible to model the future concept drift trend of the streaming data, while such cases are not fully explored in previous work.

Therefore, we propose a novel method DDG-DA, that can effectively forecast the evolution of data distribution and improve the performance of models. Specifically, we first train a predictor to estimate the future data distribution, then leverage it to generate training samples, and finally train models on the generated data.

Dataset

The data in the paper are private. So we conduct experiments on Qlib's public dataset. Though the dataset is different, the conclusion remains the same. By applying DDG-DA, users can see rising trends at the test phase both in the proxy models' ICs and the performances of the forecasting models.

Run the Code

Users can try DDG-DA by running the following command:

    python workflow.py run

The default forecasting models are Linear. Users can choose other forecasting models by changing the forecast_model parameter when DDG-DA initializes. For example, users can try LightGBM forecasting models by running the following command:

    python workflow.py --conf_path=../workflow_config_lightgbm_Alpha158.yaml run

Results

The results of related methods in Qlib's public dataset can be found here

Requirements

Here are the minimal hardware requirements to run the workflow.py of DDG-DA.

Memory: 45G
Disk: 4G

Pytorch with CPU & RAM will be enough for this example.