GGitHubfix(security): use RestrictedUnpickler in load_instance (#2153 )

3097dcc9创建于 3月10日历史提交

文件	最后提交记录	最后更新时间
README.md	Update README.md	4 年前
highfreq_handler.py	Adjust rolling api (#1594) * Intermediate version * Fix yaml template & Successfully run rolling * Be compatible with benchmark * Get same results with previous linear model * Black formatting * Update black * Update the placeholder mechanism * Update CI * Update CI * Upgrade Black * Fix CI and simplify code * Fix CI * Move the data processing caching mechanism into utils. * Adjusting DDG-DA * Organize import	2 年前
highfreq_ops.py	fixed a problem with multi index caused by the default value of groupkey (#1917) * fixed a problem with multi index caused by the default value of groupkey * modify group_key default value * limit pandas verion * format with black * fix docs error * fix docs error * fixed bugs caused by pandas upgrade * remove needless code * reformat with black * limit version & add docs	1 年前
highfreq_processor.py	[807] Move the REG_CONSTANT/EPS to constant.py. (#811) * [807] Move the REG_CONSTANT to constant.py. * import REG_US. * Move EPS to constant.py.	4 年前
workflow.py	fix(security): use RestrictedUnpickler in load_instance (#2153) * fix(security): enforce RestrictedUnpickler for load_instance to prevent unsafe pickle deserialization * fix: lint error	2 个月前
workflow_config_High_Freq_Tree_Alpha158.yaml	bugfix: Fix the problem that caused highfreq's yaml to be unusable (#678)	4 年前

Introduction

This folder contains 2 examples

A high-frequency dataset example
An example of predicting the price trend in high-frequency data

High-Frequency Dataset

This dataset is an example for RL high frequency trading.

Get High-Frequency Data

Get high-frequency data by running the following command:

    python workflow.py get_data

Dump & Reload & Reinitialize the Dataset

The High-Frequency Dataset is implemented as qlib.data.dataset.DatasetH in the workflow.py. DatatsetH is the subclass of qlib.utils.serial.Serializable, whose state can be dumped in or loaded from disk in pickle format.

About Reinitialization

After reloading Dataset from disk, Qlib also support reinitializing the dataset. It means that users can reset some states of Dataset or DataHandler such as instruments, start_time, end_time and segments, etc., and generate new data according to the states.

The example is given in workflow.py, users can run the code as follows.

Run the Code

Run the example by running the following command:

    python workflow.py dump_and_load_dataset

Benchmarks Performance (predicting the price trend in high-frequency data)

Here are the results of models for predicting the price trend in high-frequency data. We will keep updating benchmark models in future.

Model Name	Dataset	IC	ICIR	Rank IC	Rank ICIR	Long precision	Short Precision	Long-Short Average Return	Long-Short Average Sharpe
LightGBM	Alpha158	0.0349±0.00	0.3805±0.00	0.0435±0.00	0.4724±0.00	0.5111±0.00	0.5428±0.00	0.000074±0.00	0.2677±0.00