Intro

Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples.

Compared with Yi, Yi-1.5 delivers stronger performance in coding, math, reasoning, and instruction-following capability, while still maintaining excellent capabilities in language understanding, commonsense reasoning, and reading comprehension.

Model	Context Length	Pre-trained Tokens
Yi-1.5	4K, 16K, 32K	3.6T

## Use with openMind

environment variable

# source environment variable
source /usr/local/Ascend/ascend-toolkit/set_env.sh
export OPENMIND_FRAMEWORK=pt

pip install openMind Library

OpenMind Library can be installed through pip, please select the corresponding command according to the actual environment for installation.

It should be noted that since torch npu depends on torch, which can be directly installed through pip in aarch64, but requires a specific URL to download the CPU version in x86, the installation commands in the two environments are different. The specific installation code has been distinguished and presented in the following text.

# aarch64
pip install openmind[all]
# x86
pip install openmind[all] --extra-index-url https://download.pytorch.org/whl/cpu

Inference

from openmind import AutoTokenizer, AutoModelForCausalLM
import torch
import torch_npu

model_dir = "HangZhou_Ascend/Yi-1.5-6B-Chat"
tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True)
# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto",  trust_remote_code=True, torch_dtype=torch.float16)
model = model.eval()
response, history = model.chat(tokenizer, "1+1=", history=[], meta_instruction="")
print(response)