ITDR-Qwen2.5-7B:基于ITDR指令微调数据集的推荐系统增强大语言模型

通过整合13个公开推荐数据集构建的ITDR指令微调数据集，含约20万实例，覆盖用户-物品交互与理解两大核心任务下的7个子任务，显著提升主流开源大语言模型在推荐任务上的性能。【此简介由AI生成】

d4dde36e创建于 2025年8月5日8次提交

文件	最后提交记录	最后更新时间
.gitattributes	add model files	9 个月前
Modelfile	add model files	9 个月前
README.md	Update README.md	9 个月前
added_tokens.json	add model files	9 个月前
config.json	add model files	9 个月前
configuration.json	System init configuration.json	9 个月前
generation_config.json	add model files	9 个月前
merges.txt	add model files	9 个月前
model-00001-of-00009.safetensorsLFS	add model files	9 个月前
model-00002-of-00009.safetensorsLFS	add model files	9 个月前
model-00003-of-00009.safetensorsLFS	add model files	9 个月前
model-00004-of-00009.safetensorsLFS	add model files	9 个月前
model-00005-of-00009.safetensorsLFS	add model files	9 个月前
model-00006-of-00009.safetensorsLFS	add model files	9 个月前
model-00007-of-00009.safetensorsLFS	add model files	9 个月前
model-00008-of-00009.safetensorsLFS	add model files	9 个月前
model-00009-of-00009.safetensorsLFS	add model files	9 个月前
model.safetensors.index.json	add model files	9 个月前
special_tokens_map.json	add model files	9 个月前
tokenizer.json	add model files	9 个月前
tokenizer_config.json	add model files	9 个月前
vocab.json	add model files	9 个月前

license: mit language:

en tasks:
question-answering
text-generation
text-classification
nli
feature-extraction
entity-typing frameworks: PyTorch base_model_relation: finetune metrics:
bleu
accuracy

base_model:

Qwen/Qwen2.5-7B

ITDR: An Instruction Tuning Dataset for Enhancing Large Language Models in Recommendations

Introduction

Large language models (LLMs) have demonstrated outstanding performance in natural language processing tasks. However, in the field of recommendation systems, due to the structural differences between user behavior data and natural language, LLMs struggle to effectively model the associations between user preferences and items. Although prompt-based methods can generate recommendation results, their inadequate understanding of recommendation tasks leads to constrained performance. To address this gap, in this work, we construct a sufficient instruction tuning dataset, ITDR, which encompasses 7 subtasks across two core root tasks—useritem interaction and user-item understanding. The dataset integrates data from 13 public recommendation datasets and is built using manually crafted standardized templates, comprising approximately 200,000 instances. Experimental results demonstrate that ITDR significantly enhances the performance of mainstream open-source LLMs such as GLM-4, Qwen2.5, Qwen2.5-Instruct and LLaMA-3.2 on recommendation tasks. Furthermore, we analyze the correlations between tasks and explore the impact of task descriptions and data scale on instruction tuning effectiveness. Finally, we perform comparative experiments against closed-source LLMs with substantial parameters.

项目介绍

下载使用量

项目总下载次数（含Clone、Pull、 zip 包及 release 下载），每日凌晨更新

ITDR: An Instruction Tuning Dataset for Enhancing Large Language Models in Recommendations

Introduction

项目介绍

下载使用量

目录