432615e8创建于 2024年11月23日历史提交

model-index:

  • name: xlm-roberta-longformer-base-16384 results: [] license: mit language:
  • multilingual
  • af
  • am
  • ar
  • as
  • az
  • be
  • bg
  • bn
  • br
  • bs
  • ca
  • cs
  • cy
  • da
  • de
  • el
  • en
  • eo
  • es
  • et
  • eu
  • fa
  • fi
  • fr
  • fy
  • ga
  • gd
  • gl
  • gu
  • ha
  • he
  • hi
  • hr
  • hu
  • hy
  • id
  • is
  • it
  • ja
  • jv
  • ka
  • kk
  • km
  • kn
  • ko
  • ku
  • ky
  • la
  • lo
  • lt
  • lv
  • mg
  • mk
  • ml
  • mn
  • mr
  • ms
  • my
  • ne
  • nl
  • 'no'
  • om
  • or
  • pa
  • pl
  • ps
  • pt
  • ro
  • ru
  • sa
  • sd
  • si
  • sk
  • sl
  • so
  • sq
  • sr
  • su
  • sv
  • sw
  • ta
  • te
  • th
  • tl
  • tr
  • ug
  • uk
  • ur
  • uz
  • vi
  • xh
  • yi
  • zh pipeline_tag: feature-extraction frameworks:
  • PyTorch hardwares:
  • NPU library_name: openmind

xlm-roberta-longformer-base-16384

⚠️ This is just the PyTorch version of hyperonym/xlm-roberta-longformer-base-16384 without any modifications.

xlm-roberta-longformer is a multilingual Longformer initialized with XLM-RoBERTa's weights without further pretraining. It is intended to be fine-tuned on a downstream task.

The notebook for replicating the model is available on GitHub: https://github.com/hyperonym/dirge/blob/master/models/xlm-roberta-longformer/convert.ipynb

Use in Openmind

from openmind import AutoTokenizer, AutoModelForSequenceClassification, is_torch_npu_available
from openmind_hub import snapshot_download
import torch.nn.functional as F
from torch import Tensor
import openmind
import torch
import argparse
import time

# Mean Pooling - Take attention mask into account for correct averaging
def mean_pooling(model_output, attention_mask):
    token_embeddings = model_output[0]  # First element of model_output contains all token embeddings
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)

def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "--model_name_or_path",
        type=str,
        help="Path to model",
        default="jeffding/xlm-roberta-longformer-base-16384-openmind",
    )
    args = parser.parse_args()
    return args

def main():
    args = parse_args()
    model_path = args.model_name_or_path

    if is_torch_npu_available():
        device = "npu:0"
    else:
        device = "cpu"
        
    # Load model from HuggingFace Hub
    tokenizer = AutoTokenizer.from_pretrained(model_path)
    model = AutoModelForSequenceClassification.from_pretrained(
        model_path, trust_remote_code=True,
        torch_dtype=torch.float16
    ).to(device)
    model.eval()

    start_time = time.time()
    
    pairs = [["中国的首都在哪儿","北京"], ["what is the capital of China?", "北京"],["how to implement quick sort in python?","Introduction of quick sort"]]
    
    with torch.no_grad():
        inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512).to(device)
        scores = model(**inputs, return_dict=True).logits.view(-1, ).float()
        print(scores)
    
    end_time = time.time()
    print(f"硬件环境:{device},推理执行时间:{end_time - start_time}秒")
    
if __name__ == "__main__":
    main()