한자로 한글한자자동변환기

??中文 ?? | English??

安? | 快速?始 | 特性 | 社?交流

PaddleNLP 是一款 ??易用 且 功能强大 的自然?言?理和大?言模型(LLM)???。聚合?界 ?????模型 ?提供 ?箱?用 的????，覆盖NLP多?景的模型?搭配 ????范例 可?足??者 ?活定制 的需求。

News ??

2024.04.24 PaddleNLP v2.8 ：自??致收?的RsLoRA+算法，大幅提升PEFT??收?速度以及??效果；引入高性能生成加速到RLHF PPO算法，打破 PPO ??中生成速度??，PPO??性能大幅?先。通用化支持 FastFNN、FusedQKV等多?大模型??性能?化方式，大模型??更快、更?定。
2024.01.04 PaddleNLP v2.7 ：大模型??全面升?，?一工具?大模型入口。?一???、精?、??、推理以及部署等??的??代?，到 PaddleNLP/llm目?。全新大模型工具?文? ，一站式指引用??大模型入?到??部署上?。全?点存?机制 Unified Checkpoint，大大提高大模型存?的通用性。高效微?升?，支持了高效微?+LoRA同?使用，支持了QLoRA等算法。
2023.08.15 PaddleNLP v2.6 ： ?布全流程大模型工具? ，涵盖???，精?，??，推理以及部署等各???，?用?提供端到端的大模型方案和一站式的????；?置 4D?行分布式Trainer ，高效微?算法LoRA/Prefix Tuning , 自?INT8/INT4量化算法等等；全面支持 LLaMA 1/2 , BLOOM , ChatGLM 1/2 , GLM , OPT 等主流大模型

安?

?境依?

python >= 3.7
paddlepaddle >= 2.6.0
如需大模型功能，?使用 paddlepaddle-gpu >= 2.6.0

pip安?

pip install --upgrade paddlenlp

或者可通?以下命令安?最新 develop 分支代?：

pip install --pre --upgrade paddlenlp -f https://www.paddlepaddle.org.cn/whl/paddlenlp.html

更多?于PaddlePaddle和PaddleNLP安?的???程??看 Installation 。

快速?始

大模型文本生成

PaddleNLP提供了方便易用的Auto API，能?快速的加?模型和Tokenizer。?里以使用 linly-ai/chinese-llama-2-7b 大模型做文本生成?例：

>>
>
 from
 paddlenlp
.
transformers
 import
 AutoTokenizer
, 
AutoModelForCausalLM

>>
>
 tokenizer
 =
 AutoTokenizer
.
from_pretrained
(
"linly-ai/chinese-llama-2-7b"
)
>>
>
 model
 =
 AutoModelForCausalLM
.
from_pretrained
(
"linly-ai/chinese-llama-2-7b"
, 
dtype
=
"float16"
)
>>
>
 input_features
 =
 tokenizer
(
"?好！?自我介?一下。"
, 
return_tensors
=
"pd"
)
>>
>
 outputs
 =
 model
.
generate
(
**
input_features
, 
max_length
=
128
)
>>
>
 tokenizer
.
batch_decode
(
outputs
[
0
])
[
'
\n
?好！我是一?AI?言模型，可以回答?的??和提供?助。'
]

一?UIE??

PaddleNLP提供一???功能，无需??，直接?入?据?可?放域抽取?果。?里以信息抽取-命名????任?，UIE模型?例：

>>
>
 from
 pprint
 import
 pprint

>>
>
 from
 paddlenlp
 import
 Taskflow


>>
>
 schema
 =
 [
'??'
, 
'?手'
, 
'?事名?'
] 
# Define the schema for entity extraction

>>
>
 ie
 =
 Taskflow
(
'information_extraction'
, 
schema
=
schema
)
>>
>
 pprint
(
ie
(
"2月8日上午北京冬??自由式滑雪女子大跳台??中中??手谷?凌以188.25分?得金牌！"
))
[{
'??'
: [{
'end'
: 
6
,
          
'probability'
: 
0.9857378532924486
,
          
'start'
: 
0
,
          
'text'
: 
'2月8日上午'
}],
  
'?事名?'
: [{
'end'
: 
23
,
            
'probability'
: 
0.8503089953268272
,
            
'start'
: 
6
,
            
'text'
: 
'北京冬??自由式滑雪女子大跳台??'
}],
  
'?手'
: [{
'end'
: 
31
,
          
'probability'
: 
0.8981548639781138
,
          
'start'
: 
28
,
          
'text'
: 
'谷?凌'
}]}]

更多PaddleNLP?容可?考：

大模型全流程工具? ，包含主流中文大模型的全流程方案。
精?模型? ，包含?????模型的端到端全流程使用。
多?景示例，了解如何使用PaddleNLP解?NLP多?技???，包含基?技?、系??用?拓展?用。
交互式?程，在??免?算力平台AI Studio上快速??PaddleNLP。

特性

?? ?箱?用的NLP工具集

?? ?富完?的中文模型?

??? ???端到端系?范例

?? 高性能分布式???推理

?箱?用的NLP工具集

Taskflow提供?富的 ???箱?用 的???NLP?置模型，覆盖自然?言理解?生成?大?景，提供 ?????的效果 ? ???致的推理性能 。

更多使用方法可?考 Taskflow文? 。

?富完?的中文模型?

?? ?界最全的中文???模型

精? 45+ ?????和 500+ ????模型??，涵盖?界最全的中文???模型：?包括文心NLP大模型的ERNIE、PLATO等，也覆盖BERT、GPT、RoBERTa、T5等主流??。通? AutoModel API一?? 高速下? ?。

from
 paddlenlp
.
transformers
 import
 *


ernie
 =
 AutoModel
.
from_pretrained
(
'ernie-3.0-medium-zh'
)
bert
 =
 AutoModel
.
from_pretrained
(
'bert-wwm-chinese'
)
albert
 =
 AutoModel
.
from_pretrained
(
'albert-chinese-tiny'
)
roberta
 =
 AutoModel
.
from_pretrained
(
'roberta-wwm-ext'
)
electra
 =
 AutoModel
.
from_pretrained
(
'chinese-electra-small'
)
gpt
 =
 AutoModelForPretraining
.
from_pretrained
(
'gpt-cpm-large-cn'
)

?????模型?算??，可以使用API一?使用文心ERNIE-Tiny全系列?量化模型，降低???模型部署?度。

# 6L768H

ernie
 =
 AutoModel
.
from_pretrained
(
'ernie-3.0-medium-zh'
)
# 6L384H

ernie
 =
 AutoModel
.
from_pretrained
(
'ernie-3.0-mini-zh'
)
# 4L384H

ernie
 =
 AutoModel
.
from_pretrained
(
'ernie-3.0-micro-zh'
)
# 4L312H

ernie
 =
 AutoModel
.
from_pretrained
(
'ernie-3.0-nano-zh'
)

????模型?用范式如??表示、文本分?、句?匹配、序列?注、?答等，提供?一的API??。

import
 paddle

from
 paddlenlp
.
transformers
 import
 *


tokenizer
 =
 AutoTokenizer
.
from_pretrained
(
'ernie-3.0-medium-zh'
)
text
 =
 tokenizer
(
'自然?言?理'
)

# ??表示

model
 =
 AutoModel
.
from_pretrained
(
'ernie-3.0-medium-zh'
)
sequence_output
, 
pooled_output
 =
 model
(
input_ids
=
paddle
.
to_tensor
([
text
[
'input_ids'
]]))
# 文本分? & 句?匹配

model
 =
 AutoModelForSequenceClassification
.
from_pretrained
(
'ernie-3.0-medium-zh'
)
# 序列?注

model
 =
 AutoModelForTokenClassification
.
from_pretrained
(
'ernie-3.0-medium-zh'
)
# ?答

model
 =
 AutoModelForQuestionAnswering
.
from_pretrained
(
'ernie-3.0-medium-zh'
)

?? 全?景覆盖的?用示例

覆盖???到??的NLP?用示例，涵盖NLP基?技?、NLP系??用以及拓展?用。全面基于??核心?架2.0全新API?系??，???者提供??文本?域的最佳??。

精????模型示例可?考 Model Zoo ，更多?景示例文?可?考 examples目? 。更有免?算力支持的 AI Studio 平台的 Notbook交互式?程提供??。

PaddleNLP???模型适用任???（ 点?展??情 ）

Model	Sequence Classification	Token Classification	Question Answering	Text Generation	Multiple Choice
ALBERT	?	?	?	?	?
BART	?	?	?	?	?
BERT	?	?	?	?	?
BigBird	?	?	?	?	?
BlenderBot	?	?	?	?	?
ChineseBERT	?	?	?	?	?
ConvBERT	?	?	?	?	?
CTRL	?	?	?	?	?
DistilBERT	?	?	?	?	?
ELECTRA	?	?	?	?	?
ERNIE	?	?	?	?	?
ERNIE-CTM	?	?	?	?	?
ERNIE-Doc	?	?	?	?	?
ERNIE-GEN	?	?	?	?	?
ERNIE-Gram	?	?	?	?	?
ERNIE-M	?	?	?	?	?
FNet	?	?	?	?	?
Funnel-Transformer	?	?	?	?	?
GPT	?	?	?	?	?
LayoutLM	?	?	?	?	?
LayoutLMv2	?	?	?	?	?
LayoutXLM	?	?	?	?	?
LUKE	?	?	?	?	?
mBART	?	?	?	?	?
MegatronBERT	?	?	?	?	?
MobileBERT	?	?	?	?	?
MPNet	?	?	?	?	?
NEZHA	?	?	?	?	?
PP-MiniLM	?	?	?	?	?
ProphetNet	?	?	?	?	?
Reformer	?	?	?	?	?
RemBERT	?	?	?	?	?
RoBERTa	?	?	?	?	?
RoFormer	?	?	?	?	?
SKEP	?	?	?	?	?
SqueezeBERT	?	?	?	?	?
T5	?	?	?	?	?
TinyBERT	?	?	?	?	?
UnifiedTransformer	?	?	?	?	?
XLNet	?	?	?	?	?

可?考 Transformer 文? ?看目前支持的???模型??、??和??用法。

???端到端系?范例

PaddleNLP??信息抽取、???索、智能?答、情感分析等高?NLP?景，提供了端到端系?范例，打通 ?据?注 - 模型?? - 模型?? - ??部署 全流程，持?降低NLP技???落地??。更多??的系????范例使用?明??考 Applications 。

?? ???索系?

??无?督?据、有?督?据等多??据情?，?合SimCSE、In-batch Negatives、ERNIE-Gram?塔模型等，推出前沿的???索方案，包含召回、排序??，打通??、??、高效向量?索引擎建?和??全流程。

更多使用?明??考 ???索系? 。

? 智能?答系?

基于 ??RocketQA 技?的?索式?答系?，支持FAQ?答、?明??答等多????景。

更多使用?明??考智能?答系? ? 文?智能?答

?? ???点抽取?情感分析

基于情感知?增强???模型SKEP，???品???行?价?度和?点抽取，以及?粒度的情感分析。

更多使用?明??考情感分析。

??? 智能?音指令解析

集成了 PaddleSpeech 和百度?放平台的?音??和 UIE 通用信息抽取等技?，打造智能一?化的?音指令解析系?范例，?方案可?用于智能?音??、智能?音交互、智能?音?索等?景，提高人机交互效率。

更多使用?明??考智能?音指令解析。

高性能分布式???推理

? FastTokenizer：高性能文本?理?

AutoTokenizer
.
from_pretrained
(
"ernie-3.0-medium-zh"
, 
use_fast
=
True
)

?了??更?致的模型部署性能，安?FastTokenizer后只需在 AutoTokenizer API上打? use_fast=True??，?可?用C++??的高性能分?算子，?松?得超Python百余倍的文本?理加速，更多使用?明可?考 FastTokenizer文? 。

?? FastGeneration：高性能生成加速?

model
 =
 GPTLMHeadModel
.
from_pretrained
(
'gpt-cpm-large-cn'
)
...
outputs
, 
_
 =
 model
.
generate
(
    
input_ids
=
inputs_ids
, 
max_length
=
10
, 
decode_strategy
=
'greedy_search'
,
    
use_fast
=
True
)

??地在 generate()API上打? use_fast=True??，?松在Transformer、GPT、BART、PLATO、UniLM等生成式???模型上?得5倍以上GPU加速，更多使用?明可?考 FastGeneration文? 。

?? Fleet：??4D混合?行分布式??技?

更多?于千??AI模型的分布式??使用?明可?考 GPT-3 。

社?交流

微信?描二??????卷，回?小助手???（NLP）之后，?可加入交流群?取福利
- ??多社???者以及官方??深度交流。
- 10G重磅NLP??大?包！

Citation

如果PaddleNLP??的?究有?助，?迎引用

@misc{=paddlenlp,
    title={PaddleNLP: An Easy-to-use and High Performance NLP Library},
    author={PaddleNLP Contributors},
    howpublished = {\url{https://github.com/PaddlePaddle/PaddleNLP}},
    year={2021}
}

Acknowledge

我?借?了Hugging Face的 Transformers ???于???模型使用的?秀??，在此?Hugging Face作者及其?源社?表示感?。

License

PaddleNLP遵循 Apache-2.0?源?? 。

Name		Name	Last commit message	Last commit date
Latest commit History 4,884 Commits
.github		.github
applications		applications
csrc		csrc
docs		docs
examples		examples
fast_generation		fast_generation
fast_tokenizer		fast_tokenizer
llm		llm
model_zoo		model_zoo
paddlenlp		paddlenlp
pipelines		pipelines
ppdiffusers		ppdiffusers
scripts		scripts
tests		tests
.clang-format		.clang-format
.clang_format.hook		.clang_format.hook
.copyright.hook		.copyright.hook
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Makefile		Makefile
README.md		README.md
README_en.md		README_en.md
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.py		setup.py

PaddlePaddle/PaddleNLP

Folders and files

Latest commit

History

Repository files navigation