根据报告
https://arxiv.org/pdf/2104.12369 ,搜索“GPT”:
- From applications perspective, GPT-3 is revolutionary...
- Inspired by GPT-3 and our preliminary experiments, we choose the Transformer-based autoregressive language model...
- The architecture of PanGu-α is based on Transformer [13], which has been extensively used as the backbone of a variety of pretrained language models such as BERT [2] and GPT [10, 11, 1].
- Following the practice in GPT-3 ...
- Similar to the GPT-3 ...
- We follow the same closed-book setting in GPT-3 ...
- ... which is consistent with the observations in GPT-3.