BLOOM : BigScience Large Open-science Open-access Multilingual Language Model / 가장 큰 multilingual model 소개

참고 자료

BLOOM : BigScience Large Open-science Open-access Multilingual Language Model / 가장 큰 multilingual model 소개

joannekim0420 2022. 11. 1. 16:21

728x90

https://huggingface.co/bigscience/bloom

모델 소개가 huggingface 공식 문서에 너무 잘 되어 있다..

bigscience/bloom · Hugging Face

huggingface.co

간략 정리

BigScience 에서공개한 다국어(총 59개 언어, 46 natural lanauge + 13 프로그래밍 언어) LLM generation 모델인 BLOOM
투명하게 훈련 과정을 공개하였고, 176 billion 파라미터를 갖는다.
BLOOM 처음으로 100B 파라미터를 넘는 language model로 416 A100 80GB GPU 클러스터에서 몇 달 동안 훈련.

Model Type: Transformer-based Language Model

Model Architecture and Objective

Modified from Megatron-LM GPT2 (see paper, BLOOM Megatron code):
Decoder-only architecture
Layer normalization applied to word embeddings layer (StableEmbedding; see code, paper)
ALiBI positional encodings (see paper), with GeLU activation functions
176,247,271,424 parameters:
- 3,596,615,680 embedding parameters
- 70 layers, 112 attention heads
- Hidden layers are 14336-dimensional
- Sequence length of 2048 tokens used (see BLOOM tokenizer, tokenizer description)

Objective Function: Cross Entropy with mean reduction (see API documentation).

공식 소개 블로그 : https://bigscience.huggingface.co/blog/bloom

BLOOM

Our 176B parameter language model is here.

bigscience.huggingface.co

저작자표시 (새창열림)

'참고 자료' 카테고리의 다른 글

서버 폴더 윈도우에서 네트워크 드라이브로 쉽게 접근해서 보는 법(sshfs) (0)	2023.12.11

현재글BLOOM : BigScience Large Open-science Open-access Multilingual Language Model / 가장 큰 multilingual model 소개

내일을찾는중♪

인공지능개발자

Python, NLP, defaultdict, AssertionError, counter, MTQE, heapq, deque, programmers, 공대 대학원, pytorch, BERTScore, Machine Translation, level2, POP, 프로그래머스, 인공지능대학원, level3, Linux, 파이썬,

Today :
Yesterday :

내일을찾는중♪