RNNs with Attention

＊사용된 모든 영문 image의 출처는 cs231n 강의 자료입니다.＊

1. Seq2seq for Machine Translation

2. Seq2seq with Attention

3. MRC with Attention

Seq2seq for Machine Translation

Sequence to Sequence Model (seq2seq)

Sequence를 입력받아 Sequence를 출력
ChatBot, Machine Translation ···

Seq2seq for Machine Translation

Source sequence를 입력받아 Target seqeunce를 출력

Input seqeunce의 마지막 시점에 모든 정보를 담아 Decoder의 hidden state로 전달

→ Source sequence가 길어지면 초기 정보를 잘 반영하지 못한다.

Seq2seq with Attention

모든 input sequence의 정보를 조합하여 각 output word를 생성

Dot-Product Attention

Attention vector를 구할 때, 내적 사용

→ Decoder의 벡터 g와 Encoder의 각 input의 hidden state 간의 내적을 구하여 softmax를 취하고 가중평균을 구하여 attention vector 산출

Bahdanau Attention (Additive model)

Attention vector를 구할 때, concat 사용

→ Decoder의 벡터 g와 각 input의 hidden state를 각각 concat 하여 FC layer를 구성한 뒤, 나오는 node의 출력 값으로 attention vector 산출 (similarity 값)

Decoder의 input

해당 step에서 Decoder의 input 값: x(t)
해당 step 이전의 hidden state 값: g(t-1)
Encoder에서 weighted averaged 된 vector

→ g(t) 생성

Attention Example in Machine Translation

다른 언어들 간의 어순을 train
관사 등의 필요 없는 단어는 pass

MRC with Attention

Trends of Machine Translation

Statistical Machine Translation (SMT)가 word-to-word translation으로 probabilistic model 사용

→ User의 feature engineering 필요

Neural Machine Translation의 등장으로 data로부터 자동적인 feature 추출

Encoder-Decoder(seq2seq)

Source sentence	Target sentence
나 / 는 / 학교 / 에 / 간다	I / go / to / school
5 input	4 input

→ Source sentence 길이와 Target sentence 길이의 차이에서 오는 train 저하 발생

→ Seq2seq의 Encoder-Decoder로 극복

→ Encoder의 sentence를 context vector에 함축하여 Decoder에 전달

→ Source sentence가 길어지면 초기 문맥을 덜 반영하는 문제 발생

→ Attention Mechanism으로 극복

→ Source sentence의 각 vector의 위치를 파악하여 Decoder에 전달

Machine Reading Comprehension (MRC)

Machine Reading Comprehension
Question Answering

→ Context를 읽고, Question을 해석한 후, Answer를 산출

Context-to-Context Attention (Self Attention)

Encoder의 context가 굉장히 길면, context 내부에 Attention을 적용하여 문맥 파악

→ Encoder 자체에서 self attention을 수행하여 context 내부에서 각각 어떤 연관성이 있는지 도출

'NLP' 카테고리의 다른 글

Basic Regular expression 연습 (0)	2022.06.27
NLP preprocessing (0)	2022.06.27
Language Model & AWD techniques (0)	2022.06.22
Character-level Language Model (0)	2022.06.21
Word Embedding - Word2Vec, Glove, Doc2Vec (0)	2022.06.20

동영`s 인공지능 공부방

RNNs with Attention

Seq2seq for Machine Translation

Sequence to Sequence Model (seq2seq)

Seq2seq for Machine Translation

Seq2seq with Attention

Seq2seq with Attention

Decoder의 input

Attention Example in Machine Translation

MRC with Attention

Trends of Machine Translation

Encoder-Decoder(seq2seq)

Machine Reading Comprehension (MRC)

Context-to-Context Attention (Self Attention)

'NLP' 카테고리의 다른 글

티스토리툴바

RNNs with Attention

Seq2seq for Machine Translation

Sequence to Sequence Model (seq2seq)

Seq2seq for Machine Translation

Seq2seq with Attention

Seq2seq with Attention

Decoder의 input

Attention Example in Machine Translation

MRC with Attention

Trends of Machine Translation

Encoder-Decoder(seq2seq)

Machine Reading Comprehension (MRC)

Context-to-Context Attention (Self Attention)

'NLP' 카테고리의 다른 글

'NLP' Related Articles

티스토리툴바