XLnet for NLP Study
Section 1. 개발 테스트 환경 구축
구글 드라이브의 데이터와 모델을 불러올 수 있도록 연동하는 과정
#!pip install pytorch-transformers # For Colab
import pandas as pd
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
!ls drive/'My Drive'/'Colab Notebooks'
amazon_cells_labelled.txt Untitled0.ipynb
bert_sentiment.ipynb XLM-Roberta_tensor.ipynb
Cooljamm_stat.ipynb XLnet_sentiment.ipynb
'Cooljamm_stat.ipynb의 사본' xtest_93_bs3_ep7.npy
data xtest_93.npy
ids2.npy xtest_95_bs3_ep7.npy
ids_93.npy xtrain_93_bs3_ep7.npy
Keras_API.ipynb xtrain_93.npy
model xtrain_95_bs3_ep7.npy
musicEmotion.ipynb ytest_93_bs3_ep7.npy
requirement.txt ytest_93.npy
sst2_albert.ipynb ytest_95_bs3_ep7.npy
'sst2_albert.ipynb의 사본' ytrain_93_bs3_ep7.npy
test.ipynb ytrain_93.npy
transformerXLnet_sentiment.ipynb ytrain_95_bs3_ep7.npy
PATH = "drive/My Drive/Colab Notebooks/amazon_cells_labelled.txt"
Section 2. 알고리즘 구동을 위한 라이브러리 설치
!pip install transformers
Collecting transformers
[?25l Downloading https://files.pythonhosted.org/packages/d8/f4/9f93f06dd2c57c7cd7aa515ffbf9fcfd8a084b92285732289f4a5696dd91/transformers-3.2.0-py3-none-any.whl (1.0MB)
[K |████████████████████████████████| 1.0MB 4.6MB/s
[?25hCollecting sacremoses
[?25l Downloading https://files.pythonhosted.org/packages/7d/34/09d19aff26edcc8eb2a01bed8e98f13a1537005d31e95233fd48216eed10/sacremoses-0.0.43.tar.gz (883kB)
[K |████████████████████████████████| 890kB 23.6MB/s
[?25hRequirement already satisfied: requests in /usr/local/lib/python3.6/dist-packages (from transformers) (2.23.0)
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.6/dist-packages (from transformers) (2019.12.20)
Requirement already satisfied: dataclasses; python_version < "3.7" in /usr/local/lib/python3.6/dist-packages (from transformers) (0.7)
Requirement already satisfied: filelock in /usr/local/lib/python3.6/dist-packages (from transformers) (3.0.12)
Requirement already satisfied: packaging in /usr/local/lib/python3.6/dist-packages (from transformers) (20.4)
Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.6/dist-packages (from transformers) (4.41.1)
Collecting sentencepiece!=0.1.92
[?25l Downloading https://files.pythonhosted.org/packages/d4/a4/d0a884c4300004a78cca907a6ff9a5e9fe4f090f5d95ab341c53d28cbc58/sentencepiece-0.1.91-cp36-cp36m-manylinux1_x86_64.whl (1.1MB)
[K |████████████████████████████████| 1.1MB 32.5MB/s
[?25hCollecting tokenizers==0.8.1.rc2
[?25l Downloading https://files.pythonhosted.org/packages/80/83/8b9fccb9e48eeb575ee19179e2bdde0ee9a1904f97de5f02d19016b8804f/tokenizers-0.8.1rc2-cp36-cp36m-manylinux1_x86_64.whl (3.0MB)
[K |████████████████████████████████| 3.0MB 35.2MB/s
[?25hRequirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from transformers) (1.18.5)
Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from sacremoses->transformers) (1.15.0)
Requirement already satisfied: click in /usr/local/lib/python3.6/dist-packages (from sacremoses->transformers) (7.1.2)
Requirement already satisfied: joblib in /usr/local/lib/python3.6/dist-packages (from sacremoses->transformers) (0.16.0)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests->transformers) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.6/dist-packages (from requests->transformers) (2020.6.20)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests->transformers) (2.10)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.6/dist-packages (from requests->transformers) (1.24.3)
Requirement already satisfied: pyparsing>=2.0.2 in /usr/local/lib/python3.6/dist-packages (from packaging->transformers) (2.4.7)
Building wheels for collected packages: sacremoses
Building wheel for sacremoses (setup.py) ... [?25l[?25hdone
Created wheel for sacremoses: filename=sacremoses-0.0.43-cp36-none-any.whl size=893257 sha256=79007c1c1f53169cbfb3b49aa3d9a07bdbd6418a278bbd84f34db9208b8236a8
Stored in directory: /root/.cache/pip/wheels/29/3c/fd/7ce5c3f0666dab31a50123635e6fb5e19ceb42ce38d4e58f45
Successfully built sacremoses
Installing collected packages: sacremoses, sentencepiece, tokenizers, transformers
Successfully installed sacremoses-0.0.43 sentencepiece-0.1.91 tokenizers-0.8.1rc2 transformers-3.2.0
from transformers import *
#from google.colab import files
#uploaded = files.upload()
import torch
# from pytorch_transformers import XLNetTokenizer,XLNetForSequenceClassification
# from pytorch_transformers import XLNetTokenizer
from transformers import XLNetForSequenceClassification
from sklearn.model_selection import train_test_split
#from pytorch_transformers import AdamW
import matplotlib.pyplot as plt
from keras.preprocessing.sequence import pad_sequences
from torch.utils.data import TensorDataset,DataLoader,RandomSampler,SequentialSampler
Dataset 불러오기 및 샘플 데이터 확인
fd = pd.read_csv(PATH,sep='\t')
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
fd.head()
So there is no way for me to plug it in here in the US unless I go by a converter. | 0 | |
---|---|---|
0 | Good case, Excellent value. | 1 |
1 | Great for the jawbone. | 1 |
2 | Tied to charger for conversations lasting more... | 0 |
3 | The mic is great. | 1 |
4 | I have to jiggle the plug to get it to line up... | 0 |
데이터셋 전처리
fd.columns = ['sentence','value']
fd.head()
sentence | value | |
---|---|---|
0 | Good case, Excellent value. | 1 |
1 | Great for the jawbone. | 1 |
2 | Tied to charger for conversations lasting more... | 0 |
3 | The mic is great. | 1 |
4 | I have to jiggle the plug to get it to line up... | 0 |
sentences = []
for sentence in fd['sentence']:
sentence = sentence+"[SEP] [CLS]"
sentences.append(sentence)
sentences[0:3]
['Good case, Excellent value.[SEP] [CLS]',
'Great for the jawbone.[SEP] [CLS]',
'Tied to charger for conversations lasting more than 45 minutes.MAJOR PROBLEMS!![SEP] [CLS]']
Section 3. 학습 및 분류(prediction) 정확도 출력
- XLNet 사전학습 모델을 기본으로한 tokenizer를 이용하여 텍스트 정보를 token으로 전환한다. 이때 XLNet's vocabulary 셋트를 이용하였다.
- 정수로 조합된 token의 정보들이 인덱스 정보로 저장되어 모델 학습 및 테스트용으로 사용된다.
Load pre-trained model tokenizer
tokenizer = XLNetTokenizer.from_pretrained('xlnet-base-cased',do_lower_case=True) #xlnet-base-cased
tokenized_text = [tokenizer.tokenize(sent) for sent in sentences]
HBox(children=(FloatProgress(value=0.0, description='Downloading', max=798011.0, style=ProgressStyle(descripti…
sample text
tokenized_text[100]
['▁buyer',
'▁be',
'ware',
',',
'▁you',
'▁could',
'▁flush',
'▁money',
'▁right',
'▁down',
'▁the',
'▁toilet',
'.',
'[',
's',
'ep',
']',
'▁[',
'cl',
's',
']']
ids = [tokenizer.convert_tokens_to_ids(x) for x in tokenized_text]
print(ids[100])
labels = fd['value'].values
print(labels[100])
[8689, 39, 3676, 19, 44, 121, 13179, 356, 203, 151, 18, 8976, 9, 10849, 23, 3882, 3158, 4145, 11974, 23, 3158]
0
Token 길이 통일을 위한 Padding - 가장 긴 길이의 token을 기준으로 나머지 token의 길이를 맞춰줌
max1 = len(ids[0])
for i in ids:
if(len(i)>max1):
max1=len(i)
print(max1)
MAX_LEN = max1
54
패딩(padding) 수행
#input_ids2 = pad_sequences(ids,maxlen=MAX_LEN,dtype="long",truncating="post",padding="post")
import numpy as np
input_ids2 = np.load('drive/My Drive/Colab Notebooks/ids_93.npy')
학습과 테스트를 위한 데이터 분류작업을 진행한다.
#xtrain,xtest,ytrain,ytest = train_test_split(input_ids2,labels,test_size=0.15)
xtrain = np.load('drive/My Drive/Colab Notebooks/xtrain_95_bs3_ep7.npy')
ytrain = np.load('drive/My Drive/Colab Notebooks/ytrain_95_bs3_ep7.npy')
xtest = np.load('drive/My Drive/Colab Notebooks/xtest_95_bs3_ep7.npy')
ytest = np.load('drive/My Drive/Colab Notebooks/ytest_95_bs3_ep7.npy')
ytest[0:10]
array([0, 1, 0, 0, 1, 0, 0, 1, 0, 1])
# np.save('drive/My Drive/Colab Notebooks/ids_93.npy', input_ids2) # x_save.npy
# 동일한 index를 이용하여 다시 테스트하기 위한 index저장 코드
PATH2 = "drive/My Drive/Colab Notebooks/model/XLnet-based-Custom-96"
Xtrain = torch.tensor(xtrain)
Ytrain = torch.tensor(ytrain)
Xtest = torch.tensor(xtest)
Ytest = torch.tensor(ytest)
# np.save('drive/My Drive/Colab Notebooks/xtrain_93_bs3_ep7.npy', xtrain) # x_save.npy
# np.save('drive/My Drive/Colab Notebooks/ytrain_93_bs3_ep7.npy', ytrain) # x_save.npy
# np.save('drive/My Drive/Colab Notebooks/ytest_93_bs3_ep7.npy', ytest) # x_save.npy
# np.save('drive/My Drive/Colab Notebooks/xtest_93_bs3_ep7.npy', xtest) # x_save.npy
사전 학습 모델 load 및 학습 설정 및 학습진행
한번의 iteration에 들어갈 데이터의 양(bs)과, 학습할 레이어의 개수와 얼마나 학습할 지(epoch)를 선택한다. batch_size :(default) 3
batch_size = 5 #3
no_train = 0
epochs = 5 # 3
train_data = TensorDataset(Xtrain,Ytrain)
test_data = TensorDataset(Xtest,Ytest)
loader = DataLoader(train_data,batch_size=batch_size)
test_loader = DataLoader(test_data,batch_size=batch_size)
사전 학습된 모델로 성능 테스트 진행
def flat_accuracy(preds,labels): # A function to predict Accuracy
correct=0
for i in range(0,len(labels)):
if(preds[i]==labels[i]):
correct+=1
return (correct/len(labels))*100
model = XLNetForSequenceClassification.from_pretrained("xlnet-base-cased",num_labels=2) # xlnet-base(large)-cased
torch.cuda.empty_cache()
model.cuda()
optimizer = AdamW(model.parameters(),lr=2e-5) # initial learning rate 0.00002
checkpoint = torch.load(PATH2)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
epoch = checkpoint['epoch']
loss = checkpoint['loss']
model.eval()
acc = []
lab = []
t = 0
for inp,lab1 in test_loader:
inp.to(device)
lab1.to(device)
t+=lab1.size(0)
outp1 = model(inp.to(device))
[acc.append(p1.item()) for p1 in torch.argmax(outp1[0],axis=1).flatten() ]
[lab.append(z1.item()) for z1 in lab1]
print("Total Examples : {} Accuracy {}".format(t,flat_accuracy(acc,lab)))
/usr/local/lib/python3.6/dist-packages/transformers/configuration_xlnet.py:211: FutureWarning: This config doesn't use attention memories, a core feature of XLNet. Consider setting `men_len` to a non-zero value, for example `xlnet = XLNetLMHeadModel.from_pretrained('xlnet-base-cased'', mem_len=1024)`, for accurate training performance as well as an order of magnitude faster inference. Starting from version 3.5.0, the default parameter will be 1024, following the implementation in https://arxiv.org/abs/1906.08237
FutureWarning,
Some weights of the model checkpoint at xlnet-base-cased were not used when initializing XLNetForSequenceClassification: ['lm_loss.weight', 'lm_loss.bias']
- This IS expected if you are initializing XLNetForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing XLNetForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of XLNetForSequenceClassification were not initialized from the model checkpoint at xlnet-base-cased and are newly initialized: ['sequence_summary.summary.weight', 'sequence_summary.summary.bias', 'logits_proj.weight', 'logits_proj.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Total Examples : 150 Accuracy 96.66666666666667
Section 4. 학습과정 증명 자료
학습 모델을 다운로드 하고, 여기에 parameter를 설정하여 학습과정을 증명한다.
모델 초기화
model = XLNetForSequenceClassification.from_pretrained("xlnet-base-cased",num_labels=2) # xlnet-base(large)-cased
torch.cuda.empty_cache()
model.cuda()
/usr/local/lib/python3.6/dist-packages/transformers/configuration_xlnet.py:211: FutureWarning: This config doesn't use attention memories, a core feature of XLNet. Consider setting `men_len` to a non-zero value, for example `xlnet = XLNetLMHeadModel.from_pretrained('xlnet-base-cased'', mem_len=1024)`, for accurate training performance as well as an order of magnitude faster inference. Starting from version 3.5.0, the default parameter will be 1024, following the implementation in https://arxiv.org/abs/1906.08237
FutureWarning,
Some weights of the model checkpoint at xlnet-base-cased were not used when initializing XLNetForSequenceClassification: ['lm_loss.weight', 'lm_loss.bias']
- This IS expected if you are initializing XLNetForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing XLNetForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of XLNetForSequenceClassification were not initialized from the model checkpoint at xlnet-base-cased and are newly initialized: ['sequence_summary.summary.weight', 'sequence_summary.summary.bias', 'logits_proj.weight', 'logits_proj.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
XLNetForSequenceClassification(
(transformer): XLNetModel(
(word_embedding): Embedding(32000, 768)
(layer): ModuleList(
(0): XLNetLayer(
(rel_attn): XLNetRelativeAttention(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(ff): XLNetFeedForward(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(layer_1): Linear(in_features=768, out_features=3072, bias=True)
(layer_2): Linear(in_features=3072, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): XLNetLayer(
(rel_attn): XLNetRelativeAttention(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(ff): XLNetFeedForward(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(layer_1): Linear(in_features=768, out_features=3072, bias=True)
(layer_2): Linear(in_features=3072, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(dropout): Dropout(p=0.1, inplace=False)
)
(2): XLNetLayer(
(rel_attn): XLNetRelativeAttention(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(ff): XLNetFeedForward(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(layer_1): Linear(in_features=768, out_features=3072, bias=True)
(layer_2): Linear(in_features=3072, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(dropout): Dropout(p=0.1, inplace=False)
)
(3): XLNetLayer(
(rel_attn): XLNetRelativeAttention(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(ff): XLNetFeedForward(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(layer_1): Linear(in_features=768, out_features=3072, bias=True)
(layer_2): Linear(in_features=3072, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(dropout): Dropout(p=0.1, inplace=False)
)
(4): XLNetLayer(
(rel_attn): XLNetRelativeAttention(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(ff): XLNetFeedForward(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(layer_1): Linear(in_features=768, out_features=3072, bias=True)
(layer_2): Linear(in_features=3072, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(dropout): Dropout(p=0.1, inplace=False)
)
(5): XLNetLayer(
(rel_attn): XLNetRelativeAttention(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(ff): XLNetFeedForward(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(layer_1): Linear(in_features=768, out_features=3072, bias=True)
(layer_2): Linear(in_features=3072, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(dropout): Dropout(p=0.1, inplace=False)
)
(6): XLNetLayer(
(rel_attn): XLNetRelativeAttention(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(ff): XLNetFeedForward(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(layer_1): Linear(in_features=768, out_features=3072, bias=True)
(layer_2): Linear(in_features=3072, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(dropout): Dropout(p=0.1, inplace=False)
)
(7): XLNetLayer(
(rel_attn): XLNetRelativeAttention(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(ff): XLNetFeedForward(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(layer_1): Linear(in_features=768, out_features=3072, bias=True)
(layer_2): Linear(in_features=3072, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(dropout): Dropout(p=0.1, inplace=False)
)
(8): XLNetLayer(
(rel_attn): XLNetRelativeAttention(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(ff): XLNetFeedForward(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(layer_1): Linear(in_features=768, out_features=3072, bias=True)
(layer_2): Linear(in_features=3072, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(dropout): Dropout(p=0.1, inplace=False)
)
(9): XLNetLayer(
(rel_attn): XLNetRelativeAttention(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(ff): XLNetFeedForward(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(layer_1): Linear(in_features=768, out_features=3072, bias=True)
(layer_2): Linear(in_features=3072, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(dropout): Dropout(p=0.1, inplace=False)
)
(10): XLNetLayer(
(rel_attn): XLNetRelativeAttention(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(ff): XLNetFeedForward(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(layer_1): Linear(in_features=768, out_features=3072, bias=True)
(layer_2): Linear(in_features=3072, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(dropout): Dropout(p=0.1, inplace=False)
)
(11): XLNetLayer(
(rel_attn): XLNetRelativeAttention(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(ff): XLNetFeedForward(
(layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(layer_1): Linear(in_features=768, out_features=3072, bias=True)
(layer_2): Linear(in_features=3072, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(dropout): Dropout(p=0.1, inplace=False)
)
(sequence_summary): SequenceSummary(
(summary): Linear(in_features=768, out_features=768, bias=True)
(first_dropout): Identity()
(last_dropout): Dropout(p=0.1, inplace=False)
)
(logits_proj): Linear(in_features=768, out_features=2, bias=True)
)
optimizer = AdamW(model.parameters(),lr=2e-5) # initial learning rate 0.00002
import torch.nn as nn
criterion = nn.CrossEntropyLoss()
# def flat_accuracy(preds,labels): # A function to predict Accuracy
# correct=0
# for i in range(0,len(labels)):
# if(preds[i]==labels[i]):
# correct+=1
# return (correct/len(labels))*100
Transfer learning을 시작한다.
for epoch in range(epochs):
model.train()
loss1 = []
steps = 0
train_loss = []
l = []
for inputs,labels1 in loader :
inputs.to(device)
labels1.to(device)
optimizer.zero_grad()
outputs = model(inputs.to(device))
loss = criterion(outputs[0],labels1.to(device)).to(device)
logits = outputs[1]
#ll=outp(loss)
[train_loss.append(p.item()) for p in torch.argmax(outputs[0],axis=1).flatten() ]#our predicted
[l.append(z.item()) for z in labels1]# real labels
loss.backward()
optimizer.step()
loss1.append(loss.item())
no_train += inputs.size(0)
steps += 1
print("Current Loss is : {} Step is : {} number of Example : {} Accuracy : {}".format(loss.item(),epoch,no_train,flat_accuracy(train_loss,l)))
Current Loss is : 0.6339799165725708 Step is : 0 number of Example : 849 Accuracy : 51.9434628975265
Current Loss is : 0.8297128677368164 Step is : 1 number of Example : 1698 Accuracy : 86.45465253239105
Current Loss is : 0.020124394446611404 Step is : 2 number of Example : 2547 Accuracy : 91.51943462897526
Current Loss is : 0.007085510529577732 Step is : 3 number of Example : 3396 Accuracy : 97.1731448763251
Current Loss is : 0.005526750348508358 Step is : 4 number of Example : 4245 Accuracy : 98.70435806831567
테스트 결과
model.eval()#Testing our Model
acc = []
lab = []
t = 0
for inp,lab1 in test_loader:
inp.to(device)
lab1.to(device)
t+=lab1.size(0)
outp1 = model(inp.to(device))
[acc.append(p1.item()) for p1 in torch.argmax(outp1[0],axis=1).flatten() ]
[lab.append(z1.item()) for z1 in lab1]
print("Total Examples : {} Accuracy {}".format(t,flat_accuracy(acc,lab)))
Total Examples : 150 Accuracy 96.0
#!pip freeze > 'drive/My Drive/Colab Notebooks/requirement.txt'
테스트 결과 모델 저장
PATH2 = "drive/My Drive/Colab Notebooks/model/XLnet-based-Custom-96"
# torch.save(model.state_dict(), PATH2)
# torch.save({
# 'epoch': epoch,
# 'model_state_dict': model.state_dict(),
# 'optimizer_state_dict': optimizer.state_dict(),
# 'loss': loss,
# }, PATH2)
학습된 모델 재구동 테스트
# model = TheModelClass(*args, **kwargs)
# optimizer = TheOptimizerClass(*args, **kwargs)
checkpoint = torch.load(PATH2)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
epoch = checkpoint['epoch']
loss = checkpoint['loss']
model.eval()
acc = []
lab = []
t = 0
for inp,lab1 in test_loader:
inp.to(device)
lab1.to(device)
t+=lab1.size(0)
outp1 = model(inp.to(device))
[acc.append(p1.item()) for p1 in torch.argmax(outp1[0],axis=1).flatten() ]
[lab.append(z1.item()) for z1 in lab1]
print("Total Examples : {} Accuracy {}".format(t,flat_accuracy(acc,lab)))
Total Examples : 150 Accuracy 96.66666666666667