[실전 LLM 파인튜닝] Day7 Fine Tuning, 3.4

티스토리 뷰

Books

[실전 LLM 파인튜닝] Day7 Fine Tuning, 3.4

juniz 2025. 1. 9. 00:18

3.4 단일 GPU를 활용한 Gemma-2B-it 파인튜닝

Fine tune Gemma-2B-it

3.4.1 Setup Runpod

3.4.2 Model Preparation

Create hugging face account
Create a Token
Use Model : https://huggingface.co/google/gemma-2b-it
Run the code
# Load a pretrained causal language model (e.g., GPT) using the Hugging Face Transformers library model = AutoModelForCausalLM.from_pretrained( model_name, # The name or path of the pretrained model (e.g., "gpt2", "EleutherAI/gpt-neo"). use_cache=False, # Disable caching of past key-value pairs during inference to save memory. device_map="auto", # Automatically map the model to the best available devices (e.g., GPU, CPU). torch_dtype=torch.bfloat16, # Use bfloat16 precision to reduce memory usage while maintaining performance. low_cpu_mem_usage=True, # Optimize CPU memory usage during model loading, helpful for large models. attn_implementation="eager", # Use eager execution for attention computation for debugging or compatibility. ) # Load the tokenizer that corresponds to the specified model tokenizer = AutoTokenizer.from_pretrained( model_name # The same name or path as the model to ensure compatibility. )

3.4.3 Prepare Dataset

3.4.4 Geamma Model 기능 확인

Keyword Extraction

def change_inference_chat_format(input_text):
    # The function returns a list formatted as a conversation exchange
    # between a 'user' and an 'assistant'. This is commonly used in AI chat models
    # that require structured input to simulate conversational interactions.
    return [
        {"role": "user", "content": f"{input_text}"},  # The user's initial input is set dynamically.
        {"role": "assistant", "content": """부산의 한 왕복 2차선 도로에서 역주행 사고로 배달 오토바이 운전자인 고등학생이 숨지는 사고가 발생했다.
         유족은 '가해자가 사고 후 곧바로 신고하지 않고 늑장 대응해 피해를 키웠다'고 주장하고 있다."""},
        # The assistant provides a preset response (possibly an article or statement).
        {"role": "user", "content": "중요한 키워드 5개를 뽑아주세요."},  # User asks for 5 key keywords from the assistant.
        {"role": "assistant", "content": ""}  # The assistant's next response is left empty, to be generated dynamically.
    ]

# Initialize the conversation prompt using the input text.
prompt = change_inference_chat_format(input_text)

# Tokenizer is applied to convert the conversation into a format suitable for model input.
inputs = tokenizer.apply_chat_template(
    prompt,  # Provide the chat exchange structure.
    tokenize=True,  # Tokenize the text for the model.
    add_generation_prompt=True,  # Add additional prompts to guide the model's response generation.
    return_tensors="pt"  # Return the input in PyTorch tensor format.
).to(model.device)  # Ensure the input is moved to the model's device (e.g., GPU).

# Generate the assistant's response using the model.
outputs = model.generate(
    input_ids=inputs.to(model.device),  # Input tensor for the model.
    max_new_tokens=256  # Limit the maximum number of tokens in the generated response.
)

# Decode the model's output and print it, skipping any special tokens like <|endoftext|>.
print(tokenizer.decode(
    outputs[0],  # The first (and only) generated response.
    skip_special_tokens=True  # Remove special tokens from the output.
))

Data Summary

# Input_text : 위 정의된 기사와 동일 

def change_inference_chat_format(input_text):
    return [
    {"role": "user", "content": f"{input_text}"},
    {"role": "assistant", "content": "한국어 요약:\n"}
    ]

# chat template 적용
prompt = change_inference_chat_format(input_text)

# 생성
inputs = tokenizer.apply_chat_template(prompt, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=256, use_cache=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

저작자표시 동일조건

'Books' 카테고리의 다른 글

[실전 LLM 파인튜닝] Day9 Chapter 03 Fine Tuning, 3.4.8 - 3.4.11 (0)	2025.01.14
[실전 LLM 파인튜닝] Day8 Chapter 03 Fine Tuning, 3.4.5 - 3.4.7 (0)	2025.01.11
[실전 LLM 파인튜닝] Day6 Fine Tuning, 3.3 (0)	2025.01.06
[실전 LLM 파인튜닝] Day5 Fine Tuning, 3.2(119p~131p) (4)	2025.01.04
[실전 LLM 파인튜닝] Day4 Fine Tuning (0)	2025.01.04

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

TAG more

« 2025/02 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

글 보관함

니즈 개발 일기

티스토리 뷰

[실전 LLM 파인튜닝] Day7 Fine Tuning, 3.4

3.4 단일 GPU를 활용한 Gemma-2B-it 파인튜닝

3.4.1 Setup Runpod

3.4.2 Model Preparation

3.4.3 Prepare Dataset

3.4.4 Geamma Model 기능 확인

'Books' 카테고리의 다른 글

티스토리툴바