11 프롬프트 엔지니어링 in 쿼토

구조화된 프롬프트 관리

쿼토 환경에서 프롬프트를 체계적으로 관리하고 재사용하는 방법을 배워봅시다. 단순한 대화형 사용을 넘어 재현가능하고 확장 가능한 프롬프트 시스템을 구축해보세요.

11.1 구조화된 프롬프트 관리

11.1.1 프롬프트 라이브러리 구축

쿼토 프로젝트에서 프롬프트를 체계적으로 관리하는 디렉토리 구조:

project/
├── prompts/
│   ├── academic/
│   │   ├── literature_review.yml
│   │   ├── methodology.yml
│   │   ├── discussion.yml
│   │   └── conclusion.yml
│   ├── analysis/
│   │   ├── data_summary.yml
│   │   ├── visualization.yml
│   │   └── interpretation.yml
│   └── writing/
│       ├── translate.yml
│       ├── proofread.yml
│       └── summarize.yml
├── _quarto.yml
└── chapters/
    ├── chapter1.qmd
    └── chapter2.qmd

11.1.2 프롬프트 템플릿 예제

# prompts/academic/literature_review.yml
name: "문헌 리뷰 생성"
version: "1.2.0"
author: "이광춘"
created: "2024-08-01"
updated: "2024-08-26"

description: |
  학술 논문의 문헌 리뷰 섹션을 생성하는 프롬프트입니다.
  제공된 논문 리스트를 바탕으로 체계적인 리뷰를 작성합니다.

prompt_template: |
  당신은 {field} 분야의 전문 연구자입니다. 
  
  다음 논문들을 바탕으로 문헌 리뷰를 작성해주세요:
  
  {paper_list}
  
  리뷰 구성:
  1. **현재 연구 동향**: 최근 3년간의 주요 연구 흐름
  2. **방법론적 발전**: 새롭게 도입된 연구 방법들
  3. **핵심 발견**: 분야에 중요한 영향을 미친 결과들
  4. **연구 공백**: 아직 충분히 탐구되지 않은 영역들
  5. **미래 방향**: 향후 연구가 나아가야 할 방향
  
  각 섹션은 300-400 단어로 작성하고, 적절한 인용을 포함해주세요.
  학술적 톤을 유지하되 명확하고 이해하기 쉽게 작성해주세요.

variables:
  - name: field
    type: string
    required: true
    description: "연구 분야 (예: 데이터 사이언스, 생명공학)"
  
  - name: paper_list
    type: text
    required: true  
    description: "리뷰할 논문들의 목록 (제목, 저자, 핵심 내용 포함)"

model_config:
  model: "gpt-4"
  temperature: 0.7
  max_tokens: 2000
  top_p: 0.9

quality_criteria:
  - 학술적 정확성
  - 논리적 구조
  - 인용의 적절성
  - 객관적 톤
  - 포괄적 커버리지

tags: ["academic", "literature-review", "research", "writing"]

11.1.3 템플릿과 변수 활용

쿼토 문서에서 프롬프트 템플릿 사용:

# 프롬프트 로딩 함수
load_prompt <- function(prompt_file, variables = list()) {
  
  prompt_config <- yaml::read_yaml(prompt_file)
  
  # 변수 치환
  prompt_text <- prompt_config$prompt_template
  for (var_name in names(variables)) {
    pattern <- paste0("\\{", var_name, "\\}")
    prompt_text <- gsub(pattern, variables[[var_name]], prompt_text)
  }
  
  return(list(
    prompt = prompt_text,
    config = prompt_config$model_config,
    metadata = list(
      name = prompt_config$name,
      version = prompt_config$version,
      file = prompt_file,
      timestamp = Sys.time()
    )
  ))
}

# 사용 예제
lit_review_prompt <- load_prompt(
  "prompts/academic/literature_review.yml",
  variables = list(
    field = "데이터 사이언스",
    paper_list = "
    1. Smith et al. (2024): Machine Learning in Healthcare
    2. Johnson et al. (2023): Deep Learning Applications  
    3. Kim et al. (2024): AI Ethics in Practice
    "
  )
)

11.2 코드 청크 내 AI 통합

11.2.1 쿼토 확장을 통한 AI 통합

import pandas as pd
import matplotlib.pyplot as plt

# 데이터 로드
df = pd.read_csv("data/mtcars.csv")

# 기본 통계
summary_stats = df.describe()
print("데이터셋 기본 통계:")
print(summary_stats)

# AI가 이 통계를 해석하여 텍스트 생성
# (실제 구현에서는 쿼토 확장이 이를 처리)

11.2.2 인라인 AI 호출

연구 결과에 따르면 `{python} ai_summarize(results, style="academic")` 
이는 기존 연구 `{python} ai_cite(topic="fuel_efficiency", year=2024)`와 
일치하는 경향을 보여준다.

11.2.3 메타데이터 기반 AI 제어

---
title: "AI 보조 데이터 분석 보고서"
format: html

ai:
  default_model: "gpt-4"
  default_temperature: 0.7
  cache_enabled: true
  
  prompt_library: "prompts/"
  
  analysis_config:
    style: "academic"
    length: "medium"  
    include_caveats: true
    
  quality_checks:
    fact_verification: true
    citation_check: true
    consistency_check: true
---

11.3 멀티모달 콘텐츠 생성

11.3.1 텍스트 + 이미지 생성 파이프라인

def create_multimodal_content(topic, data_path):
    """텍스트와 이미지를 함께 생성하는 파이프라인"""
    
    # 1. 데이터 분석
    df = pd.read_csv(data_path)
    analysis_result = perform_analysis(df)
    
    # 2. 시각화 생성
    fig, ax = plt.subplots(figsize=(10, 6))
    plot_result = create_visualization(analysis_result, ax)
    plt.savefig(f"images/{topic}_analysis.png", dpi=300, bbox_inches='tight')
    
    # 3. AI 텍스트 생성
    prompt = f"""
    다음 데이터 분석 결과를 바탕으로 보고서 텍스트를 작성해주세요:
    
    분석 주제: {topic}
    주요 발견: {analysis_result.summary}
    통계 수치: {analysis_result.stats}
    
    첨부된 그래프를 참조하여 설명해주세요.
    """
    
    ai_text = generate_ai_content(prompt, include_image=f"images/{topic}_analysis.png")
    
    # 4. 마크다운 문서 생성
    markdown_content = f"""
    ## {topic} 분석 결과
    
    {ai_text}
    
    ![분석 결과]({topic}_analysis.png){{#fig-analysis}}
    
    @fig-analysis 에서 보는 바와 같이...
    """
    
    return markdown_content

11.3.2 코드 + 설명 자동 생성

# 복잡한 데이터 처리 코드
library(dplyr)
library(ggplot2)

processed_data <- raw_data %>%
  filter(!is.na(value)) %>%
  group_by(category, year) %>%
  summarise(
    mean_value = mean(value, na.rm = TRUE),
    median_value = median(value, na.rm = TRUE),
    std_dev = sd(value, na.rm = TRUE),
    count = n(),
    .groups = "drop"
  ) %>%
  mutate(
    cv = std_dev / mean_value,  # 변동계수
    category = factor(category, levels = custom_order)
  )

# AI가 이 코드에 대한 단계별 설명을 생성

11.3.3 시각화 + 내러티브 통합

import seaborn as sns
import matplotlib.pyplot as plt

# 시각화 생성
plt.figure(figsize=(12, 8))
sns.scatterplot(data=df, x='hp', y='mpg', hue='cyl', size='wt')
plt.title('자동차 성능 분석: 마력 vs 연비')
plt.xlabel('마력 (Horse Power)')
plt.ylabel('연비 (Miles per Gallon)')

# AI가 이 그래프에 대한 내러티브 생성
plt.show()

11.4 재사용 가능한 프롬프트 모듈

11.4.1 함수형 프롬프트 설계

# prompts/modules/citation.yml
name: "인용 생성 모듈"
type: "function"

prompt_template: |
  다음 정보를 바탕으로 {citation_style} 스타일의 인용을 생성해주세요:
  
  저자: {authors}
  제목: {title}
  저널: {journal}
  년도: {year}
  페이지: {pages}
  DOI: {doi}

variables:
  citation_style:
    type: "enum"
    options: ["APA", "MLA", "Chicago", "IEEE"]
    default: "APA"

# prompts/modules/data_description.yml  
name: "데이터 기술 모듈"
type: "function"

prompt_template: |
  다음 데이터셋을 {audience} 수준에서 설명해주세요:
  
  데이터 형태: {data_shape}
  변수 목록: {variables}
  결측값: {missing_info}
  특이사항: {notes}
  
  설명 포함사항:
  - 데이터의 출처와 수집 방법
  - 각 변수의 의미와 범위
  - 데이터 품질과 한계점
  - 분석 시 고려사항

variables:
  audience:
    type: "enum"
    options: ["일반인", "연구자", "정책입안자"]
    default: "연구자"

11.4.2 모듈 조합하기

# 모듈식 프롬프트 사용
create_data_report <- function(data, title, audience = "연구자") {
  
  # 모듈 1: 데이터 기술
  data_desc <- use_prompt_module(
    "prompts/modules/data_description.yml",
    variables = list(
      audience = audience,
      data_shape = paste(dim(data), collapse = " × "),
      variables = paste(names(data), collapse = ", "),
      missing_info = get_missing_summary(data)
    )
  )
  
  # 모듈 2: 통계 요약
  stats_summary <- use_prompt_module(
    "prompts/modules/statistical_summary.yml", 
    variables = list(
      summary_stats = summary(data),
      data_types = sapply(data, class)
    )
  )
  
  # 모듈 3: 시각화 설명
  viz_desc <- use_prompt_module(
    "prompts/modules/visualization_narrative.yml",
    variables = list(
      plot_path = create_summary_plots(data),
      key_patterns = identify_patterns(data)
    )
  )
  
  # 최종 보고서 조합
  report <- glue::glue("""
  # {title}
  
  ## 데이터 개요
  {data_desc$content}
  
  ## 통계적 요약  
  {stats_summary$content}
  
  ## 시각화 분석
  {viz_desc$content}
  
  ---
  *이 보고서는 AI 도구의 도움으로 생성되었습니다.*
  """)
  
  return(report)
}

11.5 실습: 프롬프트 라이브러리 구축

11.5.1 1단계: 기본 디렉토리 구조 생성

# 프로젝트 루트에서 실행
mkdir -p prompts/{academic,analysis,writing,modules}
mkdir -p cache/ai_responses
mkdir -p logs/ai_usage

11.5.2 2단계: 설정 파일 생성

# .ai-config.yml
project_name: "my-research-project"
version: "1.0.0"

models:
  primary: "gpt-4"
  fallback: "claude-3-opus"
  
prompt_library: "prompts/"
cache_directory: "cache/ai_responses/"
log_directory: "logs/ai_usage/"

default_settings:
  temperature: 0.7
  max_tokens: 2000
  cache_enabled: true
  human_review_required: true

quality_thresholds:
  minimum_confidence: 0.8
  factcheck_required: true
  citation_verify: true

11.5.3 3단계: 프롬프트 검증 시스템

# scripts/validate_prompts.py
import yaml
import os
from pathlib import Path

def validate_prompt_file(prompt_path):
    """프롬프트 파일의 형식과 내용을 검증"""
    
    try:
        with open(prompt_path, 'r', encoding='utf-8') as f:
            prompt_data = yaml.safe_load(f)
    except yaml.YAMLError as e:
        return {"valid": False, "error": f"YAML 파싱 오류: {e}"}
    
    # 필수 필드 확인
    required_fields = ["name", "prompt_template", "model_config"]
    missing_fields = [field for field in required_fields if field not in prompt_data]
    
    if missing_fields:
        return {"valid": False, "error": f"필수 필드 누락: {missing_fields}"}
    
    # 변수 검증
    if "variables" in prompt_data:
        for var in prompt_data["variables"]:
            if "name" not in var or "type" not in var:
                return {"valid": False, "error": f"변수 정의 불완전: {var}"}
    
    return {"valid": True, "metadata": prompt_data}

def validate_all_prompts():
    """모든 프롬프트 파일 검증"""
    prompt_dir = Path("prompts")
    results = {}
    
    for prompt_file in prompt_dir.rglob("*.yml"):
        result = validate_prompt_file(prompt_file)
        results[str(prompt_file)] = result
    
    return results

if __name__ == "__main__":
    validation_results = validate_all_prompts()
    
    valid_count = sum(1 for r in validation_results.values() if r["valid"])
    total_count = len(validation_results)
    
    print(f"프롬프트 검증 완료: {valid_count}/{total_count} 통과")
    
    for file_path, result in validation_results.items():
        if not result["valid"]:
            print(f"❌ {file_path}: {result['error']}")

11.5.4 4단계: 쿼토 확장 생성

-- _extensions/ai-assistant/ai-assistant.lua
function generate_ai_content(prompt_file, variables)
  -- 프롬프트 로딩
  local prompt_config = load_prompt_config(prompt_file)
  
  -- 변수 치환
  local filled_prompt = substitute_variables(prompt_config.prompt_template, variables)
  
  -- AI API 호출
  local response = call_ai_api(filled_prompt, prompt_config.model_config)
  
  -- 결과 캐싱
  cache_response(response, prompt_file, variables)
  
  -- 메타데이터 기록
  log_ai_usage(prompt_file, response.token_usage)
  
  return response.content
end

-- 쿼토 필터로 등록
return {
  {
    CodeBlock = function(elem)
      if elem.classes:includes("ai-generate") then
        local prompt_file = elem.attributes["ai-prompt-file"]
        local variables = elem.attributes["ai-variables"]
        
        if prompt_file then
          local ai_content = generate_ai_content(prompt_file, variables)
          return pandoc.CodeBlock(ai_content)
        end
      end
      return elem
    end
  }
}

11.6 품질 관리와 모니터링

11.6.1 프롬프트 성능 추적

# 프롬프트 성능 모니터링
track_prompt_performance <- function(prompt_file, response, human_rating = NULL) {
  
  log_entry <- list(
    timestamp = Sys.time(),
    prompt_file = prompt_file,
    model = response$model,
    tokens_used = response$usage$total_tokens,
    response_time = response$response_time,
    cost = calculate_cost(response$usage),
    human_rating = human_rating,
    auto_quality_score = calculate_quality_score(response$content)
  )
  
  # 로그 파일에 추가
  write.table(
    log_entry, 
    "logs/prompt_performance.csv", 
    append = TRUE,
    sep = ",",
    row.names = FALSE,
    col.names = !file.exists("logs/prompt_performance.csv")
  )
}

# 성능 분석 대시보드
generate_performance_dashboard <- function() {
  
  performance_data <- read.csv("logs/prompt_performance.csv")
  
  # 프롬프트별 성능 비교
  prompt_summary <- performance_data %>%
    group_by(prompt_file) %>%
    summarise(
      avg_quality = mean(auto_quality_score, na.rm = TRUE),
      avg_cost = mean(cost, na.rm = TRUE),
      usage_count = n(),
      .groups = "drop"
    )
  
  # 시각화
  p1 <- ggplot(prompt_summary, aes(x = avg_cost, y = avg_quality)) +
    geom_point(aes(size = usage_count)) +
    geom_text(aes(label = basename(prompt_file)), vjust = -0.5) +
    labs(title = "프롬프트 비용 대비 품질", 
         x = "평균 비용", y = "평균 품질 점수")
  
  ggsave("reports/prompt_performance_dashboard.png", p1)
  
  return(prompt_summary)
}

11.7 다음 단계

다음 장에서는 이렇게 구축한 프롬프트 시스템을 실제 AI 워크플로우에 통합하는 방법을 살펴보겠습니다. 실시간 협업, 자동화된 품질 검사, 그리고 지속적인 개선 사이클까지 포함하는 완전한 AI 어시스턴트 시스템을 구축해보세요.

실습 과제

자주 사용하는 AI 작업 중 하나를 선택하여 재사용 가능한 프롬프트 템플릿으로 만들어보세요. 변수 치환, 품질 기준, 그리고 성능 추적 기능까지 포함시키는 것이 목표입니다.

고급 팁

프롬프트 라이브러리가 커질수록 프롬프트 간의 의존성과 조합 가능성을 고려해야 합니다. 모듈화된 설계로 시작하여 점진적으로 복잡한 워크플로우를 구축하는 것이 좋습니다.