25일차: 테디노트의 RAG 비법노트: 랭체인을 활용한 GPT부터 로컬 모델까지의 RAG 강의 후기 (Memory 2)

말하는 감자에요 2025. 9. 7. 15:25

2025. 9. 7. 15:25

728x90

오늘은 25일차 학습 기록입니다.

어제(24일차)에는 ConversationBufferMemory, ConversationBufferWindowMemory, ConversationTokenBufferMemory, ConversationEntityMemory 등 대화형 AI에서 가장 기본적인 메모리 방식들을 정리해 보았는데요.

오늘은 그 연장선으로, 한 단계 더 발전된 메모리 기법들을 살펴보려 합니다. 특히 대화 내용을 지식 그래프(Knowledge Graph) 형태로 구조화하는 ConversationKGMemory, 그리고 대화를 요약(summary) 으로 압축해 효율적으로 관리하는 ConversationSummaryMemory와 ConversationSummaryBufferMemory를 다뤄보겠습니다.

이 세 가지 메모리는 단순히 대화 로그를 쌓는 수준을 넘어, 장기 맥락 관리와 효율적인 토큰 사용이라는 실전적인 문제를 해결해 주는 방식들입니다.

1. ConversationKGMemory

📌 개념

ConversationKGMemory는 대화에서 나온 정보들을 삼중(triple) 형태 (주어 - 관계 - 객체)로 추출하여 지식 그래프 구조로 저장하는 메모리입니다.
단순히 대화 내용을 텍스트로 누적하는 것이 아니라,
👉 “누가 누구와 어떤 관계가 있는지”를 구조적으로 표현합니다.
예를 들어, 대화 중
- (앨리스, 동료, 밥)
- (앨리스, 상사, 찰리)
“앨리스는 밥의 동료이고, 찰리의 상사다.”
라는 문장이 들어오면, 다음과 같이 저장됩니다:

즉, 대화 맥락을 그래프 형태로 모델에게 제공할 수 있어, 사실/관계 기반 질의응답에 유용합니다.

코드로 살펴보겠습니다.

코드 예시 (LangChain)

from langchain_openai import ChatOpenAI
from langchain.memory import ConversationKGMemory

llm = ChatOpenAI(temperature=0)

memory = ConversationKGMemory(llm=llm, return_messages=True)

print(memory)

memory.save_context(
    {"input": "이쪽은 Panygo에 거주중인 김셜리씨 입니다."},
    {"output": "김셜리씨는 누구시죠?"}
)

memory.save_context(
    {"input": "김셜리씨는 우리 회사의 신입 개발자입니다."},
    {"output": "만나서 반갑삽니다."}
)

memory.load_memory_variables({"inputs": "김셜리씨는 누구?"})

출력 결과:

{'history': [SystemMessage(content='On Panygo: Panygo has resident 김셜리씨.', additional_kwargs={}, response_metadata={}),
SystemMessage(content='On 김셜리씨: 김셜리씨 is a 신입 개발자. 김셜리씨 is in 우리 회사.', additional_kwargs={}, response_metadata={})]}

이번에는 ConversationChain에 ConversationKGMemory를 메모리로 지정하여 대화를 나눈 후 memory를 확인해 보도록 하겠습니다.

from langchain.prompts.prompt import PromptTemplate
from langchain.chains import ConversationChain

llm = ChatOpenAI(temperature=0)

template = """The following is a friendly conversation between a human and an AI. 
The AI is talkative and provides lots of specific details from its context. 
If the AI does not know the answer to a question, it truthfully says it does not know. 
The AI ONLY uses information contained in the "Relevant Information" section and does not hallucinate.

Relevant Information:

{history}

Conversation:
Human: {input}
AI:"""
prompt = PromptTemplate(
    input_variables=["history", "input"], template=template)

conversation_with_kg = ConversationChain(
    llm=llm, prompt=prompt, memory=ConversationKGMemory(llm=llm)
)

conversation_with_kg.predict(
    input="My name is Teddy. Shirley is a coworker of mine, and she's a new designer at our company."
)

출력 결과:

"Hello Teddy! It's nice to meet you. Shirley must be excited to be starting a new job as a designer at your company. I hope she's settling in well and getting to know everyone. If you need any tips on how to make her feel welcome or help her adjust to the new role, feel free to ask me!”

# Shirley 에 대한 질문
conversation_with_kg.memory.load_memory_variables({"input": "who is Shirley?"})

출력 결과:

{'history': 'On Shirley: Shirley is a coworker. Shirley is a new designer. Shirley is at company.'}

2. ConversationSummaryMemory

📌 개념

대화 로그 전체를 요약(summary) 으로 압축해 보관하는 메모리.
새 턴이 들어올 때마다 기존 요약 + 최신 대화를 바탕으로 업데이트된 요약을 생성하여 저장.
장기 맥락을 가볍게 유지하면서 토큰 사용량을 크게 줄임.

언제 쓰면 좋은가

한 세션이 길게 이어지는 고객지원/컨시어지/에이전트 시나리오.
“핵심 맥락만 남기고 군더더기는 버리고 싶다”면 최적.

핵심 파라미터

llm: 요약을 생성/갱신할 모델(필수)
return_messages: history를 메시지 리스트로 받을지 여부(기본은 문자열)
prompt: 요약 규칙(추측 금지, 중요 키 포함 등) 커스터마이즈 가능

동작 흐름

summary(기존 요약) + 새 입력/출력 → LLM에 전달
LLM이 새 요약 생성 → summary 교체
다음 턴에는 요약 + 최신 대화만 사용

코드 예시 (LangChain)

from langchain.memory import ConversationSummaryMemory
from langchain_openai import ChatOpenAI

memory = ConversationSummaryMemory(
    llm=ChatOpenAI(model='gpt-4o', temperature=0), return_messages=True
)

memory.save_context(
    inputs={"human": "유럽 여행 패키지의 가격은 얼마인가요?"},
    outputs={
        "ai": "유럽 14박 15일 패키지의 기본 가격은 3,500유로입니다. 이 가격에는 항공료, 호텔 숙박비, 지정된 관광지 입장료가 포함되어 있습니다. 추가 비용은 선택하신 옵션 투어나 개인 경비에 따라 달라집니다."
    },
)
memory.save_context(
    inputs={"human": "여행 중에 방문할 주요 관광지는 어디인가요?"},
    outputs={
        "ai": "이 여행에서는 파리의 에펠탑, 로마의 콜로세움, 베를린의 브란덴부르크 문, 취리히의 라이네폴 등 유럽의 유명한 관광지들을 방문합니다. 각 도시의 대표적인 명소들을 포괄적으로 경험하실 수 있습니다."
    },
)
memory.save_context(
    inputs={"human": "여행자 보험은 포함되어 있나요?"},
    outputs={
        "ai": "네, 모든 여행자에게 기본 여행자 보험을 제공합니다. 이 보험은 의료비 지원, 긴급 상황 발생 시 지원 등을 포함합니다. 추가적인 보험 보장을 원하시면 상향 조정이 가능합니다."
    },
)
memory.save_context(
    inputs={
        "human": "항공편 좌석을 비즈니스 클래스로 업그레이드할 수 있나요? 비용은 어떻게 되나요?"
    },
    outputs={
        "ai": "항공편 좌석을 비즈니스 클래스로 업그레이드하는 것이 가능합니다. 업그레이드 비용은 왕복 기준으로 약 1,200유로 추가됩니다. 비즈니스 클래스에서는 더 넓은 좌석, 우수한 기내식, 그리고 추가 수하물 허용량 등의 혜택을 제공합니다."
    },
)
memory.save_context(
    inputs={"human": "패키지에 포함된 호텔의 등급은 어떻게 되나요?"},
    outputs={
        "ai": "이 패키지에는 4성급 호텔 숙박이 포함되어 있습니다. 각 호텔은 편안함과 편의성을 제공하며, 중심지에 위치해 관광지와의 접근성이 좋습니다. 모든 호텔은 우수한 서비스와 편의 시설을 갖추고 있습니다."
    },
)
memory.save_context(
    inputs={"human": "식사 옵션에 대해 더 자세히 알려주실 수 있나요?"},
    outputs={
        "ai": "이 여행 패키지는 매일 아침 호텔에서 제공되는 조식을 포함하고 있습니다. 점심과 저녁 식사는 포함되어 있지 않아, 여행자가 자유롭게 현지의 다양한 음식을 경험할 수 있는 기회를 제공합니다. 또한, 각 도시별로 추천 식당 리스트를 제공하여 현지의 맛을 최대한 즐길 수 있도록 도와드립니다."
    },
)
memory.save_context(
    inputs={"human": "패키지 예약 시 예약금은 얼마인가요? 취소 정책은 어떻게 되나요?"},
    outputs={
        "ai": "패키지 예약 시 500유로의 예약금이 필요합니다. 취소 정책은 예약일로부터 30일 전까지는 전액 환불이 가능하며, 이후 취소 시에는 예약금이 환불되지 않습니다. 여행 시작일로부터 14일 전 취소 시 50%의 비용이 청구되며, 그 이후는 전액 비용이 청구됩니다."
    },
)

저장된 메모리의 history를 확인합니다.

이전의 모든 대화를 압축적으로 요약한 내용을 확인할 수 있습니다.

# 저장된 메모리 확인
print(memory.load_memory_variables({})["history"])

출력 결과:

[SystemMessage(content="The human asks about the price of a European travel package. The AI responds that the basic price for a 14-night, 15-day European package is 3,500 euros, which includes airfare, hotel accommodations, and entrance fees to designated tourist sites. Additional costs depend on optional tours and personal expenses. The trip includes visits to major tourist attractions such as the Eiffel Tower in Paris, the Colosseum in Rome, the Brandenburg Gate in Berlin, and Rhine Falls in Zurich, offering a comprehensive experience of each city's iconic landmarks. The human inquires if travel insurance is included, and the AI confirms that basic travel insurance is provided for all travelers, covering medical expenses and emergency support, with options for additional coverage available. The human asks if it is possible to upgrade to business class seats, and the AI confirms that it is possible for an additional cost of approximately 1,200 euros round-trip, offering benefits such as wider seats, superior in-flight meals, and additional baggage allowance. The human then asks about the hotel rating included in the package, and the AI states that the package includes 4-star hotel accommodations, which offer comfort, convenience, and excellent service, and are centrally located for easy access to tourist sites. The human asks for more details about meal options, and the AI explains that the package includes daily breakfast at the hotel, while lunch and dinner are not included, allowing travelers to explore local cuisine. The AI also provides a list of recommended restaurants in each city to enhance the culinary experience. The human asks about the deposit and cancellation policy, and the AI explains that a deposit of 500 euros is required at the time of booking. The cancellation policy allows for a full refund if canceled 30 days before the booking date, but the deposit is non-refundable if canceled after that. If canceled 14 days before the start of the trip, 50% of the cost is charged, and after that, the full cost is charged.", additional_kwargs={}, response_metadata={})]

장단점

장점: 장기 맥락 유지, 토큰 효율 최고, 노이즈 제거
단점: 세부 문장/표현은 없어질 수 있음(요약 품질 의존)

3. ConversationSummaryBufferMemory

📌 개념

요약 + 최근 N턴 원문을 함께 유지하는 하이브리드 메모리.
장기 맥락은 요약으로 압축하고, 가장 최근 대화 몇 턴은 원문 그대로 붙여 세부 정확도를 살린다.

언제 쓰면 좋은가

“장기 맥락”도 중요하고 “최근 디테일”도 중요한 실전 서비스.
예: 결제 직전 안내, 최근 확인 코드/수치/주소 등 정확한 원문이 필요한 경우.

코드 예시 (LangChain)

from langchain_openai import ChatOpenAI
from langchain.memory import ConversationSummaryBufferMemory

llm = ChatOpenAI()

memory = ConversationSummaryBufferMemory(
    llm=llm,
    max_token_limit=200,  # 요약의 기준이 되는 토큰 길이를 설정합니다.
    return_messages=True,
)

먼저, 1개의 대화만 저장해 보도록 한 뒤, 메모리를 확인해보겠습니다.

memory.save_context(
    inputs={"human": "유럽 여행 패키지의 가격은 얼마인가요?"},
    outputs={
        "ai": "유럽 14박 15일 패키지의 기본 가격은 3,500유로입니다. 이 가격에는 항공료, 호텔 숙박비, 지정된 관광지 입장료가 포함되어 있습니다. 추가 비용은 선택하신 옵션 투어나 개인 경비에 따라 달라집니다."
    },
)

메모리에 저장된 대화를 확인합니다.

# 메모리에 저장된 대화내용 확인
memory.load_memory_variables({})["history"]

출력 결과:

[HumanMessage(content='유럽 여행 패키지의 가격은 얼마인가요?', additional_kwargs={}, response_metadata={}),
AIMessage(content='유럽 14박 15일 패키지의 기본 가격은 3,500유로입니다. 이 가격에는 항공료, 호텔 숙박비, 지정된 관광지 입장료가 포함되어 있습니다. 추가 비용은 선택하신 옵션 투어나 개인 경비에 따라 달라집니다.', additional_kwargs={}, response_metadata={})]

아직은 대화내용을 요약하지 않습니다. 기준이 되는 200 토큰에 도달하지 않았기 때문입니다.

대화를 추가로 저장하여 200 토큰 제한을 넘기도록 해 보겠습니다.

memory.save_context(
    inputs={"human": "여행 중에 방문할 주요 관광지는 어디인가요?"},
    outputs={
        "ai": "이 여행에서는 파리의 에펠탑, 로마의 콜로세움, 베를린의 브란덴부르크 문, 취리히의 라이네폴 등 유럽의 유명한 관광지들을 방문합니다. 각 도시의 대표적인 명소들을 포괄적으로 경험하실 수 있습니다."
    },
)
memory.save_context(
    inputs={"human": "여행자 보험은 포함되어 있나요?"},
    outputs={
        "ai": "네, 모든 여행자에게 기본 여행자 보험을 제공합니다. 이 보험은 의료비 지원, 긴급 상황 발생 시 지원 등을 포함합니다. 추가적인 보험 보장을 원하시면 상향 조정이 가능합니다."
    },
)
memory.save_context(
    inputs={
        "human": "항공편 좌석을 비즈니스 클래스로 업그레이드할 수 있나요? 비용은 어떻게 되나요?"
    },
    outputs={
        "ai": "항공편 좌석을 비즈니스 클래스로 업그레이드하는 것이 가능합니다. 업그레이드 비용은 왕복 기준으로 약 1,200유로 추가됩니다. 비즈니스 클래스에서는 더 넓은 좌석, 우수한 기내식, 그리고 추가 수하물 허용량 등의 혜택을 제공합니다."
    },
)
memory.save_context(
    inputs={"human": "패키지에 포함된 호텔의 등급은 어떻게 되나요?"},
    outputs={
        "ai": "이 패키지에는 4성급 호텔 숙박이 포함되어 있습니다. 각 호텔은 편안함과 편의성을 제공하며, 중심지에 위치해 관광지와의 접근성이 좋습니다. 모든 호텔은 우수한 서비스와 편의 시설을 갖추고 있습니다."
    },
)

저장된 대화내용을 확인합니다. 가장 최근 1개의 대화에 대해서는 요약이 진행되지 않지만, 이전의 대화 내용은 요약본으로 저장되어 있습니다.

# 메모리에 저장된 대화내용 확인
memory.load_memory_variables({})["history"]

출력 결과:

[SystemMessage(content='The human asks about the price of a European travel package and the key tourist attractions included in the itinerary. The AI lists iconic landmarks in cities like Paris, Rome, Berlin, and Zurich. The human then inquires about upgrading airline seats to business class and the cost involved. The AI confirms that upgrading to business class is possible with an additional cost of approximately 1,200 euros for roundtrip flights, offering benefits like wider seats, premium meals, and extra baggage allowance.', additional_kwargs={}, response_metadata={}),
HumanMessage(content='패키지에 포함된 호텔의 등급은 어떻게 되나요?', additional_kwargs={}, response_metadata={}),
AIMessage(content='이 패키지에는 4성급 호텔 숙박이 포함되어 있습니다. 각 호텔은 편안함과 편의성을 제공하며, 중심지에 위치해 관광지와의 접근성이 좋습니다. 모든 호텔은 우수한 서비스와 편의 시설을 갖추고 있습니다.', additional_kwargs={}, response_metadata={})]

두 메모리 비교 요약

메모리	저장 방식	장점	단점	추천 사용처
ConversationSummaryMemory	전체 대화를 하나의 요약으로 누적	토큰 효율 최고, 장기 맥락 유지	세부 표현 손실 가능, 요약 품질 의존	긴 세션, 핵심만 필요한 요약형 로그
ConversationSummaryBufferMemory	요약 + 최근 N턴 원문 병행	장기 맥락 + 최신 디테일 모두 반영	토큰 사용량 증가(최근 N턴 포함)	실전 서비스, 정확한 최신 정보가 중요한 경우

마무리

오늘(25일차)에는 ConversationKGMemory, ConversationSummaryMemory, ConversationSummaryBufferMemory 세 가지 메모리 방식을 살펴보았습니다.

ConversationKGMemory는 대화 내용을 지식 그래프(triple) 형태로 저장하여, 누가 누구와 어떤 관계가 있는지를 구조적으로 관리할 수 있게 해줍니다. → 인물·관계 기반 질의응답에 강점.
ConversationSummaryMemory는 대화 기록을 압축된 요약으로 유지하여 토큰을 크게 절약하면서도 장기 맥락을 보존합니다. → 긴 세션에서도 효율적.
ConversationSummaryBufferMemory는 요약에 더해 최근 N턴의 원문을 그대로 유지하여, 장기 맥락과 최신 세부 맥락을 동시에 챙길 수 있는 절충안입니다. → 실제 서비스 운영에서 가장 실용적.

정리하면,

KGMemory는 "관계 중심"
SummaryMemory는 "핵심 요약 중심"
SummaryBufferMemory는 "요약 + 최신 원문 병행"

이라는 차이가 있습니다.

👉 결국 어떤 메모리를 쓸지는 서비스의 성격과 필요에 달려 있습니다.

장기 프로필·관계 관리가 필요하다면 KGMemory,
긴 대화를 효율적으로 유지하려면 SummaryMemory,
최신 디테일까지 놓치고 싶지 않다면 SummaryBufferMemory가 적합합니다.

읽어주셔서 감사합니다.

728x90

'AI > RAG 비법노트' 카테고리의 다른 글

27일차: 테디노트의 RAG 비법노트: 랭체인을 활용한 GPT부터 로컬 모델까지의 RAG 강의(멀티턴 챗봇 구현) (1)	2025.09.09
26일차: 테디노트의 RAG 비법노트: 랭체인을 활용한 GPT부터 로컬 모델까지의 RAG 강의 후기 (Memory 3) (0)	2025.09.09
24일차: 테디노트의 RAG 비법노트: 랭체인을 활용한 GPT부터 로컬 모델까지의 RAG 강의 후기 (Memory) (0)	2025.09.06
23일차: 테디노트의 RAG 비법노트: 랭체인을 활용한 GPT부터 로컬 모델까지의 RAG 강의 후기 (ollama 기반의 RAG 프로젝트) (1)	2025.09.03
22일차: 테디노트의 RAG 비법노트: 랭체인을 활용한 GPT부터 로컬 모델까지의 RAG 강의 후기 (직렬화, 토큰 사용량 확인) (0)	2025.09.03

말하는 감자의 AI Researcher 여정기