• LLM기반 chatBot 설계, 구조 변화
    DEV 2023. 12. 29. 21:25

    많은 곳에서 LLM을 활용한 챗봇 서비스가 만들어지고 있다. langchain 코드 예제를 통해 챗봇 개발의 설계, 구조의 흐름이 어떻게 변하고 있는지 조사해 보자.

    1. LLM api만을 사용

    LLM api에 프롬프트 연결

    • 단순히 LLM에게 하나의 prompt로 질의
    • 약간 아재개그 같지만, 미국에선 먹히는 것 같다.
    from langchain.chat_models import ChatOpenAI
    from langchain.prompts import ChatPromptTemplate
    from langchain_core.output_parsers import StrOutputParser
    
    prompt = ChatPromptTemplate.from_template("Tell me a short joke about {topic}")
    model = ChatOpenAI(model="gpt-3.5-turbo")
    output_parser = StrOutputParser()
    
    chain = prompt | model | output_parser
    
    chain.invoke({"topic": "ice cream"})
    
    output:
    "Why did the ice cream go to therapy?\n\n
    Because it had too many toppings and couldn't find its cone-fidence!"

    2. prompt chaining

    prompt chaining

    • 여러 단계가 필요한 문제 해결과정을 프롬프트를 각 단계로 나누고, 체이닝 하여 해결하는 방식
    • prompt1의 답변(Honolulu)을 prompt2의 'city'에 연결하여 'spanish'로 응답하도록 명령 
    from operator import itemgetter
    
    from langchain.chat_models import ChatOpenAI
    from langchain.prompts import ChatPromptTemplate
    from langchain.schema import StrOutputParser
    
    prompt1 = ChatPromptTemplate.from_template("what is the city {person} is from?")
    prompt2 = ChatPromptTemplate.from_template(
        "what country is the city {city} in? respond in {language}"
    )
    
    model = ChatOpenAI()
    
    chain1 = prompt1 | model | StrOutputParser()
    
    chain2 = (
        {"city": chain1, "language": itemgetter("language")}
        | prompt2
        | model
        | StrOutputParser()
    )
    
    chain2.invoke({"person": "obama", "language": "spanish"})
    
    output:
    'El país donde se encuentra la ciudad de Honolulu, donde nació Barack Obama, 
    el 44º Presidente de los Estados Unidos, es Estados Unidos. 
    Honolulu se encuentra en la isla de Oahu, en el estado de Hawái.'

    3. 외부 데이터 연결

    외부 데이터를 연결한 챗봇

    • 임베딩 + 벡터 데이터 베이스
      - LLM을 사용하여 임베딩을 생성한 다음, 임베딩을 기반으로 검색 및 추천시스템 등의 ML 애플리케이션을 구축
      - FAISS.from_texts(["harrison worked at kensho"], embedding=OpenAIEmbeddings())
    !pip install langchain openai faiss-cpu tiktoken
    
    from operator import itemgetter
    
    from langchain.chat_models import ChatOpenAI
    from langchain.embeddings import OpenAIEmbeddings
    from langchain.prompts import ChatPromptTemplate
    from langchain.vectorstores import FAISS
    from langchain_core.output_parsers import StrOutputParser
    from langchain_core.runnables import RunnableLambda, RunnablePassthrough
    
    vectorstore = FAISS.from_texts(
        ["harrison worked at kensho"], embedding=OpenAIEmbeddings()
    )
    retriever = vectorstore.as_retriever()
    
    template = """Answer the question based only on the following context:
    {context}
    
    Question: {question}
    """
    prompt = ChatPromptTemplate.from_template(template)
    
    model = ChatOpenAI()
    
    chain = (
        {"context": retriever, "question": RunnablePassthrough()}
        | prompt
        | model
        | StrOutputParser()
    )
    
    chain.invoke("where did harrison work?")
    
    output:
    'Harrison worked at Kensho.'

    Orchestration Framework

    • LangChain, Semantic Kernel(MS)
    • 프레임워크들의 공통적 철학
      - 모델을 선택, 프롬프트 템플릿/채이닝, 연관성 있는 데이터 검색함으로써 프롬프트를 증강

    RAG (Retrieval-augmented generation)

    4. Agent 사용

    agent system

    • agent : 문제가 주어지면 에이전트가 알아서 자기에게 주어진 툴들을 사용해 해결방법을 계획하고, 행동하여 문제를 해결
    • https://github.com/101dotxyz/GPTeam
    • http://aidev.co.kr/chatbotdeeplearning/12619
    • 사용가능한 툴들을 agent에게 알려주면 에이전트가 질의에 따라 필요한 tool(search, calculator, foobar-db)을 선택하여 문제를 해결
    from langchain.agents import AgentType, Tool, initialize_agent
    from langchain.chains import LLMMathChain
    from langchain.chat_models import ChatOpenAI
    from langchain.utilities import SerpAPIWrapper, SQLDatabase
    from langchain_experimental.sql import SQLDatabaseChain
    
    llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613")
    search = SerpAPIWrapper()
    llm_math_chain = LLMMathChain.from_llm(llm=llm, verbose=True)
    db = SQLDatabase.from_uri("sqlite:///../../../../../notebooks/Chinook.db")
    db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)
    
    tools = [
        Tool(
            name="Search",
            func=search.run,
            description="useful for when you need to answer questions about current events. You should ask targeted questions",
        ),
        Tool(
            name="Calculator",
            func=llm_math_chain.run,
            description="useful for when you need to answer questions about math",
        ),
        Tool(
            name="FooBar-DB",
            func=db_chain.run,
            description="useful for when you need to answer questions about FooBar. Input should be in the form of a question containing full context",
        ),
    ]
    
    from langchain.memory import ConversationBufferMemory
    from langchain.prompts import MessagesPlaceholder
    
    agent_kwargs = {
        "extra_prompt_messages": [MessagesPlaceholder(variable_name="memory")],
    }
    memory = ConversationBufferMemory(memory_key="memory", return_messages=True)
    
    agent = initialize_agent(
        tools,
        llm,
        agent=AgentType.OPENAI_FUNCTIONS,
        verbose=True,
        agent_kwargs=agent_kwargs,
        memory=memory,
    )
    
    agent.run("hi")
    
    
    > Entering new  chain...
    Hello! How can I assist you today?
    
    > Finished chain.
    
    
    agent.run("my name is bob")
    
    
    > Entering new  chain...
    Nice to meet you, Bob! How can I help you today?
    
    > Finished chain.
    
    'Nice to meet you, Bob! How can I help you today?'
    
    agent.run("whats my name")
    
    
    > Entering new  chain...
    Your name is Bob.
    
    > Finished chain.
    
    
    'Your name is Bob.'
    728x90
go.