MLX

This notebook shows how to get started using MLX LLM’s as chat models.

In particular, we will: 1. Utilize the MLXPipeline, 2. Utilize the ChatMLX class to enable any of these LLMs to interface with LangChain’s Chat Messages abstraction. 3. Demonstrate how to use an open-source LLM to power an ChatAgent pipeline

%pip install --upgrade --quiet  mlx-lm transformers huggingface_hub

1. Instantiate an LLM

There are three LLM options to choose from.

from langchain_community.llms.mlx_pipeline import MLXPipeline

llm = MLXPipeline.from_model_id(
    "mlx-community/quantized-gemma-2b-it",
    pipeline_kwargs={"max_tokens": 10, "temp": 0.1},
)

API Reference:

MLXPipeline

2. Instantiate the `ChatMLX` to apply chat templates

Instantiate the chat model and some messages to pass.

from langchain.schema import (
    HumanMessage,
)
from langchain_community.chat_models.mlx import ChatMLX

messages = [
    HumanMessage(
        content="What happens when an unstoppable force meets an immovable object?"
    ),
]

chat_model = ChatMLX(llm=llm)

API Reference:

Inspect how the chat messages are formatted for the LLM call.

chat_model._to_chat_prompt(messages)

Call the model.

res = chat_model.invoke(messages)
print(res.content)

3. Take it for a spin as an agent!

Here we’ll test out gemma-2b-it as a zero-shot ReAct Agent. The example below is taken from here.

Note: To run this section, you’ll need to have a SerpAPI Token saved as an environment variable: SERPAPI_API_KEY

from langchain import hub
from langchain.agents import AgentExecutor, load_tools
from langchain.agents.format_scratchpad import format_log_to_str
from langchain.agents.output_parsers import (
    ReActJsonSingleInputOutputParser,
)
from langchain.tools.render import render_text_description
from langchain_community.utilities import SerpAPIWrapper

API Reference:

Configure the agent with a react-json style prompt and access to a search engine and calculator.

# setup tools
tools = load_tools(["serpapi", "llm-math"], llm=llm)

# setup ReAct style prompt
prompt = hub.pull("hwchase17/react-json")
prompt = prompt.partial(
    tools=render_text_description(tools),
    tool_names=", ".join([t.name for t in tools]),
)

# define the agent
chat_model_with_stop = chat_model.bind(stop=["\nObservation"])
agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_log_to_str(x["intermediate_steps"]),
    }
    | prompt
    | chat_model_with_stop
    | ReActJsonSingleInputOutputParser()
)

# instantiate AgentExecutor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke(
    {
        "input": "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?"
    }
)

1. Instantiate an LLM​

API Reference:

2. Instantiate the ChatMLX to apply chat templates​

API Reference:

3. Take it for a spin as an agent!​

API Reference:

Help us out by providing feedback on this documentation page:

1. Instantiate an LLM

2. Instantiate the `ChatMLX` to apply chat templates

3. Take it for a spin as an agent!