Skip to content

LangChain integration

cogcache implements the LangChain BaseCache interface so it drops in as a replacement for InMemoryCache, SQLiteCache, RedisCache, etc.

Install

pip install cogcache[langchain]

Usage

from langchain_core.globals import set_llm_cache
from langchain_openai import ChatOpenAI
from cogcache import CogniCache
from cogcache.integrations.langchain import CogniCacheLangChain

# 1. Create the underlying semantic cache
cogcache_instance = CogniCache(
    similarity_threshold=0.92,
    redis_url="redis://localhost:6379/0",   # or None for in-memory
)

# 2. Wrap it in the LangChain adapter
langchain_cache = CogniCacheLangChain(cogcache_instance)

# 3. Register globally — every LangChain LLM call goes through it
set_llm_cache(langchain_cache)

# 4. Use any LangChain LLM normally
llm = ChatOpenAI(model="gpt-4o-mini")
llm.invoke("What is gradient descent?")    # LLM call
llm.invoke("Explain gradient descent.")     # semantic HIT

Per-LLM cache

If you don't want a global cache, you can pass cogcache to a single LLM:

llm = ChatOpenAI(model="gpt-4o-mini", cache=langchain_cache)

Caveats

The default BaseCache contract uses (prompt, llm_string) as the key, where llm_string is the serialized LLM settings. cogcache uses only the prompt for semantic matching, so different LLM configurations (temperature, top_p, etc.) share the same cache entry.

If you need to keep configurations separate, set distinct routes:

cache_temp_0 = CogniCacheLangChain(cogcache, route="temp_0")
cache_temp_1 = CogniCacheLangChain(cogcache, route="temp_1")

llm_deterministic = ChatOpenAI(temperature=0,   cache=cache_temp_0)
llm_creative      = ChatOpenAI(temperature=1.0, cache=cache_temp_1)