Show HN: In-Browser Graph RAG with Kuzu-WASM and WebLLM

blog.kuzudb.com

158 points by sdht0 4 months ago

We show the potential of modern, embedded graph databases in the browser by demonstrating a fully in-browser chatbot that can perform Graph RAG using Kuzu (the graph database we're building) and WebLLM, a popular in-browser inference engine for LLMs. The post retrieves from the graph via a Text-to-Cypher pipeline that translates a user question into a Cypher query, and the LLM uses the retrieved results to synthesize a response. As LLMs get better, and WebGPU and Wasm64 become more widely adopted, we expect to be able to do more and more in the browser in combination with LLMs, so a lot of the performance limitations we see currently may not be as much of a problem in the future.

We will soon also be releasing a vector index as part of Kuzu that you can also use in the browser to build traditional RAG or Graph RAG that retrieves from both vectors and graphs. The system has come a long way since we open sourced it about 2 years ago, so please give us feedback about how it can be more useful!

willguest 4 months ago

I absolutely love this. I make VR experiences that run on the ICP, which delivers wasm modules as smart contracts - I've been waiting for a combo of node-friendly, wasm deployable tools and webLLM. The ICP essentially facilitates self-hosting of data and provides consensus protocols for secure messaging and transactions.

This will make it super easy for me to add LLM functionality to existing webxr spaces, and I'm excited to see how an intelligent avatar or convo between them will play out. This is, very likely, the thing that will make this possible :)

If anyone wants to collab, or contribute in some way, I'm open to ideas and support. Search for 'exeud' to find more info

wkat4242 4 months ago

Why the Blockchain there? I don't really see the value. But maybe I misunderstand. It's just that I tend to be pretty dismissive of products mentioning blockchain. Mostly from the time when this tech was severely overhyped. Like Metaverse after it and now of course AI. I do know there's some usecases for it, I just wonder what they are and why you chose it.
I think I like the idea but I don't think I fully understand what it is that you're doing :) But I love everything VR.
- varelaseb 4 months ago
  
  Take this with a grain of salt, as I run a startup in the industry.
  Blockchain has taken a weird path. It started with Bitcoin offering something genuinely new - a Byzantine fault-tolerant mechanism for decentralized value exchange without trusted intermediaries. But the industry has drifted toward "web3" hype where the technology often isn't necessary.
  Companies pick tech stacks for all sorts of reasons beyond technical merit - vendor relationships, development velocity, legacy system compatibility, and UX considerations all factor into these decisions.
  Truth is, most blockchain companies today are solving problems that could be handled just fine with traditional databases and APIs. The industry is shifting toward abstraction layers that hide the consensus mechanisms anyway, focusing on user experience instead.
  The project mentioned probably doesn't actually need a blockchain backend for what it's doing, except maybe for tradable collectibles on an ERC standard.
  - willguest 4 months ago
    
    It is wise to be suspicious - spending even a small amount of time near the "web3" space will make attentive person suspicious of scams and parlour tricks.
    I use the network to host the webxr experiences, which are bundled wasm files, from unity. All of the code lives on the blockchain so, in this sense, i really couldn't do without it.
    If you are referring to blockchain-specific functionality, then this is largely true, however i have implemented some demostrations of the consensus mechanism being used from inside an immersive space. This is really to illustrate that it is possible, rather than try to sell you a meme coin.
    In the near future I expect to be using the ICP for a lot more, however, since it provides some rather interesting technical opportunities. It is wrapped around something called a 'network nervous system' which acts as a sort of administrator for proposals, so your point about abstraction is accurate, but this is the case with any large vendor.
    I choose to build on the ICP because it is more secure and straightforward for my particular stack, plus it has a lot of potential, despite the noise. I've implemented webrtc messaging to keep the cost of normal multiplayer data transfer as low as possible, because consensus is expensive and the network runs on a pay-per-use compute model.
    I'm offering a new route, outside of big tech, if you don't consider Unity a menace, which i acknowledge that some do. I am taking an admittedly more radical stance, by including the hosting in this.
    
    varelaseb 4 months ago
    
    Hi Will!
    Thanks for taking the time to reply to this.
    Now that I have the context, I find what you're doing super interesting, and well thought out. And greatly value the passion you're building it with as well.
    I want to clarify, my message wasn't intended as a dig at what you are doing - especially since I didn't actually look into it at all before writing my reply.
    By definition, and especially when explained with enough detail, anything that we do couldn't be done with a different tool, in the sense it'd have to be done differently.
    What my comment was meant to address was the original comment's question regarding the value of doing something "on-chain". Mainly because it's something that I've been thinking about a lot, being a founder as well in a similarly-hyped vertical.
    At it's core, blockchains are a database, and so any high level goal - beyond the composability/interoperability of on-chain primitives through tokens and shared state - can be achieved without a blockchain.
    However, there are many reasons to leverage the position of a blockchain project beyond the technical _need_ for a DLT.
    The VC environment might be attractive to some -leveraging the network effects of a sufficiently decentralized network, tapping into ecosystem incentives and growth programs, personal alignment with the moral values typically associated with decentralization, personal connections in the industry/vertical, etc.
    All that to say; no one asks why you're using a relational database or a graph database with as much suspicion or caution as they do why you're putting your stuff on-chain, and while that makes sense because of the... unique circumstances of DLTs, there's a lot more nuance to it from a business perspective than just asking yourself the "is it a grift?" question.
    
    willguest 4 months ago
    
    I guess what i wanted to get across was that ICP is considerably more than a database, but tbh it's a massive rabbit hole, unless you're already into cryptography. I'm impressed by the foresight and scale of it though.
    I thinks it's also the first thing i would call an AI governor, and there is a whole liquid democracy protocol that's pretty well thought out, imo
    Thanks for the kind words
- willguest 4 months ago
  
  Yes, the hype waves have been painful. I am still annoyed by the lack of good tools and content for VR lovers. I am personally trying to do something by offering a toolkit for building VR websites in unity.
  My website has a few examples on it, including domestic interior, bowling simulator. Recently I helped to make a VR museum, which will be there soon, and I'm working on a flying game with realistic aerodynamics.
  I've open-sourced both the toolkit and template for self-hosting, essentially providing a no-code route for creating interactive webxr spaces. Putting it on the IC means you can also self-host your creation and keep control of the data, code and costs of running it.
  You could equally just use the unity toolkit and host it elsewhere, but i wouldn't be able to provide the same level of support, if it broke in unexpected ways.
  What i like about the original post was the fully in-browser RAG and webLLM, as it looks compatible with the rest, so i could, for example, broadcast responses across webrtc data channels. So many options...

esafak 4 months ago

The example is not ideal for showcasing a graph analytics database because they could have used a traditional relational database to answer the same query, Which of my contacts work at Google?

laminarflow027 4 months ago

Hi, I work at Kuzu and can offer my thoughts on this.
You're making a fair observation here and it's true for any high level query language - SQL and Cypher and interchangeable unless the queries are recursive, in which case Cypher's graph syntax (e.g., the Kleene star * or shortest paths) has several advantages. One could make the argument that Cypher is easier for LLMs to generate because the joins are less verbose (you simply express the join as a query pattern). This post is not necessarily about graph analytics. It's about demonstrating that it's very simple to develop a relatively complex application using LLMs and a database fully in-browser, which can potentially open up new use cases. I'm sure many people will come up with other creative ways putting these fully in-browser technologies, both graph-specific, and not, e.g., using vector search-based retrieval. In fact, there are already some of our users doing this right now.
- echelon 4 months ago
  
  This is really cool, but I'm super anxious about entering my personal data, especially LinkedIn connections.
  Is there some other demo you could do with public graph data? It'd be just as cool of a demo, but with less fear of information misuse.
  I'm even more anxious about leaking information about my professional connections as I am leaking my own data.
  - laminarflow027 4 months ago
    
    Your concern makes sense, but in the demo we show, all your private data AND the graph database AND the LLM (basically, everything) is confined to your client session in the browser, and no data actually ever leaves your machine. That's the whole point of Wasm!
    The graph that you build is more for your own exploration and not for sharing with the outside world.
    
    w10-1 4 months ago
    
    Still, using non-personal example would mean the user wouldn't have to consider whether to trust you on that point (or do the analysis), and would make the technology demo friction-free.
    imo, privacy shouldn't be the driver but the kicker, because it's so inflammatory.
    
    sdht0 4 months ago
    
    https://demo.kuzudb.com is our general WASM explorer with some synthetic data you can try.
    Also note that Kuzu is open source. You can try running the explorer locally using Docker: https://github.com/kuzudb/explorer?tab=readme-ov-file#webass...
    
    semihsalihoglu 4 months ago
    
    I think having a version with a sample synthetic dataset makes sense.
beefnugs 4 months ago

We wont be seeing any ai examples that actually are anywhere near useful until we rewrite all serialization/de serialization into "natural language" as well as create layers upon layers of loops of frameworks with simulations and test cases around this nonsense

nattaylor 4 months ago

This is very cool. Kuzu has a ton of great blog content on all the ways they make Kuzu light and fast. WebLMM (or in the future chrome.ai.* etc) + embedded graph could make for some great UXes

At one time I thought I read that there was a project to embed Kuzu into DuckDB, but bringing a vector store natively into kuzu sounds even better.

laminarflow027 4 months ago

Great point! Several years ago there was a project GRainDB, which along with GraphflowDB (a purely in-memory graph database) formed the ideas of what is now Kuzu :)
https://graindb.github.io/ https://github.com/graphflow/graphflow-columnar-techniques

mentalgear 4 months ago

Nice! You might also want to check out Orama - which is also an open-source hybrid vector/full text search engine for any js runtime.

canadiantim 4 months ago

Could it be viable to have one or multiple kuzu databases per user? What’s the story like for backups with kuzu?

I saw you recently integrated FTS which is very exciting. I love everything about Kuzu and want to use it, but currently tempted to use Turso to allow for multiple sqlite dbs per user (eg one for each device).

Or would it be possible to use Kuzu to query data stored on sqlite?

Great work through and through tho. Really amazing to see the progress you’ve all made!

guodong 4 months ago

Hi! Work at Kuzu here.
> would it be possible to use Kuzu to query data stored on sqlite? Yes, we have a SQLite extension (https://docs.kuzudb.com/extensions/attach/rdbms/) that can read data from SQLite databases.
> Could it be viable to have one or multiple kuzu databases per user? What’s the story like for backups with kuzu? You can have multiple databases, but con only connect to one at a time for now. We don't have support for backups for now, but we'd like to hear more about your specific use cases. Would be great if you could join our discord (https://kuzudb.com/chat) or contact us through contact@kuzudb.com, and we can chat more there.

jasonthorsness 4 months ago

Don't the resource requirements from even small LLMs exclude most devices/users from being able to use stuff like this?

laminarflow027 4 months ago

True, but there are likely innovations happening in multiple dimensions all at once: WebGPU improvements that better utilize a device's compute, Wasm64. And of course, LLMs over time become SLMs (smaller and smaller models), that can do a surprisingly large variety of things well.
Putting aside LLMs for a minute, even applications that do not need LLMs, but benefit from a graph database, can be unlocked to help build interactive UIs and visualizations that retain privacy on the client side without ever moving the data to a server. Loads of possibilities!

nsonha 4 months ago

Could someone please explain in-browser inference to me? So in the context of OpenAI usage (WebLLM github), this means I will send binary to OpenAI instead of text? And it will lower the cost and run faster?

a-ungurianu 4 months ago

Not exactly. If you refer to the following line:
> Full OpenAI Compatibility
> WebLLM is designed to be fully compatible with OpenAI API.
It means that WebLLM exposes an API that is identical in behaviour with the OpenAI one, so any tools that build against that API could also build against WebLLM and it will still work.
WebLLM by the looks of it runs the inference purely in the browser. None of your data leaves your browser.
WebLLM does need to get a model from somewhere, with the demo linked here getting Llama3.1 Instruct 8B model{1}.
1: https://huggingface.co/mlc-ai/Llama-3.1-8B-Instruct-q4f32_1-...

DavidPP 4 months ago

I'm new to the world of graph, and I just started building with SurrealDB in embedded mode.

If you don't mind taking a few minutes, what are the main reasons to use Kuzu instead?

laminarflow027 4 months ago

Hi, glad to help! I'm a DevRel advocate at Kuzu, and have spent a decent amount of time in other database paradigms thinking about these things. I'm familiar with SurrealDB too.
Although I cannot comment too much SurrealDB's exact capabilities and performance at this point, I can definitely highlight that at the data modeling and query language-level: Kuzu's data model is a property graph model (so an actual "graph" model rather than a multi-model database) and Kuzu implements Cypher as its query language, which is already widely adopted in the industry and is very intuitive to write (for both humans and LLMs).
Although Surreal DB does indeed offer an embedded mode, Kuzu is by design 100% embedded, is super-lightweight and can run natively in many environments, such as browsers, Android applications, AWS Lambda (serverless) and we're especially designed to be a VERY Python-friendly graph database that integrates with pretty much all well-known Python libraries. Because of its columnar storage layer, Kuzu can seamlessly read and write different data formats, such as Panda/Polars DataFrame, Arrow tables, Iceberg or Delta Lake tables and seamlessly move data between advanced graph analytics libraries like NetworkX. For anything related to graph computation, Kuzu is likely to have all the right tools and utilities to help you solve the problem at hand.
In my opinion, it's a myth that databases are heavy, monolithic pieces of software, and hopefully, using Kuzu will demonstrate that it's totally possible to have data in your primary store but seamlessly move it to a performant graph storage layer when required, and move the results back with minimum friction and cost. Hope that helps!

srameshc 4 months ago

I heard about it for the first time, an embedable graph database Kuzu and even better the WASM mix and LLM.

itissid 4 months ago

Since I already have a browser connected to the Internet where this would execute, could one have the option of transparently executing the webGPU + LLM in a cloud container communicating with the browser process?

mewim 4 months ago

I think WebGPU is mostly for running inside the browser. If one has the option to use a cloud container + GPU, running LLM inference directly with CUDA/ROCm/TPU will be possible and runs more efficiently.