Canonical URL
Do not index
Do not index
I like to explain contrarian reason because, no matter what else happens, someone is going to learn something, i.e., update their priors. Either I’m going to learn that I’m wrong; or the other person is going to learn that I’m not.
We built a GenAI appliance because regulated industries have lots of data not in the cloud and enterprises in those industries need GenAI, too. Our on-prem appliance—Stardog Karaoke—brings GenAI to regulated on-prem.
Why am I Talking About?
Some considerations—
- Built?—more like assembling from pre-existing components, i.e., tech legos.
- LLM—A model-serving capability in Appliance 1.0, which is inclusive of an existing CPU platform to which we’ve extended GPU-based model-serving (plus some other GNN and vector pieces) and then extended our K8s control plane over the entirety.
- Appliance—A vertically-integrated SW/HW/networking system that will reside on-prem in the customer’s DC and be operated, remotely, by us by extending the control plane of our multi-AZ AWS SOC2ii cloud offering. Includes:
- Separately extensible CPU and GPU components
- A CPU component (i.e., one or more L10s) that runs our existing knowledge graph platform via K8s. We intend 3 variations of the CPU component:
- a 2U4N variant (“smol boi”)
- a 4U8N variant (“chunky monkey”)
- an 8U20n variant (“zaddy”)
- A GPU component (one or more L10s) that runs the new LLM services we’ve added to the platform, also by extending our K8s control plane to include GPU-powered services.
- North-South networking (200GbE)
- integrated racking, PDU, etc
- custom bezels for that brand awareness in the DC and for the LinkedIn lulz!
- performance targets of up to 50,000 concurrent users, primarily for KGQA
- no external run-time service or software dependencies; i.e., no shelling out to OpenAI, etc. That’s no way to live yr life, son.
Motivation #1 World-class UX
Why is an enterprise software company doing this? Basically because there’s a shit tonne of scrilla to be made; but why do we think that?
World-class UX requires full-stack control in the GenAI era, at least right now. The LLM systems integration problem is quite nasty and the only reliable way to solve and control it is to fucking solve and control it. The part you don’t have direct leverage over IS the part that will fuck yr users up and that fucks you up.
Motivation #2 World-class Unit Economics
Happily the path of world-class unit economics in the GenAI era is also the path of full-stack control.