Table of Contents

About the Labs

Stardog Labs is the R&D unit of Stardog Union. We innovate in AI fields like knowledge graph, LLM, automated inference, and related areas. Some examples include the following.

Model Pruning

Eigenpruning

ShortGPT: LLM Layers are more redundant than you expect

Quantization

FP6-LLM Quantization

Extreme Compression of Large Language Models via Additive Quantization

Model Distillation

Divide-or-Conquer? Which Part Should You Distill Your LLM?

Efficiently distilling LLMs for Edge Apps

Model Serving

Xc-Cache: Cross-Attending to Cached Context for Efficient LLM Inference

TCRA-LLM: Token Compression Retrieval Augmented Large Language Model for Inference Cost Reduction

Optimizing Attention

You need to pay better attention

DeFT: Flash Tree-attention with IO-Awareness for Efficient Tree-search-based LLM Inference