LLM on demand

Our HPC-hosted LLM service provides external collaborators secure, high‑performance access to language models via a VPN-protected gateway. Powered by Ollama on GPU-accelerated nodes, it offers a browser-based Open WebUI for interactive chat and prompt workflows, plus OpenAI-compatible REST APIs for programmatic use. Access is authenticated with per-user API keys and role-based quotas, and the backend integrates with the cluster scheduler to ensure fair, reliable resource allocation. By default, data remains on-cluster; prompts and outputs can be logged for audit and usage reporting while respecting project privacy policies, and no user data is used for model training unless explicitly approved. The result is a turnkey platform to prototype, evaluate, and operationalize LLM-driven research and applications without managing infrastructure.

Available instruments

Select instruments to view their specifications and compare them (3 max)

Instruments' description and comparison

LLM on demand

Available instruments

Nvidia GPU

Instruments' description and comparison