Short description LiteLLM is a gateway that puts every model, local or cloud, behind one OpenAI compatible API. It adds virtual keys, usage logging, and a control plane UI. Purpose and how people use it People use LiteLLM to stop juggling different SDKs and keys for different providers. Every app points at LiteLLM, and LiteLLM…
Tag: Nvidia L4
Build a Local AI Stack: Series Index
This is a set of quick setup guides for running a full local AI tooling stack on your own hardware. Every tool runs on your machine, talks to your own models, and costs nothing per request. The guides are written to be posted one at a time, so each one stands on its own. Before…
NVIDIA A2: Cooling a Lower-Power GPU with a DIY Blower Fan
My third ESXi host, ESX-3, runs an NVIDIA A2 instead of the L4 found in ESX-1 and ESX-2. The A2 is a lower-power data center GPU with a 60 W TDP compared to the L4’s 72 W. Like the L4, it is passively cooled and supports NVIDIA vGPU. I am cooling it with the same…
n3rdware NVIDIA L4 Coolers: 3-Slot vs 1-Slot Compared
As part of my collaboration with Robbe from n3rdware, he designed two different aftermarket coolers for the NVIDIA L4: a 3-slot version and a 1-slot version. I tested each one on a separate ESXi host running the same stress test. This article brings both results together in a direct comparison so you can see how…
NVIDIA L4 Cooling Results with a Custom 1 Slot Cooler from n3rdware
I’m happy to say that my collaboration with Robbe from n3rdware has finally reached the finish line. He has been designing and selling custom GPU coolers for years, with most of his customers coming from the home lab community, so this project was in very capable hands from the start. As part of this collaboration,…
My NVIDIA L4 Now Runs 18°C Cooler with a Custom 3 Slot Cooler
I’m happy to say that my collaboration with Robbe from n3rdware has finally reached the finish line. Robbe has been designing and selling custom GPU coolers for years, with most of his customers coming from the home lab community, so this project was in very capable hands from the start. Our goal was to build…
Building a Custom Silent Cooler for NVIDIA A2 and L4 Data Center GPUs with n3rdware
My goal with my home lab has always been clear. Keep it as silent as possible and as cool as possible, without sacrificing performance. Recently I started working more seriously with the NVIDIA A2 and the NVIDIA L4. Both are data center GPUs designed to operate inside rack servers with strong, directed airflow. They do…
Silent Cooling Solution for the Nvidia L4 24 GB GPU
I am keeping this post very short, with mostly photos. I tested the cooling performance with different games. The GPU’s max power is 72W, though during my tests, it exceeded 75W. It’s also possible to limit it to 30W. I tested the GPU by running games like Black Myth: Wukong, Cyberpunk 2077, Uncharted 4: A…
Nvidia L4: Powerful Low-Power GPU for Nvidia AI Enterprise and Virtual GPU
I’ve been searching the internet for a long time to find a versatile GPU for AI and video graphics workloads that also supports vGPU and Nvidia AI Enterprise. Some of the GPUs I considered were the RTX 6000 Ada, A2, A10, L4, T4, A40, and A16. I was most drawn to the RTX 6000 Ada…