Penguin Solutions (PENG) said Thursday it released an updated version of its ClusterWareAI software, adding tools to help operators manage and optimize large-scale AI infrastructure.
The company said the update includes an AI Factory Operations Agent that lets administrators monitor GPU cluster performance and troubleshoot issues using natural language queries.
The release also expands automated remediation for Kubernetes-based inference environments and adds hardware-level monitoring to detect performance degradation, the company said.
Penguin said the software is designed to improve AI infrastructure performance, resilience, and GPU utilization while reducing operational complexity.
Comments