Altair_Blog_hero_1920x225

Partner Perspectives

Altair® PBS Professional® and Lenovo Intelligent Computing Orchestration (LiCO): Advanced HPC and AI Workload Management

By Altair | Lenovo |

High-performance computing (HPC) and artificial intelligence (AI) are often characterized by a mix of diverse workloads, from large language models (LLMs) and large simulations to numerous small tasks, all running on differently sized clusters. Allocation optimization and efficient compute resource utilization is critical for these workloads, especially in HPC environments. 

Lenovo and Altair deliver a fully integrated, easy-to-use, thoroughly tested and supported compute orchestration solution that helps customers manage their workloads, simplifying administration. The solution ensures jobs are scheduled efficiently, thereby minimizing resource waste and maximizing performance.

 

Efficient Orchestration with LiCO

Lenovo Intelligent Computing Orchestration (LiCO) is an in-house-developed software solution that simplifies the use of clustered computing resources for AI model development and training and for HPC workloads.

The unified platform simplifies interaction with underlying compute resources, enabling customers to leverage popular open-source cluster tools and reduce the effort and complexity of using them for HPC and AI.

 

Altair® PBS Professional® in Action with LiCO

Altair® PBS Professional® is fully integrated into LiCO to provide users with a smart, easy way to schedule and optimize their HPC and AI workloads. 

Lenovo and Altair’s combined efforts give customers a single, easy-to-use solution that’s thoroughly tested and supported so they can focus on their work without the burden of managing the complexity of an HPC/AI cluster.

The integration of LiCO and PBS Professional results in more effective resource utilization and job scheduling. With automation and efficient resource management, the LiCO and PBS Professional combination frees IT staff from routine tasks, thus increased productivity and allowing them to focus on more strategic responsibilities. This improves efficiency, reduces wait times, and lowers operational costs. 

Resource optimization and efficient workload management can result in significant cost savings by reducing energy consumption and maximizing hardware utilization.

The Lenovo-Altair solution is highly scalable, accommodating organizations of all sizes and enabling expansion of computing resources as needed to support growing workloads.

LiCO's compatibility with AI frameworks complements PBS Professional's ability to handle HPC workloads. This combination allows organizations to leverage both AI and HPC resources efficiently, making it suitable for research and development projects that require diverse computing capabilities. Detailed reporting and analytics provide organizations with valuable insights into resource usage, job performance, and system health, enabling data-driven decision-making and continuous optimization.

The integrated solution facilitates team collaboration by enabling users to share computing resources and workloads, enhancing the cooperative aspects of research and development.

Both LiCO and PBS Professional include strong security features to protect sensitive data, ensuring that only authorized users can access and manage resources. The combination delivers high levels of security.

Another advantage is ecosystem compatibility. LiCO and PBS Professional are designed to integrate with various HPC and AI hardware and software solutions, making the combined offering adaptable to existing infrastructures and workflows.

 

LiCO and PBS Professional Integration

HPC and AI users provide inputs (scripts, containers, and resources requested) through LiCO’s interface, and LiCO creates a PBS Professional batch script to deploy and manage workloads based on those inputs.

Queues allow administrators to partition hardware resources based on varying types or requirements. Within the integrated environment of LiCO and PBS Professional, administrators can establish and modify queues directly from the LiCO interface, eliminating the need to access the PBS Professional scheduler via a console. This streamlined approach simplifies the ongoing management of enterprise AI environments and reduces the necessity for specialized expertise in HPC software tools.

A Scheduler page is available for the administrator role in LiCO’s graphical interface, allowing administrators to create, edit, and delete queues and set queue and node state.

Figure 1. Scheduler management page in LiCO

Queues enable administrators to partition hardware resources based on specific requirements. Queues can be created through a user-friendly interface, allowing individuals who may not be well-versed in command-line operations to perform this task efficiently. Within the interface, users can specify various parameters such as node assignment, job priority, maximum job runtime, and the option to share compute resources (individual CPUs) among multiple jobs, with an optional job count parameter indicating the number of jobs allocated to each resource.

Administrators also have the authority to set a queue’s state, determining whether jobs can be allocated to nodes or placed in a queue for future processing. More experienced users can manage queues through the PBS Professional scheduler's command-line interface. After creating a queue and restarting PBS Professional-related services, the newly created queue becomes accessible via the web portal interface. 

Within the same interface, users are empowered to edit, delete, and manage the state of a queue and to modify the node state as needed.

Figure 2. Creating a queue in LiCO

 

Altair and Lenovo for Efficient Compute Orchestration

Together, Lenovo and Altair offer a powerful, comprehensive approach to HPC and AI workload management.

The value of the combined solution — LiCO and PBS Professional — lies in its ability to manage and optimize HPC and AI workloads. It delivers simplicity, resource optimization, scalability, productivity gains, and robust security, making it a smart choice for organizations with diverse and demanding computing requirements in areas including research, engineering, and simulation. The Lenovo and Altair teams will be at Supercomputing 23 from November 12-17, 2023. For more information, please visit Lenovo at Booth #601 or Altair Booth #825.