Phoenix Rising: Next-Generation Job Scheduling with Fenice
Deep in the belly of Altair Labs — rumored to be tucked away in an arcane network of subvolcanic tunnels — our boffins have been busy. One of their latest projects, codenamed “Fenice” (which is Italian for phoenix and pronounced “fen-ee-chay”), promises to be the future of job scheduling and best-of-everything workload management.
Fenice is designed to be scalable and flexible enough to handle every workload, from millions of short, high-throughput electronic design automation (EDA) jobs to large, complex high-performance computing (HPC) jobs. The plug-and-play solution will combine all the best features of the trusted Altair scheduling and workload management tools on which many organizations already rely.
“The goal of our next-generation scheduler is to help Altair customers extract maximum value from the computing resources they’ve invested in,” says Altair Chief Scientist Andrea Casotto. “Fenice delivers unprecedented performance and capabilities that aren’t available anywhere else.”
Fenice Highlights
- Scalable from a single job to 20 million queued jobs
- Small memory footprint, even at the highest load levels
- Top performance with up to 70,000 tasks per second
Beyond job scheduling, Fenice’s capabilities include cost control, job prediction, and workload analysis.
To manage millions of queued jobs, Fenice allows a federation of schedulers to work together as one. The workload is distributed across multiple schedulers. As demand evolves, both workload and compute resources can be migrated across federation components.
Space Science and the Swift X-Ray Telescope
As part of NASA’s Explorers Program, in cooperation with Italy and the U.K., the Swift X-Ray Telescope (XRT) aboard the orbiting Neil Gehrels Swift Observatory measures the position, spectrum, and brightness of gamma ray bursts, afterglows, and many other cosmic X-ray sources.
When a team of researchers needed to process the entire Swift XRT archive, which had been collected over 18 years, they enlisted Fenice to get the job done – and completed it in record time.
The Swift XRT workload consisted of over 600,000 jobs, and the team used task parallelism, with processing performed as multiple independent jobs in an AWS cloud environment. Starting from a previously released Swift-DeepSky software pipeline encapsulated in a Docker container, the entire collection of available XRT X-Ray images was processed within an unprecedented 60 hours of wall-clock time and at a cost of less than $200, a massive reduction compared to traditional methods.
From Outer Space to Wall Street
Turning data into useful information that enables researchers to advance their science is a big part of what keeps us excited about the technology we work to evolve — and astrophysical data is just the start.
Fenice has also shown excellent efficiency in running high-throughput jobs in areas like financial services and semiconductor design, and it’s capable of full-spectrum workload management.
What’s next from Altair Labs? Stay tuned for more hot developments from under the volcano!