DIGITAL SOLUTIONS
Fujitsu Develops Real-Time CPU/GPU Optimization
Fujitsu develops world's first technology for real time CPU and GPU processing optimization to address global GPU shortage.
November 10, 2023
Tokyo (Japan), November 9, 2023 – Fujitsu today announced the development of the world’s first technology to optimize the use of CPUs and GPUs by allocating resources in real time to give priority to processes with high execution efficiency, even when running programs that use GPUs. Fujitsu designed the new technology to address the global shortage of GPUs due to the explosive demand for generative AI, deep learning, and other applications, by optimizing users’ existing computing resources. |
Fujitsu has also developed a new technology for parallel processing that switches processing of multiple programs in real time without waiting for the completion of a running program in an HPC system that performs large-scale computations by linking multiple computers. This technology makes it possible to immediately execute the processing of applications that require large-scale computational resources and real-time performance like digital twin and generative AI programs. Fujitsu will provide the newly developed technology as part of a future computer workload broker, a software initiative currently under development that enables AI to automatically calculate and select the most appropriate resource for a problem that customers want to solve according to their requirements, including computation time, computation accuracy, and cost, and will continue to validate this technology with customers, to realize a platform that can solve societal problems and create innovation to achieve a sustainable future. Fujitsu will demonstrate these technologies at SC23, which will be held at the Colorado Convention Center in Denver, Colorado, from Sunday, November 12, 2023, with the HPC technology featured in the Research Posters program.
Features of new technology1. World’s first technology to use CPU and GPU even during program processing For example, as shown in Figure 1, if the user wants to efficiently process three programs using one CPU and two GPUs, it is possible to assign GPUs to programs 1 and 2 according to the availability of GPUs. Then, in response to the request of program 3, the GPU allocation is changed from program 1 to program 3 for performance measurement, and the degree of processing acceleration on the GPU is measured. As a result of the measurement, it is found that the overall processing time would be reduced by allocating the GPU to program 3 rather than to program 1. Therefore, the GPU would be allocated to program 3 and the CPU would be allocated to program 1 during that time. After program 2 is finished, the GPU becomes free, so the GPU is allocated to program 1 again, and in this way, the computational resources are allocated so that the program processing is completed in the shortest time. This technology makes it possible to quickly train models for processing graph AI data in the development of applications such as AI using GPUs and advanced image recognition. 2. World’s first technology for real-time switching of execution of multiple programs on an HPC system Because the conventional control method uses unicast communication, which switches program execution to each server one by one, variations in switching timing occur, making it difficult to perform batch switching of program execution in real time. By adopting broadcast communication that can be sent simultaneously to the communication that switches program execution, Fujitsu has enabled real-time batch switching of program execution by reducing the interval between program processing switches that affect program performance from a few seconds to 100 milliseconds in a 256 node HPC environment. Since the appropriate communication method depends on application requirements and network quality, the optimal communication method can be selected in consideration of the degree of performance improvement due to broadcast communication and performance degradation due to packet loss. This technology enables applications requiring real-time performance for digital twins, generative AI, and materials and drug discovery to be executed more rapidly using HPC-like computational resources.
Future Plans
|