MuSE: Clustering, Extensibility & Almost Infinite I/O Power with OPAL-RT Simulations

OPAL-RT has improved the user experience when using multiple simulators, remote targets or when requiring clustered machines for increased I/O or other capabilities. This high-speed link is referred to as MuSE (Multi-System Expansion link).

Irène Pérès, MuSE’s Product Owner at OPAL-RT, recently joined us to speak about the incredible power, flexibility, versatility and user-friendliness of this special add-on—that is so much more than just an expansion kit.

Interviewer (IV)Thanks for joining us, Irène! I have a partial understanding that MuSE is sort of a hardware expansion kit, allowing clustering of multiple machines, and frequently used for augmenting available I/O. Is that how you’d describe it?

Irène Pérès (IP): “MuSE does indeed involve some hardware components: optical fiber and SFP (Small Form-factor Pluggable) transceivers. However, the hardware is just the tip of the iceberg, as the essence of MuSE lies in the software layers we’ve developed. The problem we needed to solve was: when a user needs more I/O channels than are available in one chassis, how can we facilitate connection with other chassis to add that additional I/O capacity?”

IV: MuSE has been used for simulation in industries where they tend to use a lot of I/O, like aviation and aeronautics—is that right?

IP: “Yes! We first developed the link with these industries in mind. But actually, we see an emerging trend to use MuSE in very diverse applications and even for smaller systems. The user-friendliness of the modular joining of the systems is a very persuasive selling point for its usability. It makes things a great deal easier. So, for example, if you have an OP4510 simulator with its 128 I/O channels and need more I/O, with PCIe you could add an OP4520 expansion chassis and double the capacity to 256 I/Os–but what if you needed even more? The OP4510 is a fairly small box, so it can’t take more than one PCIe card. But now, using the four SFP ports in the front of the chassis, we can connect four I/O expansion chassis to that OP4510, and these chassis can be of any type–even bigger ones like the OP5607 with 256 I/O channels, increasing the I/O capability exponentially.”

IV: So you mean you can mix any type of OPAL-RT chassis with MuSE?

IP: “Indeed! As I mentioned, we were using SFPs to connect to third-party devices (amplifiers, MMC controllers), and already had SFP ports on all our newer systems. So the software we developed is compatible with all systems, which allows the user to build a network of simulators and expansion chassis of different types, depending on their requirements and budget. On our OP5707 simulator, for example, we have 16 SFP ports, so we can connect 16 OP5607 chassis, each with up to 256 I/Os. We have even recently upgraded the OP5600 family to add MuSE capability, with the new OP5650 just released, which has an Artix-7 FPGA and 4 SFP ports.”

IV: So really: up to 4,096 I/O connections? Have we ever done anything like that as far as you know?

IP: “Well, let’s run through some pretty exceptional simulation challenges, and it’ll give you an idea of the flexibility and expandability we can provide. Configurations like this simply weren’t possible with PCIe in a user-friendly way, if at all:

  • One of our clients needed to expand their OP4510 simulator (running with RT-LAB) to perform real-time control of power electronics and electrical machines on a test bench. In that case, the addition of four OP4200s clustered allowed them to run the control on a real-time calculator of the OP4510 while piloting the inverters and collecting data from measurements on the OP4200s;
  • Another client has two setups, one OP5707 with four OP4520s, and another OP5707 with four OP4200s, running with HYPERSIM. In this case, the remote units are located ~100 meters from the simulator, and the customer needs to use the optical fiber pre-installed in their building. We could not add real-time synchronization cables between the units, as we had done previously, and with this setup in mind, we integrated our real-time synchronization signal into the MuSE link, so that all communications are now over one single link.
  • Finally, one of our long term clients had bought several of our chassis (OP5600, OP4520, OP5707, OP4510) and wanted to reuse these for a larger project they were preparing. Adding the MuSE capability allows them to connect their OP4510s and OP4520s to the OP5707 to collect data from measurements, yet still be able to easily disconnect an OP4510 to use it as standalone chassis in other experiments when needed.”

IV: Well we’ve talked a lot about hardware, but you mentioned the software layer was a large part of the link. What exactly does it bring, in term of expandability?

IP: “Usability, both during model preparation and during execution of the simulation, was our main objective. In the past, we had developed a generic Xilinx Aurora interface that users could use to prepare FPGA bitstreams for our chassis. But managing the data packing/unpacking and the timing constraints of the protocol was cumbersome, and we understood that our users were obviously more interested in developing their control algorithms than tackling these low-level problems. Now the Aurora integration in the bitstream is completely taken care of by our RT-XSG toolbox during the generation of the bitstream.”

“And for the CPU model, we made it easy through the RT-LAB and HYPERSIM interfaces: the user has only to add remote units through the GUIs. The simulator’s bitstream will not need to be changed if the users need to use one more remote unit later on —everything is taken care of. Much more user-friendly, and a huge step forward!”

IV: Pretty exceptional, and we helped them all out. Okay, so what we refer to as MEA (More Electric Aircraft), or electric boats, for example–it’s not hard to imagine a scenario where we actually need 4,056 connectors! Given the increasing electrical/electronic complexity of some of the transportation and other electronic networks and infrastructures out there, that is.

IP: “We know, through the installations I’ve outlined to you above, that these setups work reliably. Knowing this and going forward, in terms of expandability, it’s more an issue of time, of latency—how much data can you transfer in a time-step? But you have the same considerations as PCIe. We start to hit the ceiling at some point–but that’s not a limitation distinct to MuSE, or imposed by OPAL-RT.”

IV: And in term of performances, do the optical fibers reduce latency?

IP: “Actually it is not the link itself which removes latency, but the fact that we no longer need to use one or more PCIe slots managed by the chassis’ motherboard, and different PCIe interconnection cards and cables with various chassis. We thus gained better control by using the Aurora link, and the only PCIe connection remaining is between the simulator’s motherboard and the simulator’s internal FPGA. The rest of the communication happens between the FPGAs of the various chassis directly. We also control the network enumeration, the detection of the number and type of chassis connected, etc.”

IV: In your experience, is this a highly specialized configuration for a certain type of user…?

IP: “Not at all, perhaps even the opposite. Before, when we had PCIe connections, the users didn’t have to worry about it, the motherboard did. So we wanted to have the same ease of use with the SFP and optical fibers. Generally, all the user has to do is cable the SFPs, and configure the simulator and its remote chassis in RT-LAB or HYPERSIM. The way the users interact in the model, it’s the same as with any other chassis–there’s nothing special to program.”

IV: So I understand this doesn’t just increase the I/O of the machine, but it also allows many other connection schemes. So this is not just a linear improvement in use, but an exponential one. Are there other developments coming?

IP: Of course! We achieved ease of use for the CPU model and the bitstream preparation. But we are now tackling other aspects of FPGA to FPGA communications required by our users: models distributed over multiple FPGAs, communication with power amplifiers, with MMC controllers, etc. I think we’re just in the first phase of this MuSE project, and there are plenty of interesting features still to come!

About the Interviewee

Irène Pérès joined OPAL-RT in 2001, and she has been involved in the development of OPAL-RT simulators, including software drivers, firmware and aspects of hardware management.
She is now a Product Manager and technical fellow and focuses on the evolution of OPAL-RT multi-FPGA platforms.
From her early career years (she received a Ph.D. in Plasma Physics from Paul Sabatier University–in Toulouse, France–in 1990), she retains a strong interest in challenging, complex technical projects, and strives to bring simplicity to these challenges.

HYPERSIM on Demand

As part of an ongoing blog post series reflecting on our biggest product introductions of 2018, we spoke with HYPERSIM Product Manager, Etienne Leduc. Etienne comments here on the introduction of OPAL-RT’s new EMT prep station platform offering, HYPERSIM on Demand. It allows users to accelerate the prototyping, development and testing of power system equipment on remote servers, without concerns about IT resources, licensing, or scarcity of performant hardware availability.

What is perhaps the most innovative feature though, is that since the cloud is infinitely scalable, and since HYPERSIM on Demand works on a ‘# of cores times # of hours’ pricing model, scenarios the industry has never been able to accommodate suddenly become available, and former limitations are struck down. A train manufacturer with a new network design, for example, may require 20,000 hours of simulation to satisfy industry requirements. The math on this is simple: 20 cores times 1,000 hours each—or, for that matter, 200 cores times 100 hours each: the choice is yours!

We spoke with Etienne as he prepared for the company’s participation at CIGRE, a world forum for power systems important to several of OPAL-RT’s core sectors.

Interviewer [IV]: “So, Etienne. Good day. You’re the HYPERSIM product manager and have been for a while. How do you see this suite of tools as an addition to the simulation capabilities at any given ‘shop’?“

Etienne Leduc [EL]: “Good day to you! Well, first of all, make no mistake, it’s currently possible to run real-time simulations only on our HIL simulators to test external devices such as PMUs and controllers. In that way, the cloud implementation is not meant to be everything to all people; but what it does open up and make abundantly possible is very exciting in and of itself.”

“We were already very proud of the speed at which the user can prepare their real-time simulation and run offline tests on their own Windows PC, thanks to our know-how in parallelizing computation. This is called ‘accelerated simulation’ or ‘faster than real-time’. This particular capability is not currently available on the power system simulation market elsewhere, and with cloud simulation, we’ve pushed it multiple levels further—it’s truly amazing!”

“Another thing we’re currently investigating is to make sure that the latency between the data centers and the hardware in the user’s lab is low enough to run tests using Ethernet-based communication protocols, such as C37.118, OPC-UA or other SCADA protocols. The remote cloud server solutions are coming of age, and are filling in niche needs, use cases, and scenarios that have always existed, but perhaps have not been addressed directly up to this point due to various limitations.”

“So we’ve already changed the market thanks to the ‘very high performance at very low cost’ that our cloud services represent, and we’ll continue to change it further once we’ve enabled cloud-based Hardware-in-the-Loop simulation.”

IV: “We’re the first company in our market to make it to the ‘cloud’—how do you feel about us being ‘disruptive’ in this way, and how might it help or change the market?”

EL: “You know, OPAL-RT has always been about disruptive innovation and democratizing real-time simulation. We’ve had a dream since 1997, and a credo that goes with it, to “put a high-end simulator on every engineer’s desk”. Jean Bélanger and Lise Laforce, our founders, were and are very serious about this. To achieve this dream, we have to work on multiple aspects of our products, such as performance, usability, reliability, ease of access, etc.”

“And cloud services address exactly these aspects. By having your simulation computationally powered by Amazon, you ensure data security, you have access to the latest hardware technology, and can simulate from anywhere around the globe, as long as you have an internet connection. More and more tools—powerful, world-class tools—are available nowadays directly through a web browser, or through the Software as a Service [SaaS] business model, and this is also a next step we are aiming at; we have some milestones attached to that. One of the ways this becomes really exciting for our core customers in various key sectors is that the user is entirely freed from managing software versions, subsystems, licensing, and hardware, and whose turn it is to use the IT resources that week, etc. All that just becomes a non-issue at this stage.”

IV: “How might existing OPAL-RT customers or new ones work this into their simulation routines, depending on the simulation type they’re doing?”

EL: “The answer to this is so simple that it’s also disruptive! From the software point of view, the user simply has to add the virtual simulator, as with any other target, by using HYPERSIM’s target manager and inputting the IP address. From the point of view of habitual workflow, all users have to validate various elements of their model(s), at various stages, and prepare test cases before they go to real-time.”

“If the access to the simulator is limited, whether it’s shared with colleagues or located in another building, for example, the user could previously only run their simulations on their own machines. And this is what everyone got used to doing! Now: this might work seamlessly for smaller models, but if you’ve got complex or much larger models, you’d be rapidly pretty limited as you sat and watched your machine chew on the simulation. And an extremely powerful personal computer (which may only be required for a small portion of the time) is not necessarily within the reach of every user, or university budget, or design lab.”

“So as soon as the user feels their simulation is running too slowly–which is common these days with the integration of distributed generation, more complex control and protection schemes, FACTS and HVDC, etc.–they can switch, in literally one click, to a virtual, fast parallel simulator, and access the massive server power of Amazon. And the power is far from being the only great thing about it. When validating a model, you can spend a lot of time analyzing data after the simulation itself—going back and looking at various items forensically. With the service we’re offering, the user pays only for the time during which they actually use the virtual simulator—not for the time it’s sitting around unused during data analysis, for example.”

“There are various ideal use cases for HYPERSIM on Demand, and what they all have in common is high need for short-term computational power and immediate access to resources: students with huge end-of-term Engineering projects and theses; or manufacturers who have to do what used to be cripplingly large chunks of simulation in a short time. This is a solution whose time has come, and we’re receiving a lot of enthusiastic feedback from various sectors.”

IV: “Why would a client use HYPERSIM on Demand if they already had a fast real-time simulatoreither HYPERSIM or another competitive solution?”

EL: “It’s a matter of asset optimization. In practice, several simulation specialists need to perform offline simulation studies during several months using software like PSCAD and EMTP-RV before performing real-time simulation tests for hardware-in-the-loop (HIL) with an external control replica. These specialists must optimize the global power grid, including protection and control systems using approximate numerical models. Several teams must work concurrently to find the critical cases to be simulated later, using actual protection and control system replicas.”

“In theory, the real-time simulator could be used to perform offline studies faster, since they use parallel processing. But in practice, real-time simulators are a precious asset always used to test actual protection and control hardware systems. As a result, using HYPERSIM on Demand offers the possibility of executing several simulation studies concurrently by several teams while the real-time simulator is busy. This is called ‘Software-in-the-loop’, implying that control and protection systems are simulated in software instead of using the actual hardware.”

IV: “What are the real advantages of HYPERSIM on Demand over more classical methods consisting of running several single-processor simulations in parallel?”

EL: “It is true that running 100 simulations taking ten minutes each will give 100 results after 10 minutes if 100 processors are used to simulate a moderately large network. On the other hand, HYPERSIM will give one result every six seconds due to its very efficient parallel computing features—whose efficiency increases with the complexity and size of the power grid. This means two things. One, that users are getting much better interactivity! This is like HIL, but with a Human-in-the-Loop! If the simulation goes poorly for any reason, you don’t have to wait 10 minutes before you realize it. This means users can interact with the simulation to find worst-case scenarios using human intelligence and intuition. And two, this also means that one could develop an automatic testing system based on clever algorithms or artificial intelligence to find the worst cases and to optimize parameters automatically and in less time. It’s all about making more sophisticated tests in less time, in order to decrease time-to-commissioning, even with increasing grid size and complexity.”

IV: “Etienne, I’d like to thank you for your time, and for this glimpse into the potential future of simulation. Very informative and entertaining—thanks.”

Etienne Leduc | Product Owner of HYPERSIM®

Etienne Leduc received his bachelor’s degree in Electrical Engineering from Germany’s Bremen University of Applied Sciences in 2013 and joined the R&D department of OPAL-RT upon his return to Canada. After having worked in the protection field and IEC 61850 with HYPERSIM for a year, Etienne focused on providing services for the growing number of HYPERSIM customers. Etienne was promoted to Product Owner of HYPERSIM in 2015, and has since been dedicated to planning and designing the software suite’s evolution and development.

For more on HYPERSIM on Demand: