The best machine to run PowerFactory
Update — April 13, 2026
I tested a fifth VM (c4-highmem-2) with the same CPU as the winner and it came out 28% slower. I don't know why yet — I have a hypothesis but I haven't verified it. See details ↓
Day 33 / 60
The best machine to run PowerFactory is the one with the highest clock frequency possible.
I'd been avoiding this question for weeks: which machine should run PowerFactory?
I'll be honest: I assumed more cores and more RAM meant faster. I used the one recommended by default, a c2d-standard-16 with 16 cores and 64 GB of RAM. Never measured it.
So I did what I should have done weeks ago: a systematic benchmark. 5 machines, 25 power flows each, multiple runs. Same CEN database (~2,600 buses from the Chilean grid), same PowerFactory version, supported by Don Nelson.
The machines
| VM | CPU | vCPUs | RAM | Disk |
|---|---|---|---|---|
| n2-standard-4 | Intel 2.2 GHz | 4 | 16 GB | pd-ssd |
| c2-standard-4 | Intel 3.1 GHz | 4 | 16 GB | pd-ssd |
| c2-standard-8 | Intel 3.1 GHz | 8 | 32 GB | pd-ssd |
| c2d-standard-16 | AMD 3.5 GHz | 16 | 64 GB | pd-ssd |
| c4-highcpu-4 | Intel Emerald Rapids ~4.2 GHz | 4 | 8 GB | Hyperdisk |
| c4-highmem-2 (update) | Intel Emerald Rapids ~4.2 GHz | 2 | 15 GB | Hyperdisk |
All running PowerFactory 2024 SP1 on GCP, with Spark 0.3.1 executing the flows. I didn't get to collect full data for the n2-standard-4, but it's the slowest by far — the results are from the other 4. I tested c4-highmem-2 after publishing the post, details are in the update at the end.
Results
The chart speaks for itself:
The c4-highcpu-4 with 4 cores and 8 GB of RAM averages 2.19 seconds per flow. The c2d-standard-16 — the one I was using — with 16 cores and 64 GB averages 2.72. Four times the resources, 20% slower.
Why? PowerFactory is single-threaded for power flow solving. The only thing that matters is core frequency. More RAM doesn't help. More cores don't help. The 15 extra cores I was paying for did absolutely nothing.
| VM | Average | Best | Worst | Variation |
|---|---|---|---|---|
| c2-standard-4 | 3.903s | 3.704s | 4.136s | ~11% |
| c2-standard-8 | 3.603s | 3.597s | 3.608s | <1% |
| c2d-standard-16 | 2.724s | 2.707s | 2.742s | <1% |
| c4-highcpu-4 | 2.192s | 2.165s | 2.239s | <3% |
Another data point confirming the thesis: the c2-standard-8 (8 cores) is only 3% faster than the c2-standard-4 (4 cores). Same frequency, double the cores, virtually identical performance. The extra cores are useless here.
Thermal throttling
The c2-standard-4 has a problem: it overheats. On the second run, the first 13 flows average 3.7 seconds, but from flow 14 onward they jump to 4.6 seconds. The CPU drops frequency to prevent overheating.
The red line is the c2-standard-4: you can clearly see the jump after flow 13. The c4-highcpu-4 (green) and c2d-standard-16 (blue) stay flat. The c4 isn't just faster — it's consistent.
This matters. When you run a full study with multiple scenarios, you need each flow to take the same time. If the machine throttles halfway through, your timing becomes unpredictable and your cost estimates are useless.
The other times
The power flow isn't the only thing that takes time. Before solving, Spark has to load the database and activate the scenario. These times also vary between machines.
Load time
Loading the CEN database (~2,600 buses) is pure I/O: reading the file from disk and parsing it into memory.
The c4-highcpu-4 with Hyperdisk loads in 8.3 seconds, 33% faster than the c2 machines with pd-ssd (~12s). Disk type matters here.
An interesting detail: in some runs the load time spiked to 80-130 seconds. That happens when the disk cache is cold — the first time the VM reads the file. In consecutive runs the OS keeps it cached and it drops to normal times.
Activation time
Activating the scenario (study case + operation variation) tells PowerFactory "I want to simulate this system state."
Same thing here: the c4 activates in ~1.95 seconds vs ~2.7s on the c2-standard-4. The absolute difference is small but it shows that CPU frequency impacts every operation.
The c2d-standard-16 had an outlier of 10.2 seconds in one run. No clear explanation — possibly contention on the shared VM.
Full pipeline
Putting it all together — load + activation + solver — shows the end-to-end difference:
| VM | Load | Activation | Solver | Total |
|---|---|---|---|---|
| c2-standard-4 | 12.2s | 2.7s | 92.6s | 107.5s |
| c2-standard-8 | 12.0s | 2.5s | 89.9s | 104.5s |
| c2d-standard-16 | 9.1s | 2.2s | 68.0s | 79.4s |
| c4-highcpu-4 | 8.3s | 2.0s | 54.1s | 64.3s |
The c4-highcpu-4 completes all 25 flows in 64 seconds end-to-end. The c2-standard-4 takes 107. The c4 wins at every stage.
Cost
In practice I keep the machines running by the hour. What matters is how much each dollar buys me.
| VM | Price/hour | Flows/hour |
|---|---|---|
| c2-standard-4 | $0.2088 | ~922 |
| c2-standard-8 | $0.4176 | ~999 |
| c2d-standard-16 | $0.7264 | ~1,322 |
| c4-highcpu-4 | $0.1701 | ~1,642 |
The c4-highcpu-4 costs $0.17/hr and does ~1,642 flows per hour. The c2d-standard-16 I was using costs $0.73/hr and does ~1,322. I was paying 4.3x more per hour for 20% less throughput.
I was burning money. Literally.
Infinite compute
In a job application to OpenAI they asked: if you had infinite compute, what would you do? At the time I didn't have a good answer. Now I understand.
I would simulate the electrical grid in all its infinite possibilities to learn from it. Contingencies, protection settings, operation scenarios, demand variations. Every possible combination. A machine running 24/7 generating knowledge.
The c4-highcpu-4 I'm using costs $0.1701/hr. Running 24/7:
$0.1701/hr × 24 hr × 30 days = $122.47/month
~1,642 flows/hr × 24 hr × 30 days = ~1,182,240 flows/month
But if the only thing that matters is core frequency and I'm keeping the machine on 24/7, the best option isn't a VM — it's an Intel i9-14900K running at 6.0 GHz turbo. If the solver scales linearly with frequency, each flow would drop from ~2.19s to ~1.53s. 43% faster.
The best PC to run PowerFactory
| CPU | Intel i9-14900K — 6.0 GHz turbo | ~$460 |
| RAM | 16 GB DDR5 | ~$200 |
| Storage | SSD 120 GB (enough) | ~$20 |
| Motherboard | ASUS Prime Z690-P (LGA 1700 DDR5) | ~$120 |
| PSU | MSI MAG A650BN 650W 80+ Bronze | ~$60 |
| Total | ~$860 |
The c4-highcpu-4 running 24/7 costs ~$122/month. This PC pays for itself in ~6 months.
Compute isn't cheap. But it's what needs to be done. For an AGI to understand the electrical grid, someone has to generate the data. A PC running 24/7 simulating contingencies, running power flows, tuning protections. That's what I'm going to do.
Update — April 13, 2026
I tested a fifth VM: c4-highmem-2
After publishing this post a question lingered: if the only thing that matters is core frequency, what happens if I cut vCPUs in half? It should theoretically be just as fast and cheaper.
I tested c4-highmem-2: same CPU as c4-highcpu-4 (Intel Emerald Rapids 4.2 GHz turbo), but only 2 vCPUs and 15 GB of RAM.
| VM | vCPU | Average | Price/hr | Cost/1,000 |
|---|---|---|---|---|
| c4-highcpu-4 | 4 | 2.192s | $0.1701 | $0.09 |
| c4-highmem-2 | 2 | 2.810s | $0.1263 | $0.10 |
28% slower with the same CPU. I didn't expect that. If core frequency is all that matters, two VMs with the same Intel Emerald Rapids at 4.2 GHz should perform practically the same.
Hypothesis (unverified): on GCP vCPUs are hyperthreads, so 2 vCPUs might be sharing a single physical core. If that were the case, PowerFactory's solver would be competing with the OS (Windows) and the Spark server for the same core and losing time to context switches.
But that's just a hypothesis. It could be something else: memory differences (15 GB vs 8 GB, different access latencies), GCP scheduler behavior, disk type, or something specific to the c4-highmem-2 SKU. I didn't measure it.
Tentative takeaway: dropping below 4 vCPUs on GCP for PowerFactory didn't go as expected. Until I understand why, I'll stick with c4-highcpu-4 as the baseline. If anyone knows the real reason, reach out.