93 lines
3.8 KiB
Markdown
93 lines
3.8 KiB
Markdown
# metagraph-optimization
|
|
|
|
Directed Acyclic Graph optimization for QED
|
|
|
|
## Generate Operations from chains
|
|
|
|
We assume we have a (valid) graph given. We can generate all initially possible graph operations from it, and we can calculate the graph properties like compute effort and total data transfer.
|
|
|
|
Goal: For some operation, regenerate possible operations after that one has been applied, but without having to copy the entire graph. This would be helpful for optimization algorithms to try paths of optimizations and build up tree structures, like for example chess computers do.
|
|
|
|
Idea: Keep the original graph, a list of possible operations at the current state, and a queue of applied operations together. The "actual" graph is then the original graph with all operations in the queue applied. We can push and pop new operations to/from the queue, automatically updating the graph's global metrics and possible optimizations from there.
|
|
|
|
Problems:
|
|
- The list of operations can be very large, it would be helpful to somehow separately know which possible operations changed after applying one.
|
|
- Operations need to be perfectly reversible, so we need to store replaced nodes and new nodes.
|
|
- Lots of testing required because mistakes will propagate and multiply.
|
|
|
|
## Other TODOs
|
|
- Reduce memory footprint of the graph, are the UUIDs too large?
|
|
- Memory layout of Nodes? They should lie linearly in memory, right now probably on heap?
|
|
- Add scaling functions
|
|
|
|
## Benchmarks of graphs
|
|
|
|
For graphs AB->AB^n:
|
|
- Number of Sums should always be 1
|
|
- Number of ComputeTaskS2 should always be (n+1)!
|
|
- Number of ComputeTaskU should always be (n+3)
|
|
|
|
Times are from my home machine: AMD Ryzen 7900X3D, 64GB DDR5 RAM @ 6000MHz
|
|
|
|
```
|
|
$ julia --project examples/import_bench.jl
|
|
AB->AB:
|
|
Graph:
|
|
Nodes: Total: 34, DataTask: 19, ComputeTaskP: 4, ComputeTaskS2: 2, ComputeTaskV: 4, ComputeTaskU: 4, ComputeTaskSum: 1
|
|
Edges: 37
|
|
Total Compute Effort: 185
|
|
Total Data Transfer: 102
|
|
Total Compute Intensity: 1.8137254901960784
|
|
Graph size in memory: 8.3594 KiB
|
|
11.362 μs (522 allocations: 29.70 KiB)
|
|
|
|
AB->ABBB:
|
|
Graph:
|
|
Nodes: Total: 280, DataTask: 143, ComputeTaskP: 6, ComputeTaskS2: 24, ComputeTaskV: 64, ComputeTaskU: 6, ComputeTaskSum: 1, ComputeTaskS1: 36
|
|
Edges: 385
|
|
Total Compute Effort: 2007
|
|
Total Data Transfer: 828
|
|
Total Compute Intensity: 2.4239130434782608
|
|
Graph size in memory: 88.2188 KiB
|
|
95.234 μs (4781 allocations: 270.82 KiB)
|
|
|
|
AB->ABBBBB:
|
|
Graph:
|
|
Nodes: Total: 7854, DataTask: 3931, ComputeTaskP: 8, ComputeTaskS2: 720, ComputeTaskV: 1956, ComputeTaskU: 8, ComputeTaskSum: 1, ComputeTaskS1: 1230
|
|
Edges: 11241
|
|
Total Compute Effort: 58789
|
|
Total Data Transfer: 23244
|
|
Total Compute Intensity: 2.5292118396145242
|
|
Graph size in memory: 2.0988 MiB
|
|
2.810 ms (136432 allocations: 7.57 MiB)
|
|
|
|
AB->ABBBBBBB:
|
|
Graph:
|
|
Nodes: Total: 438436, DataTask: 219223, ComputeTaskP: 10, ComputeTaskS2: 40320, ComputeTaskV: 109600, ComputeTaskU: 10, ComputeTaskSum: 1, ComputeTaskS1: 69272
|
|
Edges: 628665
|
|
Total Compute Effort: 3288131
|
|
Total Data Transfer: 1297700
|
|
Total Compute Intensity: 2.53381444093396
|
|
Graph size in memory: 118.4037 MiB
|
|
463.082 ms (7645256 allocations: 422.57 MiB)
|
|
|
|
ABAB->ABAB:
|
|
Graph:
|
|
Nodes: Total: 3218, DataTask: 1613, ComputeTaskP: 8, ComputeTaskS2: 288, ComputeTaskV: 796, ComputeTaskU: 8, ComputeTaskSum: 1, ComputeTaskS1: 504
|
|
Edges: 4581
|
|
Total Compute Effort: 24009
|
|
Total Data Transfer: 9494
|
|
Total Compute Intensity: 2.528860332841795
|
|
Graph size in memory: 891.375 KiB
|
|
1.155 ms (55467 allocations: 3.09 MiB)
|
|
|
|
ABAB->ABC:
|
|
Graph:
|
|
Nodes: Total: 817, DataTask: 412, ComputeTaskP: 7, ComputeTaskS2: 72, ComputeTaskV: 198, ComputeTaskU: 7, ComputeTaskSum: 1, ComputeTaskS1: 120
|
|
Edges: 1151
|
|
Total Compute Effort: 6028
|
|
Total Data Transfer: 2411
|
|
Total Compute Intensity: 2.5002073828287017
|
|
Graph size in memory: 225.0625 KiB
|
|
286.583 μs (13996 allocations: 804.48 KiB)
|
|
``` |