Co-authored-by: Anton Reinhard <anton.reinhard@proton.me> Reviewed-on: #9
metagraph-optimization
Directed Acyclic Graph optimization for QED
Usage
For all the julia calls, use -t n
to give julia n
threads.
Instantiate the project first:
julia --project=./ -e 'import Pkg; Pkg.instantiate()'
Run Tests
To run all tests, run
julia --project=./ -e 'import Pkg; Pkg.test()' -O0
Run Examples
Get the correct environment for the examples folder:
julia --project=examples/ -e 'import Pkg; Pkg.develop(Pkg.PackageSpec(path=pwd())); Pkg.instantiate();'
Then execute a specific example:
julia --project=examples examples/<file>.jl -O3
Concepts
Generate Operations from chains
We assume we have a (valid) graph given. We can generate all initially possible graph operations from it, and we can calculate the graph properties like compute effort and total data transfer.
Goal: For some operation, regenerate possible operations after that one has been applied, but without having to copy the entire graph. This would be helpful for optimization algorithms to try paths of optimizations and build up tree structures, like for example chess computers do.
Idea: Keep the original graph, a list of possible operations at the current state, and a queue of applied operations together. The "actual" graph is then the original graph with all operations in the queue applied. We can push and pop new operations to/from the queue, automatically updating the graph's global metrics and possible optimizations from there.
Problems:
- The list of operations can be very large, it would be helpful to somehow separately know which possible operations changed after applying one.
- Operations need to be perfectly reversible, so we need to store replaced nodes and new nodes.
- Lots of testing required because mistakes will propagate and multiply.
Other TODOs
- Reduce memory footprint of the graph
- Memory layout of Nodes? They should lie linearly in memory, right now probably on heap?
- Add scaling functions
Benchmarks of graphs
For graphs AB->AB^n:
- Number of Sums should always be 1
- Number of ComputeTaskABC_S2 should always be (n+1)!
- Number of ComputeTaskABC_U should always be (n+3)
Times are from my home machine: AMD Ryzen 7900X3D, 64GB DDR5 RAM @ 6000MHz (not necessarily up to date, check Jupyter Notebooks in notebooks/
instead)
$ julia --project examples/import_bench.jl
AB->AB:
Graph:
Nodes: Total: 34, DataTask: 19, ComputeTaskABC_P: 4,
ComputeTaskABC_S2: 2, ComputeTaskABC_V: 4, ComputeTaskABC_U: 4,
ComputeTaskABC_Sum: 1
Edges: 37
Total Compute Effort: 185
Total Data Transfer: 102
Total Compute Intensity: 1.8137254901960784
Graph size in memory: 8.3594 KiB
11.362 μs (522 allocations: 29.70 KiB)
AB->ABBB:
Graph:
Nodes: Total: 280, DataTask: 143, ComputeTaskABC_P: 6,
ComputeTaskABC_S2: 24, ComputeTaskABC_V: 64, ComputeTaskABC_U: 6,
ComputeTaskABC_Sum: 1, ComputeTaskABC_S1: 36
Edges: 385
Total Compute Effort: 2007
Total Data Transfer: 828
Total Compute Intensity: 2.4239130434782608
Graph size in memory: 88.2188 KiB
95.234 μs (4781 allocations: 270.82 KiB)
AB->ABBBBB:
Graph:
Nodes: Total: 7854, DataTask: 3931, ComputeTaskABC_P: 8,
ComputeTaskABC_S2: 720, ComputeTaskABC_V: 1956, ComputeTaskABC_U: 8,
ComputeTaskABC_Sum: 1, ComputeTaskABC_S1: 1230
Edges: 11241
Total Compute Effort: 58789
Total Data Transfer: 23244
Total Compute Intensity: 2.5292118396145242
Graph size in memory: 2.0988 MiB
2.810 ms (136432 allocations: 7.57 MiB)
AB->ABBBBBBB:
Graph:
Nodes: Total: 438436, DataTask: 219223, ComputeTaskABC_P: 10,
ComputeTaskABC_S2: 40320, ComputeTaskABC_V: 109600, ComputeTaskABC_U: 10,
ComputeTaskABC_Sum: 1, ComputeTaskABC_S1: 69272
Edges: 628665
Total Compute Effort: 3288131
Total Data Transfer: 1297700
Total Compute Intensity: 2.53381444093396
Graph size in memory: 118.4037 MiB
463.082 ms (7645256 allocations: 422.57 MiB)
AB->ABBBBBBBBB:
Graph:
Nodes: Total: 39456442, DataTask: 19728227, ComputeTaskABC_S1: 6235290, ComputeTaskABC_P: 12, ComputeTaskABC_U: 12, ComputeTaskABC_V: 9864100, ComputeTaskABC_S2: 3628800, ComputeTaskABC_Sum: 1
Edges: 56578129
Total Compute Effort: 295923153
Total Data Transfer: 175407750
Total Compute Intensity: 1.6870585991782006
Graph size in memory: 12.7845 GiB
ABAB->ABAB:
Graph:
Nodes: Total: 3218, DataTask: 1613, ComputeTaskABC_P: 8,
ComputeTaskABC_S2: 288, ComputeTaskABC_V: 796, ComputeTaskABC_U: 8,
ComputeTaskABC_Sum: 1, ComputeTaskABC_S1: 504
Edges: 4581
Total Compute Effort: 24009
Total Data Transfer: 9494
Total Compute Intensity: 2.528860332841795
Graph size in memory: 891.375 KiB
1.155 ms (55467 allocations: 3.09 MiB)
ABAB->ABC:
Graph:
Nodes: Total: 817, DataTask: 412, ComputeTaskABC_P: 7,
ComputeTaskABC_S2: 72, ComputeTaskABC_V: 198, ComputeTaskABC_U: 7,
ComputeTaskABC_Sum: 1, ComputeTaskABC_S1: 120
Edges: 1151
Total Compute Effort: 6028
Total Data Transfer: 2411
Total Compute Intensity: 2.5002073828287017
Graph size in memory: 225.0625 KiB
286.583 μs (13996 allocations: 804.48 KiB)