The challenge isn't in extracting a dataflow graph from an imperative control flow format. Compilers are already capable of doing this, and every major compiler has an expression DAG at some point. The challenge is in mapping from a generic dataflow graph to the actual dataflow hardware primitives, which have different requirements whose constraints may require non-trivial mappings.
Register allocation is also NP complete, but we have plenty of efficient approximations that work well enough in practice, like linear scan for fast compilation, or by graph coloring or puzzle solving for slower but more efficient compilation. Is there a comparable tradeoff available in this case?