我正在尝试确定完成下述任务的最佳时间效率算法.
我有一套记录.对于这组记录,我有连接数据,表明该组中的记录对如何相互连接.这基本上代表一个无向图,其中记录是顶点,连接数据是边.
集合中的所有记录都有连接信息(即不存在孤立记录;集合中的每个记录都连接到集合中的一个或多个其他记录).
我想从集合中选择任意两条记录,并能够显示所选记录之间的所有简单路径."简单路径"是指路径中没有重复记录的路径(即仅限于有限路径).
注意:两个选择的记录将始终不同(即开始和结束顶点永远不会相同;没有循环).
例如:
If I have the following records: A, B, C, D, E and the following represents the connections: (A,B),(A,C),(B,A),(B,D),(B,E),(B,F),(C,A),(C,E), (C,F),(D,B),(E,C),(E,F),(F,B),(F,C),(F,E) [where (A,B) means record A connects to record B]
如果我选择B作为我的起始记录而E作为我的结束记录,我希望找到通过记录连接将记录B连接到记录E的所有简单路径.
All paths connecting B to E: B->E B->F->E B->F->C->E B->A->C->E B->A->C->F->E
这是一个例子,实际上我可能有包含数十万条记录的集合.
看起来这可以通过深度优先搜索图来完成.深度优先搜索将找到两个节点之间的所有非循环路径.该算法应该非常快并且可以扩展到大型图形(图形数据结构稀疏,因此它只使用它所需的内存).
我注意到你上面指定的图只有一个方向边(B,E).这是一个错字还是它真的是有向图?这个解决方案无论如何 对不起,我无法用C做,我在那个区域有点弱.我希望你能够毫不费力地翻译这个Java代码.
Graph.java:
import java.util.HashMap;
import java.util.LinkedHashSet;
import java.util.LinkedList;
import java.util.Map;
import java.util.Set;
public class Graph {
private Map> map = new HashMap();
public void addEdge(String node1, String node2) {
LinkedHashSet adjacent = map.get(node1);
if(adjacent==null) {
adjacent = new LinkedHashSet();
map.put(node1, adjacent);
}
adjacent.add(node2);
}
public void addTwoWayVertex(String node1, String node2) {
addEdge(node1, node2);
addEdge(node2, node1);
}
public boolean isConnected(String node1, String node2) {
Set adjacent = map.get(node1);
if(adjacent==null) {
return false;
}
return adjacent.contains(node2);
}
public LinkedList adjacentNodes(String last) {
LinkedHashSet adjacent = map.get(last);
if(adjacent==null) {
return new LinkedList();
}
return new LinkedList(adjacent);
}
}
Search.java:
import java.util.LinkedList;
public class Search {
private static final String START = "B";
private static final String END = "E";
public static void main(String[] args) {
// this graph is directional
Graph graph = new Graph();
graph.addEdge("A", "B");
graph.addEdge("A", "C");
graph.addEdge("B", "A");
graph.addEdge("B", "D");
graph.addEdge("B", "E"); // this is the only one-way connection
graph.addEdge("B", "F");
graph.addEdge("C", "A");
graph.addEdge("C", "E");
graph.addEdge("C", "F");
graph.addEdge("D", "B");
graph.addEdge("E", "C");
graph.addEdge("E", "F");
graph.addEdge("F", "B");
graph.addEdge("F", "C");
graph.addEdge("F", "E");
LinkedList visited = new LinkedList();
visited.add(START);
new Search().depthFirst(graph, visited);
}
private void depthFirst(Graph graph, LinkedList visited) {
LinkedList nodes = graph.adjacentNodes(visited.getLast());
// examine adjacent nodes
for (String node : nodes) {
if (visited.contains(node)) {
continue;
}
if (node.equals(END)) {
visited.add(node);
printPath(visited);
visited.removeLast();
break;
}
}
for (String node : nodes) {
if (visited.contains(node) || node.equals(END)) {
continue;
}
visited.addLast(node);
depthFirst(graph, visited);
visited.removeLast();
}
}
private void printPath(LinkedList visited) {
for (String node : visited) {
System.out.print(node);
System.out.print(" ");
}
System.out.println();
}
}
节目输出:
B E
B A C E
B A C F E
B F E
B F C E
美国国家标准与技术研究院(NIST)在线算法和数据结构词典将此问题列为" 所有简单路径",并建议使用深度优先搜索.CLRS提供相关算法.
使用Petri网一个聪明的技术被发现在这里
这是我想出的伪代码.这不是任何特定的伪代码方言,但应该足够简单.
任何人都希望分开.
[p]是表示当前路径的顶点列表.
[x]是符合标准的路径列表
[s]是源顶点
[d]是目标顶点
[c]是当前顶点(PathFind例程的参数)
假设有一种查找相邻顶点的有效方法(第6行).
1 PathList [p] 2 ListOfPathLists [x] 3 Vertex [s], [d] 4 PathFind ( Vertex [c] ) 5 Add [c] to tail end of list [p] 6 For each Vertex [v] adjacent to [c] 7 If [v] is equal to [d] then 8 Save list [p] in [x] 9 Else If [v] is not in list [p] 10 PathFind([v]) 11 Next For 12 Remove tail from [p] 13 Return
由于此答案中给出的现有非递归DFS实现似乎已被破坏,因此让我提供一个实际可行的实现.
我用Python编写了这个,因为我发现它非常易读并且不受实现细节的影响(并且因为它有yield
实现生成器的方便关键字),但是移植到其他语言应该相当容易.
# a generator function to find all simple paths between two nodes in a
# graph, represented as a dictionary that maps nodes to their neighbors
def find_simple_paths(graph, start, end):
visited = set()
visited.add(start)
nodestack = list()
indexstack = list()
current = start
i = 0
while True:
# get a list of the neighbors of the current node
neighbors = graph[current]
# find the next unvisited neighbor of this node, if any
while i < len(neighbors) and neighbors[i] in visited: i += 1
if i >= len(neighbors):
# we've reached the last neighbor of this node, backtrack
visited.remove(current)
if len(nodestack) < 1: break # can't backtrack, stop!
current = nodestack.pop()
i = indexstack.pop()
elif neighbors[i] == end:
# yay, we found the target node! let the caller process the path
yield nodestack + [current, end]
i += 1
else:
# push current node and index onto stacks, switch to neighbor
nodestack.append(current)
indexstack.append(i+1)
visited.add(neighbors[i])
current = neighbors[i]
i = 0
此代码维护两个并行堆栈:一个包含当前路径中的早期节点,另一个包含节点堆栈中每个节点的当前邻居索引(这样当我们将节点重新弹出时,我们可以继续迭代遍历节点的邻居堆栈).我同样可以很好地使用单个堆栈(节点,索引)对,但我认为双栈方法更具可读性,并且可能更容易为其他语言的用户实现.
此代码还使用一个单独的visited
集合,它始终包含当前节点和堆栈上的任何节点,以便让我有效地检查节点是否已经是当前路径的一部分.如果您的语言恰好具有"有序集"数据结构,该结构提供了高效的类似堆栈的推送/弹出操作和高效的成员资格查询,则可以将其用于节点堆栈并删除单独的visited
集合.
或者,如果您为节点使用自定义可变类/结构,则可以在每个节点中存储一个布尔标志,以指示它是否已作为当前搜索路径的一部分被访问.当然,如果您出于某种原因希望这样做,此方法将不允许您在同一图表上并行运行两次搜索.
这里有一些测试代码演示了上面给出的函数是如何工作的:
# test graph:
# ,---B---.
# A | D
# `---C---'
graph = {
"A": ("B", "C"),
"B": ("A", "C", "D"),
"C": ("A", "B", "D"),
"D": ("B", "C"),
}
# find paths from A to D
for path in find_simple_paths(graph, "A", "D"): print " -> ".join(path)
在给定的示例图上运行此代码会产生以下输出:
A -> B -> C -> D A -> B -> D A -> C -> B -> D A -> C -> D
请注意,虽然此示例图是无向的(即它的所有边都是双向的),但该算法也适用于任意有向图.例如,删除C -> B
边缘(通过B
从邻居列表中删除C
)产生相同的输出,除了第三个路径(A -> C -> B -> D
),这是不可能的.
PS.构建图表很容易,像这样的简单搜索算法(以及此线程中给出的其他算法)执行得非常糟糕.
例如,考虑在无向图上查找从A到B的所有路径的任务,其中起始节点A具有两个邻居:目标节点B(没有除A之外的其他邻居)和作为集团的一部分的节点C 的ñ +1节点,就像这样:
graph = {
"A": ("B", "C"),
"B": ("A"),
"C": ("A", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
"D": ("C", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
"E": ("C", "D", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
"F": ("C", "D", "E", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
"G": ("C", "D", "E", "F", "H", "I", "J", "K", "L", "M", "N", "O"),
"H": ("C", "D", "E", "F", "G", "I", "J", "K", "L", "M", "N", "O"),
"I": ("C", "D", "E", "F", "G", "H", "J", "K", "L", "M", "N", "O"),
"J": ("C", "D", "E", "F", "G", "H", "I", "K", "L", "M", "N", "O"),
"K": ("C", "D", "E", "F", "G", "H", "I", "J", "L", "M", "N", "O"),
"L": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "M", "N", "O"),
"M": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "N", "O"),
"N": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "O"),
"O": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N"),
}
很容易看出A和B之间的唯一路径是直接路径,但是从节点A开始的天真DFS将浪费O(n!)时间无用地探索集团内的路径,即使它(对于人类)显而易见这些路径都不可能导致B.
还可以构建具有类似属性的DAG,例如通过使起始节点A连接目标节点B和两个其他节点C 1和C 2,两个节点连接到节点D 1和D 2,两者都连接到E 1和E 2,依此类推.对于这样排列的n层节点,从A到B的所有路径的天真搜索将最终浪费O(2 n)时间来检查所有可能的死角,然后放弃.
当然,从clique中的一个节点(除C之外)或DAG的最后一层向目标节点B添加边缘将创建从A到B的指数级大量可能路径,并且纯粹的局部搜索算法无法预先确定是否会找到这样的边缘.因此,从某种意义上说,这种天真搜索的输出敏感性差是由于他们缺乏对图的全局结构的认识.
虽然有各种预处理方法(例如迭代消除叶节点,搜索单节点顶点分隔符等),可以用来避免这些"指数时间死角",我不知道任何一般在所有情况下都可以消除它们的预处理技巧.一般的解决方案是在搜索的每个步骤检查目标节点是否仍然可以访问(使用子搜索),如果不是,则提前回溯 - 但是,这会显着减慢搜索速度(最糟糕的是) ,对于许多不包含这种病态死胡同的图,与图的大小成比例).
与第二层相比,这是一个逻辑上更好看的递归版本。
public class Search { private static final String START = "B"; private static final String END = "E"; public static void main(String[] args) { // this graph is directional Graph graph = new Graph(); graph.addEdge("A", "B"); graph.addEdge("A", "C"); graph.addEdge("B", "A"); graph.addEdge("B", "D"); graph.addEdge("B", "E"); // this is the only one-way connection graph.addEdge("B", "F"); graph.addEdge("C", "A"); graph.addEdge("C", "E"); graph.addEdge("C", "F"); graph.addEdge("D", "B"); graph.addEdge("E", "C"); graph.addEdge("E", "F"); graph.addEdge("F", "B"); graph.addEdge("F", "C"); graph.addEdge("F", "E"); List> paths = new ArrayList >(); String currentNode = START; List visited = new ArrayList (); visited.add(START); new Search().findAllPaths(graph, seen, paths, currentNode); for(ArrayList path : paths){ for (String node : path) { System.out.print(node); System.out.print(" "); } System.out.println(); } } private void findAllPaths(Graph graph, List visited, List > paths, String currentNode) { if (currentNode.equals(END)) { paths.add(new ArrayList(Arrays.asList(visited.toArray()))); return; } else { LinkedList nodes = graph.adjacentNodes(currentNode); for (String node : nodes) { if (visited.contains(node)) { continue; } List temp = new ArrayList (); temp.addAll(visited); temp.add(node); findAllPaths(graph, temp, paths, node); } } } }
节目输出
B A C E B A C F E B E B F C E B F E