122

I am trying to determine the best time efficient algorithm to accomplish the task described below.

I have a set of records. For this set of records I have connection data which indicates how pairs of records from this set connect to one another. This basically represents an undirected graph, with the records being the vertices and the connection data the edges.

All of the records in the set have connection information (i.e. no orphan records are present; each record in the set connects to one or more other records in the set).

I want to choose any two records from the set and be able to show all simple paths between the chosen records. By "simple paths" I mean the paths which do not have repeated records in the path (i.e. finite paths only).

Note: The two chosen records will always be different (i.e. start and end vertex will never be the same; no cycles).

For example:

    If I have the following records:
        A, B, C, D, E

    and the following represents the connections: 
        (A,B),(A,C),(B,A),(B,D),(B,E),(B,F),(C,A),(C,E),
        (C,F),(D,B),(E,C),(E,F),(F,B),(F,C),(F,E)

        [where (A,B) means record A connects to record B]

If I chose B as my starting record and E as my ending record, I would want to find all simple paths through the record connections that would connect record B to record E.

   All paths connecting B to E:
      B->E
      B->F->E
      B->F->C->E
      B->A->C->E
      B->A->C->F->E

This is an example, in practice I may have sets containing hundreds of thousands of records.

3
  • The connections are called cycles, and this answer has a lot of informations for you.
    – elhoim
    Commented Oct 21, 2009 at 7:17
  • 3
    Please say whether you want a finite list of loop-free connections, or infinite stream of connections with all possible loops. Cf. Blorgbeard's answer. Commented Dec 6, 2010 at 8:47
  • can anyone help with this ??? stackoverflow.com/questions/32516706/…
    – tejas3006
    Commented Sep 11, 2015 at 7:15

17 Answers 17

122

It appears that this can be accomplished with a depth-first search of the graph. The depth-first search will find all non-cyclical paths between two nodes. This algorithm should be very fast and scale to large graphs (The graph data structure is sparse so it only uses as much memory as it needs to).

I noticed that the graph you specified above has only one edge that is directional (B,E). Was this a typo or is it really a directed graph? This solution works regardless. Sorry I was unable to do it in C, I'm a bit weak in that area. I expect that you will be able to translate this Java code without too much trouble though.

Graph.java:

import java.util.HashMap;
import java.util.LinkedHashSet;
import java.util.LinkedList;
import java.util.Map;
import java.util.Set;

public class Graph {
    private Map<String, LinkedHashSet<String>> map = new HashMap();

    public void addEdge(String node1, String node2) {
        LinkedHashSet<String> adjacent = map.get(node1);
        if(adjacent==null) {
            adjacent = new LinkedHashSet();
            map.put(node1, adjacent);
        }
        adjacent.add(node2);
    }

    public void addTwoWayVertex(String node1, String node2) {
        addEdge(node1, node2);
        addEdge(node2, node1);
    }

    public boolean isConnected(String node1, String node2) {
        Set adjacent = map.get(node1);
        if(adjacent==null) {
            return false;
        }
        return adjacent.contains(node2);
    }

    public LinkedList<String> adjacentNodes(String last) {
        LinkedHashSet<String> adjacent = map.get(last);
        if(adjacent==null) {
            return new LinkedList();
        }
        return new LinkedList<String>(adjacent);
    }
}

Search.java:

import java.util.LinkedList;

public class Search {

    private static final String START = "B";
    private static final String END = "E";

    public static void main(String[] args) {
        // this graph is directional
        Graph graph = new Graph();
        graph.addEdge("A", "B");
        graph.addEdge("A", "C");
        graph.addEdge("B", "A");
        graph.addEdge("B", "D");
        graph.addEdge("B", "E"); // this is the only one-way connection
        graph.addEdge("B", "F");
        graph.addEdge("C", "A");
        graph.addEdge("C", "E");
        graph.addEdge("C", "F");
        graph.addEdge("D", "B");
        graph.addEdge("E", "C");
        graph.addEdge("E", "F");
        graph.addEdge("F", "B");
        graph.addEdge("F", "C");
        graph.addEdge("F", "E");
        LinkedList<String> visited = new LinkedList();
        visited.add(START);
        new Search().depthFirst(graph, visited);
    }

    private void depthFirst(Graph graph, LinkedList<String> visited) {
        LinkedList<String> nodes = graph.adjacentNodes(visited.getLast());
        // examine adjacent nodes
        for (String node : nodes) {
            if (visited.contains(node)) {
                continue;
            }
            if (node.equals(END)) {
                visited.add(node);
                printPath(visited);
                visited.removeLast();
                break;
            }
        }
        for (String node : nodes) {
            if (visited.contains(node) || node.equals(END)) {
                continue;
            }
            visited.addLast(node);
            depthFirst(graph, visited);
            visited.removeLast();
        }
    }

    private void printPath(LinkedList<String> visited) {
        for (String node : visited) {
            System.out.print(node);
            System.out.print(" ");
        }
        System.out.println();
    }
}

Program Output:

B E 
B A C E 
B A C F E 
B F E 
B F C E 
13
  • 6
    Please note that this is not a breadth-first traversal. With breadth first you first visit all nodes with distance 0 to the root, then those with distance 1, then 2, etc.
    – mweerden
    Commented Sep 12, 2008 at 20:48
  • 16
    Correct, this is a DFS. A BFS would need to use a queue, enqueuing level-(N+1) nodes to be processed after all level-N nodes. However, for the OP's purposes, either BFS or DFS will work, as no preferred sorting order of paths is specified.
    – Matt J
    Commented Oct 21, 2008 at 17:25
  • 1
    Casey, I've been looking for a solution to this problem for ages. I have recently implemented this DFS in C++ and it works a treat.
    – AndyUK
    Commented Nov 10, 2008 at 11:16
  • 6
    Disadvantage of recursion is if you will have deep graph (A->B->C->...->N) you could have StackOverflowError in java.
    – Rrr
    Commented Aug 15, 2012 at 20:23
  • 2
    I've added an iterative version in C# below.
    – batta
    Commented Oct 17, 2014 at 16:47
25

The National Institute of Standards and Technology (NIST) online Dictionary of Algorithms and Data Structures lists this problem as "all simple paths" and recommends a depth-first search. CLRS supplies the relevant algorithms.

A clever technique using Petri Nets is found here

3
  • 2
    Could you help me with a better solution? a DFS takes forever to run: stackoverflow.com/q/8342101/632951
    – Pacerier
    Commented Dec 1, 2011 at 18:28
  • Note that it's easy to come up with graphs for which DFS is very inefficient, even though the set of all simple paths between the two nodes is small and easy to find. For example, consider an undirected graph where the starting node A has two neighbors: the goal node B (which has no neighbors other than A), and a node C that is part of a fully connected clique of n + 1 nodes. Even though there's clearly just one simple path from A to B, a naïve DFS will waste O(n!) time uselessly exploring the clique. Similar examples (one solution, DFS takes exponential time) can be found among DAGs, too. Commented Feb 20, 2016 at 22:46
  • The NIST says: "The paths may be enumerated with a depth-first search."
    – chomp
    Commented Feb 28, 2017 at 3:01
13

Here is the pseudocode I came up with. This is not any particular pseudocode dialect, but should be simple enough to follow.

Anyone want to pick this apart.

  • [p] is a list of vertices representing the current path.

  • [x] is a list of paths where meet the criteria

  • [s] is the source vertex

  • [d] is the destination vertex

  • [c] is the current vertex (argument to the PathFind routine)

Assume there is an efficient way to look up the adjacent vertices (line 6).

     1 PathList [p]
     2 ListOfPathLists [x]
     3 Vertex [s], [d]

     4 PathFind ( Vertex [c] )
     5     Add [c] to tail end of list [p]
     6     For each Vertex [v] adjacent to [c]
     7         If [v] is equal to [d] then
     8             Save list [p] in [x]
     9         Else If [v] is not in list [p]
    10             PathFind([v])
    11     Next For
    12     Remove tail from [p]
    13 Return
4
  • Can you please shed some light on step 11 and step 12
    – bozo user
    Commented Feb 13, 2013 at 8:37
  • Line 11 just denotes the end block that goes with the For loop that begins on line 6. Line 12 means to remove the last element of the path list before returning to the caller. Commented Feb 13, 2013 at 15:00
  • What is the initial call to PathFind - do you pass in the source vertex [s]?
    – bozo user
    Commented Feb 25, 2013 at 1:39
  • In this example yes, but keep in mind that you may not want to write real code that maps one-to-one with this pseudocode. It's meant more to illustrate a thought process rather than well designed code. Commented Feb 25, 2013 at 6:12
8

Since the existing non-recursive DFS implementation given in this answer seems to be broken, let me provide one that actually works.

I've written this in Python, because I find it pretty readable and uncluttered by implementation details (and because it has the handy yield keyword for implementing generators), but it should be fairly easy to port to other languages.

# a generator function to find all simple paths between two nodes in a
# graph, represented as a dictionary that maps nodes to their neighbors
def find_simple_paths(graph, start, end):
    visited = set()
    visited.add(start)

    nodestack = list()
    indexstack = list()
    current = start
    i = 0

    while True:
        # get a list of the neighbors of the current node
        neighbors = graph[current]

        # find the next unvisited neighbor of this node, if any
        while i < len(neighbors) and neighbors[i] in visited: i += 1

        if i >= len(neighbors):
            # we've reached the last neighbor of this node, backtrack
            visited.remove(current)
            if len(nodestack) < 1: break  # can't backtrack, stop!
            current = nodestack.pop()
            i = indexstack.pop()
        elif neighbors[i] == end:
            # yay, we found the target node! let the caller process the path
            yield nodestack + [current, end]
            i += 1
        else:
            # push current node and index onto stacks, switch to neighbor
            nodestack.append(current)
            indexstack.append(i+1)
            visited.add(neighbors[i])
            current = neighbors[i]
            i = 0

This code maintains two parallel stacks: one containing the earlier nodes in the current path, and one containing the current neighbor index for each node in the node stack (so that we can resume iterating through the neighbors of a node when we pop it back off the stack). I could've equally well used a single stack of (node, index) pairs, but I figured the two-stack method would be more readable, and perhaps easier to implement for users of other languages.

This code also uses a separate visited set, which always contains the current node and any nodes on the stack, to let me efficiently check whether a node is already part of the current path. If your language happens to have an "ordered set" data structure that provides both efficient stack-like push/pop operations and efficient membership queries, you can use that for the node stack and get rid of the separate visited set.

Alternatively, if you're using a custom mutable class / structure for your nodes, you can just store a boolean flag in each node to indicate whether it has been visited as part of the current search path. Of course, this method won't let you run two searches on the same graph in parallel, should you for some reason wish to do that.

Here's some test code demonstrating how the function given above works:

# test graph:
#     ,---B---.
#     A   |   D
#     `---C---'
graph = {
    "A": ("B", "C"),
    "B": ("A", "C", "D"),
    "C": ("A", "B", "D"),
    "D": ("B", "C"),
}

# find paths from A to D
for path in find_simple_paths(graph, "A", "D"): print " -> ".join(path)

Running this code on the given example graph produces the following output:

A -> B -> C -> D
A -> B -> D
A -> C -> B -> D
A -> C -> D

Note that, while this example graph is undirected (i.e. all its edges go both ways), the algorithm also works for arbitrary directed graphs. For example, removing the C -> B edge (by removing B from the neighbor list of C) yields the same output except for the third path (A -> C -> B -> D), which is no longer possible.


Ps. It's easy to construct graphs for which simple search algorithms like this one (and the others given in this thread) perform very poorly.

For example, consider the task of find all paths from A to B on an undirected graph where the starting node A has two neighbors: the target node B (which has no other neighbors than A) and a node C that is part of a clique of n+1 nodes, like this:

graph = {
    "A": ("B", "C"),
    "B": ("A"),
    "C": ("A", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
    "D": ("C", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
    "E": ("C", "D", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
    "F": ("C", "D", "E", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
    "G": ("C", "D", "E", "F", "H", "I", "J", "K", "L", "M", "N", "O"),
    "H": ("C", "D", "E", "F", "G", "I", "J", "K", "L", "M", "N", "O"),
    "I": ("C", "D", "E", "F", "G", "H", "J", "K", "L", "M", "N", "O"),
    "J": ("C", "D", "E", "F", "G", "H", "I", "K", "L", "M", "N", "O"),
    "K": ("C", "D", "E", "F", "G", "H", "I", "J", "L", "M", "N", "O"),
    "L": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "M", "N", "O"),
    "M": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "N", "O"),
    "N": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "O"),
    "O": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N"),
}

It's easy to see that the only path between A and B is the direct one, but a naïve DFS started from node A will waste O(n!) time uselessly exploring paths within the clique, even though it's obvious (to a human) that none of those paths can possibly lead to B.

One can also construct DAGs with similar properties, e.g. by having the starting node A connect target node B and to two other nodes C1 and C2, both of which connect to the nodes D1 and D2, both of which connect to E1 and E2, and so on. For n layers of nodes arranged like this, a naïve search for all paths from A to B will end up wasting O(2n) time examining all the possible dead ends before giving up.

Of course, adding an edge to the target node B from one of the nodes in the clique (other than C), or from the last layer of the DAG, would create an exponentially large number of possible paths from A to B, and a purely local search algorithm can't really tell in advance whether it will find such an edge or not. Thus, in a sense, the poor output sensitivity of such naïve searches is due to their lack of awareness of the global structure of the graph.

While there are various preprocessing methods (such as iteratively eliminating leaf nodes, searching for single-node vertex separators, etc.) that could be used to avoid some of these "exponential-time dead ends", I don't know of any general preprocessing trick that could eliminate them in all cases. A general solution would be to check at every step of the search whether the target node is still reachable (using a sub-search), and backtrack early if it isn't — but alas, that would significantly slow down the search (at worst, proportionally to the size of the graph) for many graphs that don't contain such pathological dead ends.

2
  • Thank you for your DFS non-recursive solution. Just note the last line printing the result has a syntax error, should be for path in find_simple_paths(graph, "A", "D"): print(" -> ".join(path)), the print was missing the parenthesis. Commented Feb 25, 2018 at 23:02
  • 1
    @DavidOlivánUbieto: It's Python 2 code, that's why there's no parentheses. :) Commented Feb 26, 2018 at 0:57
5

Here is a logically better-looking recursive version as compared to the second floor.

public class Search {

private static final String START = "B";
private static final String END = "E";

public static void main(String[] args) {
    // this graph is directional
    Graph graph = new Graph();
    graph.addEdge("A", "B");
    graph.addEdge("A", "C");
    graph.addEdge("B", "A");
    graph.addEdge("B", "D");
    graph.addEdge("B", "E"); // this is the only one-way connection
    graph.addEdge("B", "F");
    graph.addEdge("C", "A");
    graph.addEdge("C", "E");
    graph.addEdge("C", "F");
    graph.addEdge("D", "B");
    graph.addEdge("E", "C");
    graph.addEdge("E", "F");
    graph.addEdge("F", "B");
    graph.addEdge("F", "C");
    graph.addEdge("F", "E");
    List<ArrayList<String>> paths = new ArrayList<ArrayList<String>>();
    String currentNode = START;
    List<String> visited = new ArrayList<String>();
    visited.add(START);
    new Search().findAllPaths(graph, seen, paths, currentNode);
    for(ArrayList<String> path : paths){
        for (String node : path) {
            System.out.print(node);
            System.out.print(" ");
        }
        System.out.println();
    }   
}

private void findAllPaths(Graph graph, List<String> visited, List<ArrayList<String>> paths, String currentNode) {        
    if (currentNode.equals(END)) { 
        paths.add(new ArrayList(Arrays.asList(visited.toArray())));
        return;
    }
    else {
        LinkedList<String> nodes = graph.adjacentNodes(currentNode);    
        for (String node : nodes) {
            if (visited.contains(node)) {
                continue;
            } 
            List<String> temp = new ArrayList<String>();
            temp.addAll(visited);
            temp.add(node);          
            findAllPaths(graph, temp, paths, node);
        }
    }
}
}

Program Output

B A C E 

B A C F E 

B E

B F C E

B F E 
4

Solution in C code. It is based on DFS which uses minimum memory.

#include <stdio.h>
#include <stdbool.h>

#define maxN    20  

struct  nodeLink
{

    char node1;
    char node2;

};

struct  stack
{   
    int sp;
    char    node[maxN];
};   

void    initStk(stk)
struct  stack   *stk;
{
    int i;
    for (i = 0; i < maxN; i++)
        stk->node[i] = ' ';
    stk->sp = -1;   
}

void    pushIn(stk, node)
struct  stack   *stk;
char    node;
{

    stk->sp++;
    stk->node[stk->sp] = node;

}    

void    popOutAll(stk)
struct  stack   *stk;
{

    char    node;
    int i, stkN = stk->sp;

    for (i = 0; i <= stkN; i++)
    {
        node = stk->node[i];
        if (i == 0)
            printf("src node : %c", node);
        else if (i == stkN)
            printf(" => %c : dst node.\n", node);
        else
            printf(" => %c ", node);
    }

}


/* Test whether the node already exists in the stack    */
bool    InStack(stk, InterN)
struct  stack   *stk;
char    InterN;
{

    int i, stkN = stk->sp;  /* 0-based  */
    bool    rtn = false;    

    for (i = 0; i <= stkN; i++)
    {
        if (stk->node[i] == InterN)
        {
            rtn = true;
            break;
        }
    }

    return     rtn;

}

char    otherNode(targetNode, lnkNode)
char    targetNode;
struct  nodeLink    *lnkNode;
{

    return  (lnkNode->node1 == targetNode) ? lnkNode->node2 : lnkNode->node1;

}

int entries = 8;
struct  nodeLink    topo[maxN]    =       
    {
        {'b', 'a'}, 
        {'b', 'e'}, 
        {'b', 'd'}, 
        {'f', 'b'}, 
        {'a', 'c'},
        {'c', 'f'}, 
        {'c', 'e'},
        {'f', 'e'},               
    };

char    srcNode = 'b', dstN = 'e';      

int reachTime;  

void    InterNode(interN, stk)
char    interN;
struct  stack   *stk;
{

    char    otherInterN;
    int i, numInterN = 0;
    static  int entryTime   =   0;

    entryTime++;

    for (i = 0; i < entries; i++)
    {

        if (topo[i].node1 != interN  && topo[i].node2 != interN) 
        {
            continue;   
        }

        otherInterN = otherNode(interN, &topo[i]);

        numInterN++;

        if (otherInterN == stk->node[stk->sp - 1])
        {
            continue;   
        }

        /*  Loop avoidance: abandon the route   */
        if (InStack(stk, otherInterN) == true)
        {
            continue;   
        }

        pushIn(stk, otherInterN);

        if (otherInterN == dstN)
        {
            popOutAll(stk);
            reachTime++;
            stk->sp --;   /*    back trace one node  */
            continue;
        }
        else
            InterNode(otherInterN, stk);

    }

        stk->sp --;

}


int    main()

{

    struct  stack   stk;

    initStk(&stk);
    pushIn(&stk, srcNode);  

    reachTime = 0;
    InterNode(srcNode, &stk);

    printf("\nNumber of all possible and unique routes = %d\n", reachTime);

}
3

This may be late, but here's the same C# version of DFS algorithm in Java from Casey to traverse for all paths between two nodes using a stack. Readability is better with recursive as always.

    void DepthFirstIterative(T start, T endNode)
    {
        var visited = new LinkedList<T>();
        var stack = new Stack<T>();

        stack.Push(start);

        while (stack.Count != 0)
        {
            var current = stack.Pop();

            if (visited.Contains(current))
                continue;

            visited.AddLast(current);

            var neighbours = AdjacentNodes(current);

            foreach (var neighbour in neighbours)
            {
                if (visited.Contains(neighbour))
                    continue;

                if (neighbour.Equals(endNode))
                {
                    visited.AddLast(neighbour);
                    printPath(visited));
                    visited.RemoveLast();
                    break;
                }
            }

            bool isPushed = false;
            foreach (var neighbour in neighbours.Reverse())
            {
                if (neighbour.Equals(endNode) || visited.Contains(neighbour) || stack.Contains(neighbour))
                {
                    continue;
                }

                isPushed = true;
                stack.Push(neighbour);
            }

            if (!isPushed)
                visited.RemoveLast();
        }
    }
This is a sample graph to test:

    // Sample graph. Numbers are edge ids
    //       1     3       
    //    A --- B --- C ----
    //    |     |  2       |
    //    | 4   ----- D    |
    //    ------------------
8
  • 1
    excellent -- about how you replaced the recursion with stack-based iteration. Commented Aug 27, 2015 at 10:37
  • I still don't understand it, what is neighbours.Reverse()? Is it List<T>.Reverse ?
    – user1108948
    Commented Jan 14, 2016 at 2:21
  • I checked this non-recursive version, but it seems not correct. recursive version is fine. maybe when changed to non-recursive, a small mistake is happened
    – arslan
    Commented Feb 2, 2016 at 7:33
  • @alim: Agreed, this code is simply broken. (It does not correctly remove nodes from the visited set when backtracking, and the stack handling appears to be messed up, too. I tried to see if it could be fixed, but that would basically require a complete rewrite.) I've just added an answer with a correct, working non-recursive solution (in Python, but it should be relatively easy to port to other languages). Commented Feb 21, 2016 at 1:53
  • @llmari Karonen, Nice, I am going to check, Great job.
    – arslan
    Commented Feb 22, 2016 at 9:31
1

I have solved a similar problem to this recently, instead of all solutions I was only interested in the shortest.

I used a 'breadth first' iterative search which used a queue of status' each of which held a record containing a current point on the graph and the path taken to get there.

you start with a single record in the queue, which has the starting node and an empty path.

Each iteration through the code takes the item off the head of the list, and checks to see if it is a solution (the node arrived at is the one you want, if it is, we are done), otherwise, it constructs a new queue item with the nodes connecting to the current node, and amended paths that are based on the path of the previous node, with the new jump attached at the end.

Now, you could use something similar, but when you find a solution, instead of stopping, add that solution to your 'found list' and continue.

You need to keep track of a visited nodes list, so that you never backtrack on yourself otherwise you have an infinite loop.

if you want a bit more pseudocode post a comment or something, and I will elaborate.

1
  • 6
    I believe if you're only interested in the shortest path, then Dijkstra's Algorithm is "the solution" :).
    – vicatcu
    Commented Jan 12, 2010 at 22:14
1

I think you should describe your real problem behind this. I say this because you ask for something time efficient, yet the answer set to the problem seems to grow exponentially!

Therefore I wouldn't expect a better algorithm than something exponential.

I'd do backtracking and going through the whole graph. In order to avoid cycles, save all visited nodes along the way. When you go back, unmark the node.

Using recursion:

static bool[] visited;//all false
Stack<int> currentway; initialize empty

function findnodes(int nextnode)
{
if (nextnode==destnode)
{
  print currentway 
  return;
}
visited[nextnode]=true;
Push nextnode to the end of currentway.
for each node n accesible from nextnode:
  findnodes(n);
visited[nextnode]=false; 
pop from currenteay
}

Or is that wrong?

edit: Oh, and I forgot: You should eliminate the recursive calls by utilizing that node stack

1
  • My real problem is exactly as I described, only with much larger sets. I agree this seems to grow exponentially with the size of the set. Commented Dec 7, 2010 at 3:56
1

The basic principle is you need not worry about graphs.This is standard problem known as Dynamic connectivity problem. There are following types of methods from which you can achieve nodes are connected or not:

  1. Quick Find
  2. Quick Union
  3. Improved Algorithm (Combination of both)

Here is The C Code That I've tried with minimum time complexity O(log*n) That means for 65536 list of edges, it requires 4 search and for 2^65536, it requires 5 search. I am sharing my implementation from the algorithm: Algorithm Course from Princeton university

TIP: You can find Java solution from link shared above with proper explanations.

/* Checking Connection Between Two Edges */

#include<stdio.h>
#include<stdlib.h>
#define MAX 100

/*
  Data structure used

vertex[] - used to Store The vertices
size - No. of vertices
sz[] - size of child's
*/

/*Function Declaration */
void initalize(int *vertex, int *sz, int size);
int root(int *vertex, int i);
void add(int *vertex, int *sz, int p, int q);
int connected(int *vertex, int p, int q);

int main() //Main Function
{ 
char filename[50], ch, ch1[MAX];
int temp = 0, *vertex, first = 0, node1, node2, size = 0, *sz;
FILE *fp;


printf("Enter the filename - "); //Accept File Name
scanf("%s", filename);
fp = fopen(filename, "r");
if (fp == NULL)
{
    printf("File does not exist");
    exit(1);
}
while (1)
{
    if (first == 0) //getting no. of vertices
    {
        ch = getc(fp);
        if (temp == 0)
        {
            fseek(fp, -1, 1);
            fscanf(fp, "%s", &ch1);
            fseek(fp, 1, 1);
            temp = 1;
        }
        if (isdigit(ch))
        {
            size = atoi(ch1);
            vertex = (int*) malloc(size * sizeof(int));     //dynamically allocate size  
            sz = (int*) malloc(size * sizeof(int));
            initalize(vertex, sz, size);        //initialization of vertex[] and sz[]
        }
        if (ch == '\n')
        {
            first = 1;
            temp = 0;
        }
    }
    else
    {
        ch = fgetc(fp);
        if (isdigit(ch))
            temp = temp * 10 + (ch - 48);   //calculating value from ch
        else
        {
            /* Validating the file  */

            if (ch != ',' && ch != '\n' && ch != EOF)
            {
                printf("\n\nUnkwown Character Detected.. Exiting..!");

                exit(1);
            }
            if (ch == ',')
                node1 = temp;
            else
            {
                node2 = temp;
                printf("\n\n%d\t%d", node1, node2);
                if (node1 > node2)
                {
                    temp = node1;
                    node1 = node2;
                    node2 = temp;
                }

                /* Adding the input nodes */

                if (!connected(vertex, node1, node2))
                    add(vertex, sz, node1, node2);
            }
            temp = 0;
        }

        if (ch == EOF)
        {
            fclose(fp);
            break;
        }
    }
}

do
{
    printf("\n\n==== check if connected ===");
    printf("\nEnter First Vertex:");
    scanf("%d", &node1);
    printf("\nEnter Second Vertex:");
    scanf("%d", &node2);

    /* Validating The Input */

    if( node1 > size || node2 > size )
    {
        printf("\n\n Invalid Node Value..");
        break;
    }

    /* Checking the connectivity of nodes */

    if (connected(vertex, node1, node2))
        printf("Vertex %d and %d are Connected..!", node1, node2);
    else
        printf("Vertex %d and %d are Not Connected..!", node1, node2);


    printf("\n 0/1:  ");

    scanf("%d", &temp);

} while (temp != 0);

free((void*) vertex);
free((void*) sz);


return 0;
}

void initalize(int *vertex, int *sz, int size) //Initialization of graph
{
int i;
for (i = 0; i < size; i++)
{
    vertex[i] = i;
    sz[i] = 0;
}
}
int root(int *vertex, int i)    //obtaining the root
{
while (i != vertex[i])
{
    vertex[i] = vertex[vertex[i]];
    i = vertex[i];
}
return i;
}

/* Time Complexity for Add --> logn */
void add(int *vertex, int *sz, int p, int q) //Adding of node
{
int i, j;
i = root(vertex, p);
j = root(vertex, q);

/* Adding small subtree in large subtree  */

if (sz[i] < sz[j])
{
    vertex[i] = j;
    sz[j] += sz[i];
}
else
{
    vertex[j] = i;
    sz[i] += sz[j];
}

}

/* Time Complexity for Search -->lg* n */

int connected(int *vertex, int p, int q) //Checking of  connectivity of nodes
{
/* Checking if root is same  */

if (root(vertex, p) == root(vertex, q))
    return 1;

return 0;
}
1
  • This does not seem to solve the problem as asked. The OP wants to find all the simple paths between the two nodes, not merely to check whether a path exists. Commented Feb 21, 2016 at 1:49
1

find_paths[s, t, d, k]

This question is old and answered already. However, none show perhaps a more flexible algorithm for accomplishing the same thing. So I'll throw my hat into the ring.

I personally find an algorithm of the form find_paths[s, t, d, k] useful, where:

  • s is the starting node
  • t is the target node
  • d is the maximum depth to search
  • k is the number of paths to find

Using your programming language's form of infinity for d and k will give you all paths§.

§ obviously if you are using a directed graph and you want all undirected paths between s and t you will have to run this both ways:

find_paths[s, t, d, k] <join> find_paths[t, s, d, k]

Helper Function

I personally like recursion, although it can difficult some times, anyway first lets define our helper function:

def find_paths_recursion(graph, current, goal, current_depth, max_depth, num_paths, current_path, paths_found)
  current_path.append(current)

  if current_depth > max_depth:
    return

  if current == goal:
    if len(paths_found) <= number_of_paths_to_find:
      paths_found.append(copy(current_path))

    current_path.pop()
    return

  else:
    for successor in graph[current]:
    self.find_paths_recursion(graph, successor, goal, current_depth + 1, max_depth, num_paths, current_path, paths_found)

  current_path.pop()

Main Function

With that out of the way, the core function is trivial:

def find_paths[s, t, d, k]:
  paths_found = [] # PASSING THIS BY REFERENCE  
  find_paths_recursion(s, t, 0, d, k, [], paths_found)

First, lets notice a few thing:

  • the above pseudo-code is a mash-up of languages - but most strongly resembling python (since I was just coding in it). A strict copy-paste will not work.
  • [] is an uninitialized list, replace this with the equivalent for your programming language of choice
  • paths_found is passed by reference. It is clear that the recursion function doesn't return anything. Handle this appropriately.
  • here graph is assuming some form of hashed structure. There are a plethora of ways to implement a graph. Either way, graph[vertex] gets you a list of adjacent vertices in a directed graph - adjust accordingly.
  • this assumes you have pre-processed to remove "buckles" (self-loops), cycles and multi-edges
0

Here's a thought off the top of my head:

  1. Find one connection. (Depth-first search is probably a good algorithm for this, since the path length doesn't matter.)
  2. Disable the last segment.
  3. Try to find another connection from the last node before the previously disabled connection.
  4. Goto 2 until there are no more connections.
1
  • This will not work in general: it's quite possible for two or more paths between the vertices to have the same last edge. Your method would only find one of such paths. Commented Feb 21, 2016 at 1:39
0

As far as I can tell the solutions given by Ryan Fox (58343, Christian (58444), and yourself (58461) are about as good as it get. I do not believe that breadth-first traversal helps in this case, as you will not get all paths. For example, with edges (A,B), (A,C), (B,C), (B,D) and (C,D) you will get paths ABD and ACD, but not ABCD.

2
  • mweerden, The breadth-first traversal that I submitted will find ALL paths while avoiding any cycles. For the graph that you have specified, the implementation correctly finds all three paths. Commented Sep 12, 2008 at 14:33
  • I didn't completely read your code and assumed you used a breadth-first traversal (because you said so). However, on closer inspection after your comment, I noticed that it is in fact not. It is actually a memoryless depth-first traversal like those of Ryan, Christian and Robert.
    – mweerden
    Commented Sep 12, 2008 at 15:01
0

I found a way to enumerate all the paths including the infinite ones containing loops.

http://blog.vjeux.com/2009/project/project-shortest-path.html

Finding Atomic Paths & Cycles

Definition

What we want to do is find all the possible paths going from point A to point B. Since there are cycles involved, you can't just go through and enumerate them all. Instead, you will have to find atomic path that doesn't loop and the smallest possible cycles (you don't want your cycle to repeat itself).

The first definition I took of an atomic path is a path that does not go through the same node twice. However, I found out that is was not taking all possibilities. After some reflexion, I figured out that nodes aren't important, however edges are! So an atomic path is a path that does not go through the same edge twice.

This definition is handy, it also works for cycles: an atomic cycle of point A is an atomic path that goes from point A and ends to point A.

Implementation

Atomic Paths A -> B

In order to get all the path starting from point A, we are going to traverse the graph recursively from the point A. While going through a child, we are going to make a link child -> parent in order to know all the edges we have already crossed. Before we go to that child, we must traverse that linked list and make sure the specified edge has not been already walked through.

When we arrive to the destination point, we can store the path we found.

Freeing the list

A problem occurs when you want to free the linked list. It is basically a tree chained in the reverse order. A solution would be to double-link that list and when all the atomic paths been found, free the tree from the starting point.

But a clever solution is to use a reference counting (inspired from Garbage Collection). Each time you add a link to a parent you adds one to its reference count. Then, when you arrive at the end of a path, you go backward and free while the reference count equals to 1. If it is higher, you just remove one and stop.

Atomic Cycle A

Looking for the atomic cycle of A is the same as looking for the atomic path from A to A. However there are several optimizations we can do. First, when we arrive at the destination point, we want to save the path only if the sum of the edges cost is negative: we only want to go through absorbing cycles.

As you have seen previously, the whole graph is being traversed when looking for an atomic path. Instead, we can limit the search area to the strongly connected component containing A. Finding these components requires a simple traverse of the graph with Tarjan's algorithm.

Combining Atomic Paths and Cycles

At this point, we have all the atomic paths that goes from A to B and all the atomic cycles of each node, left to us to organize everything to get the shortest path. From now on we are going to study how to find the best combination of atomic cycles in an atomic path.

1
  • This does not seem to answer the question as asked. Commented Feb 21, 2016 at 1:41
0

As ably described by some of the other posters, the problem in a nutshell is that of using a depth-first search algorithm to recursively search the graph for all combinations of paths between the communicating end nodes.

The algorithm itself commences with the start node you give it, examines all its outgoing links and progresses by expanding the first child node of the search tree that appears, searching progressively deeper and deeper until a target node is found, or until it encounters a node that has no children.

The search then backtracks, returning to the most recent node it hasn’t yet finished exploring.

I blogged about this very topic quite recently, posting an example C++ implementation in the process.

0

Adding to Casey Watson's answer, here is another Java implementation,. Initializing the visited node with the start node.

private void getPaths(Graph graph, LinkedList<String> visitedNodes) {
                LinkedList<String> adjacent = graph.getAdjacent(visitedNodes.getLast());
                for(String node : adjacent){
                    if(visitedNodes.contains(node)){
                        continue;
                    }
                    if(node.equals(END)){
                        visitedNodes.add(node);
                        printPath(visitedNodes);
                        visitedNodes.removeLast();
                    }
                    visitedNodes.add(node);
                    getPaths(graph, visitedNodes);
                    visitedNodes.removeLast();  
                }
            }
0
gg1=nx.from_edgelist([('A','B'),('A','C'),('B','A'),('B','D'),('B','E'),('B','F'),('C','A'),('C','E'),
                  ('C','F'),('D','B'),('E','C'),('E','F'),('F','B'),('F','C'),('F','E')],create_using=nx.DiGraph)

pd.Series(nx.all_simple_paths(gg1, source='B', target='E'))

out

0       [B, A, C, E]
1    [B, A, C, F, E]
2             [B, E]
3       [B, F, C, E]
4          [B, F, E]
dtype: object

Not the answer you're looking for? Browse other questions tagged or ask your own question.