Friday, April 5, 2019

BFS in Practice

                                   BFS in Practice 



      BFS, in general, stands for Breadth Frist Search in a graph. It has the same time complexity with DFS in a graph but tends to have more space complexity especially when the branching factor of a graph is big. However, as a trade-off, BFS is guaranteed to give the optimal solution, which means if we could find the target, then the path we used to find the target is guaranteed to be the shortest path or the path with the same distance value. 

     We have talked about this topic a long time ago in my previous blog posts. But here are two reasons for this new post 1. Explore the elegant way to write BFS code in a fairly complicated problem 2. BFS also could stands for Best First Search in a graph and we will look at examples for this algorithm. 

1. Level order print Binary Tree 

For example:
Given a binary tree
    3
   / \
  9  20
    /  \
   15   7
return its level order traversal as:
[
  [3],
  [9,20],
  [15,7]
]
--------------------------------------------------------------------------

This is a classic problem for breadth-first search, where we use a queue to swipe the binary tree level by level. Here is the  solution in Java:




2. Walls and gates 


Given a  m x n 2D grid initialized with these three possible values.
  1. -1 - A wall or an obstacle.
  2. 0 - A gate.
  3. INF - Infinity means an empty room. We use the value to231 - 1 = 2147483647 represent asINF you may assume that the distance to a gate is less than.2147483647
Fill each empty room with the distance to its nearest gate. If it is impossible to reach a gate, it should be filled with.INF
Example: 
Given the 2D grid:
INF  -1  0  INF
INF INF INF  -1
INF  -1 INF  -1
  0  -1 INF INF
After running your function, the 2D grid should be:
  3  -1   0   1
  2   2   1  -1
  1  -1   2  -1
  0  -1   3   4
---------------------------------------------------------------------------

What is your first algorithm after reading the problem? Well, my first algorithm is to do BFS start from each empty room position. The algorithm will find give the optimal solution by the property of BFS.  However, because of 1.  The number of gates is less than the number of empty rooms (At least in this example, and possibly in general).  2. The number 0 represents the gate, which could also represent its distance to the nearest gate ( the distance to its self is 0). We could consider doing BFS start from every gate and the then reach every possible empty room.

Here is the solution based on the above-revised algorithm:



Note: 1. We could use new int[]{i, j} to store a coordinate, which save us time to create a node class

          2. How to search each 4 direction efficiently? Line 13 is a handy way to do this (Actually is line 13 and 16-19 could generate coordinates for up, down, right and left directions)


3. Number of Islands 

Given a 2d grid map of '1's (land) and '0's (water), count the number of islands. An island is surrounded by water and is formed by connecting adjacent lands horizontally or vertically. You may assume all four edges of the grid are all surrounded by water.
Example 1:
Input:
11110
11010
11000
00000

Output: 1
Example 2:
Input:
11000
11000
00100
00011

Output: 3
--------------------------------------------------------------------------

This is also a very classic problem. I think it appears on my entry-level data structure textbook. To solve this problem we will traverse the input grid, increment the counter by 1 every time we see a "1" in location grid[i][j].  Then start from grid[i][j] we do a BFS to mark all the connected value
"1" as 0, which will make sure we only increment out counter by 1 for each island.

Here is the solution code:



Note: To do a search on every 4 directions we used the trick in line 16 and 22-25 again. Besides, new int [] { i, j} could be used to store a coordinate. 


4. Best First Search 


From the name of this algorithm, we could indicate that the "best" node, which certainly will be different according to each problem's definition, has the highest priority to be searched(expanded). Therefore, a priority queue is often used in this algorithm to find the next "best" node to expand.

 Each Best First algorithm will have 3 components:

1. Initial state (Start node) 
2. Node expansion rule 
3. Termination condition 

Example 1: 

The classic Dijkstra's Algorithm for Single Source Shortest Path(SSSP) is also a deviation of Best First Search. Here are the 3 components for Dijkstra's algorithm:

1. Initial state -> the source node has distance = 0, other nodes have distance = +INFI
2. Node expansion rule -> cost = (a). Cost of parent node + distance from the parent node to the current node
(b). The current node that has the lowest value will be pop up first ( have the highest priority)
(c). Each time the best node being popped up, we update all it's neighbor's distance using the rule (a) above.
3. Termination condition -> When priority queue becomes empty (Completed all reachable nodes in the graph)

Example 2:

Given a matrix of size N*N, and for each row, the elements are sorted in ascending order, and for each column, the elements are also sorted in ascending order. How to find the K-th smallest element in it?


Solution:

1. Initial state (start node) input[0][0]
2. Node expansion/generation rule: expand node[i][j]
         a. -> generate[i+1][j]
         b. -> generate[i][j+1]
3. Termination condition:
         a. When the k-th element is popped out for expansion, then we use it to return

4.* Be careful about (De-duplication for generated states)


5. Destroy Monsters Game 
(Multiple Breadth First Search) 


Given a 2D array, where for each cell, value 0 means walls (cannot across), value 1 means path, and value 2 and above means monsters. We want to find the shortest path starting from the point(0, 0), find and destroy all the monster with value 2 (after a monster being found and destroyed, the cell value will change to 1), then find an destroy all the monster with value 3, then monster with value 4... Until we destroyed all the monsters in the graph.

*for the purpose of easy demonstration, let's just assume the maximum monster has value 3.



If we want to reach all the 4 levels we easily add as:






Hope you enjoy the game :-)

Kind Regards ^^
Ye Jiang



Monday, December 10, 2018

Bank Real-Time Fraud Detection with Flink

                      Bank Fraud Detection with Flink 


        Apache Flink is an open source stream processing framework. The core of Flink is a distributed streaming data flow engine, however, it also provides support for batch processing, graph processing and iterative processing in ML. Flink is the next generation engine for stream processing and is proved to be even faster than Spark, which is a well-known data process engine specialized for batch processing. So, in terms of speed,  Flink > Spark >> Hadoop.  Actually, industry giants like Alibaba, Uber, Yelp... are using Flink as their stream process engine! Check this out get part of the list.

    Here, I want to present an example that how to use Flink's stream processing API to detect bank fraud transactions on a real-time basis.

   First, Let's us define what is a fraud transaction in our example here, in other words, list what kind of transaction see what we need to detect:

   1. If the customer in a transaction is an alarmed customer.

   2. If the credit card is a reported lost card.

   3. The number of transactions is more than 10 in a 1-minute window.

   4. City of transactions is changed more than 2 times in a 1-minute window.


If one of 4 conditions are met, we consider that is a potential fraud transaction and then raise an alarm.

 Then, let's see the data set samples in this example.

1. Transaction data: 

Transaction_ID, TimeStamp, City, Customer_ID, accountNumber, CreditCardNumber, Amount 
OXGT865174,2018-06-14 23:41:27,Surat,id_741yce,yc43363383, 126028343460,        457
UXJT427774,2018-06-14 23:41:37,Pune,id_741yce,yc43363383,   126028343460,        155
FIPW460992,2018-06-14 23:42:15,Surat,id_741yce,yc43363383,   126028343460,        555


2. Alarmed Customer: 

Customer_ID, AccountNumber
id_158gsb,      gs38055270
id_710tar,        ta12685664
id_433zlh,       zl94914183

3. Lost Credit Card: 

CreditCardNumber, TimeStamp,        Name,    Progress
124460647673,2018-01-12 21:41:18,Allison,  Investigating
122018058677,2018-02-13 20:41:19,Arthur,   Pending with customer
128958965781,2018-03-24 22:21:29,Ana,       Complaint registered


After we are clear with the task and be familiar with the data, eventually, we could dive into the Flink logic :D

The first thing we will do is to broadcast alarmed customer file and lost credit card file because they are relatively lightweight and we want to every node in our distributed system have to access this two important file.



Noto the POJO means Plain Old Java Object.

Then, we start by getting the current streaming execution environment:



Get and broadcast the 3 kinds of data we discussed above:




Note that Flink has the built-in broadcast method, which makes this task easy for us.

Then, the first check against alarmed customers



Here is the AlamedCustCheck class in the last line of the above code:




Check the second condition:







The actually checking logic is in the LostCardCheck class here:




The third task will form a 1 min window to detect more than 10 transactions on the same credit card:



Again the FilterAndMapMoreThan10 class here:



The last condition to check city change more than 1 time in 1 min window:



Lastly, we want to combine all the suspicious transactions together with the union method that in Flink and execute our application:




The complete code is in my GitHub Repo if you would like to check them out :-)



Enjoy Learning :D
Ye Jiang


Tuesday, November 27, 2018

CNN with TensorFlow

            An Example of Creating CNN with Tensorflow  




-- Introduction of CNN  


 Convolutional Neural Networks (CNN), is a type of neural network in machine learning that is specialized in dealing with image recognition. It could find features that aren't in a specific spot such as a traffic light in a picture or words within a sentence.

 CNN's are actually inspired by the biology of the visual cortex (Thank God!), in which local receptive fields are groups of neurons that only respond to a part of what you see. The overlapping effect will create the entire visual fields, which is similar to the effect of convolution.

 A typical CNN network example will be like the picture in the top, which compost of few convolution layers to detect certain features and some pooling layer to reduce the computations in the network. In the end, we will use the softmax function to give the prediction probability to each candidate objects.

 For the detail calculation of convolutional layer, here is an example by using the filter matrix
        [ [ 0 0 1],
         [ 1 0 0],
         [ 1 1 0] ]


Note that different filtering matrix could detect different features of a given picture(data).


-- Create CNN for handwriting recognition 


 We will use MNIST dataset to train and test our CNN. The MNIST database is a large database of handwritten digits that is commonly used for training image processing systems. This data set is actually included inside of tensorflow so first let's import all the libraries we need for this task. 




Since the images were flattened in to 1D of 784 pixels, we want to reshape the data into 2D of 28*28 pixels:



Note that we actually convert it into a [28, 28, 1] data, the 1 represents the color channels. Since our data is black and white, so we only have 1 color channel, otherwise, it will be 3 for any color picture.

Then we will convert our label into categorical in the one hot format:



After we pre-processed our data, we will, then, build our CNN model!



After we added 2 convolutional layers at second the third steps, we reduced our computational complexity by using a 2*2 max pooling layer and a random drop rate of 0.25.

 In the end, we need to flatten our results into 1D and passing it into our final prediction layer, which is a dense layer of size 10 with a softmax activation function.

Then we compile our model with loss function and optimizer, note that keras actually saved us lots of time to build this CNN model.




We actually train our model with 10 epochs. The more epochs the better, but CNN is very computationally expansive to execute.



Finally, we could see that our model has over 99% accuracy! It is great, but it did take a lot of time to execute.



Reference:
https://www.mathworks.com/solutions/deep-learning/convolutional-neural-network.html


Happy Learning :_)
Ye Jiang



Kernel Methods and Basis Expansions in Machine Learning


               Kernel Methods and Basis Expansions




-- Introduction


    Use SVM or other models to deal with classification problems in machine learning(ML), our life is relatively easy when the data set is linearly separable. However, most of the real problems do not usually have such a feature and that is why kernel methods and basis expansions play a significant role in ML. (Note that if the big majority of the data is linearly separable and only a few points will affect our decision boundary, we could use soft margin SVM to deal with it.)

    So, we start with some data points that are not linearly separable, by applying kernel function we could map our data to a higher dimensional space where data become linear separable then we could easily classifier our data points in that high dimensional space.

    Mapping data to a higher dimension is, basically doing some calculation based on example's features, therefore we usually called it the primal form of kernel methods. The dural form of kernel methods, therefore, is a similarity function that corresponds to an inner product of two data points in some higher dimensional space. We usually could apply our domain knowledge to the dural form of kernel functions to define what kind of points are "similar" to each other in given a specific problem.

   The reason we want to use the basis expansion in our data is very similar to the reason we use the primal form of kernel methods to mapping our data, where we want to separate our data even when the decision boundary is not linear.


-- A Example for Basis Expansion: 


Here is a small example. Assume that we now have 2 data features x1 and x2. The decision boundary is something like x1 = x2^2 + 10, which is quadratic in 2D space. Then we extend our data to a 5D space by apply:  X= (x1, x2) --> X = (x1, x2, x1^2, x2^2, x1*x2). Then our decision boundary is linear in this 5D space. Originally x1 - x2^2 -10 = 0 --> wX + b = 0 , where w = (1, 0, 0, -1, 0) and b = -10.


-- Kernel Ridge Regression:


For a ridge linear regression, the goal is to find W that could minimize the loss function:

(a)

We could drive the optimal W* in a closed form:

(b)

After we applied kernel trick, we do not necessarily know the transformation function f(x), having the inner product of two samples in the new space suffices to give the prediction of a new given sample X(new) = X1 + X2 + ... Xn:

Y^new = W* f(x)

By substituting W* with form (b) above, but whenever we see x we change it to f(x)

--> Y^new = Y(X*X + Lambda*I)K(Xi, X)

Where Lambda is a constant, I is the identity matrix, Xi represents the different features of new input X(new).

--------------------

Then given a kernel function K( x1, x2 ) = ( 1 + x1*x2 )^2 , how do we actually perform kernel ridge regression?

-- Here is the main idea for dual form kernels:

We will apply kernel function K to train_x and test_x ( all the input data ), specifically:

kernel_trainX = K ( train_x , train_x ) 

kernel_testX = K (test_x , train_x) 

model.fit ( kernel_trainX , train_y ) 

Y^predict = model.predit( kernel_testX) 


How to perform K (train_x, train_x)?  Here is the python code for a general polynomial kernel of any degree:





-- The main idea is, again, very similar to polynomial basis expansion, except, like we discussed previously, basis expansion is more similar to a primal form of kernels.

expanded_trainX =  E( train_x) 

expanded_testX = E(test_x) 

model.fit(expanded_trainX, train_y) 

Y^predict = model.predict(expanded_testX) 

Here is an example to perform a general polynomial expansion of any degree:




After this concrete example, I hope you had a better understanding of how kernel methods an basis expansion work in ML.


Happy Exploring^^
Ye Jiang
 






Saturday, November 3, 2018

Binary Search, First/Last Occurrence and K Closest

                               Binary Search, Easy? 


   You probably already heard and wrote some pieces of code for binary search already. However, do you really know binary search well enough to handle its many variants?  Good for you if your answer is yes, but I would not say someone is a decent software engineer if he/she can not handle it well. 


1.  The Classic Binary Search: 

Find a target value in a sorted array, return its index value in the array if such target exists. Else return -1. 




2. The First Occurrence 


Given an array and a target number, return the index of the first occurrence of the target in the array or -1 if such target does not exist. 



To conclude the trick we used here: 

1. Stop when left < right-1 
2. Update right/left bound to mid instead of mid-1/mid+1 
3. Check both the left and right value in the end to give the optimal result. 


3. The last occurrence 


 Given an array and a target number, return the index of the last occurrence of the target in the array or -1 if such target does not exist. 

Could you come up with a solution based on the first occurrence we just solved?



Almost the same trick compare to the first occurrence. The only 2 differences are we consider to go right side of the array first in the while loop and check right pointer before left pointer in the end. 


4. Closest Number in Sorted Array 


Give an array and a target, find the closest number compare to the target and output its index.



1. Stop when left < right - 1 
2. Check absolute difference against target value in the end.


5. K Closest Number in Sorted Array 


Give an array and a target, find all the k closest number compare to the target and output all of them as an array.
You may assume that K is always a valid input, that is K < array. length. 




Two take away here:

1. We used the binary search technique in the helper function largestSmallerOrEqual 

2. Start from that position we expand left or right pointer to get the optimal result. The logic of how we expand the two pointers is worth practicing. 


6. Binary Search In a Sorted 2D Array 


Search in the sorted matrix, each row of the matrix is sorted in ascending order, and the first element of the row equal to or larger than the last element of the previous row. 

return the position if the target is found, otherwise return {-1. -1}. 


The trick here is virtually to convert the 2D array into a 1D array by and do the classic binary search. 

By:  row =  mid / cols;  

        col = mid % cols; 


* Bonus 7, find the median of two sorted array 


There are two sorted arrays A and B of size m and n respectively.
Find the median of the two sorted arrays. The overall runtime complexity should be O(log (m+n)).

The gist of this question is to realize that the split point i need to satisfy that 
1.  i + j =  ( m + n + 1 ) / 2   
2. A[i-1] <= B [j]   
3. B[j-1] <= A[i] 
We use binary search to find such position i in the first smaller sorted array. Therefore the runtime will be Log(Min( m, n) ). 


Next time when you need to solve a binary search related problem, hope you have a better sense -_ - 

Enjoy Coding^^ 
Ye Jiang 



Thursday, July 20, 2017

Pressing down & Coming back

                         Recursion in Binary Tree  



When we call getSize on a binary tree object

             We could use recursive method to solve many type of questions, especially when the question  can be boiled down to many little sub-problems. Such as the classic merge sort ,  quick sort, and reverse linked list problems. You could find out that the logic we use to sort a big array using merge sort is the same as the logic we use for a smaller size array. The logic we reverse a long linked list is the same as the logic we reverse a linked list with only 2 nodes .. etc. 
            Therefore, when we ever take a look for the binary tree data structure, it is not a surprise that recursive approach will help us to deal with it in a very handy way. For the most cases, we just need to consider a 3 steps approach for a node, which have parent and 2 child nodes. Then apply this logic to every node in the binary tree using recursion. It is both effective and elegant :-) 

            So here are the 3 steps we need to consider : 

            Step 1 : What a node want from his left and right child node 

            Step 2: After receive this value, what this node do at current level

            Step 3: Then what this node should  return to its parent node 

Note: Step 1 and 3 should be the same thing essentially so we used same red color.


-- 1. Check Balance for Binary Tree


Check if a given binary tree is balanced. A balanced binary tree is one in which the depths of every node’s left and right subtree differ by at most 1.

---------------------------------------------------------------------------------------------------------------

According to the definition of a balanced binary tree we could do a height checking for every node's two child node, which will cost O(n*log(n)). We certainly could do better by following the 3 steps pattern above. 

I will copy paste that 3 steps and see what we want for each step: 

             Step 1 : What a node want from his left and right child node 
                 -- We want the height of left subtree and right subtree 

            Step 2: After receive this value, what this node do at current level
                -- We will compare that two value see are we balance at current level 

            Step 3: Then what this node should  return to its parent node
                -- If we still balance at current level , keep passing its height to parent, otherwise a special value like -1, will indicate a un-balanced condition. 


Then according that 3 steps we come out a O(n) solution: 



Could you match the 3 steps with the code above ?



--2 . Maximum Sum Path from One Leaf Node to Another


Given a binary tree in which each node contains an integer number. Find the maximum possible sum from one leaf node to another leaf node. If there is no such path available, return Integer.MIN_VALUE(Java)/INT_MIN (C++).
Examples
  -15
  /    \
2      11
     /    \
    6     14
The maximum path sum is 6 + 11 + 14 = 31.
------------------------------------------------------------------------------------------------------------

 We will keep using the 3 steps above to see what we need to do: 

            Step 1 : What a node want from his left and right child node 
                -- We want the max path sum from leaf node to its left/right node 

            Step 2: After receive this value, what this node do at current level
                --When we could form a leaf to leaf path at current level, we will see if we could          update the global maximum sum by using ( leftSum + rightSum + current.value ) 

            Step 3: Then what this node should  return to its parent node 
                -- return the max path sum from one left node to this node to its parent 


Here is the 3 steps in code : 




-- 3. Maximum Sum Path from Any Node to Any Node


Given a binary tree in which each node contains an integer number. Find the maximum possible sum from any node to any node (the start node and the end node can be the same). 
Assumptions
  • ​The root of the given binary tree is not null
Examples
   -1
  /    \
2      11
     /    \
    6    -14
one example of paths could be -14 -> 11 -> -1 -> 2
another example could be the node 11 itself
The maximum path sum in the above binary tree is 6 + 11 + (-1) + 2 = 18
---------------------------------------------------------------------------------------------------------------

 We certainly could still using that 3 steps to process this problem, but since we just completed question 2 , we could, then , consider the difference between this 2 question. 
  Difference 1. We could update the global maximum sum in any level 
                  2. We could not including the left/right sub-result (cause one node value is still valid ). 
Therefore here is the code : 

I still comment the 3 steps for reference ^^ 


-- 4. Least Common Ancestor in  Binary Tree


Given two nodes in a binary tree, find their lowest common ancestor.
Assumptions
  • There is no parent pointer for the nodes in the binary tree
  • The given two nodes are guaranteed to be in the binary tree
---------------------------------------------------------------------------------------------------


Compare to previous example, the solution for this problem is not very intuitive at beginning and require some definition knowledge about what is a common ancestor in a binary tree. Here is 2 examples: 
        5
      /   \
     9     12
   /  \      \
  2    3      14
The lowest common ancestor of 2 and 14 is 5
The lowest common ancestor of 2 and 9 is 9 

-------------------------

            Step 1 : What a node want from his left and right child node 
                -- We want to know weather the two target node is under their sub-tree 

            Step 2: After receive this value, what this node do at current level
                -- We will combine 2 and 3 steps together, cause according to the two value we get from child node, we will return different value to parent node accordingly 

            Step 3: Then what this node should  return to its parent node 
                 -- if we see both targets appears under right and left subtree, we return current node to indicate current node is the least common ancestor 
                  else if we did not see anything from two child , we return null to tell parent nothing under this node. 


                  else ( we find one node under this node ) we return this founded node value to its parent to indicate this one node is under this current node. 
              

Code is actually very simple :



Hope you enjoyed this special way to consider a binary tree problem with recursion !  

Happy Coding <> 
Ye Jiang 





Tuesday, July 18, 2017

Understanding DFS Recursion Tree ( 2 )

                  All Parenthesis 2  &&  Combination of Coins



  After the first blog about how to understand recursive tree, I think we , at least , already had a little sense  that we need to consider a recursion tree from 1.The height of the tree , 2. the branch factor of the tree. On this blog we will use another 2 question to hammer this process and explore more exciting parts about DFS recursion tree.
 




-- All Parenthesis 2


Get all valid permutations of l pairs of (), m pairs of [] and n pairs of {}.
Assumptions
  • l, m, n >= 0
Examples
l = 1, m = 1, n = 0, all the valid permutations are ["()[]", "([])", "[()]", "[]()"]
-----------------------------------------------------------------------------------------------------------------------------
  If you review the All parenthesis 1 in the previous blog post, you may come up quickly with a similarly solution ,which is we may carry out 6 parameters for each part of the 6 parenthesis. Besides, the rule to decide weather we can add certain parenthesis will be the same for the this one. However, It is actually more than what you think it might be :-) 
  If you keep that code above there will be 2 obvious defects. 1. It is not appropriate to carry out 7 or 8 parameters for any method, if we ever come to this situation, we need to come up with other solution around.  2.It has a flaw in its logic. Combination like " ( { ) [ } ] "  will appear in the final result. 
  For fixed those 2 defects, we will use a array of integer to represent all number of parenthesis left and maintain a stack to manage the proper parenthesis combination.  
 Here is a solution code: 




 So what is this recursion tree will look like ? 
1. The target length variable is equal to the height = 2 ( l + m + n)
2. The maximum branch factor will be 6, but the actually tree will never reach this point 
-- > O( 6 ^ (2lmn) ) 



-- Combination of Coins


Given a number of different denominations of coins (e.g., 1 cent, 5 cents, 10 cents, 25 cents), get all the possible ways to pay a target number of cents.
Arguments
  • coins - an array of positive integers representing the different denominations of coins, there are no duplicate numbers and the numbers are sorted by descending order, eg. {25, 10, 5, 2, 1}
  • target - a non-negative integer representing the target number of cents, eg. 99
Assumptions
  • coins is not null and is not empty, all the numbers in coins are positive
  • target >= 0
  • You have infinite number of coins for each of the denominations, you can pick any number of the coins.
Examples
coins = {2, 1}, target = 4, the return should be
[
  [0, 4],   (4 cents can be conducted by 0 * 2 cents + 4 * 1 cents)
  [1, 2],   (4 cents can be conducted by 1 * 2 cents + 2 * 1 cents)
  [2, 0]    (4 cents can be conducted by 2 * 2 cents + 0 * 1 cents)
]
--------------------------------------------------------------------------------------------------------------------
Let just take target = 99 cent, Coins = { 1 , 5 , 10 , 25 } as a example. How would you approach this problem ?  

 There is 2 approach that may solve this problem, and each of them have a different recursion tree. 
     Plan A:.  The tree height = 99, for each level we will have 4 possible branches, which are take 1 ;  1 , 5 : 1 , 5 , 10 ; and take all of them. ( Note that we could add some code to optimal that once the total number of value we has already exceed than target value, we could then stop. ) 
     Plan B:  The tree height = 4, for each level we will have n possible branches, which depends on # of coins we can take for each value. For example if we consider value 25 cents in first level, we will have 4 branches, which are take 0 , 1 , 2  or 3 number of 25 cents in our final results. ( Same idea for the level of 10 we will have 10 branches and 100 for 1 cents) 

What is your choose ?  As a software engineer, we could analyze and compare this 2 plans in both time and space complexity. 
   Plan A : Time --> O( 4^99 ) , Space --> O(99) 
   Plan B : Time --> O( 99^4 ) , Space --> O( 4 ) 

After compare, it is not hard to find out that plan B is much better then plan A ! 
Here is the solution based on Plan B: 

















Here you go , building a recursion tree every time you write DFS code will help you better understand the internal working mechanism of you code and evaluate its performance easily !

Happy Coding < >
Ye Jiang


* Update( 7 / 22 / 2017 ) 
Here is another classic example that I want to add into this topic ! 

-- Binary Tree Sub-Path Sum to Target 


Given a binary tree in which each node contains an integer number. Determine if there exists a path (the path can only be from one node to itself or to any of its descendants), the sum of the numbers on the path is the given target number.
Examples
    5
  /    \
2      11
     /    \
    6     14
  /
 3
If target = 17, There exists a path 11 + 6, the sum of the path is target.
If target = 20, There exists a path 11 + 6 + 3, the sum of the path is target.
If target = 10, There does not exist any paths sum of which is target.
If target = 11, There exists a path only containing the node 11.
-----------------------------------------------------------------------------------------------------------
If we keep a actually path from root to current node and then check every possible subpath at each level we will end up with a O(n^2) solution, but for subpath sum we actually could do better :-)

Remember how to get range sum in constant time ? 

-- We preprocessing the array to populate a prefix sum at each position, then the range sum from i - j is = prefixSum( j ) - prefixSum ( i - 1). 

We could do a similar trick in this problem. 
      1.  Use a set to store all previous prefix sum. 
      2.  Keep a value to record total sum from root to current position 
      3. Then look for a possible sub-path sum by --> Set.contains( totalSum - target ) in O(1) time. 


Here is a solution based on the strategy we come up with :



How is this recursive tree will look like ? ?

It will looks like exactly what the tree data structure itself !  ðŸ˜Ž
and we will traverse this tree in a pre-order sequence.
So  it is a deep first O(n) solution using call stack to simulate ^^


Hope you enjoyed 💫
Ye Jiang