Monday, March 20, 2017

Substring Problems

                                 SubString Problems 


-- Overview 


    This substring topic is very related to two pointer technique, which is also a topic that may need to utilize sliding window problem. So Let's dive in those problem and, hopefully, get comfortable with many examples.


-- Examples 


1. Determine If One String Is Another's Substring 


Determine if a small string is a substring of another large string.


Return the index of the first occurrence of the small string in the large string.
Return -1 if the small string is not a substring of the large string.
Assumptions
  • Both large and small are not null
  • If small is empty string, return 0
Examples
  • “ab” is a substring of “bcabc”, return 2
  • “bcd” is not a substring of “bcabc”, return -1
  • "" is substring of "abc", return 0



Note: two loop with 3 base cases: 
        1. if we reach the end of needle, then return the index in big string. 
        2. if we reach the end of big string, but still not find any match, we return -1, meaning not exist. 
        3. Else we check the current position to big, and small string, see if we have a match, if not we break this inner loop. 


2.  Longest Substring Without Repeating Characters

Given a string, find the longest substring without any repeating characters and return the length of it. The input string is guaranteed to be not null.
For example, the longest substring without repeating letters for "bcdfbd" is "bcdf", we should return 4 in this case.

Note: It is a typical, classic, Sliding window algorithm applied to a subString problem. 


3.1 : Longest SubString with at most 2 distinct 


Given a string, find the length of the longest substring T that contains at most 2 distinct characters.
For example, Given s = “eceba”,
T is "ece" which its length is 3.


Note: If you really understand example 2, and this 3.1 example, you may find out that the only difference of this two problem, when we use sliding window algorithm to solve it, is how we move our start pointer to let the substring in our window size is a valid substring according to the question requirement. 

3.2 : Longest SubString with at most K distinct 


Given a string, find the length of the longest substring T that contains at most k distinct characters.
For example, Given s = “eceba” and k = 2,
T is "ece" which its length is 3.


Note: Compare 3.1 and 3.2, we only need to change the condition when we should shrink our window. 
Also, notice from line 17 - 19, we only decrement the count of distinct value, if we gonna decrement on a value 1 char on our map, but for every char start pointer is passing, we do need to decrement its map value. 

4. Longest palindrome subString 

Given a string s, find the longest palindromic substring in s. You may assume that the maximum length of s is 1000.
Example:
Input: "babad"

Output: "bab"

Note: "aba" is also a valid answer.
Example:
Input: "cbbd"

Output: "bb"






















Note: It is little abstract if you are not that familiar with DP, but here is a diagram to explain the man flow of the idea:

 


The read arrow in the cell means we will look the previous dp[i+1][j-1] to determine the value of this current cell. and if dp[i][j] = true,  means the substring start at index i, ends at index j , is a valid palindrome. 


5. Find all Anagrams in one String 


Find all occurrence of anagrams of a given string s in a given string l. Return the list of starting indices.
Assumptions
  • s is not null or empty.
  • l is not null.
Examples
  • l = "abcbac", s = "ab", return [0, 3] since the substring with length 2 starting from index 0/3 are all anagrams of "ab" ("ab", "ba").

Note:In this problem, we keep our window size = s.length(), and after ( i >= window size ) every time we move forward one character , we also take care of the one that was at end of our window size. Notice that the condition that we do ( match++, or math-- ), is the same as we did in Example 3.
      --  Only if we will decrement on a count number = 1 --> match++ 
      --  Only if we will increment on a count number = 0  -->  match-- 



Hope you get some experience to deal with other similar substring problems.

Happy Coding -_-
Ye Jiang





Sunday, March 19, 2017

The power of Slow and Fast Pointer Technique -_-

                             The Power of Two Pointer 




-- OverView 
    The are lots of algorithm problems that can be solved by the two (slow & fast) pointer technique. What's more, we could use this technique to in place deal with many String manipulation problem in O(n) time, which requires a programmer have solid coding skills and deep impression about the physical meaning of two pointer in each problem. 

-- Examples 

1. First, Let's look at a pretty basic problem: The partition method we used when we perform quick sort. There are lots of different way to write the code for partition, but with this two pointer technique, I was just impressed. How clear the code will be ! 

By choosing a random pivot value, we partition the given array, from left to right, that all the value that smaller than pivot value be in the left of array, and or the value that is bigger or equal to pivot value sit right hand side of pivot. Finally we return the pivot index value as out put. 

private int partition(int[] a, int left, int right){
    
    }



 
2.1: The First Occurrence 
Given a target integer T and an integer array A sorted in ascending order, find the index of the first occurrence of T in A or return -1 if there is no such index.
Assumptions
  • There can be duplicate elements in the array.
Examples
  • A = {1, 2, 3}, T = 2, return 1
  • A = {1, 2, 3}, T = 4, return -1
  • A = {1, 2, 2, 2, 3}, T = 2, return 1
Corner Cases
  • What if A is null or A of zero length? We should return -1 in this case.


2.2. The Last Occurrence 
Given a target integer T and an integer array A sorted in ascending order, find the index of the last occurrence of T in A or return -1 if there is no such index.
Assumptions
  • There can be duplicate elements in the array.
Examples
  • A = {1, 2, 3}, T = 2, return 1
  • A = {1, 2, 3}, T = 4, return -1
  • A = {1, 2, 2, 2, 3}, T = 2, return 3
Corner Cases
  • What if A is null or A is array of zero length? We should return -1 in this case.

Note: The only difference of 2.1 and 2.2, is the 2 place I made comment in the code above. 


2.3.The Closest Occurrence
Given a target integer T and an integer array A sorted in ascending order, find the index i in A such that A[i] is closest to T.
Assumptions
  • There can be duplicate elements in the array, and we can return any of the indices with same value.
Examples
  • A = {1, 2, 3}, T = 2, return 1
  • A = {1, 4, 6}, T = 3, return 1
  • A = {1, 4, 6}, T = 5, return 1 or 2
  • A = {1, 3, 3, 4}, T = 2, return 0 or 1 or 2
Corner Cases
  • What if A is null or A is of zero length? We should return -1 in this case.


3.  Remove Extra White Space 

Given a string, remove all leading/trailing/duplicated empty spaces.
Assumptions:
  • The given string is not null.
Examples:
  • “  a” --> “a”
  • “   I        love DP     ” --> “I love DP”

Note: People may find various of ways to solve this problem, but the following solution that used slow and fast pointer is the most elegant one that I found so far. 


In this example the i is the fast pointer and the slow value is the slow pointer, the physical meaning is: the result are at left side of slow pointer 


4.  Remove Adjacent Repeated Characters

4.1: 
Remove adjacent, repeated characters in a given string, leaving only one character for each group of such characters.
Assumptions
  • Try to do it in place.
Examples
  • “aaaabbbc” is transferred to “abc”
Corner Cases
  • If the given string is null, we do not need to do anything.

Note: We will still use two pinter here, "end" pointer here is the slow pointer and it also will act like the top of a "Virtual Stack", where when we move the i, the fast pointer here, we constantly look at the top of our stack, see if there is a duplication happen. 



4.2: 
Repeatedly remove all adjacent, repeated characters in a given string from left to right.
No adjacent characters should be identified in the final string.
Examples
  • "abbbaaccz" → "aaaccz" → "ccz" → "z"
  • "aabccdc" → "bccdc" → "bdc"


Note: I comment the only difference between 4.1 and 4.2, For 4.2, if there is a duplication we do not preserve any of them, so we pop off from our stack whenever we find there is adjacent duplication. 

5.   String Abbreviation Matching

Word “book” can be abbreviated to 4, b3, b2k, etc. Given a string and an abbreviation, return if the string matches the abbreviation.
Assumptions:
  • The original string only contains alphabetic characters.
  • Both input and pattern are not null.
Examples:
  • pattern “s11d” matches input “sophisticated” since “11” matches eleven chars “ophisticate”.

Note: If someone still remember we discussed Wild card matching before, then you will find that it is a easy version of wild card matching and also could use recursive method to solve it ( as we did in Regular Expression matching)  

 Basically, we will keep one pointer for input , and the other for pattern. and if they will reach end at the same time, we had a successful matching ! 


-- End: 

Actually, if you would like to add sliding window algorithm into two pointer technique, which is also make sense. I hope you enjoy this basic but very powerful tool that we could use. Thanks. 

Happy Coding -_- 
Ye Jiang  











Thursday, March 2, 2017

DevOps Practice

                        DevOps Practice


  • 1. Project overview
  • 2. What I learned
  • 3. Things to do

  • Project Overview

Launched a Drupal website that hosting on AWS with EC2 instance http://52.11.193.136/  .  Created a tool for testing some basic functionalities of the Drupal site, which includes 1.User login/out, 2. Adding new content to website 3. All links on the website are functional( 200 OK response or similar). 4. User posting comments. 5.The search bar is responsible. 6.Database connection. 7.Requesting new user account.  ( I tried Amazon CodeCommit for version control, but I prefer GitHub. So I maintain this code on my GitHub repo).

Connected Drupal site with an RDS MySql database instance and an S3 bucket to make sure we are able to upload and download relatively large amounts of data online.

Create a Dockerfile and then a docker image based on the first Drupal site on AWS, and upload it into EC2 Container Service(ECS) which allows us to load-balance and future deployment easily. Utilized EC2 container service to create a launch configuration and then launched 2 similar EC2 instances on AWS ( 2 other similar Drupal site http://52.11.112.135/ and http://35.167.78.83/ ).Besides, also connect them with a new RDS MySql instance and S3 Cloud Storage bucket to ensure a working persistent filesystem.

Create an auto scaling group based on the ECS container we created before. Set some scaling policies, which allows AWS CloudWatch Service monitor our EC2 instance and auto-scale our group when there is an alarm triggered.  

Last but not least, we also want our website is Highly Available to users all the time. From the architectural design view, we could build a fault tolerance system where allows a backup deployment exist, but that may cost us too much.Instead we could get help from lots of AWS related services. I created another EC2 instance in different AZ( Available Zone ) and formed an application load balancer for the Drupal site at http://sitebalancer-2044199213.us-west-2.elb.amazonaws.com/ This load balancer will enable balancing our traffic in between this two server and when one of them have a down time for some reason, another one could still serving.

Actually, the auto scaling group we created before will also ensure our website is highly available when we have an overwhelming request to our server. Besides, choosing a relatively big size of the database and keep backing up important user data is also crucial for us to achieve high availability.

  • 2. What I learned


For me, many things need to learn from scratch in order to complete this challenge. To name a few, how to launch a Drupal site on an EC2 instance? how to connect it with DB and S3? How do we perform testings on a website? What is Docker? What is a REST API ? … etc. However, I really enjoy the whole learning process, which gives me an opportunity to gain some handy skills very quickly and broaden my horizon of interests.

Specifically, For Drupal, I learned how to launch and manage a Drupal website through AWS Linux instance. How to connect Drupal site with its dependent services. Major pros and cons for using a Drupal framework.

For AWS services, I  learned how to launch and manage multiple EC2 instances. How to connect an EC2 instance with other related AWS service ( RDS, S3 ). How to configure and deploy our workloads using EC2 Container Service. How to create an auto-scaling group and enable AWS CloudWatch service. Then, set related scaling policies to achieve scaling up under load. Finally, How to create and manage an load balancer with EC2 instances that locate in different Available Zone.

For website testing, I learned how to use and customize Drupal SimpleTest Framework. ( Few PHP test files that I created with SimpleTest were attached below). How to use Selenium WebDriver to achieve web browser automation test. How to create automation test framework. How to use logging infrastructure when testing (log4j). How to connect Database with Selenium WebDriver. My tool for this challenge was created with Selenium WebDriver.  I will also attach a sample log.out file ( log created by my tool )  and a README  file that describe how to compile and run my tool along with it.  
Besides those three major aspects, Fortunately, I also learn how to work with Dockerfile and Docker image. How to make our website Highly Available. What is and why we want to our server work as a REST API.
  • 3.Things to do


If there is more time that I can spend with this challenge or the further steps that could  improve this project I will do the following.

Add more test cases in my tool that it could test a Drupal website in a more comprehensive way.Also, because RESTful Web Services is included in Drupal 8 core, I also want to use my tool to test it as a REST API.     

Keep enhancing the availability of our website host by trying to integrate our system with AWS Elastic Load Balancing service and set up a database replication in our MySQL database. Besides, I would like to try some third party tools that in AWS marketplace to add an extra layer of our fault tolerance system.  

Finally, get more understanding and appreciation of tools, such as Puppet and Docker. Know more about how we, as a software developer, could better use them to configure workloads and managing a production cluster.



Thanks.
Ye Jiang
2/11/2017