Tuesday, 3 November 2015

Check a given sentence for a given set of simple grammer rules

A simple sentence if syntactically correct if it fulfills given rules. The following are given rules.
1. Sentence must start with a Uppercase character (e.g. Noun/ I/ We/ He etc.)
2. Then lowercase character follows.
3. There must be spaces between words.
4. Then the sentence must end with a full stop(.) after a word.
5. Two continuous spaces are not allowed.
6. Two continuous upper case characters are not allowed.
7. However the sentence can end after an upper case character.
Examples:
Correct sentences -
   "My name is Ram."
   "The vertex is S."
   "I am single."
   "I love Geeksquiz and Geeksforgeeks."

Incorrect sentence - 
   "My name is KG."
   "I lovE cinema."
   "GeeksQuiz. is a quiz site."
   "  You are my friend."
   "I love cinema" 
Question: Given a sentence, validate the given sentence for above given rules.
We strongly recommend to minimize the browser and try this yourself first.
The idea is to use an automata for the given set of rules.
Algorithm :
1. Check for the corner cases
…..1.a) Check if the first character is uppercase or not in the sentence.
…..1.b) Check if the last character is a full stop or not.
2. For rest of the string, this problem could be solved by following a state diagram. Please refer to the below state diagram for that.
WP_20141128_002
3. We need to maintain previous and current state of different characters in the string. Based on that we can always validate the sentence of every character traversed.
A C based implementation is below. (By the way this sentence is also correct according to the rule and code)
// C program to validate a given sentence for a set of rules
#include<stdio.h>
#include<string.h>
#include<stdbool.h>
 
// Method to check a given sentence for given rules
bool checkSentence(char str[])
{
    // Calculate the length of the string.
    int len = strlen(str);
 
    // Check that the first character lies in [A-Z].
    // Otherwise return false.
    if (str[0] < 'A' || str[0] > 'Z')
        return false;
 
    //If the last character is not a full stop(.) no
    //need to check further.
    if (str[len - 1] != '.')
        return false;
 
    // Maintain 2 states. Previous and current state based
    // on which vertex state you are. Initialise both with
    // 0 = start state.
    int prev_state = 0, curr_state = 0;
 
    //Keep the index to the next character in the string.
    int index = 1;
 
    //Loop to go over the string.
    while (str[index])
    {
        // Set states according to the input characters in the
        // string and the rule defined in the description.
        // If current character is [A-Z]. Set current state as 0.
        if (str[index] >= 'A' && str[index] <= 'Z')
            curr_state = 0;
 
        // If current character is a space. Set current state as 1.
        else if (str[index] == ' ')
            curr_state = 1;
 
        // If current character is [a-z]. Set current state as 2.
        else if (str[index] >= 'a' && str[index] <= 'z')
            curr_state = 2;
 
        // If current state is a dot(.). Set current state as 3.
        else if (str[index] == '.')
            curr_state = 3;
 
        // Validates all current state with previous state for the
        // rules in the description of the problem.
        if (prev_state == curr_state && curr_state != 2)
            return false;
 
        if (prev_state == 2 && curr_state == 0)
            return false;
 
        // If we have reached last state and previous state is not 1,
        // then check next character. If next character is '\0', then
        // return true, else false
        if (curr_state == 3 && prev_state != 1)
            return (str[index + 1] == '\0');
 
        index++;
 
        // Set previous state as current state before going over
        // to the next character.
        prev_state = curr_state;
    }
    return false;
}
 
// Driver program
int main()
{
    char *str[] = { "I love cinema.", "The vertex is S.",
                    "I am single.", "My name is KG.",
                    "I lovE cinema.", "GeeksQuiz. is a quiz site.",
                    "I love Geeksquiz and Geeksforgeeks.",
                    "  You are my friend.", "I love cinema" };
    int str_size = sizeof(str) / sizeof(str[0]);
    int i = 0;
    for (i = 0; i < str_size; i++)
     checkSentence(str[i])? printf("\"%s\" is correct \n", str[i]):
                            printf("\"%s\" is incorrect \n", str[i]);
 
    return 0;
}
Output:
"I love cinema." is correct
"The vertex is S." is correct
"I am single." is correct
"My name is KG." is incorrect
"I lovE cinema." is incorrect
"GeeksQuiz. is a quiz site." is incorrect
"I love Geeksquiz and Geeksforgeeks." is correct
"  You are my friend." is incorrect
"I love cinema" is incorrect
Time complexity – O(n), worst case as we have to traverse the full sentence where n is the length of the sentence.
Auxiliary space – O(1)
This article is contributed by Kumar Gautam. Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

No comments:

Post a Comment