It's an old question, but if you want to actually get the n-grams as a list of substrings (not as list of lists or tuples) and don't want to import anything, the following code works just fine and is easy to read:
def get_substrings(phrase, n): phrase = phrase.split() substrings = [] for i in range(len(phrase)): if len(phrase[i:i+n]) == n: substrings.append(''.join(phrase[i:i+n])) return substrings
You can use it e.g. in this way to get all n-grams of a list of terms up to a words length:
a = 5terms = ["An n-gram is a contiguous sequence of n items","An n-gram of size 1 is referred to as a unigram",]for term in terms: for i in range(1, a+1): print(f"{i}-grams: {get_substrings(term, i)}")
Prints:
1-grams: ['An', 'n-gram', 'is', 'a', 'contiguous', 'sequence', 'of', 'n', 'items']2-grams: ['An n-gram', 'n-gram is', 'is a', 'a contiguous', 'contiguous sequence', 'sequence of', 'of n', 'n items']3-grams: ['An n-gram is', 'n-gram is a', 'is a contiguous', 'a contiguous sequence', 'contiguous sequence of', 'sequence of n', 'of n items']4-grams: ['An n-gram is a', 'n-gram is a contiguous', 'is a contiguous sequence', 'a contiguous sequence of', 'contiguous sequence of n', 'sequence of n items']5-grams: ['An n-gram is a contiguous', 'n-gram is a contiguous sequence', 'is a contiguous sequence of', 'a contiguous sequence of n', 'contiguous sequence of n items']1-grams: ['An', 'n-gram', 'of', 'size', '1', 'is', 'referred', 'to', 'as', 'a', 'unigram']2-grams: ['An n-gram', 'n-gram of', 'of size', 'size 1', '1 is', 'is referred', 'referred to', 'to as', 'as a', 'a unigram']3-grams: ['An n-gram of', 'n-gram of size', 'of size 1', 'size 1 is', '1 is referred', 'is referred to', 'referred to as', 'to as a', 'as a unigram']4-grams: ['An n-gram of size', 'n-gram of size 1', 'of size 1 is', 'size 1 is referred', '1 is referred to', 'is referred to as', 'referred to as a', 'to as a unigram']5-grams: ['An n-gram of size 1', 'n-gram of size 1 is', 'of size 1 is referred', 'size 1 is referred to', '1 is referred to as', 'is referred to as a', 'referred to as a unigram']