Python NLTK – How to tokenize a word and sentence

Hello everyone, In this tutorial, we will show you a simple python program about how to tokenize a word and sentence in NLTK. The same program has been tested and shared in the same post.

Tokenizing in Python NLTK

In order to perform the word and sentence tokenizing for the given word, we need to invoke following functions from NLTK library word_tokenize() and sent_tokenize(), which will take the text as an input and it will return tokens (i.e. word and sentence) array as an output.

import nltk

text = "I am Dinesh Krishnan. I am a Technology Consultant."

word_tokens = nltk.word_tokenize(text)

sentence_tokens = nltk.sent_tokenize(text)

print("Word Tokens\n.....................\n")
for wt in word_tokens:
    print(wt)

print("Sentence Tokens\n.....................\n")
for st in sentence_tokens:
    print(st)

Output

Word Tokens
.....................

I
am
Dinesh
Krishnan
.
I
am
a
Technology
Consultant
.

Sentence Tokens
.....................

I am Dinesh Krishnan.
I am a Technology Consultant.