Language Translation with Python

Language translation is a method of converting the source sentence from one natural language to another natural language using computerized systems and human assistance is not required. In this article, I will introduce you to a machine learning project on language translation with Python programming language.

Language Translation with Machine Learning

Deep Learning is a recently used approach for language translation. Unlike traditional machine translation, neural machine translation is a better choice for more accurate translation and also offers better performance. DNN can be used to improve traditional systems to make them more efficient.

Also, Read – 100+ Machine Learning Projects Solved and Explained.

Different deep learning techniques and libraries are needed to develop a better language translation system. RNN, LSTM, etc. are used to train the system which will convert the sentence from the source language to the target language.

Adapting the appropriate networks and deep learning strategies is a good choice, as it has turned the system to maximize the accuracy of the translation system relative to others.

Machine Learning Project on Langauge Translation with Python

In this section, I will take you through a Machine Learning project on language translation with Python. Here, I will be creating a machine learning model to translate English to Hindi.

Let’s get started with this task by importing the necessary Python libraries and the dataset:

(25000, 3)

For simplicity, I will lowercase all the characters in the dataset:

lines['english_sentence']=lines['english_sentence'].apply(lambda x: x.lower())
lines['hindi_sentence']=lines['hindi_sentence'].apply(lambda x: x.lower())

Now I will remove all the quotes from the data:

lines['english_sentence']=lines['english_sentence'].apply(lambda x: re.sub("'", '', x))
lines['hindi_sentence']=lines['hindi_sentence'].apply(lambda x: re.sub("'", '', x))

Now I will remove all the special characters in the data:

exclude = set(string.punctuation) # Set of all special characters
# Remove all the special characters
lines['english_sentence']=lines['english_sentence'].apply(lambda x: ''.join(ch for ch in x if ch not in exclude))
lines['hindi_sentence']=lines['hindi_sentence'].apply(lambda x: ''.join(ch for ch in x if ch not in exclude))

Now I will remove all the numbers and extra spaces from the data:

Now as we have cleared the dataset the next thing we need to do is to prepare two sets of vocabularies of Hindi and English:

Now before training the language translation model we need to set the input and target values:

Training Model to Translate English to Hindi

Now as we have prepared our dataset let’s train a model for the task of Language translation model. For this task I will first split the data and then we will move forward to train our model:

Now let’s train our language translation model:

So we have successfully built and trained our model for the task of language translation with Machine Learning. Now let’s see how the model performs by translating a sentence:

(input_seq, actual_output), _ = next(train_gen)
decoded_sentence = decode_sequence(input_seq)
print('Input English sentence:', X_train[k:k+1].values[0])
print('Actual Hindi Translation:', y_train[k:k+1].values[0][6:-4])
print('Predicted Hindi Translation:', decoded_sentence[:-4])
Input English sentence: in order to understand whether this is true
Actual Hindi Translation:  यह समझने के लिए कि क्या यह सच है 
Predicted Hindi Translation:  यह समझने के लिए कि क्या यह सच है

I hope you liked this article on Language Translation Model with Python programming language to translate English to Hindi. Feel free to ask your valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1534

Leave a Reply