Attention Based Neural Machine Translator for Hindi to English Language

: -Neural machine translation, or NMT uses the models based on neural network to gain an understanding of the statistical model so that it can be used in machine translation. Attention based mechanism used in neural networks make the translation process more reliable by using the bidirectional approach of encoder and decoder and is used to compare BLEU from different sources


I. INTRODUCTION
One of the earliest goals of the computer was the automatic translation of text from one language to another. Machine translation is conceivably one of the most demanding artificial intelligence tasks given the fluidity of human being language Machine translation is the task of spontaneously converting source text in one language to text into a different language.MT performs as a channel for cross-language interaction in Natural Language Processing (NLP). MT manages language confusion problems by using computerized translation in between dual languages while keeping its meaning intact. The attentional mechanism has recently occurred to prove to be useful in neural machine translation by selectively concentrating on elements of the trace sentence during translation. NMT is fascinating since it needs marginal domain knowledge to get hands on and has the power to simplify well to very extended word sequences. For doing this we have used encoder and decoder which uses Recurrent Neural network Architecture.
In this paper, we create, with minimalism and efficiency in attention, one type of attention-based model that is the approach in which all provided words given to be attended specifically using the bhadanau model.

II. NEURAL MACHINE TRANSLATOR
The neural machine translator is a neural based network that promptly models the conditional probability p(b|a) of converting a trace sentence, a1, . . ., an, to a end sentence, b1, . . ., bm. It comprised of two elements i.e encoder and decoder. an encoder computes a description k for each trace sentence and the decoder which produces one end word at a time and hence crumbles the conditional probability as: log p(b|a) = Am j=1 log p (bj |b<j,k). In this paper, following (Sutskever et al., 2014; Luong et al., 2015), the use of assembling LSTM architecture has been taken for building the NMT systems, as illustrated in figure

III. ATTENTION BASED MODEL
The attention model to be used is type of global that differs from the local in the format whether the "attention" is positioned on all trace positions or on only a limited trace position. It is to be noticed that at each time step t in the decryption phase the approach first take input as the hidden state ht at the upper layer of a assembling LSTM so that to obtain a context vector ct that gets significant trace-side data to help calculate the current end word bt or to be said in simpler terms is to contemplate all the hidden states of the encoder when stemming the context vector ct. Also, in addition to what we did differently than already propose bhadanau model is that we use the chain of the forward and backward trace hidden states in the bi-directional encoder.

TRAINING FEATURES
Our model (NMT-1) was trained on the data collected through web scrapping which consist of the 3000 sentences of both from Hindi to English language. The limit on vocabularies to be the 1000 most common words for the two languages. Words do not present in these selected vocabularies are transformed into a worldwide token <UKG>.When teaching our Neural Machine Translator, subsequent (Bhadanau, 2015), we clean out sentence pairs whose lengths exceed 40 words and mix up mini groups as we continue. Our assembling LSTM model have 3 layers, each with 250 cells, and 250-dimensional embeddings. The epochs to be set was 60. After the decoder function implementation collects the results into python lists, before using tf concat to join them into tensors. Our code is implemented in Jupyter notebook. we achieve a speed of 50 target words per second. It takes 4 hours to entirely train the model. Also, for comparison we trained the neural machine translation without attention. (NMT-2) VIII.

HINDI TO ENGLISH TRANSLATION ANALYSIS
We compare our Neural machine translation system with attention in the Hindi to English task with the neural machine translation without attention. It was interesting to examine the influence of attentional models in properly translating names such as "Rama" and "shiva". Non-attentional models, while creating sensible names from a language model point of view, lack the direct contacts from the source side to make accurate translations.

IX. RESULTS
The neural machine translation system results acquired from NMT-1, NMT-2 and translations of Google and Bing systems, evaluated using BLEU score and can be shown in the table 1. Also, the attention plot for the attention-based model that is more focused on the diagonal can be shown in figure 3.
Figure3:-attention plot X. CONCLUSION AND FUTURE WORKS Even though, Neural machine translation system offers better accuracy than traditional methods like rule-based Machine translation as well as SMT, but it still delays behind equated to manual human translation. In this paper, NMT systems namely, sequence-to-sequence RNN with attention-based mechanism Hindi to English translation and matched with the existing MT output in terms of BLEU score. It shows up improved functioning than the present systems. Though, close analysis of calculated translations notes that our NMT systems need to be enhanced in case of identification of blank lines in output and different translation of the trace sentence. Besides, with the understanding of the impact of the bi-gram model in Hindi language translation and correlation among Indian similar languages in [7], opens a new direction of research for direct translation between pairs of similar languages.