Which of the following neural network architectures is best suited for capturing long-range dependencies in sequential data, a common challenge in language modeling?
Question 2
In the context of neural language models, what is the primary function of the 'embedding layer'?
Question 3
When comparing n-gram language models and neural language models, which of the following is a significant advantage of neural models regarding their ability to generalize to unseen data?
Question 4
Consider a neural language model trained on a large corpus. If the model consistently predicts the next word with high confidence but the generated text lacks coherence over longer sentences, what might be a potential underlying issue?
Question 5
Which of the following best describes the concept of 'attention mechanism' in advanced neural language models?
Language Modeling Quiz — Natural Language Processing | A-Warded