Multi-task learning is a blocker for general NLLP systems.
Unified models can decide how to transfer knowledge (domain adaptation, weight sharing, transfer learning, and zero-shot learning).
Unified AND multi-task models can:
More easily adapt to new tasks.
Make deploying to production X times simpler.
Lower the bar for more people to solve new tasks.
Potentially move towards continual learning.
Sequence tagging: named entity recognition, aspect specific sentiment.
Text classification: dialogue state tracking, sentiment classification.
Sequence-to-sequence: machine translation, summarization, question answering.
⇒ They correspond to the 3 equivalent super-tasks of NLP: Language Modeling, Question Answering, and Dialogue.
Start with a context.
Ask a question.
Generate the answer one word at a time by:
Pointing to context.
Pointing to question.
Or choosing a word from an external vocabulary.
Pointer Switch is choosing between those three options for each output word.
Train a single question answering model for multiple NLP tasks (aka questions).
Framework for tackling:
More general language understanding.
Weight-sharing, pre-training, fine-tuning.