Bidirectional Encoder Representations from Transformers (BERT) is a transformer-
based machine learning technique for natural language processing (NLP) pre-
training developed by Google.
Jacob Devlin and his Google colleagues developed and released BERT in 2018. Google said in 2019 that it has started implementing BERT in its search engine, and by the end of 2020, it was being used in nearly all English-language searches.
BERT is a transformer language model that was self-supervised pre trained on a
sizable sample of English data. This means that an automatic method was used to generate inputs and labels from those texts after it had been pre trained on just the raw texts without any human labeling (which explains why it may use a ton of data that is readily available to the public).
BERT was pre-trained on two tasks:
1. Masked language modeling (MLM): taking a sentence, the model randomly
masks 15% of the words in the input then run the entire masked sentence
through the model and has to predict the masked words. This allows the model
to learn a bidirectional representation of the sentence. BERT is the first deeply
bidirectional, unsupervised language representation, pre-trained using only a
plain text data sets.
2. Next sentence prediction (NSP): During pretraining, the model combines two
masked words as inputs. They occasionally match sentences that were next to
one another in the original text. The model must then determine whether or
not the two sentences followed one another.
Applications of BERT
Sequence to sequenced based language generation tasks such as:
Question answering
Summarization
Response generation
NLP Tasks
Sentiment classification
Word Sense ambiguity
This article was written by Chepkorir Diana from the AI Class of 2022. You can learn more about AI models by registering for the AI Picodegree course at HURU School
Comments