BERT: Google’s Newest Advancement in Natural Language Processing

If you are like me, ever since you started using the internet you learned how to properly formulate a search query. Early on it was vital to focus solely on keywords and minimal variables that might throw off your search results.

The difference between typing “best Chicago daycares” and “Which daycare in Chicago is the best for my child to attend?” could give drastically different results. Today a website with good SEO tactics can appear in both but there are still big differences in results depending on how you formulate your search query.

With Google’s newest search algorithm announcement there might be a new way to merge these search tactics and improve overall search results.

What is BERT?

You might have heard that Google recently announced an update to their search engine capabilities. In fact, their biggest search engine update in five years. So, what is this update and what makes it so important? Back in November 2018, Google announced their new open source, AI ranking algorithm; BERT. Bidirectional Encoder Representations from Transportation (BERT) is a new Natural Language Processing (NLP) technique in which Google will interpret search queries and generate data in order to improve search results.

With BERT, Google plans on evaluating conversational search queries with more accuracy than before. This means that searching “Which daycare in Chicago is the best for my child to attend?” might actually start giving you the precision and results you expected. Throughout the last year Google has been working on BERT and as of October 2019 started implementing it in the US.

How does BERT Work?

When comprehending the mechanics behind BERT it’s important to have an understanding of what bidirectional means. Of all the elements that make up the BERT acronym bidirectional is the one that defines the process best.

Bidirectional, meaning to go in two directions, refers to how the algorithm reads a search query. Google explains in a diagram the different methods in which a search query is read and how they differ from the BERT method. With a bidirectional approach each word or element of the query is read separately and compared with all of the other elements, regardless of if they come before or after that specific element or how far away from them, they are in the sequence of events. If we were to use the initial example of “Which daycare in Chicago is the best for my child to attend?” it would mean that the word “best” would be compared to all of the other worlds in this query, whether they come before or after.

bert update diagram

The bidirectional method also includes minor conjunctions and prepositions such as “of” or “to”, something that has been mainly overlooked in previous query analysis algorithms. Major conjunctions such as “and” and “or” have been long been accounted for in search analytics, seeing as alternating one for the other can give you drastically different results. For example, searching “Chicago AND Philadelphia” vs “Chicago OR Philadelphia” will give you either the next Bears vs Eagles game or a list of pros and cons of moving to either city.

Google has realized that words like “of” or “to” can make similar differentiations and need to be accounted for as well. In Google’s announcement they use the example “2019 Brazil traveler to USA need a visa” they emphasize how important “to” is to the search query. Without acknowledging “to”, results would be less particular to the actual question being asked. Including “to” in the search process give the context that this search is being made by a Brazilian traveling to America and not an American traveling to Brazil. Accounting for this difference helps return more accurate results than simply omitting “to” from the search query.

If we were to take the initial example of “Which daycare in Chicago is the best for my child to attend?” we can see the importance of including “for” and “to” in order to get more accurate results. By including “for” and “to” in this search using the BERT system the results would be more focused on results about what kinds of children these daycares specialize in serving and when they might have availability. If you owned a daycare with a survey page asking about potential attendees needs, you would be more likely to show up in the search results. Likewise, if you had an upcoming open house or calendar of when new students were being accepted you would rank higher than a generic search about Chicago daycares.

Accounting for these minor contractions is just a small example of what the BERT system takes into account when determining what a user is searching for. The BERT system’s precision means that all elements of the search query will hold value in determining the results of a search by better understanding the context of the search without asking for further information. Overall the BERT update hopes to specialize search results in a way that is more accurate than with Google’s previous efforts.

BERT’s Performance: A Year in Review

Google’s BERT (Bidirectional Encoder Representations from Transformers) system has shown impressive progress in its first year of testing. To evaluate BERT’s capabilities, Google has utilized several benchmark tests, comparing its performance to human baselines and other AI systems.

Key Performance Metrics:

  1. SQuAD (Stanford Question Answering Dataset):
    • Initial results (Nov 2018, SQuAD 1.1): BERT: 87.4% EM, 93.2% F1 Human: 82.3% EM, 91.2% F1
    • Recent results (Sept 2019, SQuAD 2.0): BERT (ALBERT): 89.7% EM, 92.2% F1 Human: 86.3% EM, 89.5% F1
  2. GLUE (General Language Understanding Evaluation):
    • Feb 2019: BERT scored 80.5 (Human baseline: 87.1)
    • Sept 2019: ALBERT (compact BERT) scored 89.4
    • Current leader: T5 Team Google at 89.7 (Nov 2019)
  3. MultiNLI (Multi-Genre Natural Language Inference):
    • Baseline: 72.4% matched, 71.9% unmatched
    • BERT: 86.7% matched, 85.9% unmatched

BERT’s performance has consistently improved across these metrics, often surpassing human baselines. Google has also developed ALBERT, a more efficient version of BERT, to address memory constraints during testing.

The continuous enhancement of both BERT and the testing models suggests ongoing improvement in accuracy and comprehension. While it’s uncertain if BERT has reached its peak performance, its consistent lead in various tests indicates its significant potential in natural language processing.

As BERT continues to evolve, possibly in combination with other Google technologies, it’s expected to further refine its capabilities in understanding and responding to complex language queries.

What is this update?

While Google has been working on BERT for a year already this is only the beginning. First off, BERT is only being implemented in the US and will only influence 1 in 10, English language search queries. In the grand scheme of Google’s reach this is a small starting point but they plan to extend BERT to other regions and languages as the system progresses. This announcement marks the start of a new era of search query methods for Google but don’t think you have to start changing your entire SEO strategy because of it.

How does the BERT update affect me?

Like any algorithm update, big changes can come with seemingly nothing to notice. It’s important to keep in mind that BERT is being rolled out in a small sample size at the moment and that it is a machine learning process that won’t make major changes overnight. The Idea of BERT is that it is open source and like all major algorithms, an ongoing, self-learning process, this means that improvements will be made incrementally.

If you are someone who has consistently focused on SEO and created quality content over the last few years, there really isn’t much to worry about when it comes to adapting for the BERT update. As long as you have been making thoughtful content that has been filling yours or your client’s customer needs you shouldn’t have much to worry about.

That being said, if you have a website that hasn’t been updated in a few years you definitely will need to start updating now more than ever. The goal of the BERT update is to provide users with better search results, in order to do so it needs more specified content. If you haven’t been making specific pages addressing your top search results associated with your company you’ve already fallen behind competitors and the BERT update will only continue to widen the gap. The best thing to do is to make sure consumer inquiries are covered on your site or related sources with detail and close consideration.

BERT’s Impact on Search

While much has been discussed about the BERT update, it’s important to recognize that we’re still in the early stages of understanding its full impact on search results. Google has high expectations for BERT’s potential to revolutionize how search queries are interpreted and answered. If this bidirectional approach proves to be significantly more effective than Google’s current methods, it could not only represent a major advancement for Google but also create pressure for competitors to innovate to maintain relevance.

However, regardless of whether BERT ultimately proves to be revolutionary or merely incremental in the long term, its gradual integration into Google’s search algorithm means that its impact will be felt progressively rather than as a sudden, disruptive change. While many in the industry are eager to stay ahead of the curve and prepare for future developments, it’s crucial to remember that these changes represent just one aspect of the broader SEO landscape.

Whether BERT becomes a lasting influence or a passing trend, the fundamental focus should remain on meeting user needs, creating valuable content, and developing websites that enhance the overall user experience. By paying close attention to these core principles, you can create compelling reasons for visitors to return to your site and engage with your content for longer periods. Ultimately, a user-centric approach will continue to be the cornerstone of effective SEO strategy, regardless of algorithmic changes.

Call Now