That is, the transition to the new YATI algorithm was quite a complex task from an engineering point of view. Many accelerators were combined into clusters, connected into a network, and a powerful cooling system was developed for the resulting servers. But even with such capacities, it now takes about a month to train the model.
The classic technique for training transformers russia company email list involves showing them unstructured texts. That is, a text is taken, a certain percentage of words are masked in it, and the transformer is tasked with guessing these words. For YATI, the task was more complicated: it was shown not just the text of a separate document, but actual queries and texts of documents that users had seen.and which they did not. For this, expert markup of assessors was used, who assessed the relevance of each document to the query on a complex scale.
After that, Yandex took the array of received data and trained the transformer to guess the expert assessment, thus learning to rank. As a result, the search algorithm was significantly improved and Yandex reached a record level of search quality.
Advantages of YATI and transformers
Unlike Yandex's previous neural network algorithms Palekh and Korolev, YATI can predict not a user's click, but an expert assessment, which is a fundamental difference.
In addition, the advantages of transformers are as follows:
The search works not only with queries and titles, but is also capable of evaluating long texts;
there is an “attention mechanism” that highlights the most significant fragments in the text;
The word order and context, that is, the influence of words on each other, are taken into account.
Now, for example, when you search for plane tickets from Yekaterinburg to Moscow, the search engine will understand that you need from Yekaterinburg to Moscow, and not the other way around. In addition, Yandex has become better at recognizing typos.
YATI guessed which of the documents users liked
-
- Posts: 696
- Joined: Thu Jan 02, 2025 7:09 am