A new artificial intelligence is trained with data from the dark web

The dark part of the internet is full of hackers and cybercriminals. This artificial intelligence is trained with your data.


Researchers have created a new artificial intelligence that will be trained almost entirely on data from the dark web , the unindexed part of cyberspace.

The AI ​​has been dubbed DarkBERT and will be used for one purpose: to improve the cybersecurity of this huge and mysterious computing industry.

That’s how it works

While the large language models (LLMs) powered by ChatGPT and Google Bard were trained on data from the conventional web, DarkBERT was trained exclusively on data from the Dark Web, the home of hackers and cybercriminals.

The AI ​​was fed for 16 days from two data sets, one raw and one processor.

The team says their new LLM was much better at making sense of the dark web than other models that were trained to complete similar tasks, including RoBERTa, which Facebook researchers designed in 2019 to “predict intentionally hidden sections of text within examples.” of language without annotations”, according to an official description.

“The results of our evaluation show that the DarkBERT- based classification model outperforms that of known pretrained language models,” the researchers wrote in their paper.