In a recent study, researchers from the Université de Montréal and Flare Systems have demonstrated that large language models (LLMs) can accurately extract critical cyber threat intelligence (CTI) from cybercrime forums with an impressive 98% accuracy. The findings, published in a white paper, highlight the immense potential of AI in bolstering cybersecurity efforts.
The research team, led by Vanessa Clairoux-Trépanier and Isa-May Beauchamp from the School of Criminology at the Université de Montréal, in collaboration with Flare Systems, developed an LLM system powered by OpenAI’s GPT-3.5-turbo model.
This system analyzed conversations from three prominent cybercrime forums: XSS, Exploit.in, and RAMP.
“Our goal was to assess the accuracy and efficiency of LLMs in extracting key CTI information from these forums, which are known to contain extensive discussions about emerging cyber threats,” explained Clairoux-Trépanier.
Collecting Threat Intelligence Using Large Language Model
The LLM system was instructed to summarize the conversations and code ten critical CTI variables, such as identifying targeted organizations, critical infrastructure, and exploitable vulnerabilities.
Two human coders then meticulously reviewed each conversation to evaluate the accuracy of the LLM’s output.
The results were excellent, with the LLM system achieving an average accuracy score of 98%, ranging from 95% to 100% across the ten variables. This level of performance exceeded the researchers’ expectations and underscores the immense potential of LLMs in the field of cyber threat intelligence.
“Our findings demonstrate that LLMs can effectively replace first-level threat analysts in extracting relevant information from cybercrime forums,” stated Beauchamp. “This technology can significantly enhance the efficiency and scalability of CTI efforts, allowing organizations to stay one step ahead of cyber threats.”
The study also identified areas for further improvement, such as refining the LLM’s ability to distinguish between historical narratives and current events, as well as optimizing prompts and data chunking techniques. Despite these minor limitations, the researchers emphasize that the LLM system’s performance is comparable to that of human analysts.
Masarah Paquet-Clouston, a co-author from the Complexity Science Hub in Vienna, Austria, commented on the broader implications of the study: “By leveraging the power of AI, we can revolutionize how we approach cyber threat intelligence. This technology has the potential to provide organizations with real-time, actionable insights to proactively defend against cyber attacks.”
The researchers plan to continue refining the LLM system and exploring its applications in various cybersecurity domains. They also call for further research into using state-of-the-art models, such as Claude 3.5 Sonnet and GPT-4o, to push the boundaries of AI-driven cyber threat intelligence.
As cyber threats continue to evolve and become more sophisticated, integrating AI and large language models into cybersecurity strategies is set to become a game-changer. This groundbreaking study by the Université de Montréal and Flare Systems paves the way for a new era of proactive, intelligence-driven cybersecurity.