Études fondées sur les communautés Reddit

What's new on the market? Combining internet traces and pretrained language models to recognize emerging drug names.

Forensic Sci Int . 2026;385 :112958

Résumé

Posts and comments published by users in online forum discussions provide valuable insights and might contain the earliest traces of new substances emerging on the market. However, the systematic recognition of emerging new psychoactive substances (NPS) remains an important challenge for both public health agencies and law enforcement authorities. Large volumes of messages published by users, combined with the unstructured nature of text, complicate the retrieval of relevant information like drug name mentions. Common approaches based on keywords matching (e.g., regular expressions) limit current monitoring systems, as they can only detect known terms. Consequently, new or previously unseen drug names may remain undetected, leaving novel NPS under active discussion potentially overlooked. To address this challenge, we introduce DrugRecon, a RoBERTa based pretrained language model specifically fine-tuned for drug name recognition. The model was trained and evaluated on a manually annotated corpus of posts and comments collected from drug-related sections of three online forums (Drugs-Forum, Dread, and Reddit). A data augmentation strategy was applied during fine-tuning to improve generalization to previously unseen drug names. To demonstrate its applicability in real-world settings, DrugRecon was applied to posts and comments published between April and June 2025 across the three forums. The model successfully recognized drug names absent from existing lexicons, highlighting its capacity to detect emerging terminology. By combining automatic recognition with expert validation, 12 names were classified as denoting potential novel NPS. This proactive monitoring approach not only guides further investigations, but also strengthens preparedness for when these substances eventually appear in drug-checking services, police seizures, or toxicological reports.

Tous les articles