Sugi Predict: An Open System for Target Prediction across Patent Chemical Space
Authors/Creators
Description
Two questions recur in drug discovery and chemical biology: what a given compound is likely to do, and what chemistry has already been explored around a given target. Much of the chemistry that bears on them sits in the patent record, which discloses tens of millions of compounds and is often where new chemical matter appears first. We present Sugi Predict, an open, reproducible system that predicts human protein targets across patent chemical space. Every drug-like compound in SureChEMBL (roughly 30 million) is assigned its predicted human protein targets by chemical k-nearest-neighbour transfer from an uncapped reference of 1.25 million ChEMBL ligand–target pairs. The transfer follows the established similarity principle, but each prediction is reported as two separate signals rather than one score, a confidence calibrated from chemical similarity and a count of supporting reference neighbours, with a known-compound flag, so that a single close analogue is not read like a broad consensus of weaker matches. Sugi Predict answers two questions from one index: for a molecule, its predicted targets; and, in the reverse direction, for a target the patented chemistry predicted against it, joined to patent number, assignee, date, and claim status. We validate the prediction across four test conditions of increasing difficulty: a leave-one-out interpolation upper bound (83.0 % recall@1), a scaffold split (76.5 %), a 1,556-drug named-target panel (54 % within the top 5), and a temporal split on chemistry disclosed after the reference was frozen (40.8 %, a prospective test). Accuracy throughout is proportional to chemical similarity to known chemistry, and we drop predictive routes that fail the same test. Sugi Predict is browsable as a web application at https://sugi.bio/predict.
Files
main.pdf
Files
(675.6 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:f89f0be9972df75fbfb2ea25695f74ff
|
675.6 kB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/tamerh/sugi-predict