I don't often enter the chaos realm that is Twitter anymore, but I love the monthly updates. As someone who loves med. chem. sci. computing that AI post really resonates with me. My first question anytime we're talking with a software company that pitches AI is what's your training set? Because most of the time it's going to be ChEMBL and I haven't seen that set be very representative of modern drug discovery space.

Expand full comment

All of this! Public data sets are great, but with the possible exception of the PDB powering AlphaFold, those datasets are woefully inadequate to meet the bigger needs in drug discovery.

Expand full comment