Validating Biotech AI

One of my more aspirational career goals has been efficient validation testing. It’s an odd goal, but one those that have signed a thousand test forms can relate.  Automation testing is an absolute necessity in a biopharma setting to increase quality for patients and hold stakeholders/vendors accountable. In practice, however, many validation efforts still fall into a quagmire of paperwork exercises and overlook their original intent. 

Put on your seat belts, this is going to get technical fast.

The current generation of tools we’ve developed are based on automated software testing techniques established in the early 80’s using test ‘oracles’.  An oracle is just the authority used to determine if a test passes or fails. The oracle knows the right output for an expected input. 

The problem our industry faces is that automation control software often falls into the non-testable class of software. The input domains are large, dependent on environment preconditions, and results easily misinterpreted.

So we cheat and make a human the Oracle… and if you read this far, the oracle is probably you.  That’s where the oracle problem starts. We typically arm the Human Oracle with a series of weak oracles. They are heuristic in nature, but greatly help in delivering consistent and repeatable tests.

I’m guessing all of you have written, executed, or reviewed these types of tests for a project FAT, SAT, OQ, and/or PQ. The items on the left are very straight forward, but as you move further to the right, a little more sophistication is required to generate a good test. Explicit answers are available, but the tester needs clever control of the inputs to get the right output for a pass/fail result. Once in the field, controlling a complex set of real-world inputs borders on the impossible. 

An example we can all relate to is testing a PID control loop. While the mathematical model is known, for all practical purposes they are tested as a black box that takes a set of inputs over time, applies a mathematical model, and sets an output. It is very common to field-test PIDs during commissioning. 

While most engineers and operators can quickly tell  if it isn’t right (either badly tuned or straight busted), it is difficult to explicitly test such a dynamic system.  We end up testing it using a series of overlapping weak oracles to implicitly support correct functionality.

The point to all of this is that the PID example above illustrates the coming problem of validating AI machine learning algorithms. 

Right now, large automation vendors like Rockwell are selling machine learning models used for predictive maintenance. Supply chain vendors like SAP are selling models for inventory control and predictive demand. These are great starts to the low hanging AI fruit, but do not yet directly apply to manufacturing processes.

The models we started experimenting with involve supervised learning of “normal” batches for common biotech processes, such as chromatography and tangential flow filtration. A simulated model of the “ideal” equipment is used to generate the initial learning set. Then we test it with real-life historical data and evaluate the output. The hope is that we can create a model that will predict and alert the operator of a possible deviation before it happens. Or possibly evaluate a process against a “golden batch” for quick release.

Other companies are doing similar work. These models are currently being deployed in other big data industries and network security systems. Like virtualization, IT/OT, and big data analysis, it is just a matter of time until these algorithms are implemented on the plant floor. Developing a solid validation strategy should begin now.

A human oracle will still be needed for the foreseeable future. Human involvement is the answer to the inevitable question of “Who watches the Watchers?” we experience in validation.    But we can better equip you by integrating the same advanced tools that have been used in business applications into your tool set:

  •        Metamorphic Relational Testing
  •  Cross Validation with Independent Models

Metamorphic testing was developed to help mitigate the oracle problem in application software testing. Grossly simplified, the methodology uses the relationship between a system’s output and multiple inputs to detect possible malfunctions. A possible application would be when controlling pH, we know there are ???? liters of material we’re controlling, it’s running a little acidic, and the caustic we’re using should raise the solution around 0.1/ml injection. If the system starts dumping 10mL/sec, it’s probably safe to assume something is wrong and we’re just making salt. Metamorphic testing, in this case, helps constrain testable values that cannot be precisely pre-defined in stochastic systems.

Cross Validation with an independent model is another possible tool. Again, an overly simplified description is to evaluate the output from two to independent mathematical models of a system for agreement. The major drawback to this is that it doubles the work. But as our industry moves toward process modeling before equipment is built, this initial model may be used to validate the real-world process.

These aren’t solutions to the oracle problem. But they can help mitigate errors and make the human oracle much more efficient with the proper application. Combined with automated test execution, we may be able to gain a better engineering to testing effort ratio for our projects, with higher quality results, and greater options for regression testing.

 -Bill Mueller, Lucid Automation & Security

https://www.lucidautomationsecurity.comIf you want more information, I found these links very helpful when researching this article: [1] Test your Machine Learning Algorithm with Metamorphic Testing, [2] On Testing Nontestable Programs

Published by

Bill Mueller Sr. Life Science Automation Engineer Published • 2y 8 articles Machine Learning AIs are coming to the Life Sciences. I wrote an article looking into how we’re looking to adapt our current biotech validation practices with these new machine learned models and possibly leverage it for testing direct.