Measuring accuracy presents a significant challenge. A lot can happen between the time a prediction is made, and 60 minutes later when it can be validated, such as a meal, exercise, insulin delivery, or a stressful meeting. At diabits we want to make sure that we are providing the users with the highest level of accuracy and predictive reliability possible.
Achieving this kind of accuracy doesn’t happen overnight. This is why we are always conducting validation testing both internally and externally with other research organizations. We’ve worked hard to perfect our methodology and to validate our predictions. We’ve performed validation testing in 4 rigorous studies, 2 of which used state-of-the-art blood sugar simulation to produce large data sets in-silico. Last year we concluded an in-silico study using the Padova T1D Simulator.
Using the Padova T1D simulator for accuracy testing has substantial benefits. Using the simulator you can evaluate what blood sugar was predicted and what would happen 60 minutes later if no events occured, but also what amount of food, insulin, and exercise directly impacted the subject and their prediction during the evaluation period.
Diabits uses data from continuous glucose monitors (CGM) to build a personalized model of user’s metabolism. The machine learning algorithms behind Diabits predict future blood glucose values up to an hour in advance. Diabits achieves unparalleled accuracy with these blood sugar predictions because every user’s model is unique to them.
We validate the accuracy of our results in these studies using the Parkes Error Grid, an iteration of the Clarke Error Grid that was developed during a 2000 study by Joan L Parkes and collaborators. This grid system allows us to quickly compare a large dataset of predicted or measured points to a reference value.
The Parkes Error Grid specifies five risk levels, labeled as the A through E. The A region is classified as accurate, and the B region is clinically acceptable, meaning values estimated in this region will have little to no effect on clinical outcome and will not negatively impact a user’s treatment decisions. As such, the results described in this blog will be the proportion of predictions which fell in the A and B regions of Parkes, for any given assessment.
Our first in-silico study was conducted in-house using the FDA approved Padova simulator provided by the Epsilon Group. The study included 30 virtual patients, and measured the predictive accuracy of the Diabits algorithm. Each virtual patient is entirely unique, and represents a possible profile of a real Diabits user’s glucose metabolism.
The subjects were studied in four cohorts representing four distinct populations: adults with mixed hypoglycemia awareness, adults with impaired hypoglycemia awareness, pediatric patients with mixed hypoglycemia awareness and pediatric patients with impaired hypoglycemia awareness. Patients also had simulated behaviours related to diabetes management such as a randomized meal and insulin schedule based on the patient’s weight, age, and amount of insulin necessary based on carbs consumed. For each patient, a total of 360 days of blood glucose behaviour was simulated and insulin-on-board and carbs-on-board information were recorded.
Two different blood sugar prediction models were trained and tested for each patient. The first model, Production, is the algorithm which is used in Diabits today. The second model, ICE, is a more advanced version of Production which puts a more substantial weight on food, insulin, and exercise information. The results of the simulation showed that in general while the Production model is highly accurate, the ICE model will slightly outperform the Production model in most cases.
As a result of this study, the ICE model was implemented into Diabits. Today, if a user inputs food or insulin information, the ICE model will be automatically selected and used for that user’s predictions. If a user does not input this additional information, the Production model is automatically selected.
Table comparing accuracy of models without simulated error
Because of the nature of simulated data, estimates made during in-silico testing will always be more idealized than estimates made on live subjects. To account for this, a measurement error was also simulated. Without the simulated error, 100% of values were in regions A and B of the Parkes Error Grid for the 60 minute prediction of both the ICE and Production models. Taking into account the simulated error, 97.72% of values remained in the A and B regions for the 60 minute prediction on the Production model, and 99.52% for the ICE model.
Table comparing accuracy of models with simulated error
3,110,400 total data points were evaluated during this study, meaning they were compared and assessed for accuracy against Diabits predicted values.
Diabits predictions fall in the A and B regions
While Diabits was initially validated on live subjects in 2017 with BC Children’s Hospital, the results of this study further demonstrate Diabits predictive capabilities, and validate the ICE model.