AI Can Grade IVF Embryos with Same Accuracy as Experts

A well trained AI algorithm might be able to improve the success rates of IV treatment. In Vitro Fertilization or IVF has been helping people improve their reproductive odds since its first successful case in 1977.

While many improvements in technology have improved the process there are still aspects of the IVF treatment that are time-consuming and relatively inaccurate. One of these is a process known as “grading”.

Slow and inaccurate

The task requires an embryologist to examine embryos under a microscope checking their morphological features and assigning a quality score. Round, even numbers of cells will score highly while fractured and fragmented cells score poorly.

The highest scoring embryos will be implanted first. The process requires experience and can be inaccurate as it relies purely on visual attributes. The accuracy at this stage of the process can be improved if a cell is removed from the embryo and tested for abnormalities, a procedure known as preimplantation genetic screening.

Off-the shelf algorithm

However, this additional step makes the IVF process even more expensive and time-consuming. So until now, visual grading of eggs has been the best option.

That’s all about to change though thanks to an algorithm which has learned to grade embryos better than its human counterparts. Researchers trained a Google deep learning algorithm to identify IVF embryos as either good, fair, or poor based on the likelihood each one would successfully implant.

STORK stacks up against experts

The algorithm training has been a long term project. It began back in 2011 when the embryology lab at Weill Cornell Medicine, where the research took place, installed a time-lapse imaging system inside its embryo incubators. This meant that technicians could watch and record their embryos as they developed.

The resulting 10,000 videos of anonymized embryos could then freeze-framed and fed into a neural network. Director of the lab Nikica Zaninovic, teamed up with Olivier Elemento, director of Cornell’s Englander Institute for Precision Medicine to take the project to the next step.

The two researchers thought that they could use AI to automate a process that was notoriously time consuming and inaccurate. To test their trained network which they have nicknamed STORK, the two researchers recruited five embryologists from clinics on three continents to grade 394 embryos based on images taken from different labs.


Surprisingly the five experts could only reach the same conclusion on 89 embryos or less than a quarter of the total. To get around this lack of agreement the five embryologists were then told they needed to use a majority voting procedure—three out of five embryologists needed to agree to classify an embryo as good, fair, or poor.

STORK looked at the same images graded by the human and predicted the majority voting decision with 95.7 percent accuracy. There is some more research to go before STORK is rolled out in clinics around the world, but its initial work is looking promising and may eventually help improve IVF success rates.


Did you like the post? Comment down bellow!