Introduction to Artificial Intelligence in Ophthalmology

From EyeWiki


Artificial Intelligence (AI), a term introduced in the 1950s, refers to software that can mimic cognitive functions such as learning and problem solving.[1] It makes it possible for machines to learn from experience and adjust to new inputs. AI machines can be trained to accomplish such tasks by processing and recognizing patterns in large amounts of data.[1] It has numerous applications in several fields of ophthalmology.[2]

Types of Artificial Intelligence

Simple Automated Detectors

A simple automated detector is a system that is fed an algorithm that identifies the presence or absence of features (e.g. location, dimensions, or contour of a lesion) based on objective criteria. The input into the system is a set of step-wise rules generated by the individuals who are engineering a model that predicts outcomes.[3] This rule-based algorithm assesses features and ultimately yields an outcome (e.g. diagnosis) based on the patterns identified.[3]The rules are generally written by programmers or content experts using input features that they have previously identified as being important.

Machine Learning

A more advanced form of AI is machine learning. Unlike simple automated detectors, the input into machine learning is a training dataset (e.g. a set of images) rather than predefined algorithms (i.e. rules written by the program designers).[4] Machine learning can be further classified into supervised, semi-supervised, and unsupervised learning.[4] In unsupervised learning, the machine must “learn” on its own the different labels or classes of inputs based on the dataset presented. The machine learning process can be assisted by providing the machine with objective guidance through a labeled dataset and is the basis for supervised learning.[4] A labeled dataset, for example, is a collection of images that have been preassigned a “ground truth” diagnosis by experts who utilized standard diagnostic methods. A consensus amongst experts on a reference standard diagnosis for each item (e.g. image) in the set may strengthen the quality of the dataset that is presented to the machine during the initial learning phase when the algorithms are built.[5] Presenting an machine learning system with a hybrid of both training and labeled datasets is termed semi-supervised learning.[4] See Figure 1.

Figure 1. Machine Learning

Machine Learning Techniques

Machine learning models can employ various techniques to predict an output. Classification is one technique that is built upon supervised and semi-supervised learning.[4] This type of system allows for concrete categorization of outputs (e.g. presence versus absence of disease; mild versus severe category of disease).[4] When classes or labels for the training data are not provided, unsupervised machine learning is still able to cluster similar inputs, even if it is not able to definitively classify the individual clusters.[4] When ordinal or continuous outcomes are desired rather than a simple classification, regression techniques can employ supervised learning to determine a continuous score (e.g. numerical value) based on an input (e.g. image).[4]

The architecture of machine learning models can employ neural networks (to be discussed next) in addition to other techniques such as genetic programming, support vector machines, statistical regression, tree-based classification & regression, or random forests.

Utilizing Neural Networks in Machine Learning

An AI system designed using artificial neural networks (ANNs) is considered a subset of machine learning and can be used in both supervised, semi-supervised, and unsupervised learning. The architecture of an ANN is composed of layers of neurons. An ANN consists of an input layer that is introduced with (e.g. diagnostic) features determined by programmers and subject matter experts in advance, intermediate neural network layers, and a final output layer that receives the analysis from the intermediate layer and determines the outcome (e.g. diagnosis).[3][4] A layer refers to the level at which analysis of one set of distinct (e.g. diagnostic) features are conducted. Each layer is composed of multiple neurons, or nodes, that perform the higher level analytical processing, initially described as emulating the neuronal functions of the brain.

Each node can be considered a basic calculation of the multiple inputs including multiplication of an individual input by an assigned “weight”, addition of a bias, and a (non-linear) mathematical activation function which enables more complex analysis. The result of a each node in one layer is then transmitted to multiple nodes in the next layer where further computations occur. At the output layer in the neural network, the resulting number approximates a continuous measurement (i.e. a regression prediction) or the probability of an outcome or classification. Each item (e.g. image) in the dataset is usually analyzed multiple times during training using back-propagation to update the network's weights and biases to maximize the performance of the model's predictions. See Figure 2 for visualization of this process (and comparison to Deep Learning described below).

Figure 2. Machine learning vs deep learning neural networks

Deep Learning

In a simple neural network, there are only a few layers: as few as just an input and output layer. However, deep learning algorithms are neural networks with an expanded number of communicating layers between the input and output layers: from dozens to hundreds. This intricate type of machine learning involves supervised learning with labeled datasets (e.g. images with preassigned expert diagnoses). See Figure 3 for a flow diagram of this relationship.

Figure 3. Flow diagram of Artificial Intelligence

One popular form of deep learning neural network is the convolutional neural network (CNN). Other deep learning models include recurrent neural networks, LSTMs, and fully-connected deep neural networks.

The CNN begins with a large matrix of inputs into the first layer (often a 2D image), which then results in an output that serves as the input for the next layer in the series. The connections between layers describe a convolution propagating local information (where the weights and biases of the convolution are shared between all nodes of a layer to dramatically reduce the parameter space of the model and thus the complexity of training the model). The analysis from each layer is transmitted throughout the network until the final layer produces the outcome.[4] CNNs designed for image-based diagnosis, for example, analyze the pixels in correlation with the features seen in the disease state depicted.[3] If the predicted outcome does not match the expert outcome determined using standard methods, then errors in the prediction are propogated backwards through the CNN to adjust the weights to reduce the error. After training, the model is presented with a training dataset (e.g. unlabeled images) and produces an outcome (e.g. diagnosis) based on what the machine “learned” under supervision. See Figure 2 for visualization ofthis process in comparison to Machine Learning as described above.

While the training dataset can provide an initial analysis, it is important to ensure that the algorithm is generalizable to new datasets.  This is accomplished with appropriate control during training and verified using validation datasets, which minimize “overfitting” the algorithm to the training dataset. Techniques to minimize overfitting include dataset expansion, augmentation, applying drop out, and regularization. Augmentation can be utilized in examples where the original dataset cannot be expanded. For example, images can be altered (resizing, cropping, rotating, etc) in such a way that they still represent realistic examples. Drop out can be used to train the algorithm by ignoring a certain subset of nodes, training other nodes to perform the work of others. Regularization involves altering the total of all weights in the system. The final step is to evaluate the model’s performance on a new dataset that it has not been exposed to.[4]

Limitations of Artificial Intelligence

As a quickly evolving field, AI comes with some challenges. For one, the accuracy of outcomes is heavily dependent on the quality of inputs.[6] This has been described as the “garbage in, garbage out” phenomenon; if the initial dataset presented to the machine is inadequate, then the predictions generated by the AI tool will be inaccurate. In some situations, output recommendations by AI tools may be simply incorrect. For example, IBM Watson Health’s AI algorithm, which predicts treatments for patients with cancer, recommended the use of bevacizumab in patients with severe bleeding.[6] However, hemorrhage is stated as a black box warning for bevacizumab. This example highlights the importance of training and validating AI algorithms.[5]

Erroneous predictions by AI algorithms can bring up the issue of liability for physicians. Current law protects physicians from liability, as long as they follow the standard of care.[7] Thus, physicians are incentivized to use AI predictions only if they confirm existing decision-making processes, instead of as a resource to improve and build upon patient care.[8] In the future, further medicolegal implications must be considered if AI becomes integrated into the standard of care.

Additionally, NN algorithms are not entirely designed by programmers, but by rules inferred from the training dataset. They arrive to conclusions opaquely as programmers are unaware of the reasoning behind these self-generated rules. This is called the black box dilemma, as people may be hesitant to trust predictions that stem from a process that, by definition, lacks transparency.[9] Additionally, biases and sterotypes inherent in the training dataset become integrated into the model's performance.[10]

There is a fear that AI reduces the need for physicians, as numerous studies have shown that certain algorithms have higher success rates of diagnosing diseases as compared to those of clinicians.[11][12] This has been a concern in image-based fields such as radiology and pathology, with concerns that it may limit physicians by narrowing the scope of their clinical judgment, reducing a reliance on broad differential diagnoses, and automating the process of patient care which may affect that patient-physician relationship. By predicting diagnoses in a purely algorithmic and objective manner, AI dismisses any subjective facets of a disease that may be unique to a patient, potentially overlooking crucial information.[13]

However, it can be argued that AI merely augments the work of physicians by serving as a diagnostic tool generating predictions that can positively affect patient management.[3] For example, an AI-integrated telemedicine platform designed to screen and refer patients with cataracts exhibited diagnostic performance of over 90%. More importantly, the platform improved physician efficiency by allowing them to evaluate ten times as many patients a year.[14] By complementing the role of physicians, AI has the potential to significantly improve patient care by increasing efficiency and outcomes as it becomes incorporated into clinical practice in the near future.

Method to Approaching Artificial Intelligence Studies

Al is rapidly involving in the field of ophthalmology and new experimental algorithms are emerging in the literature that describe the AI system and methodology. Articles regarding AI studies can be challenging to understand, especially when there is no standardized way of presenting data, statistics and clinical value. Nevertheless, there are fundamental characteristics that readers can look for in order to critically appraise such studies. In the introduction, articles will typically emphasize the clinical gap that AI may fulfill and the research question it seeks to answer. Additionally, the introduction usually summarizes a thorough literature search that explores existing technologies pertaining to the disease and discusses the potential for AI to build on these technologies to provide further insight.[5]

In the methods section, articles will elaborate on the basic framework of an AI system, which consists of two phases: (1) training and validation and (2) testing. The training and validation phase can be further broken down into two other parts: (1) selection of a machine learning model; and (2) a training dataset entailing data and/or images.[5] Datasets may vary in sample size and may be limited in diversity or generalizability. As mentioned previously, the ease with which machine learning models can enshrine biases implicit in the data and the opaqueness of the reasoning behind their conclusions makes the selection of training data of fundamental importance. An article's method section should describe the source of the training data sets and the data's characteristics (diversity, gender, disease severity range, ages, etc.). All training steps should be described including the machine learning model upon which the study is based (including the source of any pre-trained weights if only transfer learning is performed), the versions of software used, and where sourcecode for the model training can be obtained.

Moreover, AI studies should clearly describe the workflow for use of the AI system. For example, the workflow may consist of an input, such as an image, which is then analyzed by the AI system to detect specific features and ultimately produce an outcome, a diagnosis. The resulting diagnosis may complement the work of physicians, demonstrating the potential for reported AI systems to be ultimately incorporated into clinical practice.

As described by Ting et al., AI articles should generally include limitations of the AI systems discussed. This gives readers a clearer understanding of the potential for integration of AI into clinical practice and of the associated shortcomings of such systems.[5]

Potential Applications of Artificial Intelligence in Ophthalmology[2]

The incorporation of AI systems into clinical practice can potentially enhance productivity in the workplace, as well as aid in the clinical decision-making or patient communication processes.[15] [16]Applying AI to medical diagnostic assessments allows for the automatic analysis of imaging, for example, and the subsequent generation of a diagnosis or prediction of a disease course.[3][15] In ophthalmology, many AI platforms are being explored for potential use in the detection, surveillance, and treatment of various ocular diseases. However, many are in the experimental phase and further evaluation must be done to assess if these algorithms are appropriate for clinical practice.[17] AI algorithms have been described in the literature in several fields in ophthalmology such as diabetic retinopathy, glaucoma, age-related macular degeneration, retinopathy of prematurity, retinal vascular occlusions, keratoconus, cataract, refractive errors, retinal detachment, squint, and ocular cancers. It is also useful for intraocular lens power calculation, planning squint surgeries, and planning intravitreal anti-vascular endothelial growth factor injections. In addition, AI can detect cognitive impairment, dementia, Alzheimer's disease, stroke risk, and so on from fundus photographs and optical coherence tomography.[2]


Multiple deep learning programs have demonstrated high sensitivity and specificity in recognizing glaucomatous optic nerve changes.[1][17][18] These AI readings are based on diagnostic features otherwise typically assessed by a human expert, including OCT and color fundus photography findings, visual field testing results, and intraocular pressure and corneal thickness measurements. In addition to these AI systems that screen for the presence or absence of glaucoma, Muhammad et al. developed a deep learning algorithm that accurately identifies glaucoma suspects, allowing for more timely management.[19] An expanded discussion of the applications of AI in the field of glaucoma can be found in the article “Artificial Intelligence in Retina.

Ocular Oncology

Emulating a decision tree model, a machine learning algorithm was developed to anticipate the course of periocular reconstruction during surgical treatment of basal cell carcinoma.[18]Moreover, several machine learning systems, through the use of artificial neural networks, have been designed to predict disease outcomes for choroidal melanoma by analyzing demographic data and oncologic history.[1]Habibalahi et al. implemented machine learning techniques in the development of a multispectral imaging system for the detection of ocular surface squamous neoplasia on biopsy[20]. The system demarcates the region of neoplastic changes, providing a visual representation of disease margins to the clinician or surgeon in minimal time.[20]


Machine learning programs have been developed to detect and grade cataracts.[18][21] [22] Wu et al recently validated a model using an AI algorithm called ResNet to identify referable cataracts.[23] Deep learning algorithms for the assessment of congenital cataracts in particular have also been reported.[1][24] A validated system by Liu et al., known as the CC-Cruiser, demonstrated high accuracy for identifying the region, density, and degree of congenital cataract formation based on slit-lamp photographs.[24]Machine learning systems for cataracts have also been shown to adequately guide plans for surgical intervention, as well as anticipate the likelihood of posterior capsular opacification developing post-operatively.[1][18][25]Calculating intraocular lens power can be conducted through machine learning methods.[18] One notable example is the Hill-RBF formula that analyzes the following inputted data: axial length, central corneal thickness, anterior chamber depth, lens thickness, corneal diameter, and keratometry measurements.[18]

Pediatric Ophthalmology

In the pediatric population, timely ocular management is critical for the preservation of vision. Incorporation of AI into screening and treatment practices may aid in achieving optimal ophthalmic care A deep learning algorithm for the assessment of strabismus from external images has been developed with the potential for implementation in tele-ophthalmology.  Other systems to detect strabismus are based on eye tracking deviations or retinal birefringence scanning.[26] [27] Other machine learning systems have the potential to facilitate screening for high myopia among other refractive errors, as well as classify children susceptible to reading disabilities.[28]Van Eenwyk et al describes a system using machine learning that incorporates the Brückner pupil red reflex imaging and eccentric photorefraction to detect amblyogenic features of strabismus or high refractive errors.[29] AI systems for congenital cataracts are discussed above in the “Cataract” section.


AI has been utilized in diabetic retinopathy (DR), retinopathy of prematurity (ROP) and age-related macular degeneration (AMD). The IDx-DR system was recently FDA approved for DR.[30] Other applications exist that can identify severity of DR and clinically significant macular edema. In ROP, AI tools such as the i-ROP DL system can distinguish features such as plus disease and is comparable or better than expert diagnosis.[31] In AMD, AI can be used to identify the difference between non-exudative and exudative AMD.[32]  An expanded discussion of the applications of AI in the field of retina can be found in the article “Artificial Intelligence in Retina.”


In oculoplastics, AI has the potential to aid in automated processing and measurement of patients’ facial dimensions to aid in pre- and post-op evaluations, as well as clinical decision support for oculoplastic referral by non-ophthalmologists. Semantic segmentation networks have been trained to automate periorbital measurements with comparable performance to human graders[33]. Hung et al. applied similar technology to training a model to aid general practitioners in decision making regarding referral of patients with blepharoptosis, which was shown to outperform the sensitivity and specificity of referral decisions made by non-ophthalmic physicians.[33] Simsek et al. developed an AI algorithm to objectively assess patients’ progress post-eyelid surgery in a standardized and repeatable manner [34].

Future directions

The advent of AI may reshape the field of medicine. There is unprecedented potential for AI to expand scientific inquiry by using neural networks to generate hypotheses and make new discoveries. With rapid analysis of vast amounts of data, AI can explore associations between disease features that may not be readily apparent to humans. There is potential for AI to enhance physicians’ abilities to diagnosis conditions earlier and more accurately. Ultimately, AI has the potential to assist physicians by individualizing medical management, exposing patients to therapy only when the clinical judgement of the physician is supported by the results of deep learning.[3]

Looking beyond solely utilizing AI in clinical practice, machine learning methods may play a role in guiding research investigations that aim to identify disease features newly discovered through automated techniques.[3]On a global level, the application of AI to existing tele-ophthalmology programs may facilitate outreach to underserved regions, addressing the shortage of specialists available to provide their expertise.[1]    


  1. 1.0 1.1 1.2 1.3 1.4 1.5 1.6 Kapoor R, Walters SP, Al-Aswad LA. The current state of artificial intelligence in ophthalmology. Surv Ophthalmol. 2019;64(2):233-240. doi:10.1016/j.survophthal.2018.09.002
  2. 2.0 2.1 2.2 Akkara JD, Kuriakose A. Role of artificial intelligence and machine learning in ophthalmology. Kerala J Ophthalmol 2019;31:150-60. doi: 10.4103/kjo.kjo_54_19
  3. 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 Roach, L. “Artificial Intelligence.” EyeNet Magazine, Nov. 2017,
  4. 4.00 4.01 4.02 4.03 4.04 4.05 4.06 4.07 4.08 4.09 4.10 The ultimate guide to AI in radiology. The Ultimate Guide to AI in Radiology. Accessed November 13, 2019.
  5. 5.0 5.1 5.2 5.3 5.4 Ting DSW, Lee AY, Wong TY. An Ophthalmologist’s Guide to Deciphering Studies in Artificial Intelligence. Ophthalmology. 2019;126(11):1475-1479. doi:10.1016/j.ophtha.2019.09.014
  6. 6.0 6.1 LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539
  7. Price WN, Gerke S, Cohen IG. Potential Liability for Physicians Using Artificial Intelligence. JAMA. October 2019. doi:10.1001/jama.2019.15064
  8. When AIs Outperform Doctors: Confronting the Challenges of a Tort-Induced Over-Reliance on Machine Learning by A. Michael Froomkin, Ian R. Kerr, Joelle Pineau :: SSRN. Accessed December 1, 2019.
  9. Castelvecchi D. Can we open the black box of AI? Nature International Weekly Journal of Science. Published October 5, 2016. Accessed November 12, 2019.
  10. Bolukbasi T, Chang K-W, Zou JY, Saligrama V, Kalai AT. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R, editors. Advances in Neural Information Processing Systems 29. Curran Associates, Inc.; 2016. pp. 4349–4357. Available:
  11. Ardila D, Kiraly AP, Bharadwaj S, et al. End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nat Med. 2019;25(6):954-961. doi:10.1038/s41591-019-0447-x
  12. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118. doi:10.1038/nature21056
  13. Johnston SC. Anticipating and Training the Physician of the Future: The Importance of Caring in an Age of Artificial Intelligence. Acad Med. 2018;93(8):1105-1106. doi:10.1097/ACM.0000000000002175
  14. Ting DSJ, Ang M, Mehta JS, Ting DSW. Artificial intelligence-assisted telemedicine platform for cataract screening and management: a potential model of care for global eye health. Br J Ophthalmol. 2019;103(11):1537-1538. doi:10.1136/bjophthalmol-2019-315025
  15. 15.0 15.1 Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44-56. doi:10.1038/s41591-018-0300-7
  16. Tsui, J.C., Wong, M.B., Kim, B.J. et al. Appropriateness of ophthalmic symptoms triage by a popular online artificial intelligence chatbot. Eye (2023).
  17. 17.0 17.1 Ting DSW, Peng L, Varadarajan AV, et al. Deep learning in ophthalmology: The technical and clinical considerations. Prog Retin Eye Res. April 2019. doi:10.1016/j.preteyeres.2019.04.003
  18. 18.0 18.1 18.2 18.3 18.4 18.5 Akkara. Role of artificial intelligence and machine learning in ophthalmology.;year=2019;volume=31;issue=2;spage=150;epage=160;aulast=Akkara. Accessed December 1, 2019.
  19. Muhammad H, Fuchs TJ, De Cuir N, et al. Hybrid Deep Learning on Single Wide-field Optical Coherence tomography Scans Accurately Classifies Glaucoma Suspects. J Glaucoma. 2017;26(12):1086-1094. doi:10.1097/IJG.0000000000000765
  20. 20.0 20.1 Habibalahi A, Bala C, Allende A, Anwer AG, Goldys EM. Novel automated non invasive detection of ocular surface squamous neoplasia using multispectral autofluorescence imaging. Ocul Surf. 2019;17(3):540-550. doi:10.1016/j.jtos.2019.03.003
  21. Yang J-J, Li J, Shen R, et al. Exploiting ensemble learning for automatic cataract detection and grading. Comput Methods Programs Biomed. 2016;124:45-57. doi:10.1016/j.cmpb.2015.10.007
  22. Zhang L, Li J, Zhang I, Han H, Liu B, Yang J, et al. Automatic cataract detection and grading using Deep Convolutional Neural Network. In: 2017 Presented at: IEEE 14th International Conference on Networking, Sensing and Control (ICNSC); 2017. p. 60‐5.
  23. Wu X, Huang Y, Liu Z, et al. Universal artificial intelligence platform for collaborative management of cataracts. Br J Ophthalmol. 2019;103(11):1553-1560. doi:10.1136/bjophthalmol-2019-314729
  24. 24.0 24.1 Liu X, Jiang J, Zhang K, et al. Localization and diagnosis framework for pediatric cataracts based on slit-lamp images using deep features of a convolutional neural network. PLoS ONE. 2017;12(3):e0168606. doi:10.1371/journal.pone.0168606
  25. Mohammadi S-F, Sabbaghi M, Z-Mehrjardi H, et al. Using artificial intelligence to predict the risk for posterior capsule opacification after phacoemulsification. J Cataract Refract Surg. 2012;38(3):403-408. doi:10.1016/j.jcrs.2011.09.036
  26. Chen Z, Fu H, Lo W-L, Chi Z. Strabismus Recognition Using Eye-Tracking Data and Convolutional Neural Networks. J Healthc Eng. 2018;2018:7692198. doi:10.1155/2018/7692198
  27. Gramatikov BI. Detecting central fixation by means of artificial neural networks in a pediatric vision screener using retinal birefringence scanning. Biomed Eng Online. 2017;16(1):52. doi:10.1186/s12938-017-0339-6
  28. Reid JE, Eaton E. Artificial intelligence for pediatric ophthalmology. Curr Opin Ophthalmol. 2019;30(5):337-346. doi:10.1097/ICU.0000000000000593
  29. Van Eenwyk J, Agah A, Giangiacomo J, Cibis G. Artificial intelligence techniques for automatic screening of amblyogenic factors. Trans Am Ophthalmol Soc. 2008;106:64-73.
  30. Abràmoff MD, Lavin PT, Birch M, Shah N, Folk JC. Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. NPJ Digit Med. 2018;1:39. doi:10.1038/s41746-018-0040-6
  31. Brown JM, Campbell JP, Beers A, et al. Automated Diagnosis of Plus Disease in Retinopathy of Prematurity Using Deep Convolutional Neural Networks. JAMA Ophthalmol. 2018;136(7):803-810. doi:10.1001/jamaophthalmol.2018.1934
  32. Russakoff DB, Lamin A, Oakley JD, Dubis AM, Sivaprasad S. Deep Learning for Prediction of AMD Progression: A Pilot Study. Invest Ophthalmol Vis Sci. 2019;60(2):712-722. doi:10.1167/iovs.18-25325
  33. 33.0 33.1 Brummen A, Owen J, et al. Artificial intelligence automation of eyelid and periorbital measurements. Investigative Ophthalmology & Visual Science. 2021;62:2149. doi:10.1016/j.ajo.2021.05.007
  34. Simsek I & Sirolu C. Analysis of surgical outcome after upper eyelid surgery by computer vision algorithm using face and facial landmark detection. Graefes Arch Clin Exp Ophthalmol. 2021;259(10):3119-3125. doi: 10.1007/s00417-021-05219-8.
The Academy uses cookies to analyze performance and provide relevant personalized content to users of our website.