Artificial neural networks are a concept known for over 60 years, with Frank Rosenblatt inventing the perceptron in 1957. With a limited range of applications up until the XXI century, they have revolutionized fields such as natural language processing, computer vision, reinforcement learning, pattern recognition, and many more. With much of this breakthrough being attributed to the era of big data and the rise of smartphones, allowing to train increasingly larger models, image-related machine learning algorithms still suffer from the problem of too small datasets.
During this seminar, I would like to provide a brief introduction to the concept of neural networks, with an emphasis on convolutional neural networks and related object detection/segmentation algorithms, such as R-CNN, Mask R-CNN, and YOLO. Drawing on experience gained in industry machine learning projects, I will describe common challenges encountered when training computer vision models (e.g. the aforementioned daset size) and steps to overcome them. Data science elements essential for a successful project delivery, such as exploratory data analysis (EDA), data augmentation, and error analysis, will also be covered. Last but not least, I will describe transformer models, being a recent approach to computer vision, as well as a basis for questions such as "how does image relate to language?" and "what direction should we search in for an artificial general intelligence?".