CNS*2020 Online has ended
Welcome to the Sched instance for CNS*2020 Online! Please read the instruction document on detailed information on CNS*2020.
Back To Schedule
Monday, July 20 • 7:00pm - 8:00pm
P155: A model for unsupervised object categorization in infants

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Zoom link

602 356 6034

Sunho Lee
, Youngjin Park, Se-Bum Paik

Both the brain and recent deep neural networks (DNNs) can successfully perform visual object recognition at similar levels. However, to acquire this function, DNNs generally require a large amount of training with a huge number of labeled data, whereas the brain does not appear to need such artificially labeled images to learn. Moreover, human infants, who certainly never experienced any training, are still able to classify unfamiliar object categories [1]. The mechanism by which the immature brain can categorize visual objects without any supervisory feedback remains elusive. Here, we suggest a biologically plausible circuit model that can correctly categorize natural images without any supervision. Instead of supervised signals, which are believed to be essential to train the system, we focused on the temporal continuity of the natural scene. Natural visual stimuli to which infants are exposed repeatedly have temporal continuity [2], unlike the dataset of images used to train artificial DNNs. In this regard, to detect the discontinuity in a natural scene that is potentially equivalent to the border of the image cluster of the same object, we designed a “differential unit” (Fig.1, DU). The DU estimates the difference between the current input and delayed input before seconds, and thereby can detect the temporal difference of visual input in real-time. In addition to the DU, to memorize the representation of visual objects, we also designed a “readout network” (Fig.1, k-Winners-Take-All network and readout), which is linked to the filtered pool5 units of randomized AlexNet. The randomized AlexNet corresponds to the early visual pathway of infants and functions as an image abstractor, where its weights are randomly initialized and fixed. The connection weights between the readout and pool5 units can be updated by Hebbian plasticity, but because the DU continuously inhibits the readout, the plasticity was blocked initially. However, when the temporal difference of response becomes below a certain threshold (which means that the same object was consistently detected), the DU stops the inhibition, and connections between the ensemble of pool5 units (highly activated for that object) and the readout are strengthened. During the test session, we can identify the category of the given test images by simply choosing the readout that shows the highest response. To validate the model performance, we made a sequence of images by sorting the CIFAR-10 dataset by categories, which mimics the temporal continuity of the natural scene. The model was trained by the designed image sequence, and tested by a separate validation set. As a result, we achieved 35% classification accuracy, which is significantly higher than the chance level of 10%. Based on the present findings, we suggest a biologically-plausible mechanism of object categorization with no supervision, and we believe that our model can explain how the visual function arises in the early stages of the brain without supervised learning.


This work was supported by National Research Foundation of Korea (No. 2019M3E5D2A01058328, 2019R1A2C4069863)


[1] D. H. Rakison and Y. Yermolayeva, “Infant categorization,” _Wiley Interdiscip. Rev. Cogn. Sci._ , vol. 1, no. 6, pp. 894–905, 2010.
[2] L. Wiskott and T. J. Sejnowski, “Slow feature analysis: Unsupervised learning of invariances,” _Neural Comput._ , vol. 14, no. 4, pp. 715–770, 2002.


Youngjin Park

Department of Bio and Brain Engineering, KAIST

Monday July 20, 2020 7:00pm - 8:00pm CEST
Slot 09